Skip to content

Conversation

@ryanaguilar
Copy link
Contributor

feature/sedm: Add SEDM (Special Education Data Model) staging models

Description & motivation

This PR introduces a complete set of staging models for the Special Education Data Model (SEDM) domain, following the established patterns for Ed-Fi 3.x base and stage models. The implementation includes 7 base models and 7 corresponding stage models for IEP (Individualized Education Program) related resources, enabling staging and analysis of special education data.

All models follow the established conventions:

  • Base models: Extract and rename columns from raw sources, preserve metadata (timestamps, tenant codes, file names), handle descriptors, and maintain reference objects. Base models are materialized as views.
  • Stage models: Generate surrogate keys using gen_skey() macro and dbt_utils.generate_surrogate_key(), deduplicate records using dbt_utils.deduplicate(), filter deleted records, and extract extensions. Stage models are materialized as tables.

The SEDM domain is configured to be enabled via feature flag: src:domain:sedm:enabled.

PR Merge Priority:

  • Low
  • Medium
  • High

Changes to existing files:

  • dbt_project.yml: Added SEDM domain configuration with base models materialized as views and stage models as tables. This follows the same pattern as existing edfi_3 domains, ensuring consistency across the project structure.

  • macros/gen_skey.sql: Added two new key definitions to support SEDM models:

    • k_student_iep: Annualized key for student IEP. Includes iep_finalized_date, iep_servicing_ed_org_id, student_iep_association_id, and student_unique_id. This key is used across multiple SEDM models to link records to IEP.
    • k_student_iep_service_prescription: Non-annualized key for IEP service prescriptions. Includes iep_service_id, iep_servicing_ed_org_id, service_prescription, service_prescription_date, student_iep_association_id, and student_unique_id. This key enables proper linking of service prescription records.

New files created:

Base Models

  • base_sedm__idea_events: Extracts IDEA (Individuals with Disabilities Education Act) events from the idea_events source. Includes event begin/end dates, event narrative, event reason and compliance descriptors, and references to students and education organizations.

  • base_sedm__student_iep_accommodations: Extracts student IEP accommodation collections from the student_iep_accommodations source. Includes key columns for student IEP association ID, student unique ID, and IEP servicing education organization ID. Preserves the accommodations list for further processing in downstream models.

  • base_sedm__student_iep_disabilities: Extracts student IEP disability collections from the student_iep_disability_collections source. Includes key columns for student IEP association ID, student unique ID, and IEP servicing education organization ID. Preserves the disabilities list for further processing.

  • base_sedm__student_iep_goals: Extracts student IEP goals from the student_iep_goals source. Includes goal ID, goal details, achievement period dates, goal descriptor, and references to students, IEPs, and IDEA events.

  • base_sedm__student_iep_service_deliveries: Extracts student IEP service deliveries from the student_iep_service_deliveries source. Includes service delivery ID, service delivery date and descriptor, service provider information, references to service prescriptions, and lists of service providers and external service providers.

  • base_sedm__student_iep_service_prescriptions: Extracts student IEP service prescriptions from the student_iep_service_prescriptions source. Includes service ID, prescription date and descriptor, begin/end dates, duration and frequency information, service location type, and references to students, IEPs, staff, and IDEA events.

  • base_sedm__student_ieps: Extracts student IEPs (main IEP records) from the student_ieps source. Includes IEP association ID, finalized date, begin/end dates, amended date, status and reason exited descriptors, special education setting descriptor, medically fragile and multiply disabled flags, hours per week information, and references to education organizations, students, and special education program associations.

Stage Models (Tables)

  • stg_sedm__student_iep_accommodations: Creates deduplicated and keyed version of IEP accommodations. Generates primary surrogate key k_student_iep_accommodation_collection using tenant_code, api_year, student_iep_association_id, student_unique_id, and iep_servicing_ed_org_id. Uses gen_skey() to generate foreign keys for k_student, k_student_xyear, and k_student_iep. Deduplicates by primary key, keeping the most recent version based on last_modified_timestamp and pull_timestamp. Filters out deleted records.

  • stg_sedm__student_iep_disabilities: Creates deduplicated and keyed version of IEP disabilities. Generates primary surrogate key k_student_iep_disability_collection using tenant_code, api_year, student_iep_association_id, student_unique_id, and iep_servicing_ed_org_id. Uses gen_skey() to generate foreign keys for k_student, k_student_xyear, and k_student_iep. Adds school_year column. Deduplicates and filters deleted records.

  • stg_sedm__student_iep_goals: Creates deduplicated and keyed version of IEP goals. Generates primary surrogate key k_student_iep_goal using tenant_code, api_year, iep_goal_id, student_unique_id, and student_iep_association_id. Uses gen_skey() to generate foreign keys for k_student, k_student_xyear, and k_student_iep. Adds school_year column. Deduplicates and filters deleted records.

  • stg_sedm__student_iep_service_deliveries: Creates deduplicated and keyed version of IEP service deliveries. Generates primary surrogate key using tenant_code, api_year, and natural key components. Uses gen_skey() to generate foreign keys for k_student, k_student_xyear, and k_student_iep. Deduplicates and filters deleted records.

  • stg_sedm__student_iep_service_prescriptions: Creates deduplicated and keyed version of IEP service prescriptions. Generates primary surrogate key k_student_iep_service_prescription using tenant_code, api_year, iep_service_id, iep_servicing_ed_org_id, service_prescription, service_prescription_date, student_iep_association_id, and student_unique_id. Uses gen_skey() to generate foreign keys for k_student, k_student_xyear, k_student_iep, and k_staff (using service_provider_staff_reference). Adds school_year column. Deduplicates and filters deleted records.

  • stg_sedm__student_ieps: Creates deduplicated and keyed version of IEPs. Generates primary surrogate key k_student_iep using tenant_code, api_year, student_iep_association_id, iep_servicing_ed_org_id, iep_finalized_date, and student_unique_id. Uses gen_skey() to generate foreign keys for k_student and k_student_xyear. Uses edorg_ref() macro for education organization references (non-annualized). Adds school_year column. Deduplicates and filters deleted records.

All stage models follow consistent patterns:

  • Primary keys use dbt_utils.generate_surrogate_key() with tenant_code, api_year, and natural key components
  • String keys are lowercased for case-insensitive uniqueness
  • Foreign keys use gen_skey() macro for standardized references
  • Deduplication uses dbt_utils.deduplicate() partitioned by primary key, ordered by last_modified_timestamp desc, pull_timestamp desc
  • All models filter out records where is_deleted = true
  • Extensions are extracted using extract_extension() macro

Tests and QC done:

  • Ran dbt run for all SEDM base and stage models and confirmed successful compilation
  • Verified that all models follow established naming conventions and patterns
  • Confirmed key generation logic matches patterns used in other domain models (edfi_3, tpdm)
  • Validated that deduplication logic is consistently applied across all stage models
  • Verified configuration in dbt_project.yml matches the structure of other domains
  • Confirmed that gen_skey() macro additions follow the existing pattern and include proper annualization flags

Future ToDos & Questions:

  • These models were run in Stadium Boston dev
  • TO DO: run in Stadium South Carolina dev

@ryanaguilar ryanaguilar marked this pull request as draft December 1, 2025 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants