Exact Path Diagnostic Assessment
Reading
Summary
The Exact Path Diagnostic Reading Assessment is a computer-adaptive assessment that can be administered up to five times per academic year to screen and identify students in need of intervention. The Diagnostic Assessment efficiently pinpoints where students are ready to start learning and measures their growth between assessments. The assessment uses a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The assessment delivers an overall placement score for each content area, plus a score for each domain, all reported in real time. Next, the assessment generates an individualized learning path based on a student’s unique domain levels. Learning paths take students to content appropriate for their instructional level, regardless of grade level. Highlights: 1) Valid and reliable assessment that diagnoses each student’s strengths and needs and pinpoints exactly where the student is ready to start learning. 2) Efficient measurement. Students in kindergarten and grade 1 have median times under 15 minutes. For students in grades 2–12, median times range from 23 to 53 minutes. 3) Real-time reporting of Lexile® and Quantile® measures, national percentile ranks, grade-level proficiency, and student growth upon each successive administration. 4) Universal accessibility tools including read-aloud, text magnification, and highlighting. 5) Easy to schedule and administer. Administrators and teachers can auto-schedule tests for all students, or they can manually adjust the schedule for specific students.
- Where to Obtain:
- Edmentum, Inc.
- info@edmentum.com
- 5600 West 83rd Street, Suite 300, 8200 Tower, Bloomington, MN 55437
- 800.447.5286
- https://info.edmentum.com/get-a-quote.html
- Initial Cost:
- $7.00 per student
- Replacement Cost:
- $7.00 per student per year
- Included in Cost:
- Edmentum’s professional services team, in tandem with our support team, works with our district and school partners through the implementation process and beyond to ensure your program is a success. In addition to the included implementation support, professional learning and engagement experiences are also available. Our detailed Edmentum Professional Services and Consulting Catalog can be found at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Edmentum offers unparalleled flexibility in program cost structure, with per-student and districtwide options. Understanding that no two schools or districts are alike, we offer licensing options to meet your unique needs and will partner with your stakeholders to provide the best fit. With our variety of packaging selections, we will find the perfect program to meet your specific needs, fit within your budget, and help your students thrive. Access options include individual content areas and multi-subject core bundles. Licenses and subscriptions are generally provided in 12-month increments, with unlimited teacher licenses included at no cost. Note that Edmentum’s standard minimum subscription term is 12 months. To further support our valued customers, we have discounts available based on the volume of student licenses. Basic pricing includes the assessment for selected the content area(s), all of which appears on a single platform; unlimited teacher licenses; administrator licenses; administrator and educator dashboards to control diagnostic assessment timing, frequency, and administration (e.g., monitoring testing, resetting a test, alerts); interactive reporting functionality for administrators and educators (including data export, dashboards, aggregate and individual reports, etc.); report access for students, parents, educators, and administrators; unlimited access available 24/7 to the embedded Guided Access and Help Center searchable support and troubleshooting tool as well as all relevant guidance and support materials, manuals, and webinars; award-winning customer support team providing support via phone, email, web, and live office hours; and dedicated support pages and resources for family/caregivers. Available as an add-on student-level license is the learning path solution capable of supporting intervention, individualized learning, and acceleration needs through the creation of personalized instructional pathways and natively integrated with the Diagnostic Assessments, as well as resources for educators to provide targeted intervention support to students through whole group, small group, or one-on-one instruction inclusive of lesson plans, printable activities, and instructional videos.
- The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, English learner accommodations, and audio and visual accommodations.
- Training Requirements:
- 1-4 hours
- Qualified Administrators:
- No minimum qualifications specified.
- Access to Technical Support:
- From the moment our partners engage with us, we provide personal support to ensure they have the best experience possible. Our live, U.S.-based customer support team offers superior technical support as well as high-value instructional support to help educators gain the full value of their Edmentum programs. Our customer support team provides full phone and email support for our programs to all users during business hours. Edmentum offers a variety of training and support modalities, including online and offline training, videos, webinars, and documentation for online system users and administrators. We offer an extensive library of online and offline prerecorded and recorded videos and webinars that are accessible 24/7 via our website and YouTube channel. Our series of public webinars are scheduled weekly, and users can register for them anytime. Exact Path has both online documentation within the program and up-to-date downloadable user guides; both of these resources are found within Exact Path’s Help Center on Edmentum’s Support page. Teachers and administrators each have their own user manual with guidance on how to use the system. There is also an embedded on-demand Help Center with searchable help/troubleshooting. Not only is context-specific help available, but there are also page tours that walk users through actions they may want to take. Other time-specific guides direct teachers to important reports and new features.
- Assessment Format:
-
- Scoring Time:
-
- Scoring is automatic
- Scores Generated:
-
- Raw score
- Percentile score
- IRT-based score
- Developmental benchmarks
- Developmental cut points
- Equated
- Lexile score
- Administration Time:
-
- 45 minutes per student / group
- Scoring Method:
-
- Automatically (computer-scored)
- Technology Requirements:
-
- Computer or tablet
- Internet connection
- Accommodations:
- The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, English learner accommodations, and audio and visual accommodations.
Descriptive Information
- Please provide a description of your tool:
- The Exact Path Diagnostic Reading Assessment is a computer-adaptive assessment that can be administered up to five times per academic year to screen and identify students in need of intervention. The Diagnostic Assessment efficiently pinpoints where students are ready to start learning and measures their growth between assessments. The assessment uses a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The assessment delivers an overall placement score for each content area, plus a score for each domain, all reported in real time. Next, the assessment generates an individualized learning path based on a student’s unique domain levels. Learning paths take students to content appropriate for their instructional level, regardless of grade level. Highlights: 1) Valid and reliable assessment that diagnoses each student’s strengths and needs and pinpoints exactly where the student is ready to start learning. 2) Efficient measurement. Students in kindergarten and grade 1 have median times under 15 minutes. For students in grades 2–12, median times range from 23 to 53 minutes. 3) Real-time reporting of Lexile® and Quantile® measures, national percentile ranks, grade-level proficiency, and student growth upon each successive administration. 4) Universal accessibility tools including read-aloud, text magnification, and highlighting. 5) Easy to schedule and administer. Administrators and teachers can auto-schedule tests for all students, or they can manually adjust the schedule for specific students.
ACADEMIC ONLY: What skills does the tool screen?
- Please describe specific domain, skills or subtests:
- The Exact Path Reading assessment includes the domains of Reading Foundations (K–3); Reading Informational Text; Reading Literature; and Language and Vocabulary.
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
-
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
Acquisition and Cost Information
Administration
- Are norms available?
- Yes
- Are benchmarks available?
- Yes
- If yes, how many benchmarks per year?
- The Diagnostic Assessment may be administered up to five times per year, though three times per year is recommended.
- If yes, for which months are benchmarks available?
- The Diagnostic Assessment may be administered any time of year. However, administration is recommended in each of the following windows: August 15 – October 14, December 1 – January 31, and April 1 – May 31.
- BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
- If yes, how many students can be rated concurrently?
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- 1-4 hours
- Please describe the minimum qualifications an administrator must possess.
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- In addition to the on-demand resources available 24/7 and embedded in Exact Path for educator support, training, and reference, Edmentum offers a variety of professional learning and engagement experiences detailed in our Edmentum Professional Services and Consulting Catalog at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Our dedicated team is ready to design and deliver services tailored to each partner’s unique needs and goals.
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- From the moment our partners engage with us, we provide personal support to ensure they have the best experience possible. Our live, U.S.-based customer support team offers superior technical support as well as high-value instructional support to help educators gain the full value of their Edmentum programs. Our customer support team provides full phone and email support for our programs to all users during business hours. Edmentum offers a variety of training and support modalities, including online and offline training, videos, webinars, and documentation for online system users and administrators. We offer an extensive library of online and offline prerecorded and recorded videos and webinars that are accessible 24/7 via our website and YouTube channel. Our series of public webinars are scheduled weekly, and users can register for them anytime. Exact Path has both online documentation within the program and up-to-date downloadable user guides; both of these resources are found within Exact Path’s Help Center on Edmentum’s Support page. Teachers and administrators each have their own user manual with guidance on how to use the system. There is also an embedded on-demand Help Center with searchable help/troubleshooting. Not only is context-specific help available, but there are also page tours that walk users through actions they may want to take. Other time-specific guides direct teachers to important reports and new features.
Scoring
- Do you provide basis for calculating performance level scores?
-
Yes
- Does your tool include decision rules?
-
Yes
- If yes, please describe.
- Risk Benchmarks: Districts and schools can use the National Percentile Ranks (NPR) reporting provided through the Exact Path Diagnostic to identify students “at risk”. Consistent with NCII, we recommend that schools use the NPR of 20 to identify students who need intensive intervention. Depending on the needs of their students, districts and schools are also able to use other NPR thresholds for purposes such as identification of moderate risk and gifted and talented.
- Can you provide evidence in support of multiple decision rules?
-
Yes
- If yes, please describe.
- Risk Benchmarks: NPRs are provided for all students in grades K–8 for reading. NPRs were derived using national samples and a weighting methodology that adjusted the sample to be representative of the national student population. The norm study was conducted using data from the 2018–19 academic year. The NPR of 20 is recommended by RTI experts including NCII as an appropriate threshold for establishing at-risk classifications.
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- The Exact Path Diagnostic Assessments include dichotomously scored multiple-choice and technology-enhanced item types. All the items on the Diagnostic Assessments are machine scored in real time. The Diagnostic Assessments provide multiple scores to describe student learning levels and student progress throughout the year. The scale scores were developed with the 1-parameter Rasch item response theory model and are placed on a vertical scale from 500 to 1500 spanning all grades K–12 for each subject. Students also receive a raw score for each domain (i.e., number of items answered correctly out of number of items delivered). In addition, student reports contain growth scores (between administrations), Lexile®/Quantile® measures, and national percentile ranks. Performance levels, called Grade Level Proficiency classifications, categorize students into four performance levels, with the top two levels indicating on-grade level achievement in mathematics, language arts, and reading. Upon completion of the Diagnostic Assessments, Exact Path generates an individualized learning path based on students’ unique proficiency by domain, providing students with access to content appropriate for their instructional level, regardless of their grade level.
- Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
- Exact Path provides computer-adaptive assessments in math, reading, and language arts that can be administered up to five times per academic year (though three times is most common) to efficiently pinpoint where students are ready to start learning and to measure their growth between assessments. The assessments use a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The algorithm selects the first question based on either a student’s enrolled grade level (if first time testing) or the student's previous diagnostic score. As the student progresses, each item presented depends on whether they answer the previous item correctly (receive more difficult item) or incorrectly (receive easier item). In this way, students receive assessments tailored to their skill levels, resulting in delivery of precise, accurate results for each content area that can be used to inform instruction and interventions. The adaptive algorithm uses consistent stopping rules for all learners that are based on the precision of the student score, so that scores are highly reliable for low performing students, average performing students, and high performing or gifted students. The Exact Path Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Item writers are trained according to Edmentum’s internal Assessment Item Writing Guide and Item Specifications, which includes Fairness, Bias, and Sensitivity guidance. Each item begins with a task model containing all parameters for that item from standards, depth of knowledge (DOK), and readability to considerations for bias and sensitivity. Once written, each item undergoes two rounds of review and revision, including bias and sensitivity reviews. Furthermore, extensive accommodations are available for use both within and outside of the Exact Path platform to support the diverse needs of students, including students from linguistically and culturally diverse backgrounds as well as students with disabilities. More information about accommodations for students with disabilities was provided at the end of the Descriptive Information section. Teachers can make appropriate accommodations for students who are English language learners, such as providing a dictionary, helping to pronounce words, and offering any other accommodation students receive instructionally. However, teachers should not give substantive help interpreting text. Exact Path has been awarded WIDA Prime V2 Correlation, indicating our ability to address English language learners’ listening, speaking, reading, and writing needs. Exact Path includes built-in text-to-speech functionality, closed captions for videos, and highlighted vocabulary words with built-in tools for translation, definition, and audio support. EdMetric (2022) conducted an independent study of DIF in Edmentum’s item bank to examine the impact of four grouping variables (gender, race, socioeconomic status, and pandemic effect) on items. Investigation utilized the Mantel-Haenszel (MH) procedure, which enables the use of the classification system by Educational Testing Service to separate items into differing levels of DIF including negligible DIF (A-level), moderate DIF (B-level), and large DIF (C-level). Items flagged with B- and C-level DIF would indicate students in the groups of interest perform differently on the item. No items in Edmentum’s item bank were flagged for B- or C-level DIF, indicating that items in Edmentum’s item bank measure student achievement from different groups in a similar manner and providing some evidence that the items are fair for different groups. This evidence reflects the attention to fairness and the measures taken to avoid bias and sensitivity throughout the item development process. The study is accessible at https://www.edmentum.com/resources/efficacy/exact-path-independent-study-differential-item-analysis-edmentums-item-bank.
Technical Standards
Classification Accuracy & Cross-Validation Summary
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|
Classification Accuracy Fall | ||||||
Classification Accuracy Winter | ||||||
Classification Accuracy Spring |
Wisconsin Forward Exam
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Wisconsin’s Statewide Achievement Assessment, called the Forward Exam, is administered in grades 3–8 in English Language Arts and Mathematics. The Forward Exam criterion measure is completely independent from the screening measure. The Forward Exam is developed by DRC in collaboration with the state of Wisconsin. However, the Forward Exam is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics Forward Exam and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts Forward Exam and the Exact Path Diagnostic Reading Assessment.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Consistent with NCII’s guidance, the 20th percentile on the Forward Exam was used as the criterion cut-point. The scale scores on the Forward Exam that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the Forward Exam classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- Wisconsin districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the 2020–21 school year.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Arizona Statewide Assessment, AzMerit 2 (AzM2)
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Arizona’s Statewide Achievement Assessment, the AzMerit 2 (AzM2), was administered in grades 3–8 in English Language Arts and Mathematics during the 2020–21 academic year. AzM2 is completely independent from the screening measure and was developed by Cambium Assessment in collaboration with the state of Arizona. However, the AzM2 is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics AzM2 and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts AzM2 Exam and the Exact Path Diagnostic Reading Assessment.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- AzM2 is administered to students in Arizona once per year in the spring. Exact Path Diagnostic Assessments are administered in the fall, winter, and spring. When evaluating the classification accuracy of Exact Path Diagnostic Assessments, the spring administration often takes place quite close to the AzM2 spring administration. Thus, the classification results can be considered concurrent. However, the classification results for the fall and winter Exact Path scores are from administrations timed several months before the AzM2 administration. Thus, these classification accuracy results can be considered predictive.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Consistent with NCII’s guidance, the 20th percentile on the AzM2 was used as the criterion cut-point. The scale scores on the AzM2 that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the AzM2 classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the 2020–21 school year.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Ohio State Test (OST)
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Ohio’s Statewide Achievement Assessment, the Ohio State Test (OST), was administered in grades 3–8 in English Language Arts and Mathematics during the 2020–21 and 2021-22 academic years. OST is completely independent from the screening measure and was developed by Cambium Assessment in collaboration with the state of Ohio. However, the OST is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics OST and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts OST Exam and the Exact Path Diagnostic Reading Assessment.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- OST is administered to students in Ohio once per year in the spring. Exact Path Diagnostic Assessments are administered in the fall, winter, and spring. When evaluating the classification accuracy of Exact Path Diagnostic Assessments, the spring administration often takes place quite close to the OST spring administration. Thus, the classification results can be considered concurrent. However, the classification results for the fall and winter Exact Path scores are from administrations timed several months before the OST administration. Thus, these classification accuracy results can be considered predictive.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Consistent with NCII’s guidance, the 20th percentile on the OST was used as the criterion cut-point. The scale scores on the OST that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the OST classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The Ohio districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the school year.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Classification Accuracy - Fall
Evidence | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Criterion measure | Wisconsin Forward Exam | Wisconsin Forward Exam | Wisconsin Forward Exam | Wisconsin Forward Exam | Wisconsin Forward Exam | Ohio State Test (OST) |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 510 | 533 | 551 | 561 | 577 | 669 |
Cut Points - Corresponding performance score (numeric) on screener measure | 933 | 968 | 1010 | 1056 | 1052 | 1070 |
Classification Data - True Positive (a) | 26 | 22 | 44 | 38 | 26 | 72 |
Classification Data - False Positive (b) | 44 | 36 | 44 | 45 | 49 | 42 |
Classification Data - False Negative (c) | 8 | 4 | 5 | 4 | 2 | 12 |
Classification Data - True Negative (d) | 152 | 196 | 202 | 205 | 208 | 281 |
Area Under the Curve (AUC) | 0.81 | 0.90 | 0.89 | 0.91 | 0.90 | 0.90 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.74 | 0.84 | 0.85 | 0.88 | 0.85 | 0.87 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.89 | 0.97 | 0.93 | 0.95 | 0.95 | 0.94 |
Statistics | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Base Rate | 0.15 | 0.10 | 0.17 | 0.14 | 0.10 | 0.21 |
Overall Classification Rate | 0.77 | 0.84 | 0.83 | 0.83 | 0.82 | 0.87 |
Sensitivity | 0.76 | 0.85 | 0.90 | 0.90 | 0.93 | 0.86 |
Specificity | 0.78 | 0.84 | 0.82 | 0.82 | 0.81 | 0.87 |
False Positive Rate | 0.22 | 0.16 | 0.18 | 0.18 | 0.19 | 0.13 |
False Negative Rate | 0.24 | 0.15 | 0.10 | 0.10 | 0.07 | 0.14 |
Positive Predictive Power | 0.37 | 0.38 | 0.50 | 0.46 | 0.35 | 0.63 |
Negative Predictive Power | 0.95 | 0.98 | 0.98 | 0.98 | 0.99 | 0.96 |
Sample | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Date | 2020-21 | 2020-21 | 2020-21 | 2020-21 | 2020-21 | 2020-21 & 2021-22 |
Sample Size | 230 | 258 | 295 | 292 | 285 | 407 |
Geographic Representation | East North Central (WI) | East North Central (WI) | East North Central (WI) | East North Central (WI) | East North Central (WI) | East North Central (OH) |
Male | 48.7% | 50.0% | 41.0% | 47.6% | 43.2% | 55.0% |
Female | 45.2% | 43.8% | 44.1% | 43.2% | 49.1% | 45.0% |
Other | ||||||
Gender Unknown | ||||||
White, Non-Hispanic | 78.7% | 77.1% | 68.8% | 73.3% | 77.9% | 76.9% |
Black, Non-Hispanic | 2.2% | 1.2% | 2.4% | 0.7% | 1.8% | 12.0% |
Hispanic | 10.4% | 12.8% | 11.2% | 13.7% | 9.8% | 2.9% |
Asian/Pacific Islander | 0.4% | 1.9% | 0.3% | 1.4% | 1.1% | 2.0% |
American Indian/Alaska Native | 0.9% | 0.7% | 0.3% | 0.7% | ||
Other | 0.9% | 0.8% | 1.7% | 1.4% | 1.1% | 5.9% |
Race / Ethnicity Unknown | 0.4% | |||||
Low SES | 30.4% | 26.0% | 26.4% | 27.4% | 23.9% | 30.0% |
IEP or diagnosed disability | 10.9% | 13.6% | 9.5% | 13.0% | 9.1% | 18.9% |
English Language Learner | 7.0% | 5.4% | 4.4% | 3.4% | 2.8% | 0.5% |
Classification Accuracy - Winter
Evidence | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Criterion measure | Arizona Statewide Assessment, AzMerit 2 (AzM2) | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 2456 | 666 | 678 | 674 | 684 | 676 |
Cut Points - Corresponding performance score (numeric) on screener measure | 880 | 1003 | 1067 | 1110 | 1112 | 1133 |
Classification Data - True Positive (a) | 77 | 150 | 147 | 117 | 156 | 134 |
Classification Data - False Positive (b) | 69 | 145 | 142 | 103 | 117 | 112 |
Classification Data - False Negative (c) | 19 | 36 | 31 | 12 | 14 | 19 |
Classification Data - True Negative (d) | 296 | 579 | 566 | 413 | 530 | 496 |
Area Under the Curve (AUC) | 0.87 | 0.89 | 0.90 | 0.93 | 0.93 | 0.91 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.83 | 0.86 | 0.88 | 0.91 | 0.91 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.90 | 0.91 | 0.92 | 0.95 | 0.94 | 0.94 |
Statistics | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Base Rate | 0.21 | 0.20 | 0.20 | 0.20 | 0.21 | 0.20 |
Overall Classification Rate | 0.81 | 0.80 | 0.80 | 0.82 | 0.84 | 0.83 |
Sensitivity | 0.80 | 0.81 | 0.83 | 0.91 | 0.92 | 0.88 |
Specificity | 0.81 | 0.80 | 0.80 | 0.80 | 0.82 | 0.82 |
False Positive Rate | 0.19 | 0.20 | 0.20 | 0.20 | 0.18 | 0.18 |
False Negative Rate | 0.20 | 0.19 | 0.17 | 0.09 | 0.08 | 0.12 |
Positive Predictive Power | 0.53 | 0.51 | 0.51 | 0.53 | 0.57 | 0.54 |
Negative Predictive Power | 0.94 | 0.94 | 0.95 | 0.97 | 0.97 | 0.96 |
Sample | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Date | 2020-21 | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 |
Sample Size | 461 | 910 | 886 | 645 | 817 | 761 |
Geographic Representation | Mountain (AZ) | East North Central (OH) | East North Central (OH) | East North Central (OH) | East North Central (OH) | East North Central (OH) |
Male | 46.0% | 54.0% | 53.0% | 51.9% | 53.0% | 51.0% |
Female | 54.0% | 46.0% | 47.0% | 48.1% | 47.0% | 49.0% |
Other | ||||||
Gender Unknown | ||||||
White, Non-Hispanic | 83.1% | 69.0% | 73.0% | 69.0% | 66.0% | 74.0% |
Black, Non-Hispanic | 5.0% | 15.1% | 14.0% | 11.0% | 10.0% | 14.1% |
Hispanic | 88.1% | 3.0% | 1.0% | 2.0% | 2.0% | 2.0% |
Asian/Pacific Islander | 1.1% | 3.0% | 1.0% | 2.0% | 1.0% | 3.0% |
American Indian/Alaska Native | 1.1% | 0.1% | ||||
Other | 0.7% | 10.0% | 10.0% | 9.0% | 7.0% | 7.0% |
Race / Ethnicity Unknown | 7.0% | 14.0% | ||||
Low SES | 87.0% | 23.0% | 19.0% | 27.0% | 23.0% | 23.0% |
IEP or diagnosed disability | 15.1% | 14.0% | 16.0% | 11.0% | 14.1% | |
English Language Learner | 24.9% | 2.0% | 1.0% | 0.9% | 0.5% | 0.3% |
Classification Accuracy - Spring
Evidence | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Criterion measure | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) | Ohio State Test (OST) |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 662 | 666 | 679 | 672 | 680 | 678 |
Cut Points - Corresponding performance score (numeric) on screener measure | 999 | 1039 | 1057 | 1078 | 1112 | 1128 |
Classification Data - True Positive (a) | 172 | 169 | 161 | 179 | 211 | 177 |
Classification Data - False Positive (b) | 156 | 153 | 130 | 121 | 162 | 128 |
Classification Data - False Negative (c) | 35 | 30 | 33 | 25 | 30 | 35 |
Classification Data - True Negative (d) | 653 | 616 | 590 | 691 | 737 | 687 |
Area Under the Curve (AUC) | 0.89 | 0.90 | 0.90 | 0.93 | 0.90 | 0.91 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.87 | 0.88 | 0.88 | 0.91 | 0.88 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.92 | 0.92 | 0.92 | 0.95 | 0.92 | 0.93 |
Statistics | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Base Rate | 0.20 | 0.21 | 0.21 | 0.20 | 0.21 | 0.21 |
Overall Classification Rate | 0.81 | 0.81 | 0.82 | 0.86 | 0.83 | 0.84 |
Sensitivity | 0.83 | 0.85 | 0.83 | 0.88 | 0.88 | 0.83 |
Specificity | 0.81 | 0.80 | 0.82 | 0.85 | 0.82 | 0.84 |
False Positive Rate | 0.19 | 0.20 | 0.18 | 0.15 | 0.18 | 0.16 |
False Negative Rate | 0.17 | 0.15 | 0.17 | 0.12 | 0.12 | 0.17 |
Positive Predictive Power | 0.52 | 0.52 | 0.55 | 0.60 | 0.57 | 0.58 |
Negative Predictive Power | 0.95 | 0.95 | 0.95 | 0.97 | 0.96 | 0.95 |
Sample | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|
Date | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 | 2020-21 & 2021-22 |
Sample Size | 1016 | 968 | 914 | 1016 | 1140 | 1027 |
Geographic Representation | East North Central (OH) | East North Central (OH) | East North Central (OH) | East North Central (OH) | East North Central (OH) | East North Central (OH) |
Male | 53.0% | 54.0% | 52.0% | 53.0% | 52.0% | 52.0% |
Female | 47.0% | 46.0% | 48.0% | 47.0% | 48.0% | 48.0% |
Other | ||||||
Gender Unknown | ||||||
White, Non-Hispanic | 69.0% | 70.0% | 72.0% | 67.0% | 66.0% | 77.0% |
Black, Non-Hispanic | 14.0% | 15.0% | 15.0% | 9.0% | 10.0% | 12.0% |
Hispanic | 2.0% | 3.0% | 2.0% | 1.0% | 1.0% | 2.0% |
Asian/Pacific Islander | 2.0% | 3.0% | 1.0% | 1.0% | 1.0% | 2.0% |
American Indian/Alaska Native | 0.2% | 0.3% | 0.3% | |||
Other | 12.0% | 10.0% | 11.1% | 9.0% | 7.0% | 7.0% |
Race / Ethnicity Unknown | 12.0% | 15.0% | ||||
Low SES | 25.0% | 22.0% | 19.0% | 20.0% | 17.0% | 18.0% |
IEP or diagnosed disability | 18.0% | 15.0% | 15.0% | 15.0% | 11.0% | 14.0% |
English Language Learner | 2.0% | 2.0% | 1.0% | 1.0% | 1.0% | 0.5% |
Reliability
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- The analyses considered in this study included split-half and marginal reliability. These measures both assess the internal consistency of the tests under consideration, split-half from the context of classical test theory and marginal from the context of item response theory. Marginal reliability utilizes the item response theory (IRT) ability estimates and standard errors of the ability estimates to create a weighted average index of reliability akin to a test-retest correlation under classical test theory. The marginal reliability is a ratio of the variance of the estimated latent abilities relative to the sum of the variance of the latent ability and the expected error variance. Split-half reliability provides an estimate of alternate form reliability by dividing the test into equal halves, correlating the scores from the shortened forms, and using the Spearman-Brown formula to estimate the alternative form reliability for full-length test forms. Split-half is a more appropriate type of internal consistency reliability metric than coefficient alpha because the Exact Path Diagnostic Assessment is a computer adaptive assessment rather than a fixed-form assessment. The Exact Path Diagnostic Assessment is a variable length adaptive assessment where the test terminates once the standard error of measurement is less than or equal to 40 scale score points for reading. The stopping rule ensures that the standard error of measurement is consistent across the scale: the scores of low-, average-, and high-achieving students all have the same measurement precision.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- Students who took Exact Path Diagnostic Reading Assessment during the 2020-21 school year are included in the analysis. The sample is not strictly nationally representative, but students from nearly all 50 states are included in the dataset. As shown in the reliability table, sample sizes by grade and subject ranged from approximately 75,000 to over 90,000.
- *Describe the analysis procedures for each reported type of reliability.
- Split-half reliability coefficients were estimated for each subject and grade combination. Split-half reliability resembles a test-retest condition when a single test has been administered. For an adaptive CAT, the odd and even items are used to create the two half-length forms. The correlation (r) between scores on the two forms represents the consistency of the measure. Split-half reliability is then determined for the whole test by using the Spearman-Brown formula (p = 2r/(1+r)) to adjust the correlation to account for the full length of the test. Marginal reliability coefficients were also computed for each subject and grade combination. Traditional reliability estimators were designed based on classical test theory (CTT) of the ratio of true score and observed score variance, which is operationalized as the ratio of the variance of the observed score to the sum of the variances of the observed score and error. Under CTT, error variance is set to be constant across all true scores, while in item response theory (IRT) error varies as a function of the latent ability. Because of this difference, a single overall reliability in the context of IRT is an oversimplification of the reliability of the scores produced by the test. However, methods have been developed to approximate the traditional reliability in the IRT context. To account for the varying error across the latent ability distribution, the error variance can be integrated (Green, Bock, Humphreys, Linn, & Reckase, 1984). This can be further simplified by taking the mean of the squared standard error of measurement (SEM; Sireci, Thissen, & Wainer, 1991). Thus, the marginal reliability for IRT scores is the ratio of the variance of the estimated latent abilities relative to the sum of the variance of the latent ability and the expected error variance. To compute confidence intervals around split-half and marginal reliability, a bootstrapping approach was used.
*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Provide citations for additional published studies.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Provide citations for additional published studies.
Validity
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|
Rating |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- The Exact Path Diagnostic Assessment has extensive research supporting the validity of the assessment. Validity evidence is collected and evaluated according to the recommendations in the Standards for Educational and Psychological Testing (https://www.testingstandards.net/). The Exact Path technical report describes evidence based on test content, response processes, internal structure, relations to other variables, and consequences. In this section, we provide validity evidence in terms of relations to other variables, i.e., criterion validity. Four criterion measures are included in the correlations provided: Arizona’s end-of-year summative assessments from 2020–21 (AzM2) and 2021–22 (AASA); Indiana’s summative assessment from 2018–19 through 2020–21 (ILEARN), and Wisconsin’s summative assessment from 2020–21 (Forward Exam). These four criterion measures provide are completely external to Edmentum’s Exact Path Diagnostic Assessment screening system. However, all four external criterion measures and Exact Path are measures of reading proficiency. Thus, while the measures are aligned to different blueprints and different sets of standards, correlations are expected to be moderate to high. By providing criterion measures across a sample of states, we demonstrate the generalizability of Exact Path diagnostic as a valid screener of reading proficiency. Similar validity coefficients have been observed across other states (see https://www.edmentum.com/resources/research for more information).
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- The Arizona AzM2 and Wisconsin Forward Exam samples are from the 2020–21 academic year. The Indiana sample includes students from the 2018–19 and 2020–21 academic years. The Arizona AASA sample is from the 2021–22 academic year. The number of students per grade and criterion measure ranges from 216 to over 2800.
- *Describe the analysis procedures for each reported type of validity.
- Students’ scale scores from state summative assessments are merged with scale scores from the Exact Path Diagnostic Assessment. For concurrent validity correlation coefficients, Exact Path and state scale scores are both from the spring testing window. For predictive validity correlation coefficients, the scale scores are from the same academic year, but Exact Path scale scores are from the fall testing window, while the criterion state scale scores are from the spring testing window. Validity coefficients are Pearson correlations, and the Fisher z-transformation was used to determine the 95 percent confidence interval.
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Edmentum Exact Path and the Lexile Framework Linking Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/Exact%20Path%20and%20Lexile%20Linking%20Study%20Abstract%201.9.20.pdf Exact Path Diagnostic and the State of Texas Assessment of Academic Readiness (STAAR) Correlational Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/TX%20correlation%20report%20Exact%20Path%20and%20STAAR.pdf Exact Path Diagnostic and Pennsylvania System of School Assessment (PSSA) Correlational Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/PA-correlational-study-XP-and-PSSA.pdf
- Describe the degree to which the provided data support the validity of the tool.
- The Exact Path Diagnostic Reading Assessment can be used to screen students who are at risk for poor reading outcomes. The state summative assessment scores are typically the achievement outcomes of most importance to each state. Thus, having a screener that correlates well with these end-of-year tests is very important. The lower bound of the 95 percent confidence interval is well above 0.6 for both concurrent and predictive validity coefficients between the Exact Path Diagnostic Assessment and all of the criterion measures provided. In fact, some of the coefficients are above 0.8. These are very strong correlations despite differences in blueprint, test design, administration conditions, and test purposes. These data support the validity of Exact Path Diagnostic Assessment as a screener tool, and the generalizability of validity across various states.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Provide citations for additional published studies.
Bias Analysis
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|
Rating | Yes | Yes | Yes | Yes | Yes | Yes |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- Yes
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- A third-party differential item functioning (DIF) study was conducted by EdMetric to evaluate the Exact Path Diagnostic Assessment item pool. This study involved completing a series of DIF analyses using student responses to items from Edmentum’s item bank and delivered through Edmentum’s Exact Path Diagnostic Computerized Adaptive Test. All the analyses were performed using difR (Magis, Beland, Tuerlinckx, & De Boeck, 2010) and tidyverse (v1.3.0; Wickham et al., 2019) packages in R. The presence of DIF was investigated using the Mantel–Haenszel (MH) procedure (Clauser & Mazor, 1998). This method allowed for detecting uniform DIF without requiring an item response theory model. The MH procedure has a straightforward implementation and enabled the use of the classification system established by Educational Testing Service (Zwick & Ercikan, 1989).
- b. Describe the subgroups for which bias analyses were conducted:
- The data examined bias in relation to gender, race, socioeconomic status, and pandemic effect. 1) Gender. Gender classification was available in the student-level data set for almost half of the students (there is no gender indicated for the remaining students). The data were fairly evenly split between males and females, with slightly under half identified as female and slightly over half identified as male. 2) Race. Because Edmentum’s student-level data files provides very limited demographic information, the percentage values of this column were assigned to each student based on their school district. The account-level data provides the percentages of white students in the school district. Here, the students were considered a high majority (coded as 1) district if 50 percent or more students in the school were white, and they were considered a low majority (coded as 0) district otherwise. Nearly two-thirds of the students were from majority districts while approximately one-third were from nonmajority districts. 3) Socioeconomic Status. The account-level data provided the percentages of children in the district from families below the poverty line. The poverty data were sourced from the U.S. Census Bureau's Small Area Income and Poverty Estimates (SAIPE) program. The poverty percentage used in this study identified districts and public schools by the actual percentage of children in the district who come from families below the poverty line. This percentage was calculated by creating a ratio of the children in a district from families below the poverty line to all children in the district. Students were considered a part of a high-poverty district (coded as 1) if more than 17 percent of students were living in poverty, and they were in a low-poverty district (coded as 0) otherwise. Originally, the intention was to assign high SES districts using a 50 percent cut off; however, there were very few districts available where more than 50 percent of students lived in poverty; therefore, the average percentage of students in poverty was used to divide the data. Nearly 60 percent of students were from school districts classified as high-poverty districts while nearly 40 percent were from low-poverty districts. 4) Pandemic Effect. The pandemic grouping variable was obtained by appending the pre-pandemic data (all items administered prior to March 2020) to the pandemic data (all items administered after March 2020). The pre-pandemic data combined data from the 2018–2019 and 2019–2020 data sets while the pandemic data combined any responses from administrations after March 2020.
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
- When conducting DIF studies with the ETS classification system, items were classified as A-, B-, or C-level DIF. Items classified with A-level DIF have “little or no difference between the two matched groups” (Zieky, 2003). Items flagged with B- and C-level DIF are typically evaluated for potential bias. Despite the large number of items in Edmentum’s item bank, no items were flagged for B- or C-level DIF. Thus, given the four groups considered for the DIF analysis, the Edmentum items appear to be unbiased.
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.