Exact Path Diagnostic Assessment
Mathematics

Summary

Descriptive Information

Exact Path’s Diagnostic Mathematics Assessment is a computer-adaptive assessment that can be administered up to five times per academic year to screen and identify students in need of intervention. The Diagnostic Assessment efficiently pinpoints where students are ready to start learning and measures their growth between assessments. The assessment uses a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The assessment delivers an overall placement score, plus a score for each domain, all reported in real time. Next, the assessment generates an individualized learning path based on a student's unique domain levels. Learning paths take students to content appropriate for their instructional level, regardless of grade level. Highlights: 1) Valid and reliable assessment that diagnoses each student’s strengths and needs and pinpoint exactly where the student is ready to start learning. 2) Efficient measurement. Students in kindergarten and grade 1 have median times under 15 minutes. For students in grades 2–12, median times range from 23 to 53 minutes. 3) Real-time reporting of Quantile® measures, national percentile ranks, grade-level proficiency, and student growth upon each successive administration. 4) Universal accessibility tools including read-aloud, text magnification, and highlighting. 5) Easy to schedule and administer. Administrators and teachers can auto-schedule tests for all students, or they can manually adjust the schedule for specific students.

Acquisition & Cost

Where to Obtain:: Edmentum, Inc.; info@edmentum.com; 5600 West 83rd Street, Suite 300, 8200 Tower, Bloomington, MN 55437; 800.447.5286; https://info.edmentum.com/get-a-quote.html

Initial Cost:: $7.00 per student

Replacement Cost:: $7.00 per student per year

Included in Cost:: Edmentum’s professional services team, in tandem with our support team, works with our district and school partners through the implementation process and beyond to ensure your program is a success. In addition to the included implementation support, professional learning and engagement experiences are also available. Our detailed Edmentum Professional Services and Consulting Catalog can be found at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Edmentum offers unparalleled flexibility in program cost structure, with per-student and districtwide options. Understanding that no two schools or districts are alike, we offer licensing options to meet your unique needs and will partner with your stakeholders to provide the best fit. With our variety of packaging selections, we will find the perfect program to meet your specific needs, fit within your budget, and help your students thrive. Access options include individual content areas and multi-subject core bundles. Licenses and subscriptions are generally provided in 12-month increments, with unlimited teacher licenses included at no cost. Note that Edmentum’s standard minimum subscription term is 12 months. To further support our valued customers, we have discounts available based on the volume of student licenses. Basic pricing includes the assessment for the selected content area(s), all of which appears on a single platform; unlimited teacher licenses; administrator licenses; administrator and educator dashboards to control diagnostic assessment timing, frequency, and administration (e.g., monitoring testing, resetting a test, alerts); interactive reporting functionality for administrators and educators (including data export, dashboards, aggregate and individual reports, etc.); report access for students, parents, educators, and administrators; unlimited access available 24/7 to the embedded Guided Access and Help Center searchable support and troubleshooting tool as well as all relevant guidance and support materials, manuals, and webinars; award-winning customer support team providing support via phone, email, web, and live office hours; and dedicated support pages and resources for family/caregivers. Available as an add-on student-level license is the learning path solution capable of supporting intervention, individualized learning, and acceleration needs through the creation of personalized instructional pathways and natively integrated with the Diagnostic Assessments, as well as resources for educators to provide targeted intervention support to students through whole group, small group, or one-on-one instruction inclusive of lesson plans, printable activities, and instructional videos.; The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, calculator use, English learner accommodations, and audio and visual accommodations.

Training & Technical Support

Training Requirements:: 1-4 hours

Qualified Administrators:: No minimum qualifications specified.

Access to Technical Support:: From the moment our partners engage with us, we provide personal support to ensure they have the best experience possible. Our live, U.S.-based customer support team offers superior technical support as well as high-value instructional support to help educators gain the full value of their Edmentum programs. Our customer support team provides full phone and email support for our programs to all users during business hours. Edmentum offers a variety of training and support modalities, including online and offline training, videos, webinars, and documentation for online system users and administrators. We offer an extensive library of online and offline prerecorded and recorded videos and webinars that are accessible 24/7 via our website and YouTube channel. Our series of public webinars are scheduled weekly, and users can register for them anytime. Exact Path has both online documentation within the program and up-to-date downloadable user guides; both of these resources are found within Exact Path’s Help Center on Edmentum’s Support page. Teachers and administrators each have their own user manual with guidance on how to use the system. There is also an embedded on-demand Help Center with searchable help/troubleshooting. Not only is context-specific help available, but there are also page tours that walk users through actions they may want to take. Other time-specific guides direct teachers to important reports and new features.

Administration

Assessment Format:

Scoring Time:

Scoring is automatic

Scores Generated:

Raw score
Percentile score
IRT-based score
Developmental benchmarks
Developmental cut points
Equated
Subscale/subtest scores

Administration Time:

45 minutes per student/group

Scoring Method:

Automatically (computer-scored)

Technology Requirements:

Computer or tablet
Internet connection

Accommodations:: The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, calculator use, English learner accommodations, and audio and visual accommodations.

Descriptive Information

Please provide a description of your tool:: Exact Path’s Diagnostic Mathematics Assessment is a computer-adaptive assessment that can be administered up to five times per academic year to screen and identify students in need of intervention. The Diagnostic Assessment efficiently pinpoints where students are ready to start learning and measures their growth between assessments. The assessment uses a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The assessment delivers an overall placement score, plus a score for each domain, all reported in real time. Next, the assessment generates an individualized learning path based on a student's unique domain levels. Learning paths take students to content appropriate for their instructional level, regardless of grade level. Highlights: 1) Valid and reliable assessment that diagnoses each student’s strengths and needs and pinpoint exactly where the student is ready to start learning. 2) Efficient measurement. Students in kindergarten and grade 1 have median times under 15 minutes. For students in grades 2–12, median times range from 23 to 53 minutes. 3) Real-time reporting of Quantile® measures, national percentile ranks, grade-level proficiency, and student growth upon each successive administration. 4) Universal accessibility tools including read-aloud, text magnification, and highlighting. 5) Easy to schedule and administer. Administrators and teachers can auto-schedule tests for all students, or they can manually adjust the schedule for specific students.

The tool is intended for use with the following grade(s).

Preschool / Pre - kindergarten
selected

Kindergarten
selected

First grade
selected

Second grade
selected

Third grade
selected

Fourth grade
selected

Fifth grade
selected

Sixth grade
selected

Seventh grade
selected

Eighth grade
selected

Ninth grade
selected

Tenth grade
selected

Eleventh grade
selected

Twelfth grade

The tool is intended for use with the following age(s).

0-4 years old
selected

5 years old
selected

6 years old
selected

7 years old
selected

8 years old
selected

9 years old
selected

10 years old
selected

11 years old
selected

12 years old
selected

13 years old
selected

14 years old
selected

15 years old
selected

16 years old
selected

17 years old
selected

18 years old

The tool is intended for use with the following student populations.

Students in general education
selected

Students with disabilities
selected

English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading

Phonological processing:

RAN

Memory

Awareness

Letter sound correspondence
not selected

Phonics

Structural analysis

Word ID

Accuracy

Speed

Nonword

Accuracy

Speed

Spelling

Accuracy

Speed

Passage

Accuracy

Speed

Reading comprehension:

Multiple choice questions
not selected

Cloze

Constructed Response
not selected

Retell

Maze

Sentence verification
not selected

Other (please describe):

Listening comprehension:

Multiple choice questions
not selected

Cloze

Constructed Response
not selected

Retell

Maze

Sentence verification
not selected

Vocabulary
not selected

Expressive
not selected

Receptive

Mathematics

Global Indicator of Math Competence

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Early Numeracy

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematics Concepts

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematics Computation

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematic Application

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Fractions/Decimals

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Algebra

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Geometry

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Other (please describe):
Item types: cloze, fill in the blank, drag and drop, and hot spot

Please describe specific domain, skills or subtests:: The Exact Path Mathematics assessment includes the domains of Counting & Cardinality (K); Fractions & Ratios (grades 3–7); Functions (grades 8–HS); Algebra & Expressions; Geometry; Numbers & Operations; and Measurement, Data & Statistics.

BEHAVIOR ONLY: Which category of behaviors does your tool target?: Internalizing
Externalizing
Internalizing and Externalizing

BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:

Email Address: info@edmentum.com
Address: 5600 West 83rd Street, Suite 300, 8200 Tower, Bloomington, MN 55437
Phone Number: 800.447.5286
Website: https://info.edmentum.com/get-a-quote.html

Initial cost for implementing program:

Cost: $7.00
Unit of cost: student

Replacement cost per unit for subsequent use:

Cost: $7.00
Unit of cost: student
Duration of license: year

Additional cost information:

Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.: Edmentum’s professional services team, in tandem with our support team, works with our district and school partners through the implementation process and beyond to ensure your program is a success. In addition to the included implementation support, professional learning and engagement experiences are also available. Our detailed Edmentum Professional Services and Consulting Catalog can be found at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Edmentum offers unparalleled flexibility in program cost structure, with per-student and districtwide options. Understanding that no two schools or districts are alike, we offer licensing options to meet your unique needs and will partner with your stakeholders to provide the best fit. With our variety of packaging selections, we will find the perfect program to meet your specific needs, fit within your budget, and help your students thrive. Access options include individual content areas and multi-subject core bundles. Licenses and subscriptions are generally provided in 12-month increments, with unlimited teacher licenses included at no cost. Note that Edmentum’s standard minimum subscription term is 12 months. To further support our valued customers, we have discounts available based on the volume of student licenses. Basic pricing includes the assessment for the selected content area(s), all of which appears on a single platform; unlimited teacher licenses; administrator licenses; administrator and educator dashboards to control diagnostic assessment timing, frequency, and administration (e.g., monitoring testing, resetting a test, alerts); interactive reporting functionality for administrators and educators (including data export, dashboards, aggregate and individual reports, etc.); report access for students, parents, educators, and administrators; unlimited access available 24/7 to the embedded Guided Access and Help Center searchable support and troubleshooting tool as well as all relevant guidance and support materials, manuals, and webinars; award-winning customer support team providing support via phone, email, web, and live office hours; and dedicated support pages and resources for family/caregivers. Available as an add-on student-level license is the learning path solution capable of supporting intervention, individualized learning, and acceleration needs through the creation of personalized instructional pathways and natively integrated with the Diagnostic Assessments, as well as resources for educators to provide targeted intervention support to students through whole group, small group, or one-on-one instruction inclusive of lesson plans, printable activities, and instructional videos.

Provide information about special accommodations for students with disabilities.: The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, calculator use, English learner accommodations, and audio and visual accommodations.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?

General education teacher
not selected

Special education teacher
not selected

Parent

Child

External observer
not selected

Other

If other, please specify:

What is the administration setting?

Direct observation
not selected

Rating scale
not selected

Checklist

Performance measure
not selected

Questionnaire
not selected

Direct: Computerized
not selected

One-to-one
not selected

Other

If other, please specify:

Does the tool require technology?

Yes

If yes, what technology is required to implement your tool? (Select all that apply)

Computer or tablet
selected

Internet connection
not selected

Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?

Individual
selected

Small group If small group, n=
selected

Large group If large group, n=
selected

Computer-administered
selected

Other

If other, please specify:

Administration may take place in person as well as remotely.

What is the administration time?

Time in minutes

per (student/group/other unit)

student/group

Additional scoring time:

Time in minutes

per (student/group/other unit)

student/group

ACADEMIC ONLY: What are the discontinue rules?

No discontinue rules provided
not selected

Basals

Ceilings

Other

If other, please specify:

Are norms available?: Yes

Are benchmarks available?: Yes
If yes, how many benchmarks per year?: The Diagnostic Assessment may be administered up to five times per year, though three times per year is recommended.
If yes, for which months are benchmarks available?: The Diagnostic Assessment may be administered any time of year. However, administration is recommended in each of the following windows: August 15 – October 14, December 1 – January 31, and April 1 – May 31.

BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?: Yes

Describe the time required for administrator training, if applicable:: 1-4 hours

Please describe the minimum qualifications an administrator must possess.: No minimum qualifications

Are training manuals and materials available?: Yes

Are training manuals/materials field-tested?: Yes

Are training manuals/materials included in cost of tools?: Yes
If No, please describe training costs:: In addition to the on-demand resources available 24/7 and embedded in Exact Path for educator support, training, and reference, Edmentum offers a variety of professional learning and engagement experiences detailed in our Edmentum Professional Services and Consulting Catalog at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Our dedicated team is ready to design and deliver services tailored to each partner’s unique needs and goals.

Can users obtain ongoing professional and technical support?: Yes
If Yes, please describe how users can obtain support:: From the moment our partners engage with us, we provide personal support to ensure they have the best experience possible. Our live, U.S.-based customer support team offers superior technical support as well as high-value instructional support to help educators gain the full value of their Edmentum programs. Our customer support team provides full phone and email support for our programs to all users during business hours. Edmentum offers a variety of training and support modalities, including online and offline training, videos, webinars, and documentation for online system users and administrators. We offer an extensive library of online and offline prerecorded and recorded videos and webinars that are accessible 24/7 via our website and YouTube channel. Our series of public webinars are scheduled weekly, and users can register for them anytime. Exact Path has both online documentation within the program and up-to-date downloadable user guides; both of these resources are found within Exact Path’s Help Center on Edmentum’s Support page. Teachers and administrators each have their own user manual with guidance on how to use the system. There is also an embedded on-demand Help Center with searchable help/troubleshooting. Not only is context-specific help available, but there are also page tours that walk users through actions they may want to take. Other time-specific guides direct teachers to important reports and new features.

Scoring

How are scores calculated?

Manually (by hand)
selected

Automatically (computer-scored)
not selected

Other

If other, please specify:

Do you provide basis for calculating performance level scores?: Yes

What is the basis for calculating performance level and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

What types of performance level scores are available?

Raw score

Standard score
selected

Percentile score
not selected

Grade equivalents
selected

IRT-based score
not selected

Age equivalents
not selected

Stanines

Normal curve equivalents
selected

Developmental benchmarks
selected

Developmental cut points
selected

Equated

Probability
not selected

Lexile score
not selected

Error analysis
not selected

Composite scores
selected

Subscale/subtest scores
not selected

Other

If other, please specify:

Quantile score; Scale score; LPEG (Learning Path Entry Grade) score, which identifies the grade level proficiency and entry point of a student in a particular skill for placement on an individualized learning path.

Does your tool include decision rules?: Yes
If yes, please describe.: Risk Benchmarks: Districts and schools can use the National Percentile Ranks (NPR) reporting provided through the Exact Path Diagnostic to identify students “at risk.” Consistent with NCII, we recommend that schools use the NPR of 20 to identify students who need intensive intervention. Depending on the needs of their students, districts and schools are also able to use other NPR thresholds for purposes such as identification of moderate risk and gifted and talented.

Can you provide evidence in support of multiple decision rules?: Yes
If yes, please describe.: Risk Benchmarks: NPRs are provided for all students in grades K–8 for math. NPRs were derived using national samples and a weighting methodology that adjusted the sample to be representative of the national student population. The norm study was conducted using data from the 2018–19 academic year. The NPR of 20 is recommended by RTI experts including NCII as an appropriate threshold for establishing at-risk classifications.

Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.: The Exact Path Diagnostic Assessments include dichotomously scored multiple-choice and technology-enhanced item types. All the items on the Diagnostic Assessments are machine scored in real time. The Diagnostic Assessments provide multiple scores to describe student learning levels and student progress throughout the year. The scale scores were developed with the 1-parameter Rasch item response theory model and are placed on a vertical scale from 500 to 1500 spanning all grades K–12 for each subject. Students also receive a raw score for each domain (i.e., number of items answered correctly out of number of items delivered). In addition, student reports contain growth scores (between administrations), Lexile®/Quantile® measures, and national percentile ranks. Performance levels, called Grade Level Proficiency classifications, categorize students into four performance levels, with the top two levels indicating on-grade level achievement in mathematics, language arts, and reading. Upon completion of the Diagnostic Assessments, Exact Path generates an individualized learning path based on students’ unique proficiency by domain, providing students with access to content appropriate for their instructional level, regardless of their grade level.

Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.: Exact Path provides computer-adaptive assessments in math, reading, and language arts that can be administered up to five times per academic year (though three times is most common) to efficiently pinpoint where students are ready to start learning and to measure their growth between assessments. The assessments use a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The algorithm selects the first question based on either a student’s enrolled grade level (if first time testing) or the student's previous diagnostic score. As the student progresses, each item presented depends on whether they answer the previous item correctly (receive more difficult item) or incorrectly (receive easier item). In this way, students receive assessments tailored to their skill levels, resulting in delivery of precise, accurate results for each content area that can be used to inform instruction and interventions. The adaptive algorithm uses consistent stopping rules for all learners that are based on the precision of the student score, so that scores are highly reliable for low performing students, average performing students, and high performing or gifted students. The Exact Path Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Item writers are trained according to Edmentum’s internal Assessment Item Writing Guide and Item Specifications, which includes Fairness, Bias, and Sensitivity guidance. Each item begins with a task model containing all parameters for that item, from standards, depth of knowledge (DOK), and readability to considerations for bias and sensitivity. Once written, each item undergoes two rounds of review and revision, including bias and sensitivity reviews. Furthermore, extensive accommodations are available for use both within and outside of the Exact Path platform to support the diverse needs of students, including students from linguistically and culturally diverse backgrounds as well as students with disabilities. More information about accommodations for students with disabilities is provided at the end of the Descriptive Information section. Teachers can make appropriate accommodations for students who are English language learners, such as providing a dictionary, helping to pronounce words, and offering any other accommodation students receive instructionally. However, teachers should not give substantive help interpreting text. Exact Path has been awarded WIDA Prime V2 Correlation, indicating our ability to address English language learners’ listening, speaking, reading, and writing needs. Exact Path includes built-in text-to-speech functionality, closed captions for videos, and highlighted vocabulary words with built-in tools for translation, definition, and audio support. EdMetric (2022) conducted an independent study of DIF in Edmentum’s item bank to examine the impact of four grouping variables (gender, race, socioeconomic status, and pandemic effect) on items. Investigation utilized the Mantel-Haenszel (MH) procedure, which enables the use of the classification system by Educational Testing Service to separate items into differing levels of DIF including negligible DIF (A-level), moderate DIF (B-level), and large DIF (C-level). Items flagged with B- and C-level DIF would indicate students in the groups of interest perform differently on the item. No items in Edmentum’s item bank were flagged for B- or C-level DIF, indicating that items in Edmentum’s item bank measure student achievement from different groups in a similar manner and providing some evidence that the items are fair for different groups. This evidence reflects the attention to fairness and the measures taken to avoid bias and sensitivity throughout the item development process. The study is accessible online at https://www.edmentum.com/resources/efficacy/exact-path-independent-study-differential-item-analysis-edmentums-item-bank.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Classification Accuracy Fall
Classification Accuracy Winter
Classification Accuracy Spring

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Wisconsin Forward Exam

Classification Accuracy

Select time of year

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.: Wisconsin’s Statewide Achievement Assessment, called the Forward Exam, is administered in grades 3–8 in English Language Arts and Mathematics . The Forward Exam criterion measure is completely independent from the screening measure. The Forward Exam is developed by DRC in collaboration with the state of Wisconsin. However, the Forward Exam is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics Forward Exam and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts Forward Exam and the Exact Path Diagnostic Reading Assessment.

Do the classification accuracy analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).: Consistent with NCII’s guidance, the 20th percentile on the Forward Exam was used as the criterion cut-point. The scale scores on the Forward Exam that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the Forward Exam classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?: Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.: The Wisconsin districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the 2020–21 school year.

Cross-Validation

Has a cross-validation study been conducted?: No
If yes,

Select time of year.

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.

Do the cross-validation analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2)

Classification Accuracy

Select time of year

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.: Arizona’s Statewide Achievement Assessment, the AzMerit 2 (AzM2), was administered in grades 3–8 in English Language Arts and Mathematics during the 2020–21 academic year. AzM2 is completely independent from the screening measure and was developed by Cambium Assessment in collaboration with the state of Arizona. However, the AzM2 is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics AzM2 and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts AzM2 Exam and the Exact Path Diagnostic Reading Assessment.

Do the classification accuracy analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.: AzM2 is administered to students in Arizona once per year in the spring. Exact Path Diagnostic Assessments are administered in the fall, winter, and spring. When evaluating the classification accuracy of Exact Path Diagnostic Assessments, the spring administration often takes place quite close to the AzM2 spring administration. Thus, the classification results can be considered concurrent. However, the classification results for the fall and winter Exact Path scores are from administrations timed several months before the AzM2 administration. Thus, these classification accuracy results can be considered predictive.

Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).: Consistent with NCII’s guidance, the 20th percentile on the AzM2 was used as the criterion cut-point. The scale scores on the AzM2 that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the AzM2 classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?: Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.: The Arizona districts that supplied data for this study had access to the Exact Path Learning Path which provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the 2020–21 school year.

Cross-Validation

Has a cross-validation study been conducted?: No
If yes,

Select time of year.

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.

Do the cross-validation analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Ohio State Test (OST)

Classification Accuracy

Select time of year

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.: Ohio’s Statewide Achievement Assessment, the Ohio State Test (OST), was administered in grades 3–8 in English Language Arts and Mathematics during the 2020–21 and 2021-22 academic years. OST is completely independent from the screening measure and was developed by Cambium Assessment in collaboration with the state of Ohio. However, the OST is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics OST and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts OST Exam and the Exact Path Diagnostic Reading Assessment.

Do the classification accuracy analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.: OST is administered to students in Ohio once per year in the spring. Exact Path Diagnostic Assessments are administered in the fall, winter, and spring. When evaluating the classification accuracy of Exact Path Diagnostic Assessments, the spring administration often takes place quite close to the OST spring administration. Thus, the classification results can be considered concurrent. However, the classification results for the fall and winter Exact Path scores are from administrations timed several months before the OST administration. Thus, these classification accuracy results can be considered predictive.

Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).: Consistent with NCII’s guidance, the 20th percentile on the OST was used as the criterion cut-point. The scale scores on the OST that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the OST classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?: Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.: The Ohio districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the school year.

Cross-Validation

Has a cross-validation study been conducted?: No
If yes,

Select time of year.

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.

Do the cross-validation analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Classification Accuracy - Fall

Evidence	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Criterion measure	Ohio State Test (OST)	Ohio State Test (OST)	Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2)	Wisconsin Forward Exam	Wisconsin Forward Exam	Ohio State Test (OST)
Cut Points - Percentile rank on criterion measure	20	20	20	20	20	20
Cut Points - Performance score on criterion measure	665	676	3512	560	579	677
Cut Points - Corresponding performance score (numeric) on screener measure	856	908	899	1001	1019	1064
Classification Data - True Positive (a)	119	150	21	42	28	87
Classification Data - False Positive (b)	107	134	15	30	40	101
Classification Data - False Negative (c)	27	37	3	3	4	21
Classification Data - True Negative (d)	439	537	107	221	214	295
Area Under the Curve (AUC)	0.88	0.87	0.93	0.92	0.91	0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.85	0.84	0.88	0.87	0.87	0.80
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.91	0.89	0.98	0.98	0.95	0.88

Statistics	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Base Rate	0.21	0.22	0.16	0.15	0.11	0.21
Overall Classification Rate	0.81	0.80	0.88	0.89	0.85	0.76
Sensitivity	0.82	0.80	0.88	0.93	0.88	0.81
Specificity	0.80	0.80	0.88	0.88	0.84	0.74
False Positive Rate	0.20	0.20	0.12	0.12	0.16	0.26
False Negative Rate	0.18	0.20	0.13	0.07	0.13	0.19
Positive Predictive Power	0.53	0.53	0.58	0.58	0.41	0.46
Negative Predictive Power	0.94	0.94	0.97	0.99	0.98	0.93

Sample	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Date	2020-21 & 2021-2022	2020-21 & 2021-22	2020-21	2020-21	2020-21	2020-21 & 2021-22
Sample Size	692	858	146	296	286	504
Geographic Representation	East North Central (OH)	East North Central (OH)	Mountain (AZ)	East North Central (WI)	East North Central (WI)	East North Central (OH)
Male	50.7%	54.0%	52.1%	47.0%	43.0%	50.0%
Female	48.8%	46.0%	47.9%	43.2%	49.0%	49.0%
Other						0.2%
Gender Unknown						0.2%
White, Non-Hispanic	68.6%	70.0%	4.1%	72.6%	77.6%	74.0%
Black, Non-Hispanic	14.9%	15.0%	5.5%	0.7%	1.7%	15.1%
Hispanic	2.0%	3.0%	86.3%	13.5%	9.8%	2.0%
Asian/Pacific Islander	2.0%	2.0%	0.7%	1.0%	1.4%	1.0%
American Indian/Alaska Native			2.1%	0.7%	0.7%
Other	13.0%	10.0%	1.4%	1.4%	1.0%	6.9%
Race / Ethnicity Unknown
Low SES	19.9%	22.0%		27.0%	23.4%	19.0%
IEP or diagnosed disability	18.9%	14.0%	10.3%	13.2%	9.1%	18.1%
English Language Learner	2.0%	2.0%	20.5%	3.4%	2.8%	0.4%

Classification Accuracy - Winter

Evidence	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Criterion measure	Ohio State Test (OST)	Ohio State Test (OST)	Wisconsin Forward Exam	Wisconsin Forward Exam	Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2)	Ohio State Test (OST)
Cut Points - Percentile rank on criterion measure	20	20	20	20	20	20
Cut Points - Performance score on criterion measure	665	676	557	560	3574	677
Cut Points - Corresponding performance score (numeric) on screener measure	887	929	986	1034	1001	1068
Classification Data - True Positive (a)	141	149	30	42	25	106
Classification Data - False Positive (b)	129	115	32	45	24	114
Classification Data - False Negative (c)	27	35	3	2	2	32
Classification Data - True Negative (d)	514	591	210	206	99	394
Area Under the Curve (AUC)	0.89	0.88	0.94	0.94	0.89	0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.86	0.85	0.91	0.91	0.83	0.80
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.92	0.91	0.98	0.97	0.95	0.87

Statistics	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Base Rate	0.21	0.21	0.12	0.15	0.18	0.21
Overall Classification Rate	0.81	0.83	0.87	0.84	0.83	0.77
Sensitivity	0.84	0.81	0.91	0.95	0.93	0.77
Specificity	0.80	0.84	0.87	0.82	0.80	0.78
False Positive Rate	0.20	0.16	0.13	0.18	0.20	0.22
False Negative Rate	0.16	0.19	0.09	0.05	0.07	0.23
Positive Predictive Power	0.52	0.56	0.48	0.48	0.51	0.48
Negative Predictive Power	0.95	0.94	0.99	0.99	0.98	0.92

Sample	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Date	2020-21 & 2021-22	2020-21 & 2021-22	2020-21	2020-21	2020-21	2020-21 & 2021-22
Sample Size	811	890	275	295	150	646
Geographic Representation	East North Central (OH)	East North Central (OH)	East North Central (WI)	East North Central (WI)	Mountain (AZ)	East North Central (OH)
Male	51.0%	54.0%	40.7%	47.1%	49.3%	49.1%
Female	49.0%	46.0%	45.1%	43.4%	50.7%	50.9%
Other
Gender Unknown						0.2%
White, Non-Hispanic	70.0%	70.0%	69.5%	72.9%	6.0%	74.0%
Black, Non-Hispanic	14.1%	15.1%	2.5%	0.7%	6.7%	15.0%
Hispanic	3.0%	2.0%	11.3%	13.6%	83.3%	2.0%
Asian/Pacific Islander	2.0%	3.0%	0.4%	1.0%	0.7%	0.9%
American Indian/Alaska Native			0.7%	0.7%	2.0%
Other	12.0%	10.0%	1.5%	1.4%	2.7%	8.0%
Race / Ethnicity Unknown
Low SES	27.0%	23.0%	26.5%	27.1%		22.0%
IEP or diagnosed disability	18.0%	15.1%	10.2%	13.2%	10.7%	15.9%
English Language Learner	2.0%	2.0%	4.7%	3.4%	20.0%	0.9%

Classification Accuracy - Spring

Evidence	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Criterion measure	Ohio State Test (OST)	Ohio State Test (OST)	Wisconsin Forward Exam	Wisconsin Forward Exam	Wisconsin Forward Exam	Ohio State Test (OST)
Cut Points - Percentile rank on criterion measure	20	20	20	20	20	20
Cut Points - Performance score on criterion measure	665	676	557	560	579	678
Cut Points - Corresponding performance score (numeric) on screener measure	918	977	1031	1042	1079	1084
Classification Data - True Positive (a)	170	185	34	43	33	61
Classification Data - False Positive (b)	117	146	38	29	43	51
Classification Data - False Negative (c)	22	18	0	1	3	14
Classification Data - True Negative (d)	600	619	164	195	184	224
Area Under the Curve (AUC)	0.92	0.92	0.96	0.96	0.93	0.86
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.91	0.90	0.93	0.94	0.89	0.82
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.94	0.94	0.98	0.98	0.96	0.90

Statistics	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Base Rate	0.21	0.21	0.14	0.16	0.14	0.21
Overall Classification Rate	0.85	0.83	0.84	0.89	0.83	0.81
Sensitivity	0.89	0.91	1.00	0.98	0.92	0.81
Specificity	0.84	0.81	0.81	0.87	0.81	0.81
False Positive Rate	0.16	0.19	0.19	0.13	0.19	0.19
False Negative Rate	0.11	0.09	0.00	0.02	0.08	0.19
Positive Predictive Power	0.59	0.56	0.47	0.60	0.43	0.54
Negative Predictive Power	0.96	0.97	1.00	0.99	0.98	0.94

Sample	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Date	2020-21 & 2021-22	2020-21 & 2021-22	2020-21	2020-21	2020-21	2020-21 & 2021-22
Sample Size	909	968	236	268	263	350
Geographic Representation	East North Central (OH)	East North Central (OH)	East North Central (WI)	East North Central (WI)	East North Central (WI)	East North Central (OH)
Male	51.0%	54.0%	47.5%	51.9%	46.8%	51.1%
Female	49.0%	45.0%	52.5%	47.8%	53.2%	49.1%
Other
Gender Unknown
White, Non-Hispanic	68.0%	70.0%	80.9%	80.2%	84.4%	77.1%
Black, Non-Hispanic	16.0%	15.0%	3.0%	0.7%	1.9%	13.1%
Hispanic	4.0%	3.0%	13.1%	14.9%	10.6%
Asian/Pacific Islander	2.0%	3.0%	0.4%	1.1%	1.5%	0.9%
American Indian/Alaska Native		0.2%	0.8%	0.7%	0.8%
Other	12.0%	10.0%	1.7%	1.5%	1.1%	9.1%
Race / Ethnicity Unknown
Low SES	27.0%	22.0%	30.9%	29.9%	25.5%
IEP or diagnosed disability	17.1%	15.0%	11.9%	14.6%	9.9%	17.1%
English Language Learner	2.0%	2.0%	5.5%	3.7%	3.0%	1.1%

Reliability

Grade	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Offer a justification for each type of reliability reported, given the type and purpose of the tool.: The analyses considered in this study included split-half and marginal reliability. These measures both assess the internal consistency of the tests under consideration, split-half from the context of classical test theory and marginal from the context of item response theory. Marginal reliability utilizes the item response theory (IRT) ability estimates and standard errors of the ability estimates to create a weighted average index of reliability akin to a test-retest correlation under classical test theory. The marginal reliability is a ratio of the variance of the estimated latent abilities relative to the sum of the variance of the latent ability and the expected error variance. Split-half reliability provides an estimate of alternate form reliability by dividing the test into equal halves, correlating the scores from the shortened forms, and using the Spearman-Brown formula to estimate the alternative form reliability for full-length test forms. Split-half is a more appropriate type of internal consistency reliability metric than coefficient alpha because the Exact Path Diagnostic Assessment is a computer adaptive assessment rather than a fixed-form assessment. The Exact Path Diagnostic Assessment is a variable-length adaptive assessment where the test terminates once the standard error of measurement is less than or equal to 40 scale score points for mathematics. The stopping rule ensures that the standard error of measurement is consistent across the scale: the scores of low-, average-, and high-achieving students all have the same measurement precision.

*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.: Students who took the Exact Path Diagnostic Mathematics Assessment during the 2020–21 school year are included in the analysis. The sample is not strictly nationally representative, but students from nearly all 50 states are included in the dataset. As shown in the reliability table, sample sizes by grade and subject ranged from approximately 75,000 to over 90,000.

*Describe the analysis procedures for each reported type of reliability.: Split-half reliability coefficients were estimated for each subject and grade combination. Split-half reliability resembles a test-retest condition when a single test has been administered. For an adaptive CAT, the odd and even items are used to create the two half-length forms. The correlation (r) between scores on the two forms represents the consistency of the measure. Split-half reliability is then determined for the whole test by using the Spearman-Brown formula (p = 2r/(1+r)) to adjust the correlation to account for the full length of the test. Marginal reliability coefficients were also computed for each subject and grade combination. Traditional reliability estimators were designed based on classical test theory (CTT) of the ratio of true score and observed score variance, which is operationalized as the ratio of the variance of the observed score to the sum of the variances of the observed score and error. Under CTT, error variance is set to be constant across all true scores, while in item response theory (IRT) error varies as a function of the latent ability. Because of this difference, a single overall reliability in the context of IRT is an oversimplification of the reliability of the scores produced by the test. However, methods have been developed to approximate the traditional reliability in the IRT context. To account for the varying error across the latent ability distribution, the error variance can be integrated (Green, Bock, Humphreys, Linn, & Reckase, 1984). This can be further simplified by taking the mean of the squared standard error of measurement (SEM; Sireci, Thissen, & Wainer, 1991). Thus, the marginal reliability for IRT scores is the ratio of the variance of the estimated latent abilities relative to the sum of the variance of the latent ability and the expected error variance. To compute confidence intervals around split-half and marginal reliability, a bootstrapping approach was used.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Validity

Grade	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.: The Exact Path Diagnostic Assessment has extensive research supporting the validity of the assessment. Validity evidence is collected and evaluated according to the recommendations in the Standards for Educational and Psychological Testing (https://www.testingstandards.net/). The Exact Path technical report describes evidence based on test content, response processes, internal structure, relations to other variables, and consequences. In this section, we provide validity evidence in terms of relations to other variables, i.e., criterion validity. Four criterion measures are included in the correlations provided: Arizona’s end-of-year summative assessments from 2020–21 (AzM2) and 2021–22 (AASA); Indiana’s summative assessment from 2018–19 through 2020–21 (ILEARN), and Wisconsin’s summative assessment from 2020–21 (Forward Exam). These four criterion measures provide are completely external to Edmentum’s Exact Path Diagnostic Assessment screening system. However, all four external criterion measures and Exact Path are measures of mathematics proficiency. Thus, while the measures are aligned to different blueprints and different sets of standards, correlations are expected to be moderate to high. By providing criterion measures across a sample of states, we demonstrate the generalizability of Exact Path diagnostic as a valid screener of mathematics proficiency. Similar validity coefficients have been observed across other states (see https://www.edmentum.com/resources/research for more information).

*Describe the sample(s), including size and characteristics, for each validity analysis conducted.: The Arizona AzM2 and Wisconsin Forward Exam samples are from the 2020–21 academic year. The Indiana sample includes students from 2018–19 and 2020–21 academic years. The Arizona AASA sample is from the 2021–22 academic year. The number of students per grade and criterion measure ranges from 216 to over 2800.

*Describe the analysis procedures for each reported type of validity.: Students’ scale scores from state summative assessments are merged with scale scores from the Exact Path Diagnostic Assessment. For concurrent validity correlation coefficients, Exact Path and state scale scores are both from the spring testing window. For predictive validity correlation coefficients, the scale scores are from the same academic year, but Exact Path scale scores are from the fall testing window, while the criterion state scale scores are from the spring testing window. Validity coefficients are Pearson correlations, and the Fisher z-transformation was used to determine the 95 percent confidence interval.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: Yes

Provide citations for additional published studies.: Edmentum Exact Path and the Quantile Framework Linking Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/Exact%20Path%20and%20Quantile%20Linking%20Study%20Abstract%201.9.20.pdf Exact Path Diagnostic and the State of Texas Assessment of Academic Readiness (STAAR) Correlational Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/TX%20correlation%20report%20Exact%20Path%20and%20STAAR.pdf Exact Path Diagnostic and Pennsylvania System of School Assessment (PSSA) Correlational Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/PA-correlational-study-XP-and-PSSA.pdf

Describe the degree to which the provided data support the validity of the tool.: The Exact Path Diagnostic Mathematics Assessment can be used to screen students who are at risk for poor mathematics outcomes. The state summative assessment scores are typically the achievement outcomes of most importance to each state. Thus, having a screener that correlates well with these end-of-year tests is very important. The lower bound of the 95 percent confidence interval is well above 0.6 for both concurrent and predictive validity coefficients between the Exact Path Diagnostic Assessment and all of the criterion measures provided. In fact, some of the coefficients are above 0.8. These are very strong correlations despite differences in blueprint, test design, administration conditions, and test purposes. These data support the validity of Exact Path Diagnostic Assessment as a screener tool, and the generalizability of validity across various states.

Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Bias Analysis

Grade	Grade 3	Grade 4	Grade 5	Grade 6	Grade 7	Grade 8
Rating	Yes	Yes	Yes	Yes	Yes	Yes

Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.: Yes

If yes,
a. Describe the method used to determine the presence or absence of bias:: A third-party differential item functioning (DIF) study was conducted by EdMetric to evaluate the Exact Path Diagnostic Assessment item pool. This study involved completing a series of DIF analyses using student responses to items from Edmentum’s item bank and delivered through Edmentum’s Exact Path Diagnostic Computerized Adaptive Test. All the analyses were performed using difR (Magis, Beland, Tuerlinckx, & De Boeck, 2010) and tidyverse (v1.3.0; Wickham et al., 2019) packages in R. The presence of DIF was investigated using the Mantel-Haenszel (MH) procedure (Clauser & Mazor, 1998). This method allowed for detecting uniform DIF without requiring an item response theory model. The MH procedure has a straightforward implementation and enabled the use of the classification system established by Educational Testing Service (Zwick & Ercikan, 1989).

b. Describe the subgroups for which bias analyses were conducted:: The data examined bias in relation to gender, race, socioeconomic status, and pandemic effect. 1) Gender. Gender classification was available in the student-level data set for almost half of the students (there is no gender indicated for the remaining students). The data were fairly evenly split between males and females, with slightly under half identified as female and slightly over half identified as male. 2) Race. Because Edmentum’s student-level data files provide very limited demographic information, the percentage values of this column were assigned to each student based on their school district. The account-level data provides the percentages of white students in the school district. Here, the students were considered a high majority (coded as 1) district if 50 percent or more students in the school were white, and they were considered a low majority (coded as 0) district otherwise. Nearly two-thirds of the students were from majority districts while approximately one-third were from nonmajority districts. 3) Socioeconomic Status. The account-level data provided the percentages of children in the district from families below the poverty line. The poverty data were sourced from the U.S. Census Bureau's Small Area Income and Poverty Estimates (SAIPE) program. The poverty percentage used in this study identified districts and public schools by the actual percentage of children in the district who come from families below the poverty line. This percentage was calculated by creating a ratio of the children in a district from families below the poverty line to all children in the district. Students were considered a part of a high-poverty district (coded as 1) if more than 17 percent of students were living in poverty, and they were in a low-poverty district (coded as 0) otherwise. Originally, the intention was to assign high SES districts using a 50 percent cutoff; however, there were very few districts available where more than 50 percent of students lived in poverty. Therefore, the average percentage of students in poverty was used to divide the data. Nearly 60 percent of students were from school districts classified as high-poverty districts while nearly 40 percent were from low-poverty districts. 4) Pandemic Effect. The pandemic grouping variable was obtained by appending the pre-pandemic data (all items administered prior to March 2020) to the pandemic data (all items administered after March 2020). The pre-pandemic data combined data from the 2018–2019 and 2019–2020 data sets, while the pandemic data combined any responses from administrations after March 2020.

c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.: When conducting DIF studies with the ETS classification system, items were classified as A-, B-, or C-level DIF. Items classified with A-level DIF have “little or no difference between the two matched groups” (Zieky, 2003). Items flagged with B- and C-level DIF are typically evaluated for potential bias. Despite the large number of items in Edmentum’s item bank, no items were flagged for B- or C-level DIF. Thus, given the four groups considered for the DIF analysis, the Edmentum items appear to be unbiased.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.

Summary
Descriptive Information
Administration
Training & Scoring

Technical Standards
Classification Accuracy &
Cross-Validation Summary
Reliability
Validity
Bias Analysis

Data Collection Practices

Exact Path Diagnostic AssessmentMathematics

Summary

Descriptive Information

Administration

Training & Scoring

Training

Scoring

Technical Standards

Classification Accuracy & Cross-Validation Summary

Wisconsin Forward Exam

Classification Accuracy

Cross-Validation

Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2)

Classification Accuracy

Cross-Validation

Ohio State Test (OST)

Classification Accuracy

Cross-Validation

Classification Accuracy - Fall

Classification Accuracy - Winter

Classification Accuracy - Spring

Reliability

Validity

Bias Analysis

Data Collection Practices

Exact Path Diagnostic Assessment
Mathematics