Exact Path Diagnostic Assessment
Mathematics

Summary

Exact Path’s Diagnostic Mathematics Assessment is a computer-adaptive assessment that can be administered up to five times per academic year to screen and identify students in need of intervention. The Diagnostic Assessment efficiently pinpoints where students are ready to start learning and measures their growth between assessments. The assessment uses a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The assessment delivers an overall placement score, plus a score for each domain, all reported in real time. Next, the assessment generates an individualized learning path based on a student's unique domain levels. Learning paths take students to content appropriate for their instructional level, regardless of grade level. Highlights: 1) Valid and reliable assessment that diagnoses each student’s strengths and needs and pinpoint exactly where the student is ready to start learning. 2) Efficient measurement. Students in kindergarten and grade 1 have median times under 15 minutes. For students in grades 2–12, median times range from 23 to 53 minutes. 3) Real-time reporting of Quantile® measures, national percentile ranks, grade-level proficiency, and student growth upon each successive administration. 4) Universal accessibility tools including read-aloud, text magnification, and highlighting. 5) Easy to schedule and administer. Administrators and teachers can auto-schedule tests for all students, or they can manually adjust the schedule for specific students.

Where to Obtain:
Edmentum, Inc.
info@edmentum.com
5600 West 83rd Street, Suite 300, 8200 Tower, Bloomington, MN 55437
800.447.5286
https://info.edmentum.com/get-a-quote.html
Initial Cost:
$7.00 per student
Replacement Cost:
$7.00 per student per year
Included in Cost:
Edmentum’s professional services team, in tandem with our support team, works with our district and school partners through the implementation process and beyond to ensure your program is a success. In addition to the included implementation support, professional learning and engagement experiences are also available. Our detailed Edmentum Professional Services and Consulting Catalog can be found at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Edmentum offers unparalleled flexibility in program cost structure, with per-student and districtwide options. Understanding that no two schools or districts are alike, we offer licensing options to meet your unique needs and will partner with your stakeholders to provide the best fit. With our variety of packaging selections, we will find the perfect program to meet your specific needs, fit within your budget, and help your students thrive. Access options include individual content areas and multi-subject core bundles. Licenses and subscriptions are generally provided in 12-month increments, with unlimited teacher licenses included at no cost. Note that Edmentum’s standard minimum subscription term is 12 months. To further support our valued customers, we have discounts available based on the volume of student licenses. Basic pricing includes the assessment for the selected content area(s), all of which appears on a single platform; unlimited teacher licenses; administrator licenses; administrator and educator dashboards to control diagnostic assessment timing, frequency, and administration (e.g., monitoring testing, resetting a test, alerts); interactive reporting functionality for administrators and educators (including data export, dashboards, aggregate and individual reports, etc.); report access for students, parents, educators, and administrators; unlimited access available 24/7 to the embedded Guided Access and Help Center searchable support and troubleshooting tool as well as all relevant guidance and support materials, manuals, and webinars; award-winning customer support team providing support via phone, email, web, and live office hours; and dedicated support pages and resources for family/caregivers. Available as an add-on student-level license is the learning path solution capable of supporting intervention, individualized learning, and acceleration needs through the creation of personalized instructional pathways and natively integrated with the Diagnostic Assessments, as well as resources for educators to provide targeted intervention support to students through whole group, small group, or one-on-one instruction inclusive of lesson plans, printable activities, and instructional videos.
The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, calculator use, English learner accommodations, and audio and visual accommodations.
Training Requirements:
1-4 hours
Qualified Administrators:
No minimum qualifications specified.
Access to Technical Support:
From the moment our partners engage with us, we provide personal support to ensure they have the best experience possible. Our live, U.S.-based customer support team offers superior technical support as well as high-value instructional support to help educators gain the full value of their Edmentum programs. Our customer support team provides full phone and email support for our programs to all users during business hours. Edmentum offers a variety of training and support modalities, including online and offline training, videos, webinars, and documentation for online system users and administrators. We offer an extensive library of online and offline prerecorded and recorded videos and webinars that are accessible 24/7 via our website and YouTube channel. Our series of public webinars are scheduled weekly, and users can register for them anytime. Exact Path has both online documentation within the program and up-to-date downloadable user guides; both of these resources are found within Exact Path’s Help Center on Edmentum’s Support page. Teachers and administrators each have their own user manual with guidance on how to use the system. There is also an embedded on-demand Help Center with searchable help/troubleshooting. Not only is context-specific help available, but there are also page tours that walk users through actions they may want to take. Other time-specific guides direct teachers to important reports and new features.
Assessment Format:
Scoring Time:
  • Scoring is automatic
Scores Generated:
  • Raw score
  • Percentile score
  • IRT-based score
  • Developmental benchmarks
  • Developmental cut points
  • Equated
  • Subscale/subtest scores
Administration Time:
  • 45 minutes per student/group
Scoring Method:
  • Automatically (computer-scored)
Technology Requirements:
  • Computer or tablet
  • Internet connection
Accommodations:
The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, calculator use, English learner accommodations, and audio and visual accommodations.

Descriptive Information

Please provide a description of your tool:
Exact Path’s Diagnostic Mathematics Assessment is a computer-adaptive assessment that can be administered up to five times per academic year to screen and identify students in need of intervention. The Diagnostic Assessment efficiently pinpoints where students are ready to start learning and measures their growth between assessments. The assessment uses a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The assessment delivers an overall placement score, plus a score for each domain, all reported in real time. Next, the assessment generates an individualized learning path based on a student's unique domain levels. Learning paths take students to content appropriate for their instructional level, regardless of grade level. Highlights: 1) Valid and reliable assessment that diagnoses each student’s strengths and needs and pinpoint exactly where the student is ready to start learning. 2) Efficient measurement. Students in kindergarten and grade 1 have median times under 15 minutes. For students in grades 2–12, median times range from 23 to 53 minutes. 3) Real-time reporting of Quantile® measures, national percentile ranks, grade-level proficiency, and student growth upon each successive administration. 4) Universal accessibility tools including read-aloud, text magnification, and highlighting. 5) Easy to schedule and administer. Administrators and teachers can auto-schedule tests for all students, or they can manually adjust the schedule for specific students.
The tool is intended for use with the following grade(s).
not selected Preschool / Pre - kindergarten
selected Kindergarten
selected First grade
selected Second grade
selected Third grade
selected Fourth grade
selected Fifth grade
selected Sixth grade
selected Seventh grade
selected Eighth grade
selected Ninth grade
selected Tenth grade
selected Eleventh grade
selected Twelfth grade

The tool is intended for use with the following age(s).
not selected 0-4 years old
selected 5 years old
selected 6 years old
selected 7 years old
selected 8 years old
selected 9 years old
selected 10 years old
selected 11 years old
selected 12 years old
selected 13 years old
selected 14 years old
selected 15 years old
selected 16 years old
selected 17 years old
selected 18 years old

The tool is intended for use with the following student populations.
selected Students in general education
selected Students with disabilities
selected English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading
Phonological processing:
not selected RAN
not selected Memory
not selected Awareness
not selected Letter sound correspondence
not selected Phonics
not selected Structural analysis

Word ID
not selected Accuracy
not selected Speed

Nonword
not selected Accuracy
not selected Speed

Spelling
not selected Accuracy
not selected Speed

Passage
not selected Accuracy
not selected Speed

Reading comprehension:
not selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
not selected Sentence verification
not selected Other (please describe):


Listening comprehension:
not selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
not selected Sentence verification
not selected Vocabulary
not selected Expressive
not selected Receptive

Mathematics
Global Indicator of Math Competence
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Early Numeracy
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Mathematics Concepts
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Mathematics Computation
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Mathematic Application
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Fractions/Decimals
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Algebra
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

Geometry
selected Accuracy
not selected Speed
selected Multiple Choice
not selected Constructed Response

selected Other (please describe):
Item types: cloze, fill in the blank, drag and drop, and hot spot

Please describe specific domain, skills or subtests:
The Exact Path Mathematics assessment includes the domains of Counting & Cardinality (K); Fractions & Ratios (grades 3–7); Functions (grades 8–HS); Algebra & Expressions; Geometry; Numbers & Operations; and Measurement, Data & Statistics.
BEHAVIOR ONLY: Which category of behaviors does your tool target?


BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:
Email Address
info@edmentum.com
Address
5600 West 83rd Street, Suite 300, 8200 Tower, Bloomington, MN 55437
Phone Number
800.447.5286
Website
https://info.edmentum.com/get-a-quote.html
Initial cost for implementing program:
Cost
$7.00
Unit of cost
student
Replacement cost per unit for subsequent use:
Cost
$7.00
Unit of cost
student
Duration of license
year
Additional cost information:
Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.
Edmentum’s professional services team, in tandem with our support team, works with our district and school partners through the implementation process and beyond to ensure your program is a success. In addition to the included implementation support, professional learning and engagement experiences are also available. Our detailed Edmentum Professional Services and Consulting Catalog can be found at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Edmentum offers unparalleled flexibility in program cost structure, with per-student and districtwide options. Understanding that no two schools or districts are alike, we offer licensing options to meet your unique needs and will partner with your stakeholders to provide the best fit. With our variety of packaging selections, we will find the perfect program to meet your specific needs, fit within your budget, and help your students thrive. Access options include individual content areas and multi-subject core bundles. Licenses and subscriptions are generally provided in 12-month increments, with unlimited teacher licenses included at no cost. Note that Edmentum’s standard minimum subscription term is 12 months. To further support our valued customers, we have discounts available based on the volume of student licenses. Basic pricing includes the assessment for the selected content area(s), all of which appears on a single platform; unlimited teacher licenses; administrator licenses; administrator and educator dashboards to control diagnostic assessment timing, frequency, and administration (e.g., monitoring testing, resetting a test, alerts); interactive reporting functionality for administrators and educators (including data export, dashboards, aggregate and individual reports, etc.); report access for students, parents, educators, and administrators; unlimited access available 24/7 to the embedded Guided Access and Help Center searchable support and troubleshooting tool as well as all relevant guidance and support materials, manuals, and webinars; award-winning customer support team providing support via phone, email, web, and live office hours; and dedicated support pages and resources for family/caregivers. Available as an add-on student-level license is the learning path solution capable of supporting intervention, individualized learning, and acceleration needs through the creation of personalized instructional pathways and natively integrated with the Diagnostic Assessments, as well as resources for educators to provide targeted intervention support to students through whole group, small group, or one-on-one instruction inclusive of lesson plans, printable activities, and instructional videos.
Provide information about special accommodations for students with disabilities.
The Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Implementation of these principles along with the test elements listed below occurs throughout the item and test development process to maximize accessibility and fairness of all assessments for all students. 1) Inclusive of all populations. 2) Precisely defined constructs. 3) Accessible and non-biased. 4) Amenable to accommodation. 5) Simple, clear, and intuitive instructions. 6) Maximum readability and comprehensibility: reducing wordiness, avoiding ambiguity, using reader-friendly construction and vocabulary, including the avoidance of using words with double meanings, and consistently applying concept names and geographic conventions. 6) Maximum legibility. The Exact Path Diagnostic Assessment provides universal accessibility tools including read-aloud, text magnification, and highlighting to ensure all students have access to supports as needed. Further modifications for students who may need additional support can include but are not limited to time considerations, display settings, calculator use, English learner accommodations, and audio and visual accommodations.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?
not selected General education teacher
not selected Special education teacher
not selected Parent
not selected Child
not selected External observer
not selected Other
If other, please specify:

What is the administration setting?
not selected Direct observation
not selected Rating scale
not selected Checklist
not selected Performance measure
not selected Questionnaire
not selected Direct: Computerized
not selected One-to-one
not selected Other
If other, please specify:

Does the tool require technology?
Yes

If yes, what technology is required to implement your tool? (Select all that apply)
selected Computer or tablet
selected Internet connection
not selected Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?
selected Individual
selected Small group   If small group, n=
selected Large group   If large group, n=
selected Computer-administered
selected Other
If other, please specify:
Administration may take place in person as well as remotely.

What is the administration time?
Time in minutes
45
per (student/group/other unit)
student/group

Additional scoring time:
Time in minutes
0
per (student/group/other unit)
student/group

ACADEMIC ONLY: What are the discontinue rules?
selected No discontinue rules provided
not selected Basals
not selected Ceilings
not selected Other
If other, please specify:


Are norms available?
Yes
Are benchmarks available?
Yes
If yes, how many benchmarks per year?
The Diagnostic Assessment may be administered up to five times per year, though three times per year is recommended.
If yes, for which months are benchmarks available?
The Diagnostic Assessment may be administered any time of year. However, administration is recommended in each of the following windows: August 15 – October 14, December 1 – January 31, and April 1 – May 31.
BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?
Yes
Describe the time required for administrator training, if applicable:
1-4 hours
Please describe the minimum qualifications an administrator must possess.
selected No minimum qualifications
Are training manuals and materials available?
Yes
Are training manuals/materials field-tested?
Yes
Are training manuals/materials included in cost of tools?
Yes
If No, please describe training costs:
In addition to the on-demand resources available 24/7 and embedded in Exact Path for educator support, training, and reference, Edmentum offers a variety of professional learning and engagement experiences detailed in our Edmentum Professional Services and Consulting Catalog at www.edmentum.com/resources/brochures/professional-services-and-consulting-catalog. Our dedicated team is ready to design and deliver services tailored to each partner’s unique needs and goals.
Can users obtain ongoing professional and technical support?
Yes
If Yes, please describe how users can obtain support:
From the moment our partners engage with us, we provide personal support to ensure they have the best experience possible. Our live, U.S.-based customer support team offers superior technical support as well as high-value instructional support to help educators gain the full value of their Edmentum programs. Our customer support team provides full phone and email support for our programs to all users during business hours. Edmentum offers a variety of training and support modalities, including online and offline training, videos, webinars, and documentation for online system users and administrators. We offer an extensive library of online and offline prerecorded and recorded videos and webinars that are accessible 24/7 via our website and YouTube channel. Our series of public webinars are scheduled weekly, and users can register for them anytime. Exact Path has both online documentation within the program and up-to-date downloadable user guides; both of these resources are found within Exact Path’s Help Center on Edmentum’s Support page. Teachers and administrators each have their own user manual with guidance on how to use the system. There is also an embedded on-demand Help Center with searchable help/troubleshooting. Not only is context-specific help available, but there are also page tours that walk users through actions they may want to take. Other time-specific guides direct teachers to important reports and new features.

Scoring

How are scores calculated?
not selected Manually (by hand)
selected Automatically (computer-scored)
not selected Other
If other, please specify:

Do you provide basis for calculating performance level scores?
Yes
What is the basis for calculating performance level and percentile scores?
not selected Age norms
selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

What types of performance level scores are available?
selected Raw score
not selected Standard score
selected Percentile score
not selected Grade equivalents
selected IRT-based score
not selected Age equivalents
not selected Stanines
not selected Normal curve equivalents
selected Developmental benchmarks
selected Developmental cut points
selected Equated
not selected Probability
not selected Lexile score
not selected Error analysis
not selected Composite scores
selected Subscale/subtest scores
not selected Other
If other, please specify:
Quantile score; Scale score; LPEG (Learning Path Entry Grade) score, which identifies the grade level proficiency and entry point of a student in a particular skill for placement on an individualized learning path.

Does your tool include decision rules?
Yes
If yes, please describe.
Risk Benchmarks: Districts and schools can use the National Percentile Ranks (NPR) reporting provided through the Exact Path Diagnostic to identify students “at risk.” Consistent with NCII, we recommend that schools use the NPR of 20 to identify students who need intensive intervention. Depending on the needs of their students, districts and schools are also able to use other NPR thresholds for purposes such as identification of moderate risk and gifted and talented.
Can you provide evidence in support of multiple decision rules?
Yes
If yes, please describe.
Risk Benchmarks: NPRs are provided for all students in grades K–8 for math. NPRs were derived using national samples and a weighting methodology that adjusted the sample to be representative of the national student population. The norm study was conducted using data from the 2018–19 academic year. The NPR of 20 is recommended by RTI experts including NCII as an appropriate threshold for establishing at-risk classifications.
Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
The Exact Path Diagnostic Assessments include dichotomously scored multiple-choice and technology-enhanced item types. All the items on the Diagnostic Assessments are machine scored in real time. The Diagnostic Assessments provide multiple scores to describe student learning levels and student progress throughout the year. The scale scores were developed with the 1-parameter Rasch item response theory model and are placed on a vertical scale from 500 to 1500 spanning all grades K–12 for each subject. Students also receive a raw score for each domain (i.e., number of items answered correctly out of number of items delivered). In addition, student reports contain growth scores (between administrations), Lexile®/Quantile® measures, and national percentile ranks. Performance levels, called Grade Level Proficiency classifications, categorize students into four performance levels, with the top two levels indicating on-grade level achievement in mathematics, language arts, and reading. Upon completion of the Diagnostic Assessments, Exact Path generates an individualized learning path based on students’ unique proficiency by domain, providing students with access to content appropriate for their instructional level, regardless of their grade level.
Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
Exact Path provides computer-adaptive assessments in math, reading, and language arts that can be administered up to five times per academic year (though three times is most common) to efficiently pinpoint where students are ready to start learning and to measure their growth between assessments. The assessments use a robust item pool to offer each student a unique testing experience that adjusts in real time based on student responses. The algorithm selects the first question based on either a student’s enrolled grade level (if first time testing) or the student's previous diagnostic score. As the student progresses, each item presented depends on whether they answer the previous item correctly (receive more difficult item) or incorrectly (receive easier item). In this way, students receive assessments tailored to their skill levels, resulting in delivery of precise, accurate results for each content area that can be used to inform instruction and interventions. The adaptive algorithm uses consistent stopping rules for all learners that are based on the precision of the student score, so that scores are highly reliable for low performing students, average performing students, and high performing or gifted students. The Exact Path Diagnostic Assessment is designed to support the principles of Universal Design: to be fair, accessible, and appropriate for all students, including students with different abilities, disabilities, and backgrounds including race, ethnicity, gender, culture, language, age, and socioeconomic status. Item writers are trained according to Edmentum’s internal Assessment Item Writing Guide and Item Specifications, which includes Fairness, Bias, and Sensitivity guidance. Each item begins with a task model containing all parameters for that item, from standards, depth of knowledge (DOK), and readability to considerations for bias and sensitivity. Once written, each item undergoes two rounds of review and revision, including bias and sensitivity reviews. Furthermore, extensive accommodations are available for use both within and outside of the Exact Path platform to support the diverse needs of students, including students from linguistically and culturally diverse backgrounds as well as students with disabilities. More information about accommodations for students with disabilities is provided at the end of the Descriptive Information section. Teachers can make appropriate accommodations for students who are English language learners, such as providing a dictionary, helping to pronounce words, and offering any other accommodation students receive instructionally. However, teachers should not give substantive help interpreting text. Exact Path has been awarded WIDA Prime V2 Correlation, indicating our ability to address English language learners’ listening, speaking, reading, and writing needs. Exact Path includes built-in text-to-speech functionality, closed captions for videos, and highlighted vocabulary words with built-in tools for translation, definition, and audio support. EdMetric (2022) conducted an independent study of DIF in Edmentum’s item bank to examine the impact of four grouping variables (gender, race, socioeconomic status, and pandemic effect) on items. Investigation utilized the Mantel-Haenszel (MH) procedure, which enables the use of the classification system by Educational Testing Service to separate items into differing levels of DIF including negligible DIF (A-level), moderate DIF (B-level), and large DIF (C-level). Items flagged with B- and C-level DIF would indicate students in the groups of interest perform differently on the item. No items in Edmentum’s item bank were flagged for B- or C-level DIF, indicating that items in Edmentum’s item bank measure student achievement from different groups in a similar manner and providing some evidence that the items are fair for different groups. This evidence reflects the attention to fairness and the measures taken to avoid bias and sensitivity throughout the item development process. The study is accessible online at https://www.edmentum.com/resources/efficacy/exact-path-independent-study-differential-item-analysis-edmentums-item-bank.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Classification Accuracy Fall Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Partially convincing evidence
Classification Accuracy Winter Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Partially convincing evidence
Classification Accuracy Spring Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available

Wisconsin Forward Exam

Classification Accuracy

Select time of year
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Wisconsin’s Statewide Achievement Assessment, called the Forward Exam, is administered in grades 3–8 in English Language Arts and Mathematics . The Forward Exam criterion measure is completely independent from the screening measure. The Forward Exam is developed by DRC in collaboration with the state of Wisconsin. However, the Forward Exam is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics Forward Exam and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts Forward Exam and the Exact Path Diagnostic Reading Assessment.
Do the classification accuracy analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Consistent with NCII’s guidance, the 20th percentile on the Forward Exam was used as the criterion cut-point. The scale scores on the Forward Exam that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the Forward Exam classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
The Wisconsin districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the 2020–21 school year.

Cross-Validation

Has a cross-validation study been conducted?
No
If yes,
Select time of year.
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Do the cross-validation analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2)

Classification Accuracy

Select time of year
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Arizona’s Statewide Achievement Assessment, the AzMerit 2 (AzM2), was administered in grades 3–8 in English Language Arts and Mathematics during the 2020–21 academic year. AzM2 is completely independent from the screening measure and was developed by Cambium Assessment in collaboration with the state of Arizona. However, the AzM2 is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics AzM2 and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts AzM2 Exam and the Exact Path Diagnostic Reading Assessment.
Do the classification accuracy analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
AzM2 is administered to students in Arizona once per year in the spring. Exact Path Diagnostic Assessments are administered in the fall, winter, and spring. When evaluating the classification accuracy of Exact Path Diagnostic Assessments, the spring administration often takes place quite close to the AzM2 spring administration. Thus, the classification results can be considered concurrent. However, the classification results for the fall and winter Exact Path scores are from administrations timed several months before the AzM2 administration. Thus, these classification accuracy results can be considered predictive.
Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Consistent with NCII’s guidance, the 20th percentile on the AzM2 was used as the criterion cut-point. The scale scores on the AzM2 that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the AzM2 classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
The Arizona districts that supplied data for this study had access to the Exact Path Learning Path which provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the 2020–21 school year.

Cross-Validation

Has a cross-validation study been conducted?
No
If yes,
Select time of year.
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Do the cross-validation analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Ohio State Test (OST)

Classification Accuracy

Select time of year
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Ohio’s Statewide Achievement Assessment, the Ohio State Test (OST), was administered in grades 3–8 in English Language Arts and Mathematics during the 2020–21 and 2021-22 academic years. OST is completely independent from the screening measure and was developed by Cambium Assessment in collaboration with the state of Ohio. However, the OST is an appropriate criterion measure because there is a substantial overlap in the content assessed on the Mathematics OST and the Exact Path Diagnostic Mathematics Assessment, and likewise a substantial overlap in the content assessed on the English Language Arts OST Exam and the Exact Path Diagnostic Reading Assessment.
Do the classification accuracy analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
OST is administered to students in Ohio once per year in the spring. Exact Path Diagnostic Assessments are administered in the fall, winter, and spring. When evaluating the classification accuracy of Exact Path Diagnostic Assessments, the spring administration often takes place quite close to the OST spring administration. Thus, the classification results can be considered concurrent. However, the classification results for the fall and winter Exact Path scores are from administrations timed several months before the OST administration. Thus, these classification accuracy results can be considered predictive.
Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Consistent with NCII’s guidance, the 20th percentile on the OST was used as the criterion cut-point. The scale scores on the OST that corresponded to the 20th percentile were identified. Students with scale scores less than the cut-point associated with the 20th percentile were classified as “at risk” (true positive). Students at or above the 20th percentile were classified “not at risk” (true negative). The screener cut scores on the Exact Path Diagnostic vertical scale were determined by identifying the scale score that maximized classification accuracy with the OST classifications (that is, maximized the percentage of true positives and true negatives). Once screener cut points were identified, students with scale scores below the screener cut point were classified as “at risk” and students with scale scores above the screener cut point were classified “not at risk.” This process was applied separately by grade (3–8). Once student classifications on the criterion and screener measures were determined, classification indices were calculated using NCII’s classification worksheet.
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
The Ohio districts that supplied data for this study had access to the Exact Path Learning Path that provides supplemental personalized instruction. We do not know what other interventions the districts may have been using during the school year.

Cross-Validation

Has a cross-validation study been conducted?
No
If yes,
Select time of year.
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Do the cross-validation analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Classification Accuracy - Fall

Evidence Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Ohio State Test (OST) Ohio State Test (OST) Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2) Wisconsin Forward Exam Wisconsin Forward Exam Ohio State Test (OST)
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 665 676 3512 560 579 677
Cut Points - Corresponding performance score (numeric) on screener measure 856 908 899 1001 1019 1064
Classification Data - True Positive (a) 119 150 21 42 28 87
Classification Data - False Positive (b) 107 134 15 30 40 101
Classification Data - False Negative (c) 27 37 3 3 4 21
Classification Data - True Negative (d) 439 537 107 221 214 295
Area Under the Curve (AUC) 0.88 0.87 0.93 0.92 0.91 0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.85 0.84 0.88 0.87 0.87 0.80
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.91 0.89 0.98 0.98 0.95 0.88
Statistics Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.21 0.22 0.16 0.15 0.11 0.21
Overall Classification Rate 0.81 0.80 0.88 0.89 0.85 0.76
Sensitivity 0.82 0.80 0.88 0.93 0.88 0.81
Specificity 0.80 0.80 0.88 0.88 0.84 0.74
False Positive Rate 0.20 0.20 0.12 0.12 0.16 0.26
False Negative Rate 0.18 0.20 0.13 0.07 0.13 0.19
Positive Predictive Power 0.53 0.53 0.58 0.58 0.41 0.46
Negative Predictive Power 0.94 0.94 0.97 0.99 0.98 0.93
Sample Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date 2020-21 & 2021-2022 2020-21 & 2021-22 2020-21 2020-21 2020-21 2020-21 & 2021-22
Sample Size 692 858 146 296 286 504
Geographic Representation East North Central (OH) East North Central (OH) Mountain (AZ) East North Central (WI) East North Central (WI) East North Central (OH)
Male 50.7% 54.0% 52.1% 47.0% 43.0% 50.0%
Female 48.8% 46.0% 47.9% 43.2% 49.0% 49.0%
Other           0.2%
Gender Unknown           0.2%
White, Non-Hispanic 68.6% 70.0% 4.1% 72.6% 77.6% 74.0%
Black, Non-Hispanic 14.9% 15.0% 5.5% 0.7% 1.7% 15.1%
Hispanic 2.0% 3.0% 86.3% 13.5% 9.8% 2.0%
Asian/Pacific Islander 2.0% 2.0% 0.7% 1.0% 1.4% 1.0%
American Indian/Alaska Native     2.1% 0.7% 0.7%  
Other 13.0% 10.0% 1.4% 1.4% 1.0% 6.9%
Race / Ethnicity Unknown            
Low SES 19.9% 22.0%   27.0% 23.4% 19.0%
IEP or diagnosed disability 18.9% 14.0% 10.3% 13.2% 9.1% 18.1%
English Language Learner 2.0% 2.0% 20.5% 3.4% 2.8% 0.4%

Classification Accuracy - Winter

Evidence Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Ohio State Test (OST) Ohio State Test (OST) Wisconsin Forward Exam Wisconsin Forward Exam Arizona Statewide Achievement Assessment - AzMerit 2 (AzM2) Ohio State Test (OST)
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 665 676 557 560 3574 677
Cut Points - Corresponding performance score (numeric) on screener measure 887 929 986 1034 1001 1068
Classification Data - True Positive (a) 141 149 30 42 25 106
Classification Data - False Positive (b) 129 115 32 45 24 114
Classification Data - False Negative (c) 27 35 3 2 2 32
Classification Data - True Negative (d) 514 591 210 206 99 394
Area Under the Curve (AUC) 0.89 0.88 0.94 0.94 0.89 0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.86 0.85 0.91 0.91 0.83 0.80
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.92 0.91 0.98 0.97 0.95 0.87
Statistics Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.21 0.21 0.12 0.15 0.18 0.21
Overall Classification Rate 0.81 0.83 0.87 0.84 0.83 0.77
Sensitivity 0.84 0.81 0.91 0.95 0.93 0.77
Specificity 0.80 0.84 0.87 0.82 0.80 0.78
False Positive Rate 0.20 0.16 0.13 0.18 0.20 0.22
False Negative Rate 0.16 0.19 0.09 0.05 0.07 0.23
Positive Predictive Power 0.52 0.56 0.48 0.48 0.51 0.48
Negative Predictive Power 0.95 0.94 0.99 0.99 0.98 0.92
Sample Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date 2020-21 & 2021-22 2020-21 & 2021-22 2020-21 2020-21 2020-21 2020-21 & 2021-22
Sample Size 811 890 275 295 150 646
Geographic Representation East North Central (OH) East North Central (OH) East North Central (WI) East North Central (WI) Mountain (AZ) East North Central (OH)
Male 51.0% 54.0% 40.7% 47.1% 49.3% 49.1%
Female 49.0% 46.0% 45.1% 43.4% 50.7% 50.9%
Other            
Gender Unknown           0.2%
White, Non-Hispanic 70.0% 70.0% 69.5% 72.9% 6.0% 74.0%
Black, Non-Hispanic 14.1% 15.1% 2.5% 0.7% 6.7% 15.0%
Hispanic 3.0% 2.0% 11.3% 13.6% 83.3% 2.0%
Asian/Pacific Islander 2.0% 3.0% 0.4% 1.0% 0.7% 0.9%
American Indian/Alaska Native     0.7% 0.7% 2.0%  
Other 12.0% 10.0% 1.5% 1.4% 2.7% 8.0%
Race / Ethnicity Unknown            
Low SES 27.0% 23.0% 26.5% 27.1%   22.0%
IEP or diagnosed disability 18.0% 15.1% 10.2% 13.2% 10.7% 15.9%
English Language Learner 2.0% 2.0% 4.7% 3.4% 20.0% 0.9%

Classification Accuracy - Spring

Evidence Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Ohio State Test (OST) Ohio State Test (OST) Wisconsin Forward Exam Wisconsin Forward Exam Wisconsin Forward Exam Ohio State Test (OST)
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 665 676 557 560 579 678
Cut Points - Corresponding performance score (numeric) on screener measure 918 977 1031 1042 1079 1084
Classification Data - True Positive (a) 170 185 34 43 33 61
Classification Data - False Positive (b) 117 146 38 29 43 51
Classification Data - False Negative (c) 22 18 0 1 3 14
Classification Data - True Negative (d) 600 619 164 195 184 224
Area Under the Curve (AUC) 0.92 0.92 0.96 0.96 0.93 0.86
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.91 0.90 0.93 0.94 0.89 0.82
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.94 0.94 0.98 0.98 0.96 0.90
Statistics Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.21 0.21 0.14 0.16 0.14 0.21
Overall Classification Rate 0.85 0.83 0.84 0.89 0.83 0.81
Sensitivity 0.89 0.91 1.00 0.98 0.92 0.81
Specificity 0.84 0.81 0.81 0.87 0.81 0.81
False Positive Rate 0.16 0.19 0.19 0.13 0.19 0.19
False Negative Rate 0.11 0.09 0.00 0.02 0.08 0.19
Positive Predictive Power 0.59 0.56 0.47 0.60 0.43 0.54
Negative Predictive Power 0.96 0.97 1.00 0.99 0.98 0.94
Sample Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date 2020-21 & 2021-22 2020-21 & 2021-22 2020-21 2020-21 2020-21 2020-21 & 2021-22
Sample Size 909 968 236 268 263 350
Geographic Representation East North Central (OH) East North Central (OH) East North Central (WI) East North Central (WI) East North Central (WI) East North Central (OH)
Male 51.0% 54.0% 47.5% 51.9% 46.8% 51.1%
Female 49.0% 45.0% 52.5% 47.8% 53.2% 49.1%
Other            
Gender Unknown            
White, Non-Hispanic 68.0% 70.0% 80.9% 80.2% 84.4% 77.1%
Black, Non-Hispanic 16.0% 15.0% 3.0% 0.7% 1.9% 13.1%
Hispanic 4.0% 3.0% 13.1% 14.9% 10.6%  
Asian/Pacific Islander 2.0% 3.0% 0.4% 1.1% 1.5% 0.9%
American Indian/Alaska Native   0.2% 0.8% 0.7% 0.8%  
Other 12.0% 10.0% 1.7% 1.5% 1.1% 9.1%
Race / Ethnicity Unknown            
Low SES 27.0% 22.0% 30.9% 29.9% 25.5%  
IEP or diagnosed disability 17.1% 15.0% 11.9% 14.6% 9.9% 17.1%
English Language Learner 2.0% 2.0% 5.5% 3.7% 3.0% 1.1%

Reliability

Grade Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Rating Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Offer a justification for each type of reliability reported, given the type and purpose of the tool.
The analyses considered in this study included split-half and marginal reliability. These measures both assess the internal consistency of the tests under consideration, split-half from the context of classical test theory and marginal from the context of item response theory. Marginal reliability utilizes the item response theory (IRT) ability estimates and standard errors of the ability estimates to create a weighted average index of reliability akin to a test-retest correlation under classical test theory. The marginal reliability is a ratio of the variance of the estimated latent abilities relative to the sum of the variance of the latent ability and the expected error variance. Split-half reliability provides an estimate of alternate form reliability by dividing the test into equal halves, correlating the scores from the shortened forms, and using the Spearman-Brown formula to estimate the alternative form reliability for full-length test forms. Split-half is a more appropriate type of internal consistency reliability metric than coefficient alpha because the Exact Path Diagnostic Assessment is a computer adaptive assessment rather than a fixed-form assessment. The Exact Path Diagnostic Assessment is a variable-length adaptive assessment where the test terminates once the standard error of measurement is less than or equal to 40 scale score points for mathematics. The stopping rule ensures that the standard error of measurement is consistent across the scale: the scores of low-, average-, and high-achieving students all have the same measurement precision.
*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
Students who took the Exact Path Diagnostic Mathematics Assessment during the 2020–21 school year are included in the analysis. The sample is not strictly nationally representative, but students from nearly all 50 states are included in the dataset. As shown in the reliability table, sample sizes by grade and subject ranged from approximately 75,000 to over 90,000.
*Describe the analysis procedures for each reported type of reliability.
Split-half reliability coefficients were estimated for each subject and grade combination. Split-half reliability resembles a test-retest condition when a single test has been administered. For an adaptive CAT, the odd and even items are used to create the two half-length forms. The correlation (r) between scores on the two forms represents the consistency of the measure. Split-half reliability is then determined for the whole test by using the Spearman-Brown formula (p = 2r/(1+r)) to adjust the correlation to account for the full length of the test. Marginal reliability coefficients were also computed for each subject and grade combination. Traditional reliability estimators were designed based on classical test theory (CTT) of the ratio of true score and observed score variance, which is operationalized as the ratio of the variance of the observed score to the sum of the variances of the observed score and error. Under CTT, error variance is set to be constant across all true scores, while in item response theory (IRT) error varies as a function of the latent ability. Because of this difference, a single overall reliability in the context of IRT is an oversimplification of the reliability of the scores produced by the test. However, methods have been developed to approximate the traditional reliability in the IRT context. To account for the varying error across the latent ability distribution, the error variance can be integrated (Green, Bock, Humphreys, Linn, & Reckase, 1984). This can be further simplified by taking the mean of the squared standard error of measurement (SEM; Sireci, Thissen, & Wainer, 1991). Thus, the marginal reliability for IRT scores is the ratio of the variance of the estimated latent abilities relative to the sum of the variance of the latent ability and the expected error variance. To compute confidence intervals around split-half and marginal reliability, a bootstrapping approach was used.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Validity

Grade Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Rating Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
The Exact Path Diagnostic Assessment has extensive research supporting the validity of the assessment. Validity evidence is collected and evaluated according to the recommendations in the Standards for Educational and Psychological Testing (https://www.testingstandards.net/). The Exact Path technical report describes evidence based on test content, response processes, internal structure, relations to other variables, and consequences. In this section, we provide validity evidence in terms of relations to other variables, i.e., criterion validity. Four criterion measures are included in the correlations provided: Arizona’s end-of-year summative assessments from 2020–21 (AzM2) and 2021–22 (AASA); Indiana’s summative assessment from 2018–19 through 2020–21 (ILEARN), and Wisconsin’s summative assessment from 2020–21 (Forward Exam). These four criterion measures provide are completely external to Edmentum’s Exact Path Diagnostic Assessment screening system. However, all four external criterion measures and Exact Path are measures of mathematics proficiency. Thus, while the measures are aligned to different blueprints and different sets of standards, correlations are expected to be moderate to high. By providing criterion measures across a sample of states, we demonstrate the generalizability of Exact Path diagnostic as a valid screener of mathematics proficiency. Similar validity coefficients have been observed across other states (see https://www.edmentum.com/resources/research for more information).
*Describe the sample(s), including size and characteristics, for each validity analysis conducted.
The Arizona AzM2 and Wisconsin Forward Exam samples are from the 2020–21 academic year. The Indiana sample includes students from 2018–19 and 2020–21 academic years. The Arizona AASA sample is from the 2021–22 academic year. The number of students per grade and criterion measure ranges from 216 to over 2800.
*Describe the analysis procedures for each reported type of validity.
Students’ scale scores from state summative assessments are merged with scale scores from the Exact Path Diagnostic Assessment. For concurrent validity correlation coefficients, Exact Path and state scale scores are both from the spring testing window. For predictive validity correlation coefficients, the scale scores are from the same academic year, but Exact Path scale scores are from the fall testing window, while the criterion state scale scores are from the spring testing window. Validity coefficients are Pearson correlations, and the Fisher z-transformation was used to determine the 95 percent confidence interval.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
Yes
Provide citations for additional published studies.
Edmentum Exact Path and the Quantile Framework Linking Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/Exact%20Path%20and%20Quantile%20Linking%20Study%20Abstract%201.9.20.pdf Exact Path Diagnostic and the State of Texas Assessment of Academic Readiness (STAAR) Correlational Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/TX%20correlation%20report%20Exact%20Path%20and%20STAAR.pdf Exact Path Diagnostic and Pennsylvania System of School Assessment (PSSA) Correlational Study. Edmentum. https://www.edmentum.com/sites/edmentum.com/files/resource/media/PA-correlational-study-XP-and-PSSA.pdf
Describe the degree to which the provided data support the validity of the tool.
The Exact Path Diagnostic Mathematics Assessment can be used to screen students who are at risk for poor mathematics outcomes. The state summative assessment scores are typically the achievement outcomes of most importance to each state. Thus, having a screener that correlates well with these end-of-year tests is very important. The lower bound of the 95 percent confidence interval is well above 0.6 for both concurrent and predictive validity coefficients between the Exact Path Diagnostic Assessment and all of the criterion measures provided. In fact, some of the coefficients are above 0.8. These are very strong correlations despite differences in blueprint, test design, administration conditions, and test purposes. These data support the validity of Exact Path Diagnostic Assessment as a screener tool, and the generalizability of validity across various states.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Bias Analysis

Grade Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Rating Yes Yes Yes Yes Yes Yes
Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
Yes
If yes,
a. Describe the method used to determine the presence or absence of bias:
A third-party differential item functioning (DIF) study was conducted by EdMetric to evaluate the Exact Path Diagnostic Assessment item pool. This study involved completing a series of DIF analyses using student responses to items from Edmentum’s item bank and delivered through Edmentum’s Exact Path Diagnostic Computerized Adaptive Test. All the analyses were performed using difR (Magis, Beland, Tuerlinckx, & De Boeck, 2010) and tidyverse (v1.3.0; Wickham et al., 2019) packages in R. The presence of DIF was investigated using the Mantel-Haenszel (MH) procedure (Clauser & Mazor, 1998). This method allowed for detecting uniform DIF without requiring an item response theory model. The MH procedure has a straightforward implementation and enabled the use of the classification system established by Educational Testing Service (Zwick & Ercikan, 1989).
b. Describe the subgroups for which bias analyses were conducted:
The data examined bias in relation to gender, race, socioeconomic status, and pandemic effect. 1) Gender. Gender classification was available in the student-level data set for almost half of the students (there is no gender indicated for the remaining students). The data were fairly evenly split between males and females, with slightly under half identified as female and slightly over half identified as male. 2) Race. Because Edmentum’s student-level data files provide very limited demographic information, the percentage values of this column were assigned to each student based on their school district. The account-level data provides the percentages of white students in the school district. Here, the students were considered a high majority (coded as 1) district if 50 percent or more students in the school were white, and they were considered a low majority (coded as 0) district otherwise. Nearly two-thirds of the students were from majority districts while approximately one-third were from nonmajority districts. 3) Socioeconomic Status. The account-level data provided the percentages of children in the district from families below the poverty line. The poverty data were sourced from the U.S. Census Bureau's Small Area Income and Poverty Estimates (SAIPE) program. The poverty percentage used in this study identified districts and public schools by the actual percentage of children in the district who come from families below the poverty line. This percentage was calculated by creating a ratio of the children in a district from families below the poverty line to all children in the district. Students were considered a part of a high-poverty district (coded as 1) if more than 17 percent of students were living in poverty, and they were in a low-poverty district (coded as 0) otherwise. Originally, the intention was to assign high SES districts using a 50 percent cutoff; however, there were very few districts available where more than 50 percent of students lived in poverty. Therefore, the average percentage of students in poverty was used to divide the data. Nearly 60 percent of students were from school districts classified as high-poverty districts while nearly 40 percent were from low-poverty districts. 4) Pandemic Effect. The pandemic grouping variable was obtained by appending the pre-pandemic data (all items administered prior to March 2020) to the pandemic data (all items administered after March 2020). The pre-pandemic data combined data from the 2018–2019 and 2019–2020 data sets, while the pandemic data combined any responses from administrations after March 2020.
c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
When conducting DIF studies with the ETS classification system, items were classified as A-, B-, or C-level DIF. Items classified with A-level DIF have “little or no difference between the two matched groups” (Zieky, 2003). Items flagged with B- and C-level DIF are typically evaluated for potential bias. Despite the large number of items in Edmentum’s item bank, no items were flagged for B- or C-level DIF. Thus, given the four groups considered for the DIF analysis, the Edmentum items appear to be unbiased.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.