DORA: Diagnostic Online Reading Assessment
Reading

Summary

The Diagnostic Online Reading Assessment (DORA) is a web-based diagnostic and screener that is part of the Let’s Go Learn integrated assessment and instruction platform. It provides a single vertically aligned grade-level equivalency score for grades K through 12 by assessing mastery across 7 sub-tests that represent foundational reading skills. These sub-tests are organized into scope and sequenced skills, word lists, and text scaled by readability aligning with the progression of how reading is typically taught. Key Features and Design: 1) Precise Scope and Sequence Alignment: DORA assesses skills and concepts in a manner consistent with instructional best practices, enabling it to effectively pinpoint students’ mastery levels. 2) Adaptive Algorithm: The assessment automatically adjusts to each student’s performance, allowing for targeted evaluation that skips mastered skills while probing deeper into areas of struggle. 3) Domains Covered: DORA evaluates phonological awareness, phonics, reading automaticity, oral vocabulary, spelling, and comprehension strategies. Instructional Applications: 1) Special Education: DORA identifies present levels of academic achievement and functional performance (PLAAFP) for IEP development, offering precise diagnostic data. 2) Tiered Interventions: The assessment identifies skill gaps to group students effectively and provide targeted instruction. 3) Progress Monitoring: By capturing growth over time, DORA supports ongoing instructional adjustments and tracking student progress. Administration Details: 1) Duration and Flexibility: The assessment typically takes 30 to 60 minutes, depending on the student’s grade level and skill breadth, with the option to administer in multiple sittings. 2) Frequency: DORA is commonly administered three times a year, with 12–18 weeks of instruction recommended between administrations. 3) Targeted Use: It is suitable for screening all students or specific groups, including those at risk of academic failure or requiring individualized support. Actionable Reporting: DORA generates detailed reports immediately upon completion. These reports include: 1) Analysis of skill/concept gaps and strengths. 2) Specific instructional recommendations for addressing deficiencies. 3) Grouping tools to support differentiated instruction in small-group or whole-class settings. 4) Individualized learning paths when paired with the optional ELA Edge program, which prescribes targeted online lessons for each student. Sophisticated Adaptivity: DORA’s design ensures that students are assessed only on relevant skills. For example, if a student demonstrates very low decoding mastery, the assessment will start on a lower-level comprehension passage even though the student may be in a higher grade level. Also, skill/concept is assessed using multiple items to confirm mastery, supporting accurate instructional planning. In summary, DORA serves as a powerful and flexible tool for screening, diagnosing, and monitoring student learning needs in reading. Its adaptive design and actionable reports ensure it is practical and effective for educators, whether used as a universal screener or a diagnostic tool for individualized learning plans.

Where to Obtain:
Let's Go Learn, Inc.
help@letsgolearn.com
705 Wellesley Ave., Kensington, CA 94708
888-618-7323
www.letsgolearn.com
Initial Cost:
$8.00 per student
Replacement Cost:
$8.00 per student per year
Included in Cost:
DORA operates on a per-student annual licensing fee, which varies depending on the scale of implementation and any additional features requested, such as progress monitoring. The cost ranges from $8 to $13 per student per year, with discounts available for multi-year subscriptions. The annual license fee includes: Student Access: Online access to assessments and progress monitoring. Educator Tools: Full access to management and reporting tools. User Resources: Comprehensive asynchronous professional development resources. Infrastructure and Maintenance: Account setup, secure hosting, and all program updates, enhancements, and maintenance during the active license term. Support Services: Unlimited access to U.S.-based customer support via toll-free phone and email during business hours. Professional Development: While professional development is required for implementation, it is available at an additional cost ranging from $500 to $2,500, depending on the scope and delivery method.
DORA includes a variety of accommodations and accessibility features, ensuring its usability for most students, including those with disabilities. It incorporates Universal Accessibility Features that are available to all students without requiring intervention from educators to enable them. These features include untimed testing, the ability to navigate the assessment using a keyboard, and adjustable audio volume. Additionally, universally accessible audio support is integrated into all items for grades K-7, which not only reduces potential reading bias but also assists students who may require audio support for text comprehension. DORA also includes processes and tools specifically designed to support students who require accommodations as determined by Individualized Education Programs (IEPs) or 504 plans. Since fall 2015, DORA has met the Level AA standard under the Web Content Accessibility Guidelines (WCAG 2.0), with documented exceptions. While all students have access to Universal Accessibility Features, accommodations beyond these features are typically determined by IEP teams or other educational professionals. Let’s Go Learn provides guidance and training to educators on how to implement accommodations effectively, but the ultimate decision and application of accommodations rest with the educators working directly with individual students.
Training Requirements:
1 hour
Qualified Administrators:
No minimum qualifications specified.
Access to Technical Support:
Typically, additional training is provided after DORA has been administered to teach teachers how to use the diagnostic data, or to show special education teachers how to use DORA to write IEPs and set short-term goals and objectives.
Assessment Format:
  • Direct: Computerized
Scoring Time:
  • Scoring is automatic
Scores Generated:
  • Raw score
  • Percentile score
  • Grade equivalents
  • Developmental benchmarks
  • Developmental cut points
  • Lexile score
  • Composite scores
  • Other: DORA also provides present levels for IEPs and next skills/concepts gap report for teachers to set goals for students.
Administration Time:
  • 45 minutes per group
Scoring Method:
  • Automatically (computer-scored)
Technology Requirements:
  • Computer or tablet
  • Internet connection
Accommodations:
DORA includes a variety of accommodations and accessibility features, ensuring its usability for most students, including those with disabilities. It incorporates Universal Accessibility Features that are available to all students without requiring intervention from educators to enable them. These features include untimed testing, the ability to navigate the assessment using a keyboard, and adjustable audio volume. Additionally, universally accessible audio support is integrated into all items for grades K-7, which not only reduces potential reading bias but also assists students who may require audio support for text comprehension. DORA also includes processes and tools specifically designed to support students who require accommodations as determined by Individualized Education Programs (IEPs) or 504 plans. Since fall 2015, DORA has met the Level AA standard under the Web Content Accessibility Guidelines (WCAG 2.0), with documented exceptions. While all students have access to Universal Accessibility Features, accommodations beyond these features are typically determined by IEP teams or other educational professionals. Let’s Go Learn provides guidance and training to educators on how to implement accommodations effectively, but the ultimate decision and application of accommodations rest with the educators working directly with individual students.

Descriptive Information

Please provide a description of your tool:
The Diagnostic Online Reading Assessment (DORA) is a web-based diagnostic and screener that is part of the Let’s Go Learn integrated assessment and instruction platform. It provides a single vertically aligned grade-level equivalency score for grades K through 12 by assessing mastery across 7 sub-tests that represent foundational reading skills. These sub-tests are organized into scope and sequenced skills, word lists, and text scaled by readability aligning with the progression of how reading is typically taught. Key Features and Design: 1) Precise Scope and Sequence Alignment: DORA assesses skills and concepts in a manner consistent with instructional best practices, enabling it to effectively pinpoint students’ mastery levels. 2) Adaptive Algorithm: The assessment automatically adjusts to each student’s performance, allowing for targeted evaluation that skips mastered skills while probing deeper into areas of struggle. 3) Domains Covered: DORA evaluates phonological awareness, phonics, reading automaticity, oral vocabulary, spelling, and comprehension strategies. Instructional Applications: 1) Special Education: DORA identifies present levels of academic achievement and functional performance (PLAAFP) for IEP development, offering precise diagnostic data. 2) Tiered Interventions: The assessment identifies skill gaps to group students effectively and provide targeted instruction. 3) Progress Monitoring: By capturing growth over time, DORA supports ongoing instructional adjustments and tracking student progress. Administration Details: 1) Duration and Flexibility: The assessment typically takes 30 to 60 minutes, depending on the student’s grade level and skill breadth, with the option to administer in multiple sittings. 2) Frequency: DORA is commonly administered three times a year, with 12–18 weeks of instruction recommended between administrations. 3) Targeted Use: It is suitable for screening all students or specific groups, including those at risk of academic failure or requiring individualized support. Actionable Reporting: DORA generates detailed reports immediately upon completion. These reports include: 1) Analysis of skill/concept gaps and strengths. 2) Specific instructional recommendations for addressing deficiencies. 3) Grouping tools to support differentiated instruction in small-group or whole-class settings. 4) Individualized learning paths when paired with the optional ELA Edge program, which prescribes targeted online lessons for each student. Sophisticated Adaptivity: DORA’s design ensures that students are assessed only on relevant skills. For example, if a student demonstrates very low decoding mastery, the assessment will start on a lower-level comprehension passage even though the student may be in a higher grade level. Also, skill/concept is assessed using multiple items to confirm mastery, supporting accurate instructional planning. In summary, DORA serves as a powerful and flexible tool for screening, diagnosing, and monitoring student learning needs in reading. Its adaptive design and actionable reports ensure it is practical and effective for educators, whether used as a universal screener or a diagnostic tool for individualized learning plans.
The tool is intended for use with the following grade(s).
not selected Preschool / Pre - kindergarten
selected Kindergarten
selected First grade
selected Second grade
selected Third grade
selected Fourth grade
selected Fifth grade
selected Sixth grade
selected Seventh grade
selected Eighth grade
selected Ninth grade
selected Tenth grade
selected Eleventh grade
not selected Twelfth grade

The tool is intended for use with the following age(s).
not selected 0-4 years old
selected 5 years old
selected 6 years old
selected 7 years old
selected 8 years old
selected 9 years old
selected 10 years old
selected 11 years old
selected 12 years old
selected 13 years old
selected 14 years old
selected 15 years old
selected 16 years old
selected 17 years old
selected 18 years old

The tool is intended for use with the following student populations.
selected Students in general education
selected Students with disabilities
selected English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading
Phonological processing:
not selected RAN
not selected Memory
selected Awareness
selected Letter sound correspondence
selected Phonics
selected Structural analysis

Word ID
selected Accuracy
selected Speed

Nonword
selected Accuracy
selected Speed

Spelling
selected Accuracy
not selected Speed

Passage
selected Accuracy
not selected Speed

Reading comprehension:
selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
not selected Sentence verification
not selected Other (please describe):


Listening comprehension:
not selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
not selected Sentence verification
not selected Vocabulary
not selected Expressive
not selected Receptive

Mathematics
Global Indicator of Math Competence
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Early Numeracy
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Concepts
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Computation
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematic Application
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Fractions/Decimals
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Algebra
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Geometry
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

not selected Other (please describe):

Please describe specific domain, skills or subtests:
DORA assesses six domains (Phonological Awareness, Phonics, High Frequency Words, Vocabulary, Spelling, Comprehension of Informational Text). Each domain has corresponding subdomains. For the domain of Phonological Awareness (Grades K-2), the subdomains are rhyme recognition; syllable blending and segmenting; onset and rime blending and segmenting; phoneme identification, isolation, and pronunciation; phoneme blending and segmentation; and phoneme addition and substitution. For the domain of Phonics, the subdomains are alphabetic knowledge, beginning consonants, short vowels, beginning blends, long vowels, diphthongs, r-controlled vowels, digraphs, vowel-digraphs, multi-syllable words. The High Frequency Words domain includes words from Dolch and Fry lists. For the Vocabulary domain, the subdomains are academic and domain-specific vocabulary. For the domain of Comprehension of Informational Text, the subdomains are author’s purpose; categorize and classify; cause and effect; drawing conclusions/making inferences; fact and opinion; main idea and details; message; summarize; text structure; vocabulary in context; compare and contrast across different mediums; analysis of close reading of the text; and citing textual evidence. For the domain of spelling, it includes regular and irregular word spelling patterns which reflect an exposure of students to reading and spelling.
BEHAVIOR ONLY: Which category of behaviors does your tool target?


BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:
Email Address
help@letsgolearn.com
Address
705 Wellesley Ave., Kensington, CA 94708
Phone Number
888-618-7323
Website
www.letsgolearn.com
Initial cost for implementing program:
Cost
$8.00
Unit of cost
student
Replacement cost per unit for subsequent use:
Cost
$8.00
Unit of cost
student
Duration of license
year
Additional cost information:
Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.
DORA operates on a per-student annual licensing fee, which varies depending on the scale of implementation and any additional features requested, such as progress monitoring. The cost ranges from $8 to $13 per student per year, with discounts available for multi-year subscriptions. The annual license fee includes: Student Access: Online access to assessments and progress monitoring. Educator Tools: Full access to management and reporting tools. User Resources: Comprehensive asynchronous professional development resources. Infrastructure and Maintenance: Account setup, secure hosting, and all program updates, enhancements, and maintenance during the active license term. Support Services: Unlimited access to U.S.-based customer support via toll-free phone and email during business hours. Professional Development: While professional development is required for implementation, it is available at an additional cost ranging from $500 to $2,500, depending on the scope and delivery method.
Provide information about special accommodations for students with disabilities.
DORA includes a variety of accommodations and accessibility features, ensuring its usability for most students, including those with disabilities. It incorporates Universal Accessibility Features that are available to all students without requiring intervention from educators to enable them. These features include untimed testing, the ability to navigate the assessment using a keyboard, and adjustable audio volume. Additionally, universally accessible audio support is integrated into all items for grades K-7, which not only reduces potential reading bias but also assists students who may require audio support for text comprehension. DORA also includes processes and tools specifically designed to support students who require accommodations as determined by Individualized Education Programs (IEPs) or 504 plans. Since fall 2015, DORA has met the Level AA standard under the Web Content Accessibility Guidelines (WCAG 2.0), with documented exceptions. While all students have access to Universal Accessibility Features, accommodations beyond these features are typically determined by IEP teams or other educational professionals. Let’s Go Learn provides guidance and training to educators on how to implement accommodations effectively, but the ultimate decision and application of accommodations rest with the educators working directly with individual students.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?
not selected General education teacher
not selected Special education teacher
not selected Parent
not selected Child
not selected External observer
not selected Other
If other, please specify:

What is the administration setting?
not selected Direct observation
not selected Rating scale
not selected Checklist
not selected Performance measure
not selected Questionnaire
selected Direct: Computerized
not selected One-to-one
not selected Other
If other, please specify:

Does the tool require technology?
Yes

If yes, what technology is required to implement your tool? (Select all that apply)
selected Computer or tablet
selected Internet connection
not selected Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?
selected Individual
selected Small group   If small group, n=
selected Large group   If large group, n=
not selected Computer-administered
not selected Other
If other, please specify:

What is the administration time?
Time in minutes
45
per (student/group/other unit)
group

Additional scoring time:
Time in minutes
0
per (student/group/other unit)

ACADEMIC ONLY: What are the discontinue rules?
not selected No discontinue rules provided
not selected Basals
not selected Ceilings
selected Other
If other, please specify:
Tests can be taken in multiple sittings. Tests terminate if they are not completed in 30 days because the diagnostic accuracy may be off if students are receiving ongoing instruction during the testing period. Because ADAM is not considered a high-stakes assessment and instead is diagnostic, teachers can administer it with their students outside of normal testing windows.


Are norms available?
Yes
Are benchmarks available?
Yes
If yes, how many benchmarks per year?
Let’s Go Learn recommends giving the DORA three times a year. DORA provides benchmark scores in the form of grade-level equivalency scores at the sub-test, domain and overall assessment levels. DORA also provides intervention tiers associated with schools using an RtI or MTSS academic framework.
If yes, for which months are benchmarks available?
The Grade-Level Placement benchmarks are available at any time of year because students are assigned a decimal grade level at the time they complete a benchmark. For example, a fourth grade student completing an assessment in the fall might receive a 4.0 or 4.1 decimal grade associated with the benchmark, a 4.5 mid-year, and at the end of year one would expect 4.8 or 4.9.
BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?
Yes
Describe the time required for administrator training, if applicable:
1 hour
Please describe the minimum qualifications an administrator must possess.
selected No minimum qualifications
Are training manuals and materials available?
Yes
Are training manuals/materials field-tested?
Yes
Are training manuals/materials included in cost of tools?
Yes
If No, please describe training costs:
Can users obtain ongoing professional and technical support?
Yes
If Yes, please describe how users can obtain support:
Typically, additional training is provided after DORA has been administered to teach teachers how to use the diagnostic data, or to show special education teachers how to use DORA to write IEPs and set short-term goals and objectives.

Scoring

How are scores calculated?
not selected Manually (by hand)
selected Automatically (computer-scored)
not selected Other
If other, please specify:

Do you provide basis for calculating performance level scores?
Yes
What is the basis for calculating performance level and percentile scores?
not selected Age norms
selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

What types of performance level scores are available?
selected Raw score
not selected Standard score
selected Percentile score
selected Grade equivalents
not selected IRT-based score
not selected Age equivalents
not selected Stanines
not selected Normal curve equivalents
selected Developmental benchmarks
selected Developmental cut points
not selected Equated
not selected Probability
selected Lexile score
not selected Error analysis
selected Composite scores
not selected Subscale/subtest scores
selected Other
If other, please specify:
DORA also provides present levels for IEPs and next skills/concepts gap report for teachers to set goals for students.

Does your tool include decision rules?
No
If yes, please describe.
Can you provide evidence in support of multiple decision rules?
No
If yes, please describe.
Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
DORA employs a single vertical grade-level scoring algorithm spanning grades K through 12, designed to reflect the specific skills and concepts taught across the six domains of reading at each grade level. This scoring structure ensures that student performance can be measured with absolute comparability over time, making it particularly effective for tracking growth from kindergarten through grade 12. Unlike norm-referenced assessments, DORA’s scoring does not obscure growth trends, especially in middle school, where national performance data often depresses normed growth indicators. Each skill or concept is assessed with a minimum of five test items, and the assessment consists of 7 sub-tests, each representing a scope and sequence of related skills, concepts, or leveled words and passages. Students progress by mastering distinct sets of items grouped by common skill/concept/level and regress when they fail to demonstrate mastery, with scoring based on achieving a ceiling condition—mastery of a specific skill set followed by failure at the next level of difficulty. This approach produces 7 individual grade-level equivalency scores that are integrated into a weighted scoring algorithm. In total, DORA encompasses 69 item sets that span foundational reading for grades K through 12. The scoring structure is explicitly designed to test skills and concepts directly rather than predict scores, resulting in absolute rather than relative performance measures. As students demonstrate mastery of new skills, DORA dynamically adjusts starting points to previous high points, reducing inefficiencies by avoiding reassessment of already-mastered content. The adaptive logic also accounts for potential regression by requiring mastery of a ceiling condition in each administration, ensuring that scores accurately reflect each student’s current level of mastery.
Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
DORA includes intervention screening reports that classify students into three tiers: Tier 1, Tier 2, or Tier 3. Its grade-level equivalency scoring system simplifies interpretation by providing a decimal score that correlates with the student’s grade level and time of year. This allows educators to calculate the student’s gap by subtracting the decimal grade at the time of testing from the DORA score. Three intervention screening report formats are available: beginning-of-year, standard, and end-of-year, each categorizing students based on their chronological grade level. DORA also provides national norms expressed as percentiles for beginning-of-year, mid-year, and end-of-year testing, offering an alternative percentile-based classification approach for intervention for those familiar with this method. Let’s Go Learn ensures that DORA is appropriate for a diverse range of students, including those from culturally and linguistically diverse populations and students with disabilities. All items are designed to be developmentally, linguistically, and culturally appropriate. Audio support is included where appropriate so as to not confound the questions requiring reading that is not part of what is being tested. This minimizes the potential for reading bias. For students who require accommodations, such as large print or additional time, DORA’s design ensures that most will not need further intervention to complete the assessment. The interface automatically optimizes text, image, and number scaling, and the assessment emphasizes maintaining standard administration and interpretation without compromising its purpose or validity. DORA was developed using universal design principles, with a commitment to fair, engaging, authentic, rigorous, and culturally inclusive assessments. A diverse review committee establishes standards that item writers adhere to, considering bias and sensitivity in accordance with Let’s Go Learn’s Sensitivity, Fairness, and Accessibility Guidelines. Items are constructed to focus students’ attention on the task without introducing bias or distraction. Every item undergoes rigorous review for sensitivity, fairness, and potential bias, including but not limited to cultural, linguistic, socioeconomic, religious, geographic, color-blind, gender, and disability biases. Items that fail to meet these criteria are revised, rejected, or removed. The Assessment Development team employs differential item functioning (DIF) analyses and p-value outlier screening to identify items with potential bias in performance metrics. Items exhibiting severe DIF are removed from the item bank, while flagged items are revised and re-piloted. These practices align with periodic quality reviews to maintain a high standard of fairness and accessibility in the assessment.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Classification Accuracy Fall Data unavailable Partially convincing evidence Partially convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Partially convincing evidence
Classification Accuracy Winter Partially convincing evidence Partially convincing evidence Partially convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Classification Accuracy Spring Partially convincing evidence Convincing evidence Partially convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available

Smarter Balanced Assessment (SBA) English Language Arts

Classification Accuracy

Select time of year
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
The criterion measure is the Smarter Balanced Assessment (SBA) English Language Arts test for grades 3-8. The SBA is an end-of-year state summative assessment administered in the spring in various states. The scaled scores and performance bands defined in the Smarter Balanced 2018-2019 Summative Technical Report were used to classify students. Students who scored below the score corresponding to the 20th percentile on the SBA for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 20th percentile were classified as not-at-risk. This is what has been established by the SBA test. For grades K-2, the grade 3 SBA scores were used as the criterion for calculation of predictive classification accuracy, as states do not administer SBA before grade 3. As such, the criterion was administered 1-3 years after the DORA administration.
Do the classification accuracy analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
For grades 3-8, the screening measure was administered at three time points in the 2018-19 academic year: Fall (between 08/01/2018 and 10/15/2018), Winter (between 12/1/2018 and 03/01/2019), and Spring (between 04/01/2019 and 06/15/2019). The criterion measure (SBA) was administered in the Spring of 2019. The Spring DORA scores, taken close in time to the SBA, represent concurrent classification accuracy, while the Fall and Winter scores represent predictive classification accuracy. For grade K to 2, the screening measure was administered at three time points in the earlier school years prior to the SBA being administered in the Spring of 2019.
Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Cut points on the criterion measure (SBA) were determined as the scale score corresponding to the 20th percentile defined in the Smarter Balanced 2018-2019 Summative Technical Report for the given subject and grade. This cut point follows the definition of students in need of intensive intervention provided by NCII’s Technical Review Committee. Students who scored below the score corresponding to the 10th percentile on the SBA for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 10th percentile were classified as not-at-risk. Cut points on the screening measure (DORA) were empirically identified as grade-level scores that best align with SBA’s 20th percentile scores for each subject, grade and testing window. Using these cut scores, students were classified as at-risk if they scored below the cut score on DORA for the given testing window, or not-at-risk if they scored at or above the cut. Classification indices between at-risk/not-at-risk on DORA and at-risk/not-at-risk on the SBA assessment are calculated per the formulas in the classification worksheet. For students in grades 3-8, screening scores in the Fall, Winter and Spring of the 2018-19 academic year were used for at-risk classification on the criterion measure administered in Spring 2019 at the same grade level. For students in grade K, screening scores from the 2015-16 academic year were used for at-risk classification on the criterion measure for the same students in grade 3 in Spring 2019. For students in grade 1, Fall, Winter, and Spring screening scores from the 2016-17 academic year were used for at-risk classification on the criterion measure for the same students in grade 3 in Spring 2019. For students in grade 2, Fall, Winter, and Spring screening scores from the 2017-18 academic year were used for at-risk classification on the criterion measure for the same students in grade 3 in Spring 2019. A post-Covid note: We are submitting data from pre-covid. Given that neither our assessment nor current modern ELA standards have changed since the 2018-19 year, we believe this data to be the most valid for comparison to SBA and its cut scores for risk or not at-risk.
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Some students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students. In addition, roughly 50% of students who take DORA may be using our math intervention but for these students we don’t know the fidelity in which they may have received our intervention.

Cross-Validation

Has a cross-validation study been conducted?
Yes
If yes,
Select time of year.
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
The criterion measure is the Smarter Balanced Assessment (SBA) ELA test for grades 3-8. For grades K-2, SBA scores for the same students in 2019 were used as the criterion for calculation of predictive classification accuracy. The SBA is an end-of-year state summative assessment administered in the Spring in various states. The percentile scores defined in the Smarter Balanced 2018–19 Summative Technical Report are used to classify students. Students who scored below the score corresponding to the 10th percentile on the SBA for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 10th percentile were classified as not-at-risk.
Do the cross-validation analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
For grades 3-8, the screening measure was administered at three time points in the 2018-19 academic year: Fall (between 08/01/2018 and 10/15/2018), Winter (between 12/1/2018 and 03/01/2019), and Spring (between 04/01/2019 and 06/15/2019). The criterion measure (SBA) was administered in the Spring of 2019. The Spring DORA scores, taken close in time to the SBA, represent concurrent classification accuracy, while the Fall and Winter scores represent predictive classification accuracy. For grade K to 2, the screening measure was administered at three time points in the earlier school years prior to the SBA being administered in the Spring of 2019.
Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
For the cross-validation study, we used a K-fold cross-validation method by splitting the sample into K=5 parts, using 4 parts (80% of the sample) for the classification accuracy study and 1 part (20% of the sample) for cross-validation. Therefore, the timing of measure administration was the same as the main Classification Accuracy study. In order to validate our results, we used the same cut points as the main Classification Accuracy study for both the criterion measure (SBA) and screening measure (DORA) when performing the classification analyses on the cross-validation sample. Cut points on the criterion measure (SBA) were determined as the scale score corresponding to the 20th percentile defined in the Smarter Balanced 2018–19 Summative Technical Report for the given subject and grade. This cut point follows the definition of students in need of intensive intervention provided by NCII’s Technical Review Committee. Students who scored below the score corresponding to the 20th percentile on the SBA test for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 20th percentile were classified as no-risk. Cut points on the screening measure (DORA) were the same scores identified as cut-points in the main Classification Accuracy study. Students were designated as actually “at risk” or “not at risk” by rank-ordering their SBA state scale scores and using the 10th percentile rank point within the study sample as the cut score, disaggregated within the sample by grade and subject area. Students actually “at risk” were so designated when their state scale scores fell below the 10th percentile rank point. Because DORA uses a grade-level equivalence score, we rank at-risk based on how far below a student may be in their total DORA grade-level score from their decimal grade corresponding with their Fall, Winter, or Spring DORA assessment. For the purpose of this cross-validation analysis the gap threshold per grade varied.
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
Yes
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Some students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students. In addition, roughly 50% of students who take DORA may be using our math intervention but for these students we don’t know the fidelity in which they may have received our intervention.

Classification Accuracy - Fall

Evidence Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 2342 2342 2342 2379 2414 2435 2456 2470
Cut Points - Corresponding performance score (numeric) on screener measure .5 .75 3.01 4.01 5 5.9 6.79 7.59
Classification Data - True Positive (a) 1245 1284 2352 4241 4350 4310 4330 2930
Classification Data - False Positive (b) 2157 2283 1455 2500 2510 2592 2697 2110
Classification Data - False Negative (c) 379 425 390 803 889 918 856 769
Classification Data - True Negative (d) 9241 9470 9354 15267 15303 15143 15240 10092
Area Under the Curve (AUC) 0.84 0.84 0.90 0.91 0.92 0.91 0.91 0.88
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.82 0.85 0.90 0.89 0.90 0.90 0.91 0.87
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.85 0.83 0.90 0.92 0.93 0.91 0.92 0.89
Statistics Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.12 0.13 0.20 0.22 0.23 0.23 0.22 0.23
Overall Classification Rate 0.81 0.80 0.86 0.86 0.85 0.85 0.85 0.82
Sensitivity 0.77 0.75 0.86 0.84 0.83 0.82 0.83 0.79
Specificity 0.81 0.81 0.87 0.86 0.86 0.85 0.85 0.83
False Positive Rate 0.19 0.19 0.13 0.14 0.14 0.15 0.15 0.17
False Negative Rate 0.23 0.25 0.14 0.16 0.17 0.18 0.17 0.21
Positive Predictive Power 0.37 0.36 0.62 0.63 0.63 0.62 0.62 0.58
Negative Predictive Power 0.96 0.96 0.96 0.95 0.95 0.94 0.95 0.93
Sample Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019
Sample Size 13022 13462 13551 22811 23052 22963 23123 15901
Geographic Representation East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
Male 48.2% 50.9% 49.9% 50.1% 50.7% 50.4% 50.7% 50.5%
Female 51.8% 49.1% 501.0% 49.9% 49.3% 49.6% 49.3% 49.5%
Other                
Gender Unknown                
White, Non-Hispanic 24.8% 27.0% 23.9% 24.2% 23.6% 22.0% 22.8% 23.9%
Black, Non-Hispanic 8.3% 7.0% 7.7% 6.6% 6.5% 9.7% 10.0% 9.2%
Hispanic 44.1% 42.5% 44.9% 50.3% 50.2% 50.1% 51.2% 51.4%
Asian/Pacific Islander 13.0% 13.5% 11.9% 12.0% 10.4% 11.0% 9.4% 8.4%
American Indian/Alaska Native 0.4% 0.3% 0.3% 0.2% 0.3% 0.2% 0.2% 0.3%
Other 8.1% 5.2% 6.5% 3.1% 5.9% 5.0% 3.9% 4.0%
Race / Ethnicity Unknown 1.3% 4.5% 4.8% 3.6% 3.1% 2.0% 2.5% 2.8%
Low SES                
IEP or diagnosed disability                
English Language Learner                

Classification Accuracy - Winter

Evidence Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 2342 2342 2342 2342 2379 2414 2435 2456 2470
Cut Points - Corresponding performance score (numeric) on screener measure .23 .5 2 3.57 4.43 5.5 6.4 7.23 8.03
Classification Data - True Positive (a) 259 833 899 1403 1501 1678 1578 1621 1470
Classification Data - False Positive (b) 580 1497 1502 1103 1097 1112 1124 1247 1209
Classification Data - False Negative (c) 101 231 302 252 283 333 319 323 375
Classification Data - True Negative (d) 2340 6233 6202 5652 6048 5415 5580 5292 5145
Area Under the Curve (AUC) 0.77 0.83 0.83 0.87 0.89 0.88 0.88 0.87 0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.77 0.81 0.83 0.86 0.88 0.87 0.87 0.86 0.82
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.78 0.84 0.83 0.88 0.90 0.89 0.89 0.88 0.85
Statistics Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.11 0.12 0.13 0.20 0.20 0.24 0.22 0.23 0.23
Overall Classification Rate 0.79 0.80 0.80 0.84 0.85 0.83 0.83 0.81 0.81
Sensitivity 0.72 0.78 0.75 0.85 0.84 0.83 0.83 0.83 0.80
Specificity 0.80 0.81 0.81 0.84 0.85 0.83 0.83 0.81 0.81
False Positive Rate 0.20 0.19 0.19 0.16 0.15 0.17 0.17 0.19 0.19
False Negative Rate 0.28 0.22 0.25 0.15 0.16 0.17 0.17 0.17 0.20
Positive Predictive Power 0.31 0.36 0.37 0.56 0.58 0.60 0.58 0.57 0.55
Negative Predictive Power 0.96 0.96 0.95 0.96 0.96 0.94 0.95 0.94 0.93
Sample Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019
Sample Size 3280 8794 8905 8410 8929 8538 8601 8483 8199
Geographic Representation East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
Male 50.9% 49.1% 50.9% 50.1% 48.3% 50.1% 51.4% 49.3% 50.5%
Female 49.1% 50.9% 49.1% 49.9% 47.1% 49.9% 48.6% 50.7% 49.5%
Other                  
Gender Unknown                  
White, Non-Hispanic 16.1% 23.1% 29.1% 25.5% 26.8% 20.1% 25.1% 28.0% 25.9%
Black, Non-Hispanic 7.7% 8.9% 6.9% 7.9% 6.8% 6.9% 7.9% 5.4% 9.1%
Hispanic 52.2% 45.1% 41.2% 43.1% 44.9% 50.1% 44.3% 49.3% 50.2%
Asian/Pacific Islander 17.3% 15.0% 14.1% 12.3% 11.4% 11.1% 12.9% 11.9% 9.6%
American Indian/Alaska Native 0.4% 0.7% 0.5% 0.5% 0.1% 0.3% 0.5% 0.4% 0.4%
Other 5.3% 6.0% 5.0% 5.9% 2.9% 6.3% 5.4% 3.0% 3.0%
Race / Ethnicity Unknown 1.0% 1.2% 3.2% 4.8% 2.5% 5.2% 3.9% 2.0% 1.8%
Low SES                  
IEP or diagnosed disability                  
English Language Learner                  

Classification Accuracy - Spring

Evidence Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 2342 2342 2342 2342 2379 2414 2435 2456 2470
Cut Points - Corresponding performance score (numeric) on screener measure .23 1 2.4 3.9 4.9 5.8 6.7 7.5 8.3
Classification Data - True Positive (a) 801 1260 1245 2493 4203 4512 4602 4687 2930
Classification Data - False Positive (b) 1739 2209 2189 1483 2493 2710 2913 2970 2422
Classification Data - False Negative (c) 233 322 354 375 725 783 930 875 701
Classification Data - True Negative (d) 6528 9192 9112 9438 15209 16025 14372 14574 10446
Area Under the Curve (AUC) 0.79 0.84 0.84 0.91 0.92 0.91 0.90 0.90 0.89
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.78 0.83 0.84 0.90 0.91 0.90 0.89 0.89 0.88
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.80 0.85 0.85 0.91 0.92 0.92 0.91 0.91 0.89
Statistics Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.11 0.12 0.12 0.21 0.22 0.22 0.24 0.24 0.22
Overall Classification Rate 0.79 0.81 0.80 0.87 0.86 0.85 0.83 0.83 0.81
Sensitivity 0.77 0.80 0.78 0.87 0.85 0.85 0.83 0.84 0.81
Specificity 0.79 0.81 0.81 0.86 0.86 0.86 0.83 0.83 0.81
False Positive Rate 0.21 0.19 0.19 0.14 0.14 0.14 0.17 0.17 0.19
False Negative Rate 0.23 0.20 0.22 0.13 0.15 0.15 0.17 0.16 0.19
Positive Predictive Power 0.32 0.36 0.36 0.63 0.63 0.62 0.61 0.61 0.55
Negative Predictive Power 0.97 0.97 0.96 0.96 0.95 0.95 0.94 0.94 0.94
Sample Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019
Sample Size 9301 12983 12900 13789 22630 24030 22817 23106 16499
Geographic Representation East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
Male 48.1% 48.3% 50.9% 51.7% 50.1% 49.2% 51.1% 50.1% 51.3%
Female 51.9% 51.7% 49.1% 48.3% 49.9% 50.8% 48.9% 49.9% 48.7%
Other                  
Gender Unknown                  
White, Non-Hispanic 17.1% 23.1% 24.9% 28.1% 25.9% 23.9% 30.1% 22.9% 23.9%
Black, Non-Hispanic 9.1% 8.1% 9.1% 6.9% 6.0% 6.1% 6.6% 11.1% 8.3%
Hispanic 50.1% 43.9% 45.1% 42.0% 48.3% 48.1% 44.2% 51.5% 51.2%
Asian/Pacific Islander 16.2% 11.9% 13.0% 10.1% 11.0% 10.1% 11.5% 10.9% 7.8%
American Indian/Alaska Native 0.3% 0.4% 0.3% 0.4% 0.3% 0.3% 0.1% 0.4% 0.3%
Other 5.1% 9.3% 4.2% 6.9% 4.1% 5.3% 4.6% 3.1% 4.9%
Race / Ethnicity Unknown 2.1% 3.3% 3.4% 5.6% 4.4% 6.2% 2.9% 0.1% 3.6%
Low SES                  
IEP or diagnosed disability                  
English Language Learner                  

Cross-Validation - Fall

Evidence Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 2342 2342 2342 2379 2414 2435 2456 2470
Cut Points - Corresponding performance score (numeric) on screener measure .5 .75 3.01 4.01 5 5.9 6.79 7.59
Classification Data - True Positive (a) 135 290 450 881 878 872 884 601
Classification Data - False Positive (b) 441 451 285 525 544 531 532 511
Classification Data - False Negative (c) 64 101 104 181 190 199 204 165
Classification Data - True Negative (d) 1962 1850 1871 2975 2998 2990 3004 1903
Area Under the Curve (AUC) 0.80 0.83 0.89 0.90 0.90 0.90 0.90 0.86
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.79 0.82 0.89 0.90 0.89 0.89 0.89 0.85
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.81 0.84 0.89 0.91 0.91 0.91 0.91 0.86
Statistics Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.08 0.15 0.20 0.23 0.23 0.23 0.24 0.24
Overall Classification Rate 0.81 0.79 0.86 0.85 0.84 0.84 0.84 0.79
Sensitivity 0.68 0.74 0.81 0.83 0.82 0.81 0.81 0.78
Specificity 0.82 0.80 0.87 0.85 0.85 0.85 0.85 0.79
False Positive Rate 0.18 0.20 0.13 0.15 0.15 0.15 0.15 0.21
False Negative Rate 0.32 0.26 0.19 0.17 0.18 0.19 0.19 0.22
Positive Predictive Power 0.23 0.39 0.61 0.63 0.62 0.62 0.62 0.54
Negative Predictive Power 0.97 0.95 0.95 0.94 0.94 0.94 0.94 0.92
Sample Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019
Sample Size 2602 2692 2710 4562 4610 4592 4624 3180
Geographic Representation East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
Male 48.2% 50.1% 49.8% 50.3% 50.2% 50.4% 49.7% 50.1%
Female 51.9% 49.9% 50.2% 49.7% 49.8% 49.6% 50.3% 49.9%
Other                
Gender Unknown                
White, Non-Hispanic 25.4% 26.2% 22.9% 25.2% 23.9% 23.0% 22.5% 25.1%
Black, Non-Hispanic 8.1% 7.8% 8.0% 6.9% 6.5% 9.2% 10.5% 9.1%
Hispanic 42.9% 42.0% 46.5% 48.9% 50.2% 49.1% 51.1% 48.1%
Asian/Pacific Islander 13.8% 14.2% 11.1% 12.4% 10.1% 10.1% 10.1% 8.9%
American Indian/Alaska Native 0.3% 0.3% 0.4% 0.5% 0.2% 0.4% 0.5% 0.4%
Other 7.3% 5.2% 6.7% 2.9% 5.9% 5.9% 3.0% 5.3%
Race / Ethnicity Unknown 2.2% 4.3% 4.4% 3.2% 3.2% 2.3% 2.3% 3.1%
Low SES                
IEP or diagnosed disability                
English Language Learner                

Cross-Validation - Winter

Evidence Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 2342 2342 2342 2342 2379 2414 2435 2456 2470
Cut Points - Corresponding performance score (numeric) on screener measure .23 .5 2 3.57 4.43 5.5 6.4 7.23 8.03
Classification Data - True Positive (a) 51 149 201 290 314 330 327 301 291
Classification Data - False Positive (b) 109 289 290 232 219 239 251 301 279
Classification Data - False Negative (c) 20 42 66 50 59 67 68 60 73
Classification Data - True Negative (d) 476 1278 1224 1110 1111 1071 1074 1034 996
Area Under the Curve (AUC) 0.77 0.83 0.83 0.86 0.89 0.87 0.86 0.86 0.83
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.76 0.81 0.82 0.85 0.89 0.86 0.85 0.86 0.82
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.77 0.84 0.83 0.86 0.88 0.88 0.87 0.87 0.84
Statistics Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.11 0.11 0.15 0.20 0.22 0.23 0.23 0.21 0.22
Overall Classification Rate 0.80 0.81 0.80 0.83 0.84 0.82 0.81 0.79 0.79
Sensitivity 0.72 0.78 0.75 0.85 0.84 0.83 0.83 0.83 0.80
Specificity 0.81 0.82 0.81 0.83 0.84 0.82 0.81 0.77 0.78
False Positive Rate 0.19 0.18 0.19 0.17 0.16 0.18 0.19 0.23 0.22
False Negative Rate 0.28 0.22 0.25 0.15 0.16 0.17 0.17 0.17 0.20
Positive Predictive Power 0.32 0.34 0.41 0.56 0.59 0.58 0.57 0.50 0.51
Negative Predictive Power 0.96 0.97 0.95 0.96 0.95 0.94 0.94 0.95 0.93
Sample Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019
Sample Size 656 1758 1781 1682 1703 1707 1720 1696 1639
Geographic Representation East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
Male 52.0% 48.9% 50.9% 49.1% 50.9% 48.2% 50.1% 49.3% 50.3%
Female 48.0% 51.1% 49.1% 50.9% 49.1% 51.8% 49.9% 50.7% 49.7%
Other                  
Gender Unknown                  
White, Non-Hispanic 16.2% 23.1% 25.9% 25.7% 27.1% 21.2% 23.2% 22.1% 24.7%
Black, Non-Hispanic 8.1% 8.9% 6.5% 7.8% 7.0% 6.9% 8.1% 9.2% 9.1%
Hispanic 50.3% 45.3% 40.2% 43.1% 45.9% 50.4% 45.3% 50.0% 50.1%
Asian/Pacific Islander 19.1% 14.3% 14.1% 11.2% 11.9% 11.2% 12.2% 11.9% 10.1%
American Indian/Alaska Native 0.3% 0.3% 0.9% 0.5% 0.4% 0.5% 0.9% 0.3% 0.4%
Other 5.2% 6.6% 7.9% 6.1% 4.3% 5.3% 5.9% 3.1% 4.2%
Race / Ethnicity Unknown 1.1% 1.5% 4.5% 5.6% 3.4% 4.5% 4.4% 3.4% 1.4%
Low SES                  
IEP or diagnosed disability                  
English Language Learner                  

Cross-Validation - Spring

Evidence Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Criterion measure Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts Smarter Balanced Assessment (SBA) English Language Arts
Cut Points - Percentile rank on criterion measure 20 20 20 20 20 20 20 20 20
Cut Points - Performance score on criterion measure 2342 2342 2342 2342 2379 2414 2435 2456 2470
Cut Points - Corresponding performance score (numeric) on screener measure .23 1 2.4 3.9 4.9 5.8 6.7 7.5 8.3
Classification Data - True Positive (a) 185 250 250 490 810 899 881 921 578
Classification Data - False Positive (b) 345 450 479 354 601 658 722 738 581
Classification Data - False Negative (c) 60 59 72 72 144 141 178 170 132
Classification Data - True Negative (d) 1270 1838 1779 1841 2971 3108 2782 2792 2008
Area Under the Curve (AUC) 0.78 0.82 0.82 0.89 0.90 0.90 0.89 0.88 0.87
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.77 0.81 0.81 0.88 0.89 0.89 0.88 0.88 0.86
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.78 0.83 0.83 0.89 0.91 0.91 0.89 0.89 0.88
Statistics Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Base Rate 0.13 0.12 0.12 0.20 0.21 0.22 0.23 0.24 0.22
Overall Classification Rate 0.78 0.80 0.79 0.85 0.84 0.83 0.80 0.80 0.78
Sensitivity 0.76 0.81 0.78 0.87 0.85 0.86 0.83 0.84 0.81
Specificity 0.79 0.80 0.79 0.84 0.83 0.83 0.79 0.79 0.78
False Positive Rate 0.21 0.20 0.21 0.16 0.17 0.17 0.21 0.21 0.22
False Negative Rate 0.24 0.19 0.22 0.13 0.15 0.14 0.17 0.16 0.19
Positive Predictive Power 0.35 0.36 0.34 0.58 0.57 0.58 0.55 0.56 0.50
Negative Predictive Power 0.95 0.97 0.96 0.96 0.95 0.96 0.94 0.94 0.94
Sample Kindergarten Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 6 Grade 7 Grade 8
Date Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019 Spring 2019
Sample Size 1860 2597 2580 2757 4526 4806 4563 4621 3299
Geographic Representation East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
East North Central (MI)
New England (CT)
Pacific (CA, WA)
Male 49.0% 50.2% 48.3% 49.2% 49.4% 51.2% 49.9% 50.6% 52.1%
Female 51.0% 49.8% 51.7% 50.8% 50.6% 48.8% 50.1% 49.4% 47.9%
Other                  
Gender Unknown                  
White, Non-Hispanic 16.9% 22.0% 24.9% 27.9% 26.2% 24.1% 29.1% 22.8% 23.4%
Black, Non-Hispanic 7.8% 8.5% 8.1% 7.1% 6.2% 5.9% 6.9% 11.1% 9.0%
Hispanic 51.9% 43.8% 42.1% 41.2% 47.9% 48.9% 43.9% 52.3% 50.1%
Asian/Pacific Islander 16.0% 15.1% 14.5% 10.9% 12.0% 10.9% 12.2% 12.3% 9.9%
American Indian/Alaska Native 0.5% 0.9% 0.4% 0.4% 0.4% 0.4% 0.1% 0.3% 0.4%
Other 4.9% 6.2% 4.3% 7.9% 3.4% 5.0% 4.9% 1.1% 2.9%
Race / Ethnicity Unknown 2.0% 3.5% 5.7% 4.6% 4.1% 4.8% 2.9% 0.1% 4.3%
Low SES                  
IEP or diagnosed disability                  
English Language Learner                  

Reliability

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Rating Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Offer a justification for each type of reliability reported, given the type and purpose of the tool.
The Diagnostic Online Reading Assessment (DORA) is a criterion-referenced, adaptive assessment designed to measure student mastery across multiple reading sub-tests. Marginal Reliability (Item Response Theory-Based, Applied to the Weighted Total Score): Evaluates the precision of student ability estimates using DORA’s weighted total score, rather than internal consistency of test items. Since DORA’s total score is a weighted composite of multiple sub-tests (e.g., Phonics, Word Recognition, Vocabulary, Reading Comprehension), it already accounts for the instructional importance of each reading skill at different grade levels. Instead of calculating marginal reliability separately for each sub-test, it is computed directly on the weighted total score, ensuring that the final reliability estimate accurately reflects the assessment’s structure. This approach prevents redundant calculations and ensures that the reliability estimate properly accounts for differences in skill emphasis across grades.
*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
Marginal Reliability Sample Size: The full DORA test dataset, typically including thousands of student responses across all sub-tests. Characteristics: Students across multiple grade levels, ensuring broad applicability. Each student’s weighted total score was analyzed, rather than separate sub-test reliability scores, to reflect how different reading skills contribute at different grade levels. The dataset reflects the adaptive nature of DORA, where students received different sets of items based on their performance, and marginal reliability was calculated to ensure precise ability estimates for the total reading score.
*Describe the analysis procedures for each reported type of reliability.
Marginal Reliability Analysis (Applied to the Weighted Total Score): - Ability estimates (𝜃) were calculated for each student’s total weighted reading score, rather than for each individual sub-test. - The Standard Error of Measurement (SEM) was computed for each student’s ability estimate, reflecting the precision of measurement. - The variance of ability estimates across all students was calculated to determine how much scores naturally vary in the population. - Marginal reliability was computed using the formula: Rmarginal=1−Mean(SEM2)Variance of Ability Estimates (𝜃)R_{\text{marginal}} = 1 - \frac{\text{Mean(SEM}^2)}{\text{Variance of Ability Estimates (𝜃)}}Rmarginal​=1−Variance of Ability Estimates (𝜃)Mean(SEM2)​ - By applying this calculation to the weighted total score, the reliability estimate reflects the combined contribution of all reading sub-tests, weighted by their instructional importance at each grade level.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
Yes

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Marginal Reliability for K-8 Students: White: n=1769, r=.89 Hispanic: n=2389, r=.89 Black: n=675, r=.86 Asian: n=645, r=.86
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Validity

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Rating Unconvincing evidence Unconvincing evidence Unconvincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
The construct validity of a tool like DORA is supported by how well it measures what it claims to measure and how effectively it informs decision-making. DORA’s design centers on assessing specific skills and concepts aligned with modern ELA foundational content standards, ensuring it is both relevant and precise. Its adaptive nature allows students to be assessed across a broad spectrum of abilities, irrespective of their enrolled grade level. This flexibility ensures that DORA captures each student’s current mastery without being constrained by predefined grade boundaries. DORA’s structure, with 7 criterion-referenced sub-tests, further enhances its validity. These sub-tests are designed to directly assess mastery of specific skills and concepts, rather than predicting mastery based on proxies. This approach, combined with DORA’s granular organization, provides precise and actionable results. The vertical native scoring system ensures stability and continuity across assessments, eliminating floor and ceiling effects. As a result, DORA can assess growth and performance accurately over time. Evidence of the tool’s validity is further demonstrated through external correlation. The total grade-equivalency score of DORA and the total scaled score of the Smarter Balanced Assessment (SBA) show a Pearson’s correlation coefficient of 0.71, indicating a good relationship between the two measures. This correlation underscores DORA’s appropriateness as a diagnostic tool, as it aligns well with other established measures of student performance in reading. We don’t expect the correlation to be higher because DORA, as a diagnostic, will go as many years above or below grade level in order to find students’ instructional points. The SBA will mainly focus on grade level items. Thus, the divergent objectives of DORA to be diagnostic and SBA to be for grade-level accountability limits the correlation to good.
*Describe the sample(s), including size and characteristics, for each validity analysis conducted.
For the concurrent validity analyses, the sample consisted of about 140,000 students who completed both the screening measure and the criterion measure during the Spring testing window. This sample included students across all performance levels, ensuring a comprehensive representation. For predictive validity analyses in grades K-2, with the Smarter Balanced Assessment (SBA) used as the criterion measure, the sample included students who completed the screening measure 12 to 36 months prior to the administration of the criterion measure. Sample size was about 13,000 students, encompassing a diverse range of performance levels. For predictive validity analyses in grades 3-8, the sample included students who completed the screening measure in the Fall testing window and the criterion measure in the subsequent Spring. Sample size was also about 140,000 students, with all performance levels represented.
*Describe the analysis procedures for each reported type of validity.
For both concurrent and predictive validity analyses, Pearson correlation coefficients were calculated to examine the relationship between the screening measure and the criterion measure. The correlations were lower for grades K to 2, 0.60 to 0.65, and 95% confidence intervals were computed around the Pearson r coefficients to ensure statistical precision. We expected lower correlations for these grades as they were being compared to the SBA given when students were at the end of 3rd grade. For grades 3-8, predictive validity was assessed by correlating screening scores from the Fall with criterion scores from the Spring of the same academic year. Concurrent validity was analyzed by correlating Spring screening scores with Spring criterion scores from the same year.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Describe the degree to which the provided data support the validity of the tool.
Predictive validity coefficients for grades K-2 were positive and significant, ranging from 0.60 to 0.65. These correlations reflect the relationship between screening measures administered 12 to 36 months prior and subsequent criterion measures. While slightly lower than coefficients from measures taken closer in time, they still exceed the minimum threshold of 0.60, indicating a positive relationship between DORA and high-stakes statewide assessments, even over an extended period. Concurrent and predictive validity coefficients for grades 3-8 ranged around 0.70, demonstrating a strong alignment between DORA, a diagnostic reading assessment, and the SBA, an summative ELA state accountability assessment.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Non-SPED; SPED Grade: n, r, sig; n, r, sig 3: 1238, 0.770, .000; 153, 0.587, .000 4: 2159, 0.755, .000; 269, 0.668, .000 5: 1901, 0.760, .000; 202, 0.632, .000 6: 1137, 0.720, .000; 117, 0.709, .000 7: 1215, 0.716, .000; 107, 0.578, .000 8: 1284, 0.788, .000; 123, 0.628, .000
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Bias Analysis

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Grade 4
Grade 5
Grade 6
Grade 7
Grade 8
Rating Not Provided Not Provided Not Provided Not Provided Not Provided Not Provided Not Provided Not Provided Not Provided
Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
Yes
If yes,
a. Describe the method used to determine the presence or absence of bias:
Analysis Method: We used a simplified method called Differential Item Functioning (DIF) to check for bias. This involves comparing how different groups (like males vs. females) respond to the same questions in ADAM. We look for any questions that one group finds consistently easier or harder than the other, which might suggest bias.
b. Describe the subgroups for which bias analyses were conducted:
Gender (Female vs. Male) Race/Ethnicity (Black or African American and Latino vs. Caucasian) Language ability (English Learners vs. non–English Learners) Educational needs (Students with Disabilities vs. General Education students) Socioeconomic status (Economically Disadvantaged vs. Not Economically Disadvantaged)
c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Our analysis involved a random sample of students, using data to calculate the difficulty level of items for each subgroup. We used a version of the statistical software specifically tailored for DIF analysis in educational assessments to ensure precise results. The analysis helps us understand if any particular question consistently shows a different difficulty level for any group, which would indicate bias.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.