mCLASS: Reading 3D
Text Reading & Comprehension (TRC)
Summary
mCLASS:3D - TRC is a set of screening and progress monitoring measures for grades K-6. Text Reading and Comprehension (TRC) is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. While the student reads from the set of leveled readers, the teacher follows along on a handheld device, recording the student’s performance as the child reads. The handheld software offers a pre-loaded class list indicating required assessment tasks, provides the teacher with directions and prompts to ensure standardized, accurate administration, and automates the precise timing requirements. Upon completion of each task, the handheld automatically calculates the student’s score and provides a risk evaluation. Student performance data are securely and immediately transferred to the Web-based mCLASS reporting system. The mCLASS:3D Web site offers a range of reports at the district, school, class, and individual student level for further analysis. The set of measures in the screening are designed to be administered at the beginning, middle, and end of year, with alternate forms of all measures available for progress monitoring in between screening windows.
- Where to Obtain:
- Amplify Education, Inc
- support@amplify.com
- 55 Washington Street Suite 800 Brooklyn, NY 11201-1071
- (800) 823-1969
- https://www.amplify.com/
- Initial Cost:
- $20.90 per student
- Replacement Cost:
- $20.90 per student per year
- Included in Cost:
- The basic pricing plan is an annual per student license of $20.90. For users already using an mCLASS assessment product, the cost per student to add mCLASS:3D is $6 per student.
- mCLASS allows for administration of the TRC assessment using mobile devices and allows teachers to easily record student responses with just a tap of a button as well as other observations noticed during an assessment for a deeper interpretation of students’ skills. It has embedded script for prompts and directions for ensuring standardized administration so all students receive the same opportunity to perform. The mCLASS Platform provides a comprehensive service for managing the staff organizational structure and student enrollment data, providing online reporting and analysis tools for users of different roles from administrators to classroom teachers, and supporting our mobile assessment delivery system. It supports the Now What Tools, which makes assessment results actionable for teachers by translating data into practical instructional support with tools for small- group instruction, item-level analysis, and parent letters. Educators and administrators can immediately access student data using reports that are designed for influencing instruction and informing administrative decisions. mCLASS is an assessment instrument well-suited for use with capturing the developing reading skills of students with disabilities, with a few exceptions: a) students who are deaf; b) students who have fluency-based speech disabilities, e.g., stuttering, oral apraxia; c) students who are learning to read in a language other than English or Spanish; d) students with severe disabilities. Use of mCLASS is appropriate for all other students, including those with disabilities and receiving special education supports for whom reading connected text is an IEP goal. For students receiving special education, it may be necessary to adjust goals and timelines; and provide accommodations as part of the administration. The purpose of accommodation is to facilitate assessment for children for whom a standard administration may not provide an accurate estimate of their skills in the core early literacy skill areas. Valid and acceptable accommodations are ones that are unlikely to change substantially the meaning or interpretation of a student’s scores. The valid and acceptable accommodations for TRC administration are available upon request
- Training Requirements:
- 4-8 hours of training
- Qualified Administrators:
- examiners must receive training in the assessment administration and scoring
- Access to Technical Support:
- Amplify’s Customer Care Center offers complete user-level support from 7:00 a.m. to 7:00 p.m. EST, Monday through Friday. Customers may contact a customer support representative via telephone, e- mail, or electronically through the mCLASS website. Calls to the Customer Care Center’s toll-free number are answered immediately by an automated attendant and routed to customer support agents according to regional expertise. Additionally, customers have self-service access to instructions, documents, and frequently asked questions on our Website. The research staff and product teams are available to answer questions about the content within the assessments. Larger implementations have a designated account manager to support ongoing successful implementation
- Assessment Format:
-
- One-to-one
- Scoring Time:
-
- Scoring is automatic
- Scores Generated:
-
- Raw score
- Developmental benchmarks
- Administration Time:
-
- 6 minutes per student
- Scoring Method:
-
- Automatically (computer-scored)
- Technology Requirements:
-
- Accommodations:
- mCLASS allows for administration of the TRC assessment using mobile devices and allows teachers to easily record student responses with just a tap of a button as well as other observations noticed during an assessment for a deeper interpretation of students’ skills. It has embedded script for prompts and directions for ensuring standardized administration so all students receive the same opportunity to perform. The mCLASS Platform provides a comprehensive service for managing the staff organizational structure and student enrollment data, providing online reporting and analysis tools for users of different roles from administrators to classroom teachers, and supporting our mobile assessment delivery system. It supports the Now What Tools, which makes assessment results actionable for teachers by translating data into practical instructional support with tools for small- group instruction, item-level analysis, and parent letters. Educators and administrators can immediately access student data using reports that are designed for influencing instruction and informing administrative decisions. mCLASS is an assessment instrument well-suited for use with capturing the developing reading skills of students with disabilities, with a few exceptions: a) students who are deaf; b) students who have fluency-based speech disabilities, e.g., stuttering, oral apraxia; c) students who are learning to read in a language other than English or Spanish; d) students with severe disabilities. Use of mCLASS is appropriate for all other students, including those with disabilities and receiving special education supports for whom reading connected text is an IEP goal. For students receiving special education, it may be necessary to adjust goals and timelines; and provide accommodations as part of the administration. The purpose of accommodation is to facilitate assessment for children for whom a standard administration may not provide an accurate estimate of their skills in the core early literacy skill areas. Valid and acceptable accommodations are ones that are unlikely to change substantially the meaning or interpretation of a student’s scores. The valid and acceptable accommodations for TRC administration are available upon request
Descriptive Information
- Please provide a description of your tool:
- mCLASS:3D - TRC is a set of screening and progress monitoring measures for grades K-6. Text Reading and Comprehension (TRC) is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. While the student reads from the set of leveled readers, the teacher follows along on a handheld device, recording the student’s performance as the child reads. The handheld software offers a pre-loaded class list indicating required assessment tasks, provides the teacher with directions and prompts to ensure standardized, accurate administration, and automates the precise timing requirements. Upon completion of each task, the handheld automatically calculates the student’s score and provides a risk evaluation. Student performance data are securely and immediately transferred to the Web-based mCLASS reporting system. The mCLASS:3D Web site offers a range of reports at the district, school, class, and individual student level for further analysis. The set of measures in the screening are designed to be administered at the beginning, middle, and end of year, with alternate forms of all measures available for progress monitoring in between screening windows.
ACADEMIC ONLY: What skills does the tool screen?
- Please describe specific domain, skills or subtests:
- Concepts of print
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
-
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
Acquisition and Cost Information
Administration
- Are norms available?
- Yes
- Are benchmarks available?
- Yes
- If yes, how many benchmarks per year?
- 3 (BOY, MOY, EOY as defined by district user)
- If yes, for which months are benchmarks available?
- approximately September, January, May
- BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
- If yes, how many students can be rated concurrently?
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- 4-8 hours of training
- Please describe the minimum qualifications an administrator must possess.
- examiners must receive training in the assessment administration and scoring
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- No
- If No, please describe training costs:
- For first-time mCLASS users, a 2-day in-person training is available for $575 in-office or $4,800 onsite for 25 maximum participants
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- Amplify’s Customer Care Center offers complete user-level support from 7:00 a.m. to 7:00 p.m. EST, Monday through Friday. Customers may contact a customer support representative via telephone, e- mail, or electronically through the mCLASS website. Calls to the Customer Care Center’s toll-free number are answered immediately by an automated attendant and routed to customer support agents according to regional expertise. Additionally, customers have self-service access to instructions, documents, and frequently asked questions on our Website. The research staff and product teams are available to answer questions about the content within the assessments. Larger implementations have a designated account manager to support ongoing successful implementation
Scoring
- Do you provide basis for calculating performance level scores?
-
Yes
- Does your tool include decision rules?
- If yes, please describe.
- Can you provide evidence in support of multiple decision rules?
-
No
- If yes, please describe.
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- Raw scores are provided as the reading level of the student, categorized as a reading level A through Z. A student’s reading level is a composite of his reading accuracy and comprehension of text. Cut Points for determining reading level are provided. A student must reach the accuracy and comprehension cut points in order for a book level to be determined as the student’s reading level. Developmental benchmarks for each measure, grade, and time of year (beginning, middle, end) classify each student’s score as Above Proficient, Proficient, Below Proficient, Far Below Proficient.
- Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
- TRC is a set of screening and progress monitoring measures for grades K-6. Text Reading and Comprehension (TRC) is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this assessment, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. The instructional reading level, a composite score of reading accuracy and comprehension is used to classify students in one of four proficiency levels. The skills assessed in TRC include those skills that must be mastered for any student learning to read in English. The materials were subject to multiple rounds of review by content development experts and school-based professionals to ensure they were culturally relevant and free from bias. Field testing data (including qualitative feedback from educators) also indicate lack of bias in the results. The observational administration allows for flexibility with respect to issues of linguistic diversity through professional judgment based on the student’s responses and prior knowledge of his/her speech patterns. A list of approved accommodations is available upon request.
Technical Standards
Classification Accuracy & Cross-Validation Summary
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|---|
Classification Accuracy Fall | |||||||
Classification Accuracy Winter | |||||||
Classification Accuracy Spring |
DIBELS Next
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, and Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
Cross-Validation
- Has a cross-validation study been conducted?
-
Yes
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills.DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. The cross-validation analysis is based on over 20,000 students in grades K-6 from seven states using random cluster sampling. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
DIBELS Next
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, and Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
Cross-Validation
- Has a cross-validation study been conducted?
-
Yes
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills.DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. The cross-validation analysis is based on over 20,000 students in grades K-6 from seven states using random cluster sampling. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
DIBELS Next
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, and Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
Cross-Validation
- Has a cross-validation study been conducted?
-
Yes
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills.DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. The cross-validation analysis is based on over 20,000 students in grades K-6 from seven states using random cluster sampling. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
DIBELS Next
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, and Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
Cross-Validation
- Has a cross-validation study been conducted?
-
Yes
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- DIBELS Next (Dynamic Measurement Group, 2010) is a set of measures used to assess early literacy and reading skills, including phonemic awareness, basic phonics, accurate and fluent reading of connected text, and reading comprehension for students from kindergarten through Grade 6. It includes the following measures: Letter Naming, Fluency (LNF), First Sound Fluency (FSF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), Oral Reading Fluency (DORF), and Daze — a maze task. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills.DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good, Kaminski, Dewey, Walin, Powell-Smith, & Latimer, 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade-appropriate reading skills and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments. DIBELS Next was developed independently by researchers at the University of Oregon.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- TRC and DIBELS Next assessments were administered during the 2018-2019 beginning-, middle-, and end-of-year (BOY, MOY, and EOY respectively) benchmark administration periods. TRC benchmark goals represent the likelihood of achieving subsequent expected reading outcomes at the end of the school year, thus a predictive classification model was used to predict the odds of scoring at or above benchmark on the criterion (i.e., DIBELS Next at the end of the year), based on students’ score on the predictor. More specifically, for each grade level, an analysis of classification accuracy is provided for: (a) beginning- to end-of-year, (b) middle- to end-of-year, and (c) end- to end-of-year on the external criterion measure.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points for Kindergarten through Grade 6 were established for TRC during a standard-setting workshop convened in Brooklyn, New York, on April 12–13, 2014. The Item Descriptor (ID) Matching method (Ferrara & Lewis, 2012), a standard-setting procedure appropriate for use with performance-based assessments that yield categorical results (e.g., Below Proficient, Proficient) such as TRC, was used to evaluate the TRC assessment content against the Common Core State Standards for English Language Arts (CCSS for ELA; National Governors Association Center for Best Practices & Council of Chief State School Officers, 2010) to determine performance standards and cut points for the four performance levels represented in TRC (i.e., Far Below, Below, Proficient, Above) TRC Cut points for these four performance levels were further examined and confirmed using contrasting group methodology (Cizek & Bunch, 2007) with composite score interpretations resulting from administration of DIBELS Next at EOY in the 2018-2019 school year. The process used to confirm TRC cut points for at-risk (Far Below Proficient/Below Proficient) using a contrasting groups method is illustrated here. The primary specification for a TRC cut point for at-risk is a level of skill where students scoring below that level have low odds (<=20%) of making adequate reading progress unless provided with additional, intensive support. The contrasting group method identifies students who are at-risk or non-risk readers on TRC according to the DIBELS Next composite score interpretation (i.e., Well Below Benchmark (which corresponds to approximately the 20th%ile as at-risk); Below Benchmark, At Benchmark, and Above Benchmark as not at-risk). The contrasting group method assumes that students performing Well Below Benchmark on the DIBELS Next composite also have a greater probability of being Far Below on TRC; while students Below, At/Above Benchmark on the DIBELS Next composite have a lower probability of being Far Below on TRC. Logistic regression and Receiver Operating Characteristic (ROC) analyses were conducted to examine the accuracy, sensitivity, specificity, and logistic prediction results for the cut points established at the TRC standard-setting workshop.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- The sample was selected from the mCLASS data system user base throughout the 2018-2019 school year, the total sample size ranges from 77,553 to 88,924 students from 7 geographical divisions across the United States. The cross-validation analysis is based on over 20,000 students in grades K-6 from seven states using random cluster sampling. Specifics on whether or not students are receiving additional intervention are unknown; however, it is expected that many of the students were receiving additional intervention.
Classification Accuracy - Fall
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Criterion measure | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 89 | 111 | 180 | 280 | 330 | 340 | 324 |
Cut Points - Corresponding performance score (numeric) on screener measure | Text Level PC | Text Level B | Text Level F | Text Level K | Text Level M | Text Level Q | Text Level T |
Classification Data - True Positive (a) | 36 | 4711 | 3591 | 2264 | 1240 | 1176 | 120 |
Classification Data - False Positive (b) | 54 | 3462 | 1982 | 1256 | 943 | 1047 | 148 |
Classification Data - False Negative (c) | 46 | 984 | 394 | 437 | 328 | 221 | 17 |
Classification Data - True Negative (d) | 689 | 16396 | 14617 | 6247 | 3224 | 2579 | 287 |
Area Under the Curve (AUC) | 0.83 | 0.90 | 0.95 | 0.91 | 0.87 | 0.87 | 0.87 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.79 | 0.89 | 0.95 | 0.91 | 0.86 | 0.86 | 0.83 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.87 | 0.90 | 0.96 | 0.92 | 0.89 | 0.88 | 0.91 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Base Rate | 0.10 | 0.22 | 0.19 | 0.26 | 0.27 | 0.28 | 0.24 |
Overall Classification Rate | 0.88 | 0.83 | 0.88 | 0.83 | 0.78 | 0.75 | 0.71 |
Sensitivity | 0.44 | 0.83 | 0.90 | 0.84 | 0.79 | 0.84 | 0.88 |
Specificity | 0.93 | 0.83 | 0.88 | 0.83 | 0.77 | 0.71 | 0.66 |
False Positive Rate | 0.07 | 0.17 | 0.12 | 0.17 | 0.23 | 0.29 | 0.34 |
False Negative Rate | 0.56 | 0.17 | 0.10 | 0.16 | 0.21 | 0.16 | 0.12 |
Positive Predictive Power | 0.40 | 0.58 | 0.64 | 0.64 | 0.57 | 0.53 | 0.45 |
Negative Predictive Power | 0.94 | 0.94 | 0.97 | 0.93 | 0.91 | 0.92 | 0.94 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Date | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 |
Sample Size | 825 | 25553 | 20584 | 10204 | 5735 | 5023 | 572 |
Geographic Representation | East North Central (IN) Middle Atlantic (NY) New England (CT) Pacific (CA) South Atlantic (DC, GA, WV) |
East North Central (IN, OH) East South Central (TN) Middle Atlantic (NY) Mountain (UT, WY) New England (CT, MA) Pacific (AK, CA, WA) South Atlantic (DC, GA, WV) West South Central (AR) |
East North Central (IN, OH) Middle Atlantic (NY) Mountain (UT, WY) New England (CT, MA) Pacific (AK, CA, WA) South Atlantic (DC, GA, WV) West South Central (AR) |
East North Central (IN, OH) Middle Atlantic (NY) Mountain (UT) New England (CT, MA) Pacific (CA, WA) South Atlantic (DC, WV) |
East North Central (IN) Middle Atlantic (NY) New England (CT, MA) Pacific (CA) South Atlantic (DC, WV) |
East North Central (IN) Middle Atlantic (NY) Mountain (UT) New England (CT, MA) Pacific (CA, WA) South Atlantic (DC, WV) West North Central (MO) |
East North Central (IN) Middle Atlantic (NY) New England (CT) Pacific (CA, WA) South Atlantic (DC, GA) |
Male | 49.7% | 44.9% | 46.1% | 50.5% | 51.7% | 52.7% | 47.6% |
Female | 38.2% | 42.7% | 43.3% | 46.6% | 47.4% | 46.8% | 44.9% |
Other | |||||||
Gender Unknown | 12.1% | 12.4% | 10.6% | 2.9% | 0.9% | 0.5% | 7.5% |
White, Non-Hispanic | 13.6% | 21.1% | 21.6% | 14.6% | 7.0% | 4.2% | 4.7% |
Black, Non-Hispanic | 13.8% | 14.2% | 16.6% | 15.2% | 11.9% | 11.0% | 3.3% |
Hispanic | 38.3% | 25.8% | 29.1% | 55.2% | 73.3% | 77.6% | 78.0% |
Asian/Pacific Islander | |||||||
American Indian/Alaska Native | 0.2% | 2.1% | 0.1% | 0.1% | 0.1% | 0.1% | 0.2% |
Other | 8.1% | 10.2% | 10.9% | 6.1% | 3.2% | 4.3% | 2.6% |
Race / Ethnicity Unknown | 25.9% | 26.7% | 21.7% | 8.8% | 4.4% | 2.8% | 11.2% |
Low SES | 10.8% | 16.8% | 15.7% | 5.9% | 6.6% | 5.7% | |
IEP or diagnosed disability | 10.8% | 6.6% | 8.4% | 13.6% | 19.3% | 22.2% | 22.7% |
English Language Learner | 18.3% | 12.9% | 12.4% | 19.1% | 34.6% | 33.9% | 26.9% |
Classification Accuracy - Winter
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Criterion measure | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 89 | 111 | 180 | 280 | 330 | 340 | 324 |
Cut Points - Corresponding performance score (numeric) on screener measure | Text Level A | Text Level D | Text Level I | Text Level L | Text Level 0 | Text Level R | Text Level V |
Classification Data - True Positive (a) | 112 | 5296 | 3719 | 2275 | 1342 | 1085 | 111 |
Classification Data - False Positive (b) | 277 | 3086 | 2023 | 938 | 987 | 695 | 167 |
Classification Data - False Negative (c) | 11 | 646 | 379 | 566 | 270 | 276 | 12 |
Classification Data - True Negative (d) | 725 | 15845 | 15510 | 6670 | 3168 | 2660 | 304 |
Area Under the Curve (AUC) | 0.88 | 0.95 | 0.96 | 0.92 | 0.88 | 0.89 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.86 | 0.95 | 0.96 | 0.91 | 0.87 | 0.88 | 0.86 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.90 | 0.95 | 0.96 | 0.92 | 0.89 | 0.90 | 0.93 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Base Rate | 0.11 | 0.24 | 0.19 | 0.27 | 0.28 | 0.29 | 0.21 |
Overall Classification Rate | 0.74 | 0.85 | 0.89 | 0.86 | 0.78 | 0.79 | 0.70 |
Sensitivity | 0.91 | 0.89 | 0.91 | 0.80 | 0.83 | 0.80 | 0.90 |
Specificity | 0.72 | 0.84 | 0.88 | 0.88 | 0.76 | 0.79 | 0.65 |
False Positive Rate | 0.28 | 0.16 | 0.12 | 0.12 | 0.24 | 0.21 | 0.35 |
False Negative Rate | 0.09 | 0.11 | 0.09 | 0.20 | 0.17 | 0.20 | 0.10 |
Positive Predictive Power | 0.29 | 0.63 | 0.65 | 0.71 | 0.58 | 0.61 | 0.40 |
Negative Predictive Power | 0.99 | 0.96 | 0.98 | 0.92 | 0.92 | 0.91 | 0.96 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Date | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 |
Sample Size | 1125 | 24873 | 21631 | 10449 | 5767 | 4716 | 594 |
Geographic Representation | East North Central (IN) Middle Atlantic (NY) New England (CT) Pacific (CA) South Atlantic (DC, GA, WV) |
East North Central (IN, OH) East South Central (TN) Middle Atlantic (NY) Mountain (UT, WY) New England (CT) Pacific (AK, CA, WA) South Atlantic (DC, GA, WV) West North Central (MO) West South Central (AR) |
East North Central (IN, OH) Middle Atlantic (NY) Mountain (UT, WY) New England (CT, MA) Pacific (AK, CA, WA) South Atlantic (DC, GA, WV) West North Central (MO) West South Central (AR) |
East North Central (IN, OH) Middle Atlantic (NY) Mountain (UT) New England (CT) Pacific (CA) South Atlantic (DC, WV) |
East North Central (IN) Middle Atlantic (NY) New England (CT, MA) Pacific (CA, WA) South Atlantic (DC, WV) |
East North Central (IN) Middle Atlantic (NY) Mountain (UT) New England (CT) Pacific (CA) South Atlantic (DC) West North Central (MO) |
East North Central (IN) Middle Atlantic (NY) Mountain (CO) Pacific (CA) West North Central (MO) West South Central (LA) |
Male | 52.0% | 45.3% | 46.2% | 50.6% | 51.9% | 53.3% | 48.3% |
Female | 38.6% | 42.8% | 43.5% | 46.4% | 47.1% | 46.2% | 45.6% |
Other | |||||||
Gender Unknown | 9.4% | 11.9% | 10.3% | 3.0% | 1.0% | 0.5% | 6.1% |
White, Non-Hispanic | 13.0% | 19.9% | 20.3% | 13.8% | 6.5% | 4.1% | 4.2% |
Black, Non-Hispanic | 12.6% | 14.3% | 16.5% | 15.5% | 12.4% | 11.4% | 3.9% |
Hispanic | 47.2% | 27.0% | 31.0% | 56.1% | 72.8% | 76.8% | 79.8% |
Asian/Pacific Islander | |||||||
American Indian/Alaska Native | 0.2% | 2.1% | 0.1% | 0.1% | 0.1% | 0.1% | 0.2% |
Other | 7.3% | 10.4% | 10.8% | 6.0% | 3.3% | 4.2% | 2.5% |
Race / Ethnicity Unknown | 19.7% | 26.3% | 21.3% | 8.5% | 4.9% | 3.4% | 9.4% |
Low SES | 9.5% | 16.9% | 15.4% | 6.0% | 6.7% | 6.3% | |
IEP or diagnosed disability | 10.8% | 7.0% | 8.2% | 13.7% | 19.9% | 23.2% | 22.6% |
English Language Learner | 24.6% | 13.9% | 13.1% | 19.9% | 34.8% | 34.8% | 26.3% |
Classification Accuracy - Spring
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Criterion measure | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 89 | 111 | 180 | 280 | 330 | 340 | 324 |
Cut Points - Corresponding performance score (numeric) on screener measure | Text Level B | Text Level F | Text Level K | Text Level M | Text Level Q | Text Level T | Text Level W |
Classification Data - True Positive (a) | 121 | 3499 | 3179 | 2255 | 1400 | 1026 | 97 |
Classification Data - False Positive (b) | 258 | 1085 | 1562 | 1005 | 1069 | 710 | 88 |
Classification Data - False Negative (c) | 11 | 641 | 493 | 728 | 253 | 253 | 10 |
Classification Data - True Negative (d) | 868 | 12833 | 14078 | 7193 | 3444 | 2567 | 219 |
Area Under the Curve (AUC) | 0.88 | 0.95 | 0.96 | 0.92 | 0.88 | 0.89 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.86 | 0.95 | 0.96 | 0.91 | 0.87 | 0.88 | 0.86 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.90 | 0.95 | 0.96 | 0.92 | 0.89 | 0.90 | 0.93 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Base Rate | 0.10 | 0.23 | 0.19 | 0.27 | 0.27 | 0.28 | 0.26 |
Overall Classification Rate | 0.79 | 0.90 | 0.89 | 0.85 | 0.79 | 0.79 | 0.76 |
Sensitivity | 0.92 | 0.85 | 0.87 | 0.76 | 0.85 | 0.80 | 0.91 |
Specificity | 0.77 | 0.92 | 0.90 | 0.88 | 0.76 | 0.78 | 0.71 |
False Positive Rate | 0.23 | 0.08 | 0.10 | 0.12 | 0.24 | 0.22 | 0.29 |
False Negative Rate | 0.08 | 0.15 | 0.13 | 0.24 | 0.15 | 0.20 | 0.09 |
Positive Predictive Power | 0.32 | 0.76 | 0.67 | 0.69 | 0.57 | 0.59 | 0.52 |
Negative Predictive Power | 0.99 | 0.95 | 0.97 | 0.91 | 0.93 | 0.91 | 0.96 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Date | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 |
Sample Size | 1258 | 18058 | 19312 | 11181 | 6166 | 4556 | 414 |
Geographic Representation | East North Central (IN) Middle Atlantic (NY) New England (CT) Pacific (CA) South Atlantic (DC, GA, WV) |
East North Central (IN, OH) East South Central (TN) Mountain (UT, WY) New England (CT) Pacific (AK, CA, WA) South Atlantic (DC, GA, WV) West North Central (MO) |
East North Central (IN, OH) Middle Atlantic (NY) Mountain (UT, WY) New England (CT, MA) Pacific (CA, WA) South Atlantic (DC, GA, WV) West South Central (AR) |
East North Central (IN, OH) Middle Atlantic (NY) Mountain (UT, WY) New England (CT, MA) Pacific (CA, WA) South Atlantic (DC, GA, WV) West South Central (AR) |
East North Central (IN) Middle Atlantic (NY) Mountain (WY) New England (CT) Pacific (CA) South Atlantic (DC, WV) West North Central (MO) |
Middle Atlantic (NY) Mountain (UT) New England (CT) Pacific (CA) South Atlantic (DC) West North Central (MO) |
East North Central (IN) Middle Atlantic (NY) Mountain (CO) Pacific (CA) West North Central (MO) West South Central (LA) |
Male | 52.5% | 44.5% | 46.6% | 50.3% | 51.5% | 52.2% | 46.1% |
Female | 39.7% | 42.1% | 43.7% | 46.7% | 47.6% | 47.2% | 44.9% |
Other | |||||||
Gender Unknown | 7.9% | 13.4% | 9.7% | 2.9% | 0.9% | 0.6% | 8.9% |
White, Non-Hispanic | 12.2% | 23.5% | 21.1% | 14.2% | 6.6% | 4.2% | 5.6% |
Black, Non-Hispanic | 12.2% | 13.2% | 16.0% | 15.6% | 11.8% | 11.3% | 3.1% |
Hispanic | 51.4% | 23.9% | 31.3% | 55.8% | 73.5% | 76.2% | 74.2% |
Asian/Pacific Islander | |||||||
American Indian/Alaska Native | 0.3% | 2.5% | 0.1% | 0.1% | 0.1% | 0.1% | 0.5% |
Other | 6.6% | 9.3% | 10.8% | 6.1% | 3.2% | 4.3% | 3.1% |
Race / Ethnicity Unknown | 17.2% | 27.5% | 20.7% | 8.3% | 4.8% | 3.9% | 13.5% |
Low SES | 8.6% | 16.1% | 15.4% | 6.0% | 6.6% | 6.8% | |
IEP or diagnosed disability | 11.4% | 6.4% | 8.3% | 13.6% | 18.7% | 21.2% | 28.3% |
English Language Learner | 27.2% | 12.3% | 13.5% | 18.9% | 33.7% | 33.0% | 33.1% |
Cross-Validation - Fall
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Criterion measure | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 89 | 111 | 180 | 280 | 330 | 340 | 324 |
Cut Points - Corresponding performance score (numeric) on screener measure | Text Level PC | Text Level B | Text Level F | Text Level K | Text Level M | Text Level Q | Text Level T |
Classification Data - True Positive (a) | 11 | 2847 | 2639 | 265 | 134 | 135 | 75 |
Classification Data - False Positive (b) | 22 | 1196 | 570 | 89 | 37 | 69 | 76 |
Classification Data - False Negative (c) | 15 | 449 | 169 | 82 | 116 | 42 | 9 |
Classification Data - True Negative (d) | 296 | 4849 | 4194 | 1119 | 436 | 318 | 157 |
Area Under the Curve (AUC) | 0.86 | 0.90 | 0.97 | 0.92 | 0.85 | 0.87 | 0.88 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.80 | 0.89 | 0.97 | 0.90 | 0.82 | 0.84 | 0.83 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.92 | 0.90 | 0.97 | 0.94 | 0.88 | 0.90 | 0.93 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Base Rate | 0.08 | 0.35 | 0.37 | 0.22 | 0.35 | 0.31 | 0.26 |
Overall Classification Rate | 0.89 | 0.82 | 0.90 | 0.89 | 0.79 | 0.80 | 0.73 |
Sensitivity | 0.42 | 0.86 | 0.94 | 0.76 | 0.54 | 0.76 | 0.89 |
Specificity | 0.93 | 0.80 | 0.88 | 0.93 | 0.92 | 0.82 | 0.67 |
False Positive Rate | 0.07 | 0.20 | 0.12 | 0.07 | 0.08 | 0.18 | 0.33 |
False Negative Rate | 0.58 | 0.14 | 0.06 | 0.24 | 0.46 | 0.24 | 0.11 |
Positive Predictive Power | 0.33 | 0.70 | 0.82 | 0.75 | 0.78 | 0.66 | 0.50 |
Negative Predictive Power | 0.95 | 0.92 | 0.96 | 0.93 | 0.79 | 0.88 | 0.95 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Date | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 |
Sample Size | 344 | 9341 | 7572 | 1555 | 723 | 564 | 317 |
Geographic Representation | East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IN) Mountain (CO) West South Central (TX) |
East North Central (IN) Mountain (CO) West South Central (TX) |
Mountain (UT) Pacific (CA) South Atlantic (NC) |
Male | 50.6% | 51.3% | 51.9% | 51.8% | 41.4% | 50.5% | 53.9% |
Female | 29.4% | 47.9% | 47.1% | 46.7% | 42.6% | 42.9% | 44.5% |
Other | |||||||
Gender Unknown | 20.1% | 0.8% | 1.0% | 1.5% | 16.0% | 6.6% | 1.6% |
White, Non-Hispanic | 28.5% | 23.1% | 24.2% | 55.4% | 47.9% | 57.6% | 5.7% |
Black, Non-Hispanic | 10.5% | 32.6% | 31.2% | 3.7% | 1.1% | 0.9% | 3.8% |
Hispanic | 11.0% | 31.1% | 31.6% | 31.0% | 30.0% | 30.7% | 86.8% |
Asian/Pacific Islander | |||||||
American Indian/Alaska Native | 0.9% | 0.4% | 0.4% | 0.6% | 0.4% | 0.7% | 0.6% |
Other | 7.3% | 4.9% | 5.4% | 7.1% | 4.4% | 3.4% | 1.3% |
Race / Ethnicity Unknown | 41.9% | 7.9% | 7.3% | 2.3% | 16.2% | 6.7% | 1.9% |
Low SES | 30.2% | 58.7% | 57.9% | 7.5% | 20.1% | 33.7% | |
IEP or diagnosed disability | 15.1% | 13.0% | 16.3% | 14.8% | 18.3% | 24.8% | 31.9% |
English Language Learner | 8.4% | 5.1% | 5.3% | 16.1% | 18.9% | 18.6% | 26.2% |
Cross-Validation - Winter
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Criterion measure | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 89 | 111 | 180 | 280 | 330 | 340 | 324 |
Cut Points - Corresponding performance score (numeric) on screener measure | Text Level A | Text Level D | Text Level I | Text Level L | Text level O | Text Level R | Text Level V |
Classification Data - True Positive (a) | 34 | 3117 | 2575 | 214 | 137 | 122 | 62 |
Classification Data - False Positive (b) | 79 | 840 | 542 | 64 | 63 | 60 | 85 |
Classification Data - False Negative (c) | 4 | 284 | 179 | 80 | 101 | 48 | 12 |
Classification Data - True Negative (d) | 241 | 4703 | 4209 | 955 | 451 | 303 | 164 |
Area Under the Curve (AUC) | 0.88 | 0.95 | 0.96 | 0.92 | 0.88 | 0.89 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.86 | 0.95 | 0.96 | 0.91 | 0.87 | 0.88 | 0.86 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.90 | 0.95 | 0.96 | 0.92 | 0.89 | 0.90 | 0.93 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Base Rate | 0.11 | 0.38 | 0.37 | 0.22 | 0.32 | 0.32 | 0.23 |
Overall Classification Rate | 0.77 | 0.87 | 0.90 | 0.89 | 0.78 | 0.80 | 0.70 |
Sensitivity | 0.89 | 0.92 | 0.94 | 0.73 | 0.58 | 0.72 | 0.84 |
Specificity | 0.75 | 0.85 | 0.89 | 0.94 | 0.88 | 0.83 | 0.66 |
False Positive Rate | 0.25 | 0.15 | 0.11 | 0.06 | 0.12 | 0.17 | 0.34 |
False Negative Rate | 0.11 | 0.08 | 0.06 | 0.27 | 0.42 | 0.28 | 0.16 |
Positive Predictive Power | 0.30 | 0.79 | 0.83 | 0.77 | 0.69 | 0.67 | 0.42 |
Negative Predictive Power | 0.98 | 0.94 | 0.96 | 0.92 | 0.82 | 0.86 | 0.93 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Date | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 |
Sample Size | 358 | 8944 | 7505 | 1313 | 752 | 533 | 323 |
Geographic Representation | East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IN) Mountain (CO) West South Central (TX) |
Mountain (UT) Pacific (CA) South Atlantic (NC) |
Male | 51.4% | 51.5% | 51.6% | 52.4% | 38.6% | 47.8% | 54.2% |
Female | 29.1% | 47.6% | 47.4% | 46.3% | 41.0% | 39.4% | 44.3% |
Other | |||||||
Gender Unknown | 19.6% | 0.9% | 0.9% | 1.3% | 20.5% | 12.8% | 1.5% |
White, Non-Hispanic | 25.1% | 20.9% | 21.6% | 55.1% | 50.9% | 54.4% | 7.4% |
Black, Non-Hispanic | 10.6% | 34.0% | 32.7% | 5.2% | 1.3% | 1.1% | 3.1% |
Hispanic | 12.6% | 31.4% | 32.6% | 30.8% | 23.0% | 29.1% | 85.8% |
Asian/Pacific Islander | |||||||
American Indian/Alaska Native | 0.8% | 0.4% | 0.3% | 0.4% | 0.4% | 0.9% | 0.3% |
Other | 8.4% | 5.0% | 5.4% | 6.5% | 3.9% | 1.7% | 1.5% |
Race / Ethnicity Unknown | 42.5% | 8.3% | 7.3% | 1.9% | 20.5% | 12.8% | 1.9% |
Low SES | 29.3% | 61.5% | 61.0% | 7.2% | 15.3% | 33.6% | |
IEP or diagnosed disability | 16.2% | 13.6% | 16.9% | 16.8% | 16.1% | 27.6% | 31.9% |
English Language Learner | 7.5% | 4.6% | 5.4% | 15.5% | 14.1% | 15.9% | 24.1% |
Cross-Validation - Spring
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Criterion measure | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next | DIBELS Next |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | 89 | 111 | 180 | 280 | 330 | 340 | 324 |
Cut Points - Corresponding performance score (numeric) on screener measure | Text Level B | Text Level F | Text Level K | Text Level M | Text Level Q | Text Level T | Text Level W |
Classification Data - True Positive (a) | 35 | 1130 | 1656 | 250 | 138 | 128 | 60 |
Classification Data - False Positive (b) | 68 | 229 | 458 | 61 | 106 | 58 | 53 |
Classification Data - False Negative (c) | 4 | 203 | 199 | 71 | 37 | 42 | 12 |
Classification Data - True Negative (d) | 262 | 4264 | 4571 | 1378 | 598 | 378 | 144 |
Area Under the Curve (AUC) | 0.93 | 0.97 | 0.97 | 0.95 | 0.88 | 0.89 | 0.84 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.90 | 0.97 | 0.96 | 0.94 | 0.85 | 0.86 | 0.79 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.96 | 0.98 | 0.97 | 0.96 | 0.91 | 0.92 | 0.90 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Base Rate | 0.11 | 0.23 | 0.27 | 0.18 | 0.20 | 0.28 | 0.27 |
Overall Classification Rate | 0.80 | 0.93 | 0.90 | 0.93 | 0.84 | 0.83 | 0.76 |
Sensitivity | 0.90 | 0.85 | 0.89 | 0.78 | 0.79 | 0.75 | 0.83 |
Specificity | 0.79 | 0.95 | 0.91 | 0.96 | 0.85 | 0.87 | 0.73 |
False Positive Rate | 0.21 | 0.05 | 0.09 | 0.04 | 0.15 | 0.13 | 0.27 |
False Negative Rate | 0.10 | 0.15 | 0.11 | 0.22 | 0.21 | 0.25 | 0.17 |
Positive Predictive Power | 0.34 | 0.83 | 0.78 | 0.80 | 0.57 | 0.69 | 0.53 |
Negative Predictive Power | 0.98 | 0.95 | 0.96 | 0.95 | 0.94 | 0.90 | 0.92 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|---|
Date | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 | 2018-2019 |
Sample Size | 369 | 5826 | 6884 | 1760 | 879 | 606 | 269 |
Geographic Representation | East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IL, IN) Mountain (CO) West South Central (TX) |
East North Central (IN) Mountain (CO) West South Central (TX) |
Mountain (UT) Pacific (CA) South Atlantic (NC) |
Male | 48.2% | 50.7% | 51.4% | 51.3% | 40.8% | 48.8% | 52.8% |
Female | 31.2% | 48.1% | 47.6% | 47.3% | 43.1% | 40.3% | 45.0% |
Other | |||||||
Gender Unknown | 20.6% | 1.2% | 1.0% | 1.4% | 16.0% | 10.9% | 2.2% |
White, Non-Hispanic | 28.5% | 35.3% | 28.6% | 55.1% | 50.3% | 54.6% | 5.6% |
Black, Non-Hispanic | 10.3% | 19.7% | 26.7% | 3.9% | 1.3% | 0.7% | 3.3% |
Hispanic | 12.2% | 27.7% | 30.6% | 28.5% | 27.8% | 29.9% | 86.2% |
Asian/Pacific Islander | |||||||
American Indian/Alaska Native | 0.8% | 0.5% | 0.5% | 0.6% | 0.5% | 0.8% | 0.4% |
Other | 7.0% | 6.1% | 5.9% | 6.5% | 4.2% | 3.1% | 1.9% |
Race / Ethnicity Unknown | 41.2% | 10.7% | 7.7% | 5.4% | 16.0% | 10.9% | 2.6% |
Low SES | 27.6% | 41.4% | 52.1% | 5.9% | 16.8% | 32.2% | |
IEP or diagnosed disability | 16.8% | 11.2% | 15.6% | 13.8% | 16.7% | 24.3% | 36.4% |
English Language Learner | 7.9% | 7.4% | 5.6% | 14.3% | 18.0% | 18.0% | 27.1% |
Reliability
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- Internal consistency and alternate form reliability: Internal consistency reliability refers to a person’s degree of confidence in the precision of scores from a single measurement. It’s used to indicate the variation of test scores which is attributable to measurement error. Alternate form reliability indicates the extent to which test results generalize to different forms. Alternate forms of the test with different items should give approximately the same scores. There are two to three books per text level in TRC. Books at the same text level are considered alternate forms. An individual student’s performance on the alternate books at the same text level should yield approximately the same scores on oral reading accuracy, comprehension, and/or retell/recall, as well as overall book performance. Inter-rater reliability evidence for grades K through 6: In observational assessments such as TRC, it is important that student performance be unrelated to or unaffected by a specific test administrator. Because there is a degree of subjectivity in scoring the accuracy of oral reading and comprehension, it is important to examine the degree to which TRC administrators can score student reading accuracy in a standardized and consistent manner. The sources of error associated with inter-rater reliability lie in the assessor.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- The internal consistency reliability was computed using TRC data at school year 2016-2017 in 9 regions and 19 states. The sample size is 2513 in total. The sample is comprised of 45% female and 46% male students; students were identified as 23% white, 25% Black or African-American, and 20% Hispanic-Latino. 39% of students were eligible for free or reduced lunch. Data from two samples are collected to provide evidence for alternate form reliability. The first sample contains 33 students from Kindergarten to Grade 5: 8 from kindergarten, 10 from Grade 1, four from Grade 2, four from Grade 3, two from Grade 4, and five from Grade 5. The sample was 39 percent female and 61 percent male; 9 percent white, 21 percent Hispanic, 67 percent black, and 3 percent represented other races. The 33 students are assessed from two schools in two Southern states during the 2013-2014 end of year benchmark administration period. The second sample includes 40 students in Grades 4-6 from two schools in two Southern states during the 2014-2015 middle of year benchmark administration period. The sample is composed of students in Grade 4 (n = 15), Grade 5 (n = 15), and Grade 6 (n = 10); 39 percent of the students were female and 61 percent male; 67 percent of students were black, 21 percent were Hispanic, 9 percent were white, and 3 percent were of other ethnicity. Inter-rater reliability evidence was obtained from two studies. STUDY 1: In one study, three raters assessed 33 students from two schools in two Southern states during the 2013–2014 end-of-year benchmark administration period. Among the students, representation was as follows: 8 from kindergarten, 10 from Grade 1, four from Grade 2, four from Grade 3, two from Grade 4, and five from Grade 5. The sample was 39 percent female and 61 percent male; 9 percent white, 21 percent Hispanic, 67 percent black, and 3 percent represented other races. STUDY 2: Another study was conducted between the 2014–2015 beginning-of-year and middle-of-year benchmark administration periods. In total, four raters assessed 40 students from two schools in two Southern states during the 2014–2015 MOY benchmark administration period. The sample is composed of students in Grade 4 (n = 15), Grade 5 (n = 15), and Grade 6 (n = 10); 39 percent of the students were female and 61 percent male; 67 percent of students were black, 21 percent were Hispanic, 9 percent were white, and 3 percent were of other ethnicity.
- *Describe the analysis procedures for each reported type of reliability.
- Cronbach’s alpha is used as the indicator for internal consistency, which quantifies the degree to which the items on an assessment all measure the same underlying construct. To avoid missing responses, the students in each grades who are reading at-grade proficient text level books are used to compute the Cronbach’s alpha. The 95% confidence interval of the Cronbach’s alpha is computed using the Bootstrap method, where 1000 samples with replacement are drawn from the data, calculating for each alpha and computing the 2.5% and 97.5% quantiles. For alternate form reliability, students were each assessed on two books at their instructional reading level and two books of the one level below their instructional reading level. Paired t-test comparisons are conducted to examine each component of TRC (accuracy, comprehension, retell/recall) as well as overall book performance. Raters’ scores are compared using intraclass correlations (ICC). ICC is one of the most commonly used statistics for assessing IRR on ordinal, interval, or ratio variables and is suitable for studies with two or more coders (Hallgren, 2012). Cicchetti (1994) provides cutoffs for ICC values, with IRR being poor for values less than 0.40, fair for values between 0.40 and 0.59, good for values between 0.60 and 0.74, and excellent for values between 0.75 and 1.00. Cohen’s kappa was also explored for overall book performance. Fleiss (1981) suggested kappa values greater than 0.75 to indicated excellent agreement, 0.40 to 0.75 as fair to good, and below 0.40 as poor. IRR estimates reported here are based on two or more independent assessors simultaneously scoring student performance during a single test administration (shadow-scoring).
*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Additional data is available upon request from the Center.
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Reference: Amplify (2015). mCLASS Reading3D - Amplify Atlas Book Set Technical Manual, 2nd Edition. Brooklyn, NY: Author
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|---|
Rating | d | d | d | d | d | d | d |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- DIBELS Next measures are brief, powerful indicators of foundational early literacy skills that: are quick to administer and score; serve as universal screening (or benchmark assessment) and progress monitoring; identify students in need of intervention support; evaluate the effectiveness of interventions; and support the RtI/Multi-tiered model. DIBELS Next includes six measures: First Sound Fluency (FSF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), DIBELS Oral Reading Fluency (DORF), and Daze. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good et al., 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade appropriate reading skills, and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments.
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- The predictive validity was computed using TRC data at middle of year to predict DIBELS Next end of year at school year 2016-2017 in 9 regions and 18 states. The sample is comprised of 47% female and 45% male students; students were identified as 19% white, 22% Black or African-American, and 32% Hispanic-Latino. 32% of students were eligible for free or reduced lunch. The concurrent validity was computed using TRC data at end of year to predict DIBELS Next end of year at school year 2016-2017 in 9 regions and 18 states. The sample is comprised of 47% female and 45% male students; students were identified as 18% white, 22% Black or African- American, and 30% Hispanic-Latino. 29% of students were eligible for free or reduced lunch.
- *Describe the analysis procedures for each reported type of validity.
- Evidence of concurrent validity is often presented as a correlation between the assessment and an external criterion measure. Instructional reading levels determined from the administration of the Atlas edition of TRC should correlate highly with other accepted procedures and measures that determine overall reading achievement, including accuracy and comprehension. The degree of correlation between two conceptually related, concurrently administered tests suggests the tests measure the same underlying psychological constructs or processes. The correlation of final instructional reading level on TRC with the Composite score on DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017). Predictive validity provides an estimate of the extent to which student performance on TRC predicts scores on the criterion measure administered at a later point in time, defined as more than three months in this study. The correlation of final instructional reading level on TRC at the middle of year with the Composite score resulting from subsequent administration of DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017).
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Reference: Amplify (2015). mCLASS Reading3D - Amplify Atlas Book Set Technical Manual, 2nd Edition. Brooklyn, NY: Author. (Contact vendor to inquire about technical manual.)
- Describe the degree to which the provided data support the validity of the tool.
- The table above summarizes the concurrent and predictive validity evidence for each grade. Across Grades K to 6, current validity coefficients range from 0.71 to 0.82, demonstrating strong correlations between final instructional reading level on TRC and DIBELS Next composite score at the end of year. The lower bounds of 95% confidence intervals are all above 0.70. Across Grades K to 6, predictive validity coefficients are in the range of 0.59 to 0.79. The lower bounds of 95% confident intervals are above 0.70 for Grades 1 to 6. The correlation with the DIBELS Next composite score is slighter lower in Kindergarten than the other grades, possibly because text levels at the lower grades are much less variable due to floor effect at Kindergarten.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- Yes
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|---|
Rating | Yes | Yes | Yes | Yes | Yes | Yes | No |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- Yes
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- Classification analyses previously described were disaggregated by subgroups to determine whether the assessment is functioning similarly across subgroups.
- b. Describe the subgroups for which bias analyses were conducted:
- Data were disaggregated for the following groups: White, Black, Hispanic
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
- AUC results are similar across subgroups.
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.