mCLASS:Reading 3D

Text Reading and Comprehension

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

$20.90 per student

 

Replacement Cost:

$6.00 per student per year.

Annual license renewal fee subject to change.

 

Included in Cost:

mCLASS allows for administration of the TRC assessment using mobile devices and allows teachers to easily record student responses with just a tap of a button as well as other observations noticed during an assessment for a deeper interpretation of students’ skills. It has embedded scripts for prompts and directions to ensure standardized administration, so all students have the same opportunity to perform. The mCLASS Platform provides a comprehensive service for managing staff organizational structure and student enrollment data. The platform provides online reporting and analysis tools for users of different roles, from administrators to classroom teachers, and supports our mobile assessment delivery system. It supports the Now What Tools, which makes assessment results actionable for teachers by translating data into practical instructional support with tools for small-group instruction, item-level analysis, and parent letters. Educators and administrators can immediately access student data using reports that are designed for influencing instruction and informing administrative decisions.

Technology Requirements:

  • Computer or Tablet
  • Internet Connection

 

Training Requirements:

  • 4 – 8 hours of training

 

Qualified Administrators:

  • Examiners must receive training on assessment administration and scoring.

 

Accommodations:

mCLASS is an assessment instrument that captures the developing reading skills of students with disabilities, with a few exceptions:

  • students who are deaf,
  • students who have fluency-based speech disabilities (e.g., stuttering, oral apraxia),
  • students who are learning to read in a language other than English or Spanish, and
  • students with severe disabilities.

Use of mCLASS is appropriate for all other students, including those with disabilities and receiving special education supports for whom reading connected text is an IEP goal. For students receiving special education, it may be necessary to adjust goals and timelines and provide accommodations as part of the administration.

 

The purpose of accommodation is to facilitate assessment for children for whom a standard administration may not provide an accurate estimate of their skills in the core early literacy skill areas. Valid and acceptable accommodations are ones that are unlikely to substantially change the meaning or interpretation of a student’s scores. A list of valid and acceptable accommodations for TRC administration is available from the vendor upon request.

Where to Obtain:

Website:

https://www.amplify.com/

Address:

Amplify Education INC, 55 Washington Street, Suite 800, Brooklyn, NY, 11201-1071

Phone:
(800) 823-1969

Email:
support@amplify.com


Access to Technical Support:

Amplify’s Customer Care Center offers complete user-level support from 7:00 a.m. to 7:00 p.m. EST, Monday through Friday. Customers may contact a customer support representative via telephone, e-mail, or electronically through the mCLASS website. Calls to the Customer Care Center’s toll-free number are answered immediately by an automated attendant and routed to customer support agents according to regional expertise. Additionally, customers have self-service access to instructions, documents, and frequently asked questions on our Website. The research staff and product teams are available to answer questions about the content within the assessments. Larger implementations have a designated account manager to support ongoing successful implementation.

mCLASS:3D - Text Reading and Comprehension (TRC) is a set of screening and progress monitoring measures for grades K-6. TRC is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level.

 

While the student reads from the set of leveled readers, the teacher follows along on a handheld device, recording the student’s performance as the child reads. The handheld software offers a pre-loaded class list indicating required assessment tasks, provides the teacher with directions and prompts to ensure standardized, accurate administration, and automates the precise timing requirements. Upon completion of each task, the handheld device automatically calculates the student’s score and provides a risk evaluation. Student performance data are securely and immediately transferred to the Web-based mCLASS reporting system. The mCLASS:3D web site offers a range of reports at the district, school, class, and individual student level for further analysis. The set of measures used for screening are designed to be administered at the beginning, middle, and end of year, with alternate forms of all measures available for progress monitoring in between screening windows.

Assessment Format:

  • Individual

 

Administration Time:

  • 5 - 8 minutes per student

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

  • Calculated automatically
  • Calculated manually

 

Scores Generated:

  • Raw Score
  • Developmental Benchmarks

 

 

Reliability

GradeK12345
RatingHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleFull bubbleFull bubble

Justify the appropriateness of each type of reliability reported:

The reliability evidence for mCLASS includes Cronbach’s alpha as an indicator of internal consistency. Cronbach’s alpha was calculated as an additional indicator of internal consistency using data from students who read book levels that were appropriate for their current grade level. Cronbach’s alpha quantifies the degree to which the items on an assessment all measure the same underlying construct. To avoid missing responses, the students in each grades who are reading at-grade proficient text level books are used to compute the Cronbach’s alpha.

The reliability evidence for mCLASS also includes inter-rater reliability data. The assessment involves administration by a single rater with a single student at a time. Thus, it is important for raters to be consistent.

 

Describe the sample characteristics for each reliability analysis conducted:

Cronbach’s alpha was calculated using data from the mCLASS data system for the 2016-2017 school year. In total, 2,513 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central.  The sample was composed of participants from the following demographic categories: 47 percent male, 45 percent female, and 8 percent unspecified gender; 25 percent black, 23 percent white, 20 percent Hispanic, 4 percent American Indian/Alaska Native, 4 percent multiracial, and 21 percent other race or unspecified race. Thirty-two percent of the sample was eligible for free and reduced lunch, while eligibility for free and reduced lunch for 52% of the sample was unknown.

Two studies of inter-rater reliability were conducted.

Study 1: Three raters assessed 33 students from two schools in two Southern states during the 2013–2014 end-of-year benchmark administration period. Among the students, representation was as follows: 8 from kindergarten, 10 from Grade 1, four from Grade 2, four from Grade 3, two from Grade 4, and five from Grade 5. The sample was 39 percent female and 61 percent male; 9 percent white, 21 percent Hispanic, 67 percent black, and 3 percent represented other races. The raters were two Amplify consultants.

Study 2: The sample was composed of students in Grade 4 (n = 15), Grade 5 (n= 15), and Grade 6 (n = 10); 39 percent of the students were female and 61 percent male; 67 percent of students were black, 21 percent were Hispanic, 9 percent were white, and 3 percent were of other ethnicity. The raters were four Amplify consultants.    

 

Describe the analysis procedures for each reported type of reliability:

Cronbach’s alpha is used as the indicator for internal consistency, which quantifies the degree to which the items on an assessment all measure the same underlying construct. To avoid missing responses, the students in each grade who are reading at-grade proficient text level books are used to compute Cronbach’s alpha. The 95% confidence interval of the Cronbach’s alpha was computed using the Bootstrap method, where 1000 samples with replacement were drawn from the data, calculating for each alpha and the 2.5% and 97.5% quantiles were computed.

In presenting inter-rater reliability (IRR) evidence, raters’ scores are typically compared using intraclass correlations (ICC). ICC is one of the most commonly used statistics for assessing IRR on ordinal, interval, or ratio variables and is suitable for studies with two or more coders (Hallgren, 2012). Cicchetti (1994) provides cutoffs for ICC values, with IRR being poor for values less than 0.40, fair for values between 0.40 and 0.59, good for values between 0.60 and 0.74, and excellent for values between 0.75 and 1.00. IRR estimates reported here are based on two or more independent assessors simultaneously scoring student performance during a single test administration (“shadow-scoring”). Reliability coefficients presented therefore represent the degree to which the administration and scoring procedures for the TRC components lead to consistent results, generalizing across administrators.

 

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Internal Consistency

K

563

0.86

(0.84, 0.88)

Inter-Rater (Retell/Recall)

K

8

0.70

(0.34,0.94)

Inter-Rater (Reading Record Accuracy)

K

8

0.98

(0.87,0.99)

Inter-Rater (Oral Comprehension)

K

8

0.98

(0.97,0.99)

Internal Consistency

1

232

0.931

(0.895, 0.949)

Inter-Rater (Reading Record Accuracy)

1

10

0.97

(0.95, 0.99)

Inter-Rater (Retell/Recall)

1

10

0.92

(0.68, 0.98)

Inter-Rater (Oral Comprehension)

1

10

0.94

(0.76,0.98)

Internal Consistency

2

1021

0.88

(0.843, 0.907)

Inter-Rater (Reading Record Accuracy)

2

4

0.12

(0.03,0.22)

Inter-Rater (Oral Comprehension)

2

4

0.72

(0.34,0.97)

Internal Consistency

3

272

0.828

(0.784, 0.860)

Inter-Rater (Reading Record Accuracy)

3

4

0.72

(0.24,0.98)

Inter-Rater (Oral Comprehension)

3

4

0.95

(0.55,0.99)

Internal Consistency

4

218

0.888

(0.795, 0.943)

Inter-Rater (Reading Record Accuracy)

4

2

0.23

(0.13,0.42)

Inter-Rater (Oral Comprehension)

4

2

0.62

(0.42,0.91)

Inter-Rater (Book Performance)

4

15

0.75

(0.30,0.91)

Inter-Rater (Oral Comprehension)

4

15

0.92

(0.79,0.98)

Inter-Rater (Reading Record Accuracy)

4

15

0.98

(0.96,0.99)

Internal Consistency

5

207

0.76

(0.71, 0.81)

Inter-Rater (Oral Comprehension)

5

5

0.24

(0.19,0.56)

Inter-Rater (Reading Record Accuracy)

5

5

0.77

(0.62,0.97)

Inter-Rater (Reading Record Accuracy)

5

15

0.89

(0.62,0.95)

Inter-Rater (Book Performance)

5

15

0.89

(0.68,0.96)

Inter-Rater (Oral Comprehension)

5

15

0.93

(0.80,0.97)

 

Validity

GradeK12345
RatingHalf-filled bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbled

Describe and justify the criterion measures used to demonstrate validity:

DIBELS Next was chosen as the outcome measure. DIBELS Next measures are brief, powerful indicators of foundational early literacy skills that: are quick to administer and score; serve as universal screeners, benchmark assessments, and progress monitoring tools; identify students in need of intervention support; evaluate the effectiveness of interventions; and support the RtI/Multi-tiered model. DIBELS Next includes six measures: First Sound Fluency (FSF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), DIBELS Oral Reading Fluency (DORF), and Daze. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good et al., 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade appropriate reading skills, and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments.

 

Describe the sample characteristics for each validity analysis conducted:

Concurrent validity was calculated using data from the mCLASS data system for the 2016-2017 school year. In total, 182,259 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central.  The sample was composed of participants from the following demographic categories: 47 percent male, 45 percent female, and 8 percent unspecified gender; 30 percent Hispanic, 22 percent black, 18 percent white, 3 percent Asian, Native Hawaiian or Pacific Islander, 3 percent multiracial, 0.3 percent American Indian/Alaska Native, and 24 percent other race or unspecified race. Twenty-nine percent of the sample was eligible for free and reduced lunch, while eligibility for free and reduced lunch for 52% of the sample was unknown.

Predictive validity was calculated using data from the mCLASS data system for the 2016-2017 school year. In total, 173,224 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central.  The sample was composed of participants from the following demographic categories: 47 percent male, 45 percent female, and 8 percent unspecified gender; 32 percent Hispanic, 22 percent black, 19 percent white, 3 percent Asian, Native Hawaiian or Pacific Islander, 3 percent multiracial, 0.4 percent American Indian/Alaska Native, and 21 percent other race or unspecified race. Thirty-two percent of the sample was eligible for free and reduced lunch, while eligibility for free and reduced lunch for 53% of the sample was unknown.        

           

Describe the analysis procedures for each reported type of validity:

Evidence of concurrent validity is often presented as a correlation between the assessment and an external criterion measure. Instructional reading levels determined from the administration of the Atlas edition of TRC should correlate highly with other accepted procedures and measures that determine overall reading achievement, including accuracy and comprehension. The degree of correlation between two conceptually related, concurrently administered tests suggests the tests measure the same underlying psychological constructs or processes. The correlation of final instructional reading level on TRC with the Composite score on DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017).

Predictive validity provides an estimate of the extent to which student performance on TRC predicts scores on the criterion measure administered at a later point in time, defined as more than three months in this study. The correlation of final instructional reading level on TRC at the middle of year with the Composite score resulting from subsequent administration of DIBELS Next at the end of year was computed to provide predictive validity evidence. 95% confidence interval of the correlation was computed using the “stats” package in in R software package (R Development Core Team, 2017).

 

Type of Validity

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Concurrent

K

DIBELS Next Composite Score

51004

0.707

(0.703, 0.712)

Concurrent

1

DIBELS Next Composite Score

50175

0.822

(0.819, 0.825)

Concurrent

2

DIBELS Next Composite Score

45193

0.795

(0.791, 0.798)

Concurrent

3

DIBELS Next Composite Score

19826

0.772

(0.766, 0.778)

Concurrent

4

DIBELS Next Composite Score

9618

0.767

(0.758, 0.775)

Concurrent

5

DIBELS Next Composite Score

5924

0.741

(0.730, 0.753)

Concurrent

6

DIBELS Next Composite Score

519

0.766

(0.728, 0.799)

Predictive

K

DIBELS Next Composite Score

44495

0.591

(0.585, 0.597)

Predictive

1

DIBELS Next Composite Score

46797

0.787

(0.784, 0.791)

Predictive

2

DIBELS Next Composite Score

43897

0.794

(0.790, 0.797)

Predictive

3

DIBELS Next Composite Score

19446

0.759

(0.753, 0.765)

Predictive

4

DIBELS Next Composite Score

10289

0.758

(0.750, 0.766)

Predictive

5

DIBELS Next Composite Score

7733

0.729

(0.718, 0.739)

Predictive

6

DIBELS Next Composite Score

567

0.760

(0.723, 0.793)

 

Describe the degree to which the provided data support the validity of the tool:

The table above summarizes the concurrent and predictive validity evidence for each grade. Across Grades K to 6, concurrent validity coefficients range from 0.71 to 0.82, demonstrating strong correlations between final instructional reading level on TRC and DIBELS Next composite score at the end of year. The lower bounds of 95% confidence intervals are all above 0.70.

Across Grades K through 6, predictive validity coefficients were in the range of 0.59 to 0.79. The lower bounds of 95% confident intervals were above 0.70 for Grades 1 through 6. The correlation with the DIBELS Next composite score was slightly lower in Kindergarten than the other grades, possibly because text levels at the lower grades are much less variable due to floor effects. It is also possible that predictive validity is not as strong in Kindergarten due to the nature of reading growth for children in Kindergarten. Students’ literacy skills at this time are widely variable and change rapidly in response to instruction.

 

Disaggregated Validity Data

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Concurrent

White

K

DIBELS Next Composite

10419

0.71

(0.700, 0.719)

Concurrent

Black

K

DIBELS Next Composite

10812

0.701

(0.691, 0.711)

Concurrent

Hispanic

K

DIBELS Next Composite

12061

0.703

(0.694, 0.712)

Concurrent

White

1

DIBELS Next Composite

10118

0.79

(0.782, 0.797)

Concurrent

Black

1

DIBELS Next Composite

11336

0.844

(0.839, 0.849)

Concurrent

Hispanic

1

DIBELS Next Composite

12798

0.828

(0.822, 0.833)

Concurrent

White

2

DIBELS Next Composite

9330

0.757

(0.749, 0.766)

Concurrent

Black

2

DIBELS Next Composite

10362

0.823

(0.817, 0.829)

Concurrent

Hispanic

2

DIBELS Next Composite

12172

0.804

(0.798, 0.81)

Concurrent

White

3

DIBELS Next Composite

2735

0.765

(0.749, 0.78)

Concurrent

Black

3

DIBELS Next Composite

4185

0.775

(0.763, 0.787)

Concurrent

Hispanic

3

DIBELS Next Composite

8203

0.78

(0.771, 0.788)

Concurrent

White

4

DIBELS Next Composite

704

0.735

(0.699, 0.767)

Concurrent

Black

4

DIBELS Next Composite

2010

0.765

(0.747, 0.783)

Concurrent

Hispanic

4

DIBELS Next Composite

5909

0.786

(0.776, 0.795)

Concurrent

White

5

DIBELS Next Composite

358

0.741

(0.690, 0.784)

Concurrent

Black

5

DIBELS Next Composite

1127

0.738

(0.710, 0.764)

Concurrent

Hispanic

5

DIBELS Next Composite

3847

0.765

(0.752, 0.778)

Concurrent

Hispanic

6

DIBELS Next Composite

337

0.779

(0.733, 0.818)

Predictive

White

K

DIBELS Next Composite

9684

0.61

(0.597, 0.622)

Predictive

Black

K

DIBELS Next Composite

9491

0.579

(0.565, 0.592)

Predictive

Hispanic

K

DIBELS Next Composite

10335

0.554

(0.54, 0.567)

Predictive

White

1

DIBELS Next Composite

9725

0.769

(0.761, 0.777)

Predictive

Black

1

DIBELS Next Composite

10826

0.797

(0.791, 0.804)

Predictive

Hispanic

1

DIBELS Next Composite

12125

0.788

(0.782, 0.795)

Predictive

White

2

DIBELS Next Composite

9243

0.765

(0.756, 0.773)

Predictive

Black

2

DIBELS Next Composite

10303

0.818

(0.812, 0.825)

Predictive

Hispanic

2

DIBELS Next Composite

12407

0.801

(0.795, 0.808)

Predictive

White

3

DIBELS Next Composite

2846

0.765

(0.749, 0.78)

Predictive

Black

3

DIBELS Next Composite

4345

0.769

(0.756, 0.781)

Predictive

Hispanic

3

DIBELS Next Composite

8469

0.764

(0.755, 0.773)

Predictive

White

4

DIBELS Next Composite

847

0.747

(0.715, 0.775)

Predictive

Black

4

DIBELS Next Composite

2172

0.778

(0.761, 0.794)

Predictive

Hispanic

4

DIBELS Next Composite

6236

0.771

(0.76, 0.78)

Predictive

White

5

DIBELS Next Composite

548

0.724

(0.681, 0.761)

Predictive

Black

5

DIBELS Next Composite

1516

0.738

(0.714, 0.76)

Predictive

Hispanic

5

DIBELS Next Composite

4992

0.747

(0.734, 0.759)

Predictive

Hispanic

6

DIBELS Next Composite

381

0.76

(0.714, 0.799)

 

Bias Analysis Conducted

GradeK12345
RatingNoNoNoNoNoNo

Have additional analyses been conducted to establish whether the tool is or is not biased against demographic subgroups (e.g., students who vary by race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?

Bias Analysis Method: No qualifying evidence provided.

 

Subgroups Included: No qualifying evidence provided.

 

Bias Analysis Results: No qualifying evidence provided. 

Sensitivity: Reliability of the Slope

GradeK12345
RatingHalf-filled bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Describe the sample used for analyses, including size and characteristics:

The sample of students included in analyses of slope of improvement were those students identified as in need of intensive intervention based on their DIBELS Next Composite scores. All students included scored below the Well Below Benchmark cutpoint which is a research-based criterion that is indicative of need for intensive instructional supports and ongoing progress monitoring. The odds of students who obtain scores in the Well Below Benchmark zone meeting grade level expectations at the next screening or benchmarking period without intensive intervention supports are 10 to 20 percent (Good, R. H., Kaminski, R., Dewey, E., Walin, J., Powell-Smith, K., & Latimer, R., 2013. DIBELS Next Technical Manual, Eugene, OR: Dynamic Measurement Group, Inc.
).

In total, 8,415 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central.  The sample was composed of participants from the following demographic categories: 58 percent male, 43 percent female, and 8 percent unspecified gender; 37 percent black, 23 percent Asian/Pacific Islander,18 percent Hispanic, 8 percent white, 1 percent American Indian/Alaska Native, 4 percent multiracial, and 19 percent other race or unspecified race. Thirty-six percent of the sample was eligible for free and reduced lunch, while eligibility for free and reduced lunch for 54% of the sample was unknown.

 

Describe the frequency of measurement:

Reliability of slope was calculated for students who had multiple TRC data points over the course of a school year which amounts to an average of 1 data point every 3 weeks.

 

Describe reliability of the slope analyses conducted with a population of students in need of intensive intervention:

HLM was used to compute the reliability and validity of the slope. Reliability of the slope is the ratio of the true score variance to the total variance. The true score variance is the random slope variance in the multilevel regression using a random intercept and random slope model; the total variance is the estimation of total variance of each student’s individual slope of improvement. The correlation of slope of improvement with the Composite score resulting from administration of DIBELS Next at the end of year is computed to provide validity evidence. The 95% confidence interval of each correlation coefficient was computed using the “stats” package in R software package (R Development Core Team, 2017).         

 

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Reliability of Slope

K

4081

0.422

(0.403, 0.440)

Reliability of Slope

1

2619

0.576

(0.556, 0.596)

Reliability of Slope

2

1279

0.871

(0.860, 0.882)

Reliability of Slope

3

246

0.934

(0.919, 0.947)

Reliability of Slope

4

103

0.967

(0.949, 0.979)

Reliability of Slope

5

56

0.967

(0.949, 0.980)

Reliability of Slope

6

31

0.992

(0.978, 0.997)

 

Sensitivity: Validity of the Slope

GradeK12345
RatingFull bubbleFull bubbleFull bubbleHalf-filled bubbleFull bubbleHalf-filled bubble

Describe and justify the criterion measures used to demonstrate validity:

DIBELS Next was chosen as the outcome measure. DIBELS Next measures are brief, powerful indicators of foundational early literacy skills that: are quick to administer and score; serve as universal screeners, benchmark assessments, and progress monitoring tools; identify students in need of intervention support; evaluate the effectiveness of interventions; and support the RtI/Multi-tiered model. DIBELS Next includes six measures: First Sound Fluency (FSF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), DIBELS Oral Reading Fluency (DORF), and Daze. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good et al., 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade appropriate reading skills, and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments.

 

Describe the sample used for analyses, including size and characteristics:

The sample of students included in analyses of slope of improvement were those students identified as in need of intensive intervention based on their DIBELS Next Composite scores. All students included scored below the Well Below Benchmark cutpoint which is a research-based criterion that is indicative of need for intensive instructional supports and ongoing progress monitoring. The odds of students who obtain scores in the Well Below Benchmark zone meeting grade level expectations at the next screening or benchmarking period without intensive intervention supports are 10 to 20 percent (Good, R. H., Kaminski, R., Dewey, E., Walin, J., Powell-Smith, K., & Latimer, R., 2013. DIBELS Next Technical Manual, Eugene, OR: Dynamic Measurement Group, Inc.
).

In total, 7,741 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central.  The sample was composed of participants from the following demographic categories: 58 percent male, 43 percent female, and 8 percent unspecified gender; 37 percent black, 23 percent Asian/Pacific Islander,18 percent Hispanic, 8 percent white, 1 percent American Indian/Alaska Native, 4 percent multiracial, and 19 percent other race or unspecified race. Thirty-six percent of the sample was eligible for free and reduced lunch, while eligibility for free and reduced lunch for 54% of the sample was unknown.

 

Describe predictive validity of the slope of improvement analyses conducted with a population of students in need of intensive intervention:

HLM was used to compute the reliability and validity of the slope. Reliability of the slope is the ratio of the true score variance to the total variance. The true score variance is the random slope variance in the multilevel regression using a random intercept and random slope model; the total variance is the estimation of total variance of each student’s individual slope of improvement. The correlation of slope of improvement with the Composite score resulting from administration of DIBELS Next at the end of year is computed to provide validity evidence. The 95% confidence interval of each correlation coefficient was computed using the “stats” package in R software package (R Development Core Team, 2017).                                                         

Type of Validity

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Validity of Slope

K

DIBELS Next Composite

3637

0.595

(0.582, 0.608)

Validity of Slope

1

DIBELS Next Composite

2303

0.770

(0.757, 0.783)

Validity of Slope

2

DIBELS Next Composite

1075

0.618

(0.594, 0.642)

Validity of Slope

3

DIBELS Next Composite

196

0.436

(0.372, 0.501)

Validity of Slope

4

DIBELS Next Composite

176

0.662

(0.574, 0.749)

Validity of Slope

5

DIBELS Next Composite

54

0.491

(0.239, 0.570)

Validity of Slope

6

DIBELS Next Composite

30

0.435

(0.321,0.579)

 

Describe the degree to which the provided data support the validity of the tool:

The table above summarizes the evidence for validity of slope for each grade. Across grades K through 6, coefficients range from 0.435 to 0.770 demonstrating relationships with the external measure and sensitivity to student learning. The correlation with the DIBELS Next composite score is slightly lower in Kindergarten than the other grades, possibly because text levels at the lower grades are much less variable due to floor effects. It is also possible that predictive validity is not as strong in Kindergarten due to the nature of reading growth for children in Kindergarten. Students’ literacy skills at this time are widely variable and change rapidly in response to instruction.

Alternate Forms

GradeK12345
RatingEmpty bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Describe the sample for these analyses, including size and characteristics:

Methods: In order to demonstrate empirical evidence for the comparability of books (i.e., alternate form reliability) for students who are at risk overall on TRC, student performance data collected during the 2016-2017 end of school year was submitted to ANOVAs by each text complexity level (i.e., A,Z) , with individual books serving as the between subjects factor, and student final performance on each book administered as the dependent variable.

Results: The range of Eta-squared value (ᵰ 2) resulting from analysis of at risk student performance on the books at each text complexity level are presented in the table below along with corresponding sample sizes and number of books. Interpreting these ᵰ 2 values as effect sizes, it is shown for each grade that a very small, nonsignificant, amount of variability is accounted for by differences in books administered, therefore, alternate form reliability of TRC for at risk students is demonstrated.

Type of Reliability

Age or Grade

n (Books)

n (Students)

2 (range)

Alternate Form

K

9

7410

0.00-0.04

Alternate Form

1

28

8362

0.01-0.04

Alternate Form

2

36

8736

0.00-0.05

Alternate Form

3

36

4594

0.01-0.05

Alternate Form

4

45

4116

0.00-0.07

Alternate Form

5

54

2836

0.00-0.05

Alternate Form

6

29

381

0.00-0.08

 

Evidence that alternate forms are of equal and controlled difficulty or, if IRT based, evidence of item or ability invariance:

In order to demonstrate empirical evidence for the comparability of books (i.e., alternate form reliability), student performance data collected during the 2014-2015 school year was submitted to a one-way Repeated Measures ANOVA, with text complexity level (i.e., AZ) and individual books serving as the between subjects factor, students as the within subjects factor, and student performance (i.e., FRU, INS, IND) on each book administered as the dependent variable.

Partial ᵰ 2 values resulting from analysis of student performance on the 20 most prevalent books at each text complexity level. Interpreting these partial ᵰ 2 values as effect sizes, it was shown for each grade that a very small, nonsignificant, amount of variability was accounted for by differences in books administered, therefore demonstrating alternate form reliability of TRC.

 

Number of alternate forms of equal and controlled difficulty:

There are more than 300 books available at each text complexity level in TRC as the mCLASS home website allows teachers to add additional progress monitoring books to the system. Development of these books, according to a text leveling gradient as specified previously, ensures comparability of content among books at each specific level.

Decision Rules: Setting and Revising Goals

GradeK12345
RatingEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubble

Specification of validated decision rules for when goals should be set or revised:

We recommend using a goal-oriented rule for evaluating a student’s response to intervention that is straightforward for teachers to understand and use. Decisions about a student’s progress are based on comparisons of TRC scores that are plotted on a graph and the aimline, or expected rate of progress as determined by either research-based benchmark expectations or expectations for growth. Goals should only be increased and not decreased. If a student is making progress that is exceeding the expected goal, the goal should be increased and/or the intervention or instruction modified to be less intensive as the student is showing greater than expected progress.

 

Evidentiary basis for these rules:

This recommended decision rule is based on early work with CBM (Fuchs, 1988, 1989) and precision teaching (White & Haring, 1980) and allows for a minimum of three data points to be gathered before any decision is made.

Fuchs, L. S. (1988). Effects of computer-managed instruction on teachers' implementation of systematic monitoring programs and student achievement. Journal of Education Research, 81, 294-304.

Fuchs, L. S. (1989). Evaluation solutions: Monitoring progress and vising intervention plans. In M. Shinn (Ed.), Curriculum-based management: Assessing special children.  New York: Guilford Press.

White, O. R., & Haring, N. G. (1980). Exceptional teaching (2nd Ed.) Columbus, OH: Merrill.

Decision Rules: Changing Instruction

GradeK12345
RatingEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubble

Specification of validated decision rules for when changes to instruction should be made:

In general, it is recommended that support be continued until a student achieves at least three points at or above the goal. If a decision is made to discontinue support, it is recommended that progress monitoring be continued weekly for at least 1 month to ensure that the student is able to maintain growth without the supplemental support. The frequency of progress monitoring will be faded gradually as the child’s progress continues to be sufficient. We suggest that educational professionals consider instructional modifications when student performance falls below the aimline for three consecutive points.                                        

 

Evidentiary basis for these rules:

This recommended decision rule is based on early work with CBM (Fuchs, 1988, 1989) and precision teaching (White & Haring, 1980) and allows for a minimum of three data points to be gathered before any decision is made.

Fuchs, L. S. (1988). Effects of computer-managed instruction on teachers' implementation of systematic monitoring programs and student achievement. Journal of Education Research, 81, 294-304.

Fuchs, L. S. (1989). Evaluation solutions: Monitoring progress and vising intervention plans. In M. Shinn (Ed.), Curriculum-based management: Assessing special children.  New York: Guilford Press.

White, O. R., & Haring, N. G. (1980). Exceptional teaching (2nd Ed.) Columbus, OH: Merrill.

Administration Format

GradeK12345
Data
  • Individual
  • Individual
  • Individual
  • Individual
  • Individual
  • Individual
  • Administration Format:

    Individual

    Administration & Scoring Time

    GradeK12345
    Data
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • Administration Time:

    5-8 minutes

    Scoring Time:

    Scoring is completed in real time during administration

    Scoring Format

    GradeK12345
    Data
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • Scoring Format:

    Manually-scored

    Computer-scored*

    *Teachers follow and score along, marking reading errors and responses to orally presented questions, as students read and respond to questions in real time. Final reading accuracy and comprehension scores and instructional reading level based on students’ performance as recorded by the teacher are calculated automatically by the software.

    ROI & EOY Benchmarks

    GradeK12345
    Data
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • Specify the minimum acceptable rate of growth/improvement:

    Growth norms are provided in a separate report (TRC Atlas National Growth Norms 2014 – 2015) which is available from the Center upon request. Student progress percentiles that account for students’ initial skills were used to determine how much growth, as documented by instructional reading level, to expect over the course of a school year.

     

    Specify the benchmarks for minimum acceptable end-of-year performance:

    Performance standards are available for three time points in the year: BOY (Beginning of Year), MOY (Middle of Year), and EOY (End of Year). End of year expectations are bolded in the table below.

    Grade

    Time of Year

    Far Below

    Proficient

    Below

    Proficient

    Proficient

    Above

    Proficient

    K

    BOY

    < PC

    PC

    RB

    A and above

    K

    MOY

    RB or below

    A

    B

    C and above

    K

    EOY

    A or below

    B

    C to D

    E and above

    1

    BOY

    A or below

    B

    C to D

    E and above

    1

    MOY

    C or below

    D to E

    F to G

    H and above

    1

    EOY

    E or below

    F to H

    I

    J and above

    2

    BOY

    E or below

    F to H

    I

    J and above

    2

    MOY

    H or below

    I

    J to K

    L and above

    2

    EOY

    J or below

    K

    L to M

    N and above

    3

    BOY

    J or below

    K

    L to M

    N and above

    3

    MOY

    K or below

    L to M

    N

    O and above

    3

    EOY

    L or below

    M to N

    O to P

    Q and above

    4

    BOY

    L or below

    M to N

    O to P

    Q and above

    4

    MOY

    N or below

    O to P

    Q

    R and above

    4

    EOY

    P or below

    Q

    R to S

    T and above

    5

    BOY

    P or below

    Q

    R to S

    T and above

    5

    MOY

    Q or below

    R to S

    T

    U and above

    5

    EOY

    S or below

    T

    U to V

    W and above

    6

    BOY

    S or below

    T

    U to V

    W and above

    6

    MOY

    U or below

    V

    W to X

    Y and above

    6

    EOY

    V or below

    W to X

    Y to Z

    *