mCLASS

Reading: 3D-Text Reading & Comprehension (TRC)

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

The basic pricing plan is an annual per student license of $20.90. For users already using an mCLASS assessment product, the cost per student to add mCLASS:3D is $6 per student.

 

Replacement Cost:

Cost of license renewal subject to change annually.

 

Included in Cost:

mCLASS allows for administration of the TRC assessment using mobile devices and allows teachers to easily record student responses with just a tap of a button as well as other observations noticed during an assessment for a deeper interpretation of students’ skills. It has an embedded script with prompts and directions to ensure standardized administration so all students receive the same opportunity to perform.

 

The mCLASS Platform provides a comprehensive service for managing the staff organizational structure and student enrollment data, providing online reporting and analysis tools for users of different roles from administrators to classroom teachers, and supporting our mobile assessment delivery system. It supports the Now What Tools, which translates assessment results into practical instructional support with tools for small- group instruction, item-level analysis, and parent letters. Educators and administrators can immediately access student data using reports that are designed for influencing instruction and informing administrative decisions.

 

Technology Requirements:

  • Tablet (or other handheld computing device)
  • Internet connection

 

Training Requirements:

  • 4-8 hours of training

 

Qualified Administrators:

Examiners must receive training in the assessment administration and scoring

 

Accommodations:

mCLASS is an assessment instrument well-suited for use with capturing the developing reading skills of students with disabilities, with a few exceptions: a) students who are deaf; b) students who have fluency-based speech disabilities, e.g., stuttering, oral apraxia; c) students who are learning to read in a language other than English or Spanish; d) students with severe disabilities. Use of mCLASS is appropriate for all other students, including those with disabilities and receiving special education supports for whom reading connected text is an IEP goal. For students receiving special education, it may be necessary to adjust goals and timelines; and provide accommodations as part of the administration.

 

The purpose of accommodation is to facilitate assessment for children for whom a standard administration may not provide an accurate estimate of their skills in the core early literacy skill areas. Valid and acceptable accommodations are ones that are unlikely to change substantially the meaning or interpretation of a student’s scores.

 

The list of valid and acceptable accommodations for TRC administration is available upon request.

Where to Obtain:

Website: https://www.amplify.com/ Address: 55 Washington Street Suite 800 Brooklyn, NY 11201-1071

Phone number: 800-823-1969, option 1

Email address: support@amplify.com


Access to Technical Support:

Amplify’s Customer Care Center offers complete user-level support from 7:00 a.m. to 7:00 p.m. EST, Monday through Friday. Customers may contact a customer support representative via telephone, e- mail, or electronically through the mCLASS website. Calls to the Customer Care Center’s toll-free number are answered immediately by an automated attendant and routed to customer support agents according to regional expertise. Additionally, customers have self-service access to instructions, documents, and frequently asked questions on mCLASS’s website. The research staff and product teams are available to answer questions about the content within the assessments. Larger implementations have a designated account manager to support ongoing successful implementation.

 

mCLASS:3D - TRC is a set of screening and progress monitoring measures for grades K-6. Text Reading and Comprehension (TRC) is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions.

 

Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. While the student reads from the set of leveled readers, the teacher follows along on a handheld device, recording the student’s performance as the child reads.

 

The handheld software offers a pre-loaded class list indicating required assessment tasks, provides the teacher with directions and prompts to ensure standardized, accurate administration, and automates the precise timing requirements. Upon completion of each task, the handheld automatically calculates the student’s score and provides a risk evaluation.

 

Student performance data are securely and immediately transferred to the Web-based mCLASS reporting system. The mCLASS:3D Web site offers a range of reports at the district, school, class, and individual student level for further analysis.

 

The set of measures in the screening suite are designed to be administered at the beginning, middle, and end of year, with alternate forms of all measures available for progress monitoring in between screening windows.

Assessment Format:

  • One-to-one

 

Administration Time:

  • 5 - 8 minutes per student

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

Raw scores are provided as the reading level of the student, categorized as a reading level A through Z. A student’s reading level is a composite of his reading accuracy and comprehension of text. Cut Points for determining reading level are provided. A student must reach the accuracy and comprehension cut points in order for a book level to be determined as the student’s reading level. Developmental benchmarks for each measure, grade, and time of year (beginning, middle, end) classify each student’s score as Above Proficient, Proficient, Below Proficient, Far Below Proficient.

 

Scores Generated:

  • Raw score
  • Developmental benchmarks

 

 

Classification Accuracy

GradeK123456
Criterion 1 Falldashdashdashdashdashdashdash
Criterion 1 WinterEmpty bubbleFull bubbleFull bubbleFull bubbleHalf-filled bubbleHalf-filled bubbleEmpty bubble
Criterion 1 Springdashdashdashdashdashdashdash
Criterion 2 Falldashdashdashdashdashdashdash
Criterion 2 Winterdashdashdashdashdashdashdash
Criterion 2 Springdashdashdashdashdashdashdash

Primary Sample

 

Criterion 1: DIBELS Next Composite at End of Year

Time of Year: Fall

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

PC

A

E

J

L

P

S

Base rate in the sample for children requiring intensive intervention

0.21

0.35

0.28

0.35

0.40

0.44

0.52

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.33

0.55

0.48

0.48

0.59

0.53

0.64

False Positive Rate

0.28

0.11

0.06

0.06

0.04

0.06

0.07

False Negative Rate

0.04

0.10

0.07

0.13

0.15

0.15

0.14

Sensitivity

0.73

0.67

0.72

0.59

0.49

0.52

0.60

Specificity

0.72

0.89

0.94

0.94

0.96

0.94

0.93

Positive Predictive Power

0.23

0.63

0.74

0.77

0.80

0.76

0.78

Negative Predictive Power

0.96

0.90

0.93

0.87

0.85

0.85

0.86

Overall Classification Rate

0.72

0.84

0.90

0.85

0.84

0.84

0.84

Area Under the Curve (AUC)

0.89

0.94

0.95

0.91

0.89

0.89

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

At 80% Sensitivity, specificity equals

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

At 70% Sensitivity, specificity equals

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

 

 

Criterion 1: DIBELS Next Composite at End of Year

Time of Year: Winter

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

A

D

I

 

A

R

V

Base rate in the sample for children requiring intensive intervention

0.07

0.22

0.20

0.27

0.27

0.27

0.33

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.40

0.33

0.30

0.31

0.42

0.40

0.60

False Positive Rate

0.36

0.17

0.15

0.14

0.27

0.24

0.45

False Negative Rate

0.10

0.11

0.10

0.23

0.15

0.19

20.00

Sensitivity

0.90

0.89

0.90

0.77

0.85

0.81

0.89

Specificity

0.64

0.83

0.85

0.86

0.73

0.76

0.55

Positive Predictive Power

0.15

0.60

0.60

0.66

0.54

0.55

0.49

Negative Predictive Power

0.99

0.96

0.97

0.91

0.93

0.91

0.91

Overall Classification Rate

0.66

0.84

0.86

0.83

0.77

0.77

0.66

Area Under the Curve (AUC)

0.80

0.93

0.95

0.90

0.88

0.87

0.86

AUC 95% Confidence Interval Lower

0.79

0.93

0.94

0.89

0.88

0.86

0.82

AUC 95% Confidence Interval Upper

0.80

0.94

0.95

0.90

0.89

0.88

0.89

At 90% Sensitivity, specificity equals

0.64

0.83

0.85

0.61

0.66

0.56

0.49

At 80% Sensitivity, specificity equals

Not Provided

Not Provided

0.92

0.86

0.82

0.76

0.72

At 70% Sensitivity, specificity equals

Not Provided

0.93

0.96

0.92

0.87

0.87

0.84

 

 

Criterion 1: DIBELS Next Composite at End of Year

Time of Year: Spring

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

A

E

J

L

P

S

V

Base rate in the sample for children requiring intensive intervention

0.28

0.26

0.26

0.27

0.42

0.39

0.40

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.46

0.45

0.34

0.44

0.49

0.46

0.55

False Positive Rate

0.00

0.05

0.04

0.05

0.06

0.07

0.07

False Negative Rate

0.09

0.08

0.07

0.13

0.14

0.14

0.13

Sensitivity

0.25

0.73

0.71

0.61

0.60

0.61

0.64

Specificity

1.00

0.95

0.96

0.95

0.94

0.93

0.93

Positive Predictive Power

0.91

0.82

0.81

0.82

0.78

0.75

0.78

Negative Predictive Power

0.91

0.92

0.93

0.87

0.86

0.86

0.87

Overall Classification Rate

0.91

0.90

0.91

0.86

0.85

0.84

0.85

Area Under the Curve (AUC)

0.97

0.97

0.96

0.92

0.91

0.89

0.92

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

At 80% Sensitivity, specificity equals

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

At 70% Sensitivity, specificity equals

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

Not Provided

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Disaggregated Data

Subgroup: White

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

A

D

I

L

A

R

Not Provided

Base rate in the sample for children requiring intensive intervention

0.04

0.15

0.12

0.18

0.22

0.22

Not Provided

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.32

0.22

0.19

0.20

0.32

0.28

Not Provided

False Positive Rate

2750.00

892.00

783.00

215.00

120.00

59.00

Not Provided

False Negative Rate

46.00

262.00

166.00

149.00

31.00

27.00

Not Provided

Sensitivity

0.89

0.82

0.86

0.71

0.83

0.78

Not Provided

Specificity

0.70

0.89

0.90

0.91

0.82

0.86

Not Provided

Positive Predictive Power

0.12

0.57

0.56

0.63

0.56

0.61

Not Provided

Negative Predictive Power

0.99

0.97

0.98

0.93

0.95

0.93

Not Provided

Overall Classification Rate

0.71

0.88

0.90

0.87

0.82

0.84

Not Provided

Area Under the Curve (AUC)

0.83

0.94

0.95

0.91

0.90

0.90

Not Provided

AUC 95% Confidence Interval Lower

0.81

0.94

0.95

0.90

0.87

0.87

Not Provided

AUC 95% Confidence Interval Upper

0.84

0.95

0.96

0.93

0.92

0.93

Not Provided

At 90% Sensitivity, specificity equals

0.70

0.81

0.85

0.71

0.64

0.70

Not Provided

At 80% Sensitivity, specificity equals

Not Provided

0.89

0.94

Not Provided

0.82

0.81

Not Provided

At 70% Sensitivity, specificity equals

Not Provided

 

0.96

0.91

0.89

0.90

Not Provided

 

Subgroup: Black

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

A

D

I

L

A

R

Not Provided

Base rate in the sample for children requiring intensive intervention

0.09

0.32

0.28

0.32

0.29

0.31

Not Provided

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.44

0.44

0.37

0.30

0.35

0.34

Not Provided

False Positive Rate

3411.00

1636.00

1275.00

330.00

280.00

166.00

Not Provided

False Negative Rate

85.00

310.00

261.00

400.00

146.00

134.00

Not Provided

Sensitivity

0.90

0.91

0.91

0.71

0.77

0.72

Not Provided

Specificity

0.60

0.78

0.83

0.89

0.82

0.84

Not Provided

Positive Predictive Power

0.18

0.66

0.67

0.75

0.64

0.67

Not Provided

Negative Predictive Power

0.98

0.95

0.96

0.87

0.90

0.87

Not Provided

Overall Classification Rate

0.63

0.82

0.85

0.83

0.80

0.80

Not Provided

Area Under the Curve (AUC)

0.77

0.93

0.94

0.89

0.88

0.85

Not Provided

AUC 95% Confidence Interval Lower

0.76

0.92

0.94

0.88

0.87

0.83

Not Provided

AUC 95% Confidence Interval Upper

0.78

0.93

0.95

0.90

0.90

0.87

Not Provided

At 90% Sensitivity, specificity equals

0.60

0.78

0.83

0.66

0.65

0.52

Not Provided

At 80% Sensitivity, specificity equals

Not Provided

0.90

0.91

0.77

0.82

0.76

Not Provided

At 70% Sensitivity, specificity equals

Not Provided

Not Provided

0.95

0.89

0.88

0.84

Not Provided

 

Subgroup: Hispanic

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

A

D

I

L

A

R

V

Base rate in the sample for children requiring intensive intervention

0.09

0.27

0.26

0.29

0.27

0.27

0.33

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.48

0.43

0.40

0.38

0.47

0.44

0.61

False Positive Rate

4131.00

2182.00

2053.00

1166.00

1442.00

1037.00

119.00

False Negative Rate

90.00

258.00

226.00

434.00

184.00

201.00

13.00

Sensitivity

0.90

0.92

0.93

0.82

0.89

0.85

0.90

Specificity

0.56

0.75

0.78

0.81

0.68

0.72

0.53

Positive Predictive Power

0.17

0.58

0.59

0.64

0.51

0.52

0.49

Negative Predictive Power

0.98

0.96

0.97

0.92

0.94

0.93

0.91

Overall Classification Rate

0.59

0.80

0.82

0.81

0.74

0.75

0.65

Area Under the Curve (AUC)

0.76

0.92

0.94

0.89

0.89

0.88

0.86

AUC 95% Confidence Interval Lower

0.75

0.92

0.93

0.89

0.88

0.87

0.81

AUC 95% Confidence Interval Upper

0.77

0.93

0.94

0.90

0.90

0.89

0.90

At 90% Sensitivity, specificity equals

0.56

0.75

0.83

0.65

0.68

0.64

0.49

At 80% Sensitivity, specificity equals

Not Provided

0.89

0.87

0.81

0.84

0.77

0.72

At 70% Sensitivity, specificity equals

Not Provided

Not Provided

0.93

0.89

0.91

0.87

0.87

 

Cross-Validation Sample

 

Criterion 1: DIBELS Next Composite Score at EOY

Time of Year: Middle of Year

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Cut points

RB

C

H

K

N

Q

U

Base rate in the sample for children requiring intensive intervention

0.07

0.22

0.20

0.27

0.27

0.27

0.33

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.40

0.33

0.30

0.31

0.42

0.40

0.60

False Positive Rate

0.36

0.17

0.15

0.14

0.27

0.24

0.45

False Negative Rate

0.10

0.11

0.10

0.23

0.15

0.19

0.11

Sensitivity

0.90

0.89

0.90

0.77

0.85

0.81

0.89

Specificity

0.64

0.83

0.85

0.86

0.73

0.76

0.55

Positive Predictive Power

0.15

0.60

0.60

0.66

0.54

0.55

0.49

Negative Predictive Power

0.99

0.96

0.97

0.91

0.93

0.91

0.91

Overall Classification Rate

0.66

0.84

0.86

0.83

0.77

0.77

0.66

Area Under the Curve (AUC)

0.80

0.93

0.95

0.90

0.88

0.87

0.86

AUC 95% Confidence Interval Lower

0.79

0.93

0.94

0.89

0.88

0.86

0.82

AUC 95% Confidence Interval Upper

0.80

0.94

0.95

0.90

0.89

0.88

0.89

At 90% Sensitivity, specificity equals

0.64

0.83

0.85

0.61

0.66

0.56

0.49

At 80% Sensitivity, specificity equals

Not Provided

Not Provided

0.92

0.86

0.82

0.76

0.72

At 70% Sensitivity, specificity equals

Not Provided

0.93

0.96

0.92

0.87

0.87

0.84

 

 

Reliability

GradeK123456
RatingFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledEmpty bubbled
  1. Justification for each type of reliability reported, given the type and purpose of the tool:

Internal consistency and alternate form reliability: Internal consistency reliability refers to a person’s degree of confidence in the precision of scores from a single measurement. It’s used to indicate the variation of test scores which is attributable to measurement error. Alternate form reliability indicates the extent to which test results generalize to different forms. Alternate forms of the test with different items should give approximately the same scores. There are two to three books per text level in TRC. Books at the same text level are considered alternate forms. An individual student’s performance on the alternate books at the same text level should yield approximately the same scores on oral reading accuracy, comprehension, and/or retell/recall, as well as overall book performance.

 

Inter-rater reliability evidence for grades K through 6: In observational assessments such as TRC, it is important that student performance be unrelated to or unaffected by a specific test administrator. Because there is a degree of subjectivity in scoring the accuracy of oral reading and comprehension, it is important to examine the degree to which TRC administrators can score student reading accuracy in a standardized and consistent manner. The sources of error associated with inter-rater reliability lie in the assessor.

 

  1. Description of the sample(s), including size and characteristics, for each reliability analysis conducted:

The internal consistency reliability was computed using TRC data at school year 2016-2017 in 9 regions and 19 states. The sample size is 2513 in total. The sample is comprised of 45% female and 46% male students; students were identified as 23% white, 25% Black or African-American, and 20% Hispanic-Latino. 39% of students were eligible for free or reduced lunch.

 

Data from two samples are collected to provide evidence for alternate form reliability. The first sample contains 33 students from Kindergarten to Grade 5: 8 from kindergarten, 10 from Grade 1, four from Grade 2, four from Grade 3, two from Grade 4, and five from Grade 5. The sample was 39 percent female and 61 percent male; 9 percent white, 21 percent Hispanic, 67 percent black, and 3 percent represented other races. The 33 students are assessed from two schools in two Southern states during the 2013-2014 end of year benchmark administration period. The second sample includes 40 students in Grades 4-6 from two schools in two Southern states during the 2014-2015 middle of year benchmark administration period. The sample is composed of students in Grade 4 (n = 15), Grade 5 (n = 15), and Grade 6 (n = 10); 39 percent of the students were female and 61 percent male; 67 percent of students were black, 21 percent were Hispanic, 9 percent were white, and 3 percent were of other ethnicity.

 

Inter-rater reliability evidence was obtained from two studies. STUDY 1: In one study, three raters assessed 33 students from two schools in two Southern states during the 2013–2014 end-of-year benchmark administration period. Among the students, representation was as follows: 8 from kindergarten, 10 from Grade 1, four from Grade 2, four from Grade 3, two from Grade 4, and five from Grade 5. The sample was 39 percent female and 61 percent male; 9 percent white, 21 percent Hispanic, 67 percent black, and 3 percent represented other races.

 

STUDY 2: Another study was conducted between the 2014–2015 beginning-of-year and middle-of-year benchmark administration periods. In total, four raters assessed 40 students from two schools in two Southern states during the 2014–2015 MOY benchmark administration period. The sample is composed of students in Grade 4 (n = 15), Grade 5 (n = 15), and Grade 6 (n = 10); 39 percent of the students were female and 61 percent male; 67 percent of students were black, 21 percent were Hispanic, 9 percent were white, and 3 percent were of other ethnicity.

 

  1. Description of the analysis procedures for each reported type of reliability:

Cronbach’s alpha is used as the indicator for internal consistency, which quantifies the degree to which the items on an assessment all measure the same underlying construct. To avoid missing responses, the students in each grades who are reading at-grade proficient text level books are used to compute the Cronbach’s alpha. The 95% confidence interval of the Cronbach’s alpha is computed using the Bootstrap method, where 1000 samples with replacement are drawn from the data, calculating for each alpha and computing the 2.5% and 97.5% quantiles.

 

For alternate form reliability, students were each assessed on two books at their instructional reading level and two books of the one level below their instructional reading level. Paired t-test comparisons are conducted to examine each component of TRC (accuracy, comprehension, retell/recall) as well as overall book performance.

 

Raters’ scores are compared using intraclass correlations (ICC). ICC is one of the most commonly used statistics for assessing IRR on ordinal, interval, or ratio variables and is suitable for studies with two or more coders (Hallgren, 2012). Cicchetti (1994) provides cutoffs for ICC values, with IRR being poor for values less than 0.40, fair for values between 0.40 and 0.59, good for values between 0.60 and 0.74, and excellent for values between 0.75 and 1.00. Cohen’s kappa was also explored for overall book performance. Fleiss (1981) suggested kappa values greater than 0.75 to indicated excellent agreement, 0.40 to 0.75 as fair to good, and below 0.40 as poor.

 

IRR estimates reported here are based on two or more independent assessors simultaneously

scoring student performance during a single test administration (“shadow-scoring”).

 

  1. Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Internal Consistency

K

563

0.86

(0.84, 0.88)

Internal Consistency

1

232

0.93

(0.90, 0.95)

Internal Consistency

2

1021

0.88

(0.84, 0.91)

Internal Consistency

3

272

0.83

(0.78, 0.86)

Internal Consistency

4

218

0.89

(0.80, 0.94)

Internal Consistency

5

207

0.76

(0.71, 0.81)

Inter-rater

K

8

0.77

 

Inter-rater

1

10

0.49

 

Inter-rater

2

4

0.35

 

Inter-rater

3

4

0.99

 

Inter-rater

4

2

0.99

 

Inter-rater

5

5

0.74

 

Inter-rater

4

15

0.48

 

Inter-rater

5

15

0.63

 

Inter-rater

6

10

0.59

 

Alternate Form Reliability based on Sample 1

Type of Reliability

Age or Grade

Accuracy

Retell/Recall

Oral Comprehension

Overall Book Performance

Alternate Form

K

t(5) = 0.04, n.s.

t(4) = 0.23, n.s.

t(2) = 1.51, n.s.

t(7) = 1.93, n.s.

Alternate Form

1

t(12) = –0.08, n.s.

t(2) = 1.73, n.s.

t(11) = –0.20,n.s.

t(12) = –0.56, n.s.

Alternate Form

2

t(5) = 1.87, n.s.

NA

t(5) = 1.75, n.s.

t(5) = 1.00, n.s.

Alternate Form

3

t(6) = –1.54, n.s.

NA

t(6) = –0.68, n.s.

t(6) = –1.00, n.s.

Alternate Form

4

t(3) = 0.29, n.s.

NA

t(3) = –2.32, n.s.

t(3) = –1.00, n.s.

Alternate Form

5

t(9) = 0.00, n.s.

NA

t(9) = –2.38, p < 0.05

t(9) = –2.45, p < 0.05

Note: n.s. = not significant.

 

Alternate Form Reliability based on Sample 2

Type of Reliability

Age or Grade

Accuracy

Retell/Recall

Oral Comprehension

Overall Book Performance

Alternate Form

4

t(29) = 1.20, n.s.

NA

t(29) = –0.82, n.s.

t(29) = 0.30, n.s.

Alternate Form

5

t(29) = 1.43, n.s.

NA

t(29) = 2.47, p < 0.05

t(29) = 3.64, p < 0.01

Alternate Form

6

t(19) = –2.08, n.s.

NA

t(19) = –0.21, n.s.

t(19) = 0.06, n.s.

Note: n.s. = not significant.

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of Reliability

Subgroup

Age or Grade

n

Coefficient

Confidence Interval

Not Provided

 

 

 

 

 

 

Validity

GradeK123456
RatingFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbled
  1. Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool:

DIBELS Next measures are brief, powerful indicators of foundational early literacy skills that: are quick to administer and score; serve as universal screening (or benchmark assessment) and progress monitoring; identify students in need of intervention support; evaluate the effectiveness of interventions; and support the RtI/Multi-tiered model. DIBELS Next includes six measures: First Sound Fluency (FSF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), DIBELS Oral Reading Fluency (DORF), and Daze. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good et al., 2013).

 

DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade appropriate reading skills, and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments.

 

  1. Description of the sample(s), including size and characteristics, for each validity analysis conducted:

The predictive validity was computed using TRC data at middle of year to predict DIBELS Next end of year at school year 2016-2017 in 9 regions and 18 states. The sample is comprised of 47% female and 45% male students; students were identified as 19% white, 22% Black or African-American, and 32% Hispanic-Latino. 32% of students were eligible for free or reduced lunch.

 

The concurrent validity was computed using TRC data at end of year to predict DIBELS Next end of year at school year 2016-2017 in 9 regions and 18 states. The sample is comprised of 47% female and 45% male students; students were identified as 18% white, 22% Black or African- American, and 30% Hispanic-Latino. 29% of students were eligible for free or reduced lunch.

 

  1. Description of the analysis procedures for each reported type of validity:

Evidence of concurrent validity is often presented as a correlation between the assessment and an external criterion measure. Instructional reading levels determined from the administration of the Atlas edition of TRC should correlate highly with other accepted procedures and measures that determine overall reading achievement, including accuracy and comprehension. The degree of correlation between two conceptually related, concurrently administered tests suggests the tests measure the same underlying psychological constructs or processes. The correlation of final instructional reading level on TRC with the Composite score on DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team,  2017).

 

Predictive validity provides an estimate of the extent to which student performance on TRC predicts scores on the criterion measure administered at a later point in time, defined as more than three months in this study. The correlation of final instructional reading level on TRC at the middle of year with the Composite score resulting from subsequent administration of DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017).

  

 

  1. Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Concurrent Validity

K

DIBELS Next Composite Score

51004

0.71

0.70, 0.71

Concurrent Validity

1

DIBELS Next Composite Score

50175

0.82

0.82, 0.83

Concurrent Validity

2

DIBELS Next Composite Score

45193

0.8

0.79, 0.80

Concurrent Validity

3

DIBELS Next Composite Score

19826

0.77

0.77, 0.78

Concurrent Validity

4

DIBELS Next Composite Score

9618

0.77

0.76, 0.78

Concurrent Validity

5

DIBELS Next Composite Score

5924

0.74

0.73, 0.75

Concurrent Validity

6

DIBELS Next Composite Score

519

0.77

0.73, 0.80

Predictive Validity

K

DIBELS Next Composite Score

44495

0.59

0.59, 0.60

Predictive Validity

1

DIBELS Next Composite Score

46797

0.79

0.78, 0.79

Predictive Validity

2

DIBELS Next Composite Score

43897

0.79

0.79, 0.80

Predictive Validity

3

DIBELS Next Composite Score

19446

0.76

0.75, 0.77

Predictive Validity

4

DIBELS Next Composite Score

10289

0.76

0.75, 0.77

Predictive Validity

5

DIBELS Next Composite Score

7733

0.73

0.72, 0.74

Predictive Validity

6

DIBELS Next Composite Score

567

0.76

0.72, 0.79

 

  1. Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

Not Provided

 

  1. Describe the degree to which the provided data support the validity of the tool:

The table above summarizes the concurrent and predictive validity evidence for each grade. Across Grades K to 6, current validity coefficients range from 0.71 to 0.82, demonstrating strong correlations between final instructional reading level on TRC and DIBELS Next composite score at the end of year. The lower bounds of 95% confidence intervals are all above 0.70. Across Grades K to 6, predictive validity coefficients are in the range of 0.59 to 0.79. The lower bounds of 95% confident intervals are above 0.70 for Grades 1 to 6. The correlation with the DIBELS Next composite score is slighter lower in Kindergarten than the other grades, possibly because text levels at the lower grades are much less variable due to floor effect at Kindergarten.

 

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Concurrent

White

K

DIBELS Next Composite Score

10419

0.71

0.70, 0.72

Concurrent

Black

K

DIBELS Next Composite Score

10812

0.70

0.69, 0.71

Concurrent

Hispanic

K

DIBELS Next Composite Score

12061

0.70

0.69, 0.71

Concurrent

White

1

DIBELS Next Composite Score

10118

0.79

0.78, 0.80

Concurrent

Black

1

DIBELS Next Composite Score

11336

0.84

0.84, 0.85

Concurrent

Hispanic

1

DIBELS Next Composite Score

12798

0.83

0.82, 0.83

Concurrent

White

2

DIBELS Next Composite Score

9330

0.76

0.75, 0.77

Concurrent

Black

2

DIBELS Next Composite Score

10362

0.82

0.82, 0.83

Concurrent

Hispanic

2

DIBELS Next Composite Score

12172

0.80

0.80, 0.81

Concurrent

White

3

DIBELS Next Composite Score

2735

0.77

0.75, 0.78

Concurrent

Black

3

DIBELS Next Composite Score

4185

0.78

0.76, 0.79

Concurrent

Hispanic

3

DIBELS Next Composite Score

8203

0.78

0.77, 0.79

Concurrent

White

4

DIBELS Next Composite Score

704

0.74

0.70, 0.77

Concurrent

Black

4

DIBELS Next Composite Score

2010

0.77

0.75, 0.78

Concurrent

Hispanic

4

DIBELS Next Composite Score

5909

0.79

0.78, 0.80

Concurrent

White

5

DIBELS Next Composite Score

358

0.74

0.69, 0.78

Concurrent

Black

5

DIBELS Next Composite Score

1127

0.74

0.71, 0.76

Concurrent

Hispanic

5

DIBELS Next Composite Score

3847

0.77

0.75, 0.78

Concurrent

White

6

DIBELS Next Composite Score

6

0.98

0.82, 1

Concurrent

Black

6

DIBELS Next Composite Score

19

0.78

0.51, 0.91

Concurrent

Hispanic

6

DIBELS Next Composite Score

337

0.78

0.73, 0.82

Predictive

White

K

DIBELS Next Composite Score

9684

0.61

0.60, 0.62

Predictive

Black

K

DIBELS Next Composite Score

9491

0.58

0.57, 0.59

Predictive

Hispanic

K

DIBELS Next Composite Score

10335

0.55

0.54, 0.57

Predictive

White

1

DIBELS Next Composite Score

9725

0.77

0.76, 0.78

Predictive

Black

1

DIBELS Next Composite Score

10826

0.80

0.79, 0.80

Predictive

Hispanic

1

DIBELS Next Composite Score

12125

0.79

0.78, 0.80

Predictive

White

2

DIBELS Next Composite Score

9243

0.77

0.76, 0.77

Predictive

Black

2

DIBELS Next Composite Score

10303

0.82

0.81, 0.83

Predictive

Hispanic

2

DIBELS Next Composite Score

12407

0.80

0.80, 0.81

Predictive

White

3

DIBELS Next Composite Score

2846

0.77

0.75, 0.78

Predictive

Black

3

DIBELS Next Composite Score

4345

0.77

0.76, 0.78

Predictive

Hispanic

3

DIBELS Next Composite Score

8469

0.76

0.76, 0.77

Predictive

White

4

DIBELS Next Composite Score

847

0.75

0.72, 0.78

Predictive

Black

4

DIBELS Next Composite Score

2172

0.78

0.76, 0.79

Predictive

Hispanic

4

DIBELS Next Composite Score

6236

0.77

0.76, 0.78

Predictive

White

5

DIBELS Next Composite Score

548

0.72

0.68, 0.76

Predictive

Black

5

DIBELS Next Composite Score

1516

0.74

0.71, 0.76

Predictive

Hispanic

5

DIBELS Next Composite Score

4992

0.75

0.73, 0.76

Predictive

White

6

DIBELS Next Composite Score

6

0.97

0.71, 1

Predictive

Black

6

DIBELS Next Composite Score

21

0.86

0.67, 0.94

Predictive

Hispanic

6

DIBELS Next Composite Score

381

0.76

0.71, 0.80

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format:

Not Provided

Sample Representativeness

GradeK123456
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Primary Classification Accuracy Sample

Not Provided

 

 

Cross Validation Sample

Representation

Not Provided

Date

2016-2017 school year

Size

173224

Male

47.00%

Female

45.00%

Unknown

8.00%

Free or reduced-price lunch

32.00%

White, Non-Hispanic

19.00%

Black, Non-Hispanic

22.00%

Hispanic

32.00%

American Indian/Alaska Native:

0.40%

Asian/Pacific Islander:

3.00%

Other

3.00%

Unknown

21.00%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

Bias Analysis Conducted

GradeK123456
RatingYesYesYesYesYesYesYes
  1. Description of the method used to determine the presence or absence of bias:

Classification analyses previously described were disaggregated by subgroups to determine whether the assessment is functioning similarly across subgroups. 

 

  1. Description of the subgroups for which bias analyses were conducted:

Data were disaggregated for the following groups: White, Black, Hispanic

 

  1. Description of the results of the bias analyses conducted, including data and interpretative statements:

AUC results are similar across subgroups.  

 

Administration Format

GradeK123456
Data
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Administration & Scoring Time

    GradeK123456
    Data
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • 5-8 minutes
  • Scoring Format

    GradeK123456
    Data
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Types of Decision Rules

    GradeK123456
    Data
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    GradeK123456
    Data
  • No
  • No
  • No
  • No
  • No
  • No
  • No