i-Ready Diagnostics

Reading

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

$6.00/student/year for i-Ready Diagnostic for reading. Annual license fee includes online student access to assessment, plus staff access to management and reporting suite, downloadable lesson plans, and user resources including i-Ready Central support website; account set-up and secure hosting; all program maintenance/ updates/ enhancements during the active license term; unlimited user access to U.S.-based service and support via toll-free phone and email during business hours. Professional development is required and available at an additional cost ($2,000/session up to six hours)

 

Replacement Cost:

License renewal fees subject to change annually.

 

Included in Cost:

i‑Ready Diagnostic is a fully web-based, vendor-hosted, Software-as-a-Service application. The per-student or site-based license fee includes account set-up and management; unlimited access to i-Ready’s assessment, management, and reporting functionality; plus unlimited access to U.S.-based customer service/technical support and all program maintenance, updates, and enhancements for as long as the license remains active. The license fee also includes hosting, data storage, and data security.

 

Via the i-Ready teacher and administrator dashboards and i-Ready Central support website, educators may access comprehensive user guides and downloadable lesson plans, as well as implementation tips, best practices, video tutorials, and more to supplement onsite, fee-based professional development. These resources are self-paced and available 24/7.

 

Technology Requirements:

  • Computer or Tablet
  • Internet connection

 

Training Requirements:

  • 4-8 hours of training

 

Qualified Administrators:

  • Professionals
  • Paraprofessionals

 

Accommodations:

Curriculum Associates engaged an independent consultant to evaluate i‑Ready Diagnostic’s accessibility. Overall, the report found that i-Ready “materials included significant functionality that indirectly supports… students with disabilities.” All items in i-Ready Diagnostic are designed to be accessible for most students. In a majority of cases, students who require accommodations (e.g., large print, extra time) will not require additional help during administration. The intentional integration of accessible design features should aid most students who typically require testing accommodations.

To address the elements of Universal Design as they apply to large-scale assessment (http://www.cehd.umn.edu/nceo/onlinepubs/Synthesis44.html), in developing i-Ready Diagnostic Curriculum Associates considered several issues related to accommodations. Most may be grouped into the following general categories that i‑Ready addresses:

  • Timing—Students may need extra time to complete the task. The Diagnostic assessment may be stopped and started as needed to allow students needing extra time to finish. The Diagnostic is untimed and can be administered in multiple test sessions. In fact, to ensure accurate results, a time limit is not recommended for any student, though administration must be completed within a period of no longer than 22 days.
  • Flexible Scheduling—Students may need multiple days to complete the assessment. i-Ready recommends that all students be given multiple days, as necessary, to complete the test (as noted above, administration must be completed within a period of no longer than 22 days).
  • Accommodated Presentation of Material—All i-Ready Diagnostic items are presented in a large, easily legible format specifically chosen for its readability. i‑Ready currently offers the ability to change the screen size; with the coming HTML5 items slated for a future release, users will be able to adjust the font size. There is only one item on the screen at a time. As appropriate to the skill(s) being assessed, some grade levels K–2 reading items also offer optional audio support.
  • Setting—Students may need to complete the task in a quiet room to minimize distraction. This can easily be done, as i-Ready Diagnostic is available on any computer with internet access that meets the technical requirements. Furthermore, all students are encouraged to use quality headphones in order to hear the audio portion of the items. Headphones also help to cancel out peripheral noise, which can be distracting to students.
  • Response Accommodation—Students should be able to control a mouse. They only need to be able to move a cursor with the mouse and be able to point, click, and drag. We are moving toward iPad® compatibility (see updates at www.i-Ready.com/support), with a beta expected in 2017-2018. This would mean touchscreen, which is potentially easier for those with motor impairments.

Where to Obtain:

Website: www.curriculumassociates.com              

Address: 153 Rangeway Road, N. Billerica MA 01862

Phone number: 800-225-0248              

Email address: info@cainc.com


Access to Technical Support:

Dedicated account manager plus unlimited access to in-house technical support during business hours

Offering a continuum of scale scores from kindergarten through high school, i‑Ready Diagnostic is a web-based adaptive screening assessment for reading which has been aligned with state and Common Core standards. i-Ready meets the expected rigor in each of the covered domains—Phonological Awareness, Phonics, High-Frequency Words, Vocabulary, Comprehension of Informational Text, and Comprehension of Literature—providing data and reports for each domain. Screening is administered up to three times per academic year, with 12-18 weeks of instruction between assessments. Each screening takes approximately 30-60 minutes—which may be broken into multiple sittings—and may be conducted with all students or with specific groups of students who have been identified as at risk of academic failure. i-Ready’s sophisticated adaptive algorithm automatically selects from thousands of technology-enhanced and multiple-choice items to get to the core of each student's strengths and challenges, regardless of the grade level at which he or she is performing.

 

The system automatically analyzes, scores, and reports student responses and results. Available as soon as a student completes the assessment, i‑Ready’s intuitive reports provide comprehensive information (including developmental analyses) about student performance, group students who struggle with the same concepts, make instructional recommendations to target skill deficiencies, and monitor progress and growth as students follow their individualized instructional paths. Reports include suggested next steps for instruction and PDF Tools for Instruction lesson plans for the teacher to use during individual, small-group, or whole-class instruction. In addition, should educators also purchase the optional i‑Ready Instruction, the system automatically prescribes online lessons that address each student’s identified academic needs.

 

 

Assessment Format:

  • Direct: Computerized

 

Administration Time:

  • 30–60 minutes per student

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

i-Ready Diagnostic scale scores are linear transformations of logit values. Logits, also known as “log odd units,” are measurement units for logarithmic probability models such as the Rasch model. Logit is used to determine both student ability and item difficulty. Within the Rasch model, if the ability matches the item difficulty, then the person has a .50 chance of answering the item correctly. For i-Ready Diagnostic, student ability and item logit values generally range from around -6 to 6.

 

Scores Generated:

  • Percentile score
  • IRT-based score
  • Developmental benchmarks
  • Lexile score
  • on-grade achievement level placements

 

 

Classification Accuracy

GradeK12345678
Criterion 1 FallHalf-filled bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 1 WinterdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 1 SpringHalf-filled bubbledashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 FalldashdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 WinterdashdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 SpringdashdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Primary Sample

 

Criterion 1: K-2: DIBELS NEXT; 3-8: Smarter Balanced

Time of Year: Fall

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

328

370

421

463

486

509

528

542

555

Base rate in the sample for children requiring intensive intervention

0.17

0.18

0.22

0.18

0.18

0.19

0.21

0.17

0.21

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.17

0.18

0.22

0.18

0.18

0.19

0.21

0.17

0.21

False Positive Rate

0.28

0.18

0.08

0.11

0.10

0.11

0.10

0.13

0.11

False Negative Rate

0.37

0.28

0.25

0.21

0.21

0.19

0.18

0.13

0.19

Sensitivity

0.63

0.72

0.75

0.79

0.79

0.81

0.82

0.87

0.81

Specificity

0.72

0.82

0.92

0.89

0.9

0.89

0.9

0.87

0.89

Positive Predictive Power

0.32

0.46

0.72

0.61

0.63

0.64

0.67

0.58

0.66

Negative Predictive Power

0.90

0.93

0.93

0.95

0.95

0.95

0.95

0.97

0.94

Overall Classification Rate

0.71

0.80

0.88

0.87

0.88

0.88

0.88

0.87

0.87

Area Under the Curve (AUC)

0.75

0.87

0.93

0.93

0.94

0.94

0.94

0.94

0.94

AUC 95% Confidence Interval Lower Bound

0.72

0.85

0.92

0.93

0.93

0.94

0.94

0.93

0.93

AUC 95% Confidence Interval Upper Bound

0.78

0.89

0.94

0.94

0.94

0.95

0.95

0.94

0.95

At 90% Sensitivity, specificity equals

0.32

0.56

0.79

0.76

0.78

0.82

0.80

0.80

0.81

At 80% Sensitivity, specificity equals

0.55

0.75

0.91

0.91

0.92

0.92

0.91

0.91

0.91

At 70% Sensitivity, specificity equals

0.68

0.86

0.95

0.96

0.97

0.96

0.97

0.95

0.96

 

Criterion 1: K-2: DIBELS NEXT; 3-8: Smarter Balanced

Time of Year: Winter

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

347

397

444

480

500

520

539

552

562

Base rate in the sample for children requiring intensive intervention

Not Provided

Not Provided

0.13

0.16

0.17

0.18

0.17

0.17

0.19

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not Provided

Not Provided

0.13

0.16

0.17

0.18

0.17

0.17

0.19

False Positive Rate

Not Provided

Not Provided

0.13

0.11

0.10

0.11

0.10

0.12

0.11

False Negative Rate

Not Provided

Not Provided

0.15

0.18

0.17

0.17

0.18

0.18

0.18

Sensitivity

Not Provided

Not Provided

0.85

0.82

0.83

0.83

0.82

0.82

0.82

Specificity

Not Provided

Not Provided

0.87

0.89

0.90

0.89

0.90

0.88

0.89

Positive Predictive Power

Not Provided

Not Provided

0.49

0.59

0.63

0.63

0.63

0.59

0.63

Negative Predictive Power

Not Provided

Not Provided

0.98

0.96

0.96

0.96

0.96

0.96

0.95

Overall Classification Rate

Not Provided

Not Provided

0.87

0.88

0.89

0.88

0.88

0.87

0.88

Area Under the Curve (AUC)

Not Provided

Not Provided

0.93

0.93

0.94

0.93

0.94

0.93

0.93

AUC 95% Confidence Interval Lower Bound

Not Provided

Not Provided

0.92

0.93

0.94

0.93

0.93

0.92

0.93

AUC 95% Confidence Interval Upper Bound

Not Provided

Not Provided

0.95

0.94

0.95

0.94

0.94

0.93

0.94

At 90% Sensitivity, specificity equals

Not Provided

Not Provided

0.93

0.79

0.82

0.80

0.80

0.76

0.79

At 80% Sensitivity, specificity equals

Not Provided

Not Provided

0.89

0.91

0.93

0.90

0.91

0.88

0.90

At 70% Sensitivity, specificity equals

Not Provided

Not Provided

0.95

0.95

0.97

0.95

0.96

0.94

0.95

 

Criterion 1: K-2: DIBELS NEXT; 3-8: Smarter Balanced

Time of Year: Spring

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

367

416

464

491

505

526

543

553

567

Base rate in the sample for children requiring intensive intervention

0.03

Not Provided

0.07

0.16

0.17

0.18

0.18

0.20

0.25

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.03

Not Provided

0.07

0.16

0.17

0.18

0.18

0.20

0.25

False Positive Rate

0.29

Not Provided

0.18

0.11

0.10

0.11

0.11

0.18

0.23

False Negative Rate

0.31

Not Provided

0.11

0.16

0.14

0.17

0.16

0.10

0.08

Sensitivity

0.69

Not Provided

0.89

0.84

0.86

0.83

0.84

0.90

0.92

Specificity

0.71

Not Provided

0.82

0.89

0.90

0.89

0.89

0.82

0.77

Positive Predictive Power

0.06

Not Provided

0.26

0.61

0.63

0.63

0.63

0.56

0.57

Negative Predictive Power

0.99

Not Provided

0.99

0.97

0.97

0.96

0.96

0.97

0.97

Overall Classification Rate

0.71

Not Provided

0.83

0.88

0.89

0.88

0.88

0.83

0.81

Area Under the Curve (AUC)

0.80

Not Provided

0.92

0.94

0.95

0.94

0.94

0.93

0.93

AUC 95% Confidence Interval Lower Bound

0.77

Not Provided

0.90

0.94

0.94

0.93

0.94

0.92

0.92

AUC 95% Confidence Interval Upper Bound

0.83

Not Provided

0.93

0.95

0.95

0.94

0.95

0.93

0.94

At 90% Sensitivity, specificity equals

0.47

Not Provided

0.75

0.82

0.84

0.79

0.81

0.77

0.77

At 80% Sensitivity, specificity equals

0.62

Not Provided

0.89

0.94

0.94

0.91

0.92

0.88

0.89

At 70% Sensitivity, specificity equals

0.94

Not Provided

0.93

0.97

0.97

0.95

0.96

0.93

0.94

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Cross-Validation Sample

 

Criterion: K-2: DIBELS NEXT; 3-8: New York State Testing Program

Time of Year: Fall

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

328

370

421

463

486

509

528

542

555

Base rate in the sample for children requiring intensive intervention

0.24

0.17

0.18

0.23

0.23

0.19

0.22

0.24

0.27

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.24

0.17

0.18

0.23

0.23

0.19

0.22

0.24

0.27

False Positive Rate

0.30

0.15

0.06

0.07

0.13

0.20

0.16

0.11

0.10

False Negative Rate

0.39

0.30

0.29

0.30

0.22

0.06

0.17

0.22

0.29

Sensitivity

0.61

0.70

0.71

0.70

0.78

0.94

0.83

0.78

0.71

Specificity

0.70

0.85

0.94

0.93

0.87

0.80

0.84

0.89

0.90

Positive Predictive Power

0.39

0.49

0.73

0.75

0.64

0.53

0.59

0.69

0.72

Negative Predictive Power

0.85

0.93

0.94

0.91

0.93

0.98

0.95

0.93

0.89

Overall Classification Rate

0.68

0.83

0.90

0.88

0.85

0.82

0.84

0.86

0.85

Area Under the Curve (AUC)

0.73

0.88

0.94

0.94

0.91

0.93

0.91

0.92

0.92

AUC 95% Confidence Interval Lower Bound

0.71

0.87

0.94

0.93

0.90

0.91

0.90

0.91

0.90

AUC 95% Confidence Interval Upper Bound

0.74

0.89

0.95

0.95

0.93

0.94

0.93

0.94

0.93

At 90% Sensitivity, specificity equals

0.31

0.61

0.81

0.78

0.70

0.76

0.72

0.76

0.71

At 80% Sensitivity, specificity equals

0.48

0.78

0.94

0.90

0.84

0.88

0.86

0.90

0.86

At 70% Sensitivity, specificity equals

0.62

0.88

0.97

0.98

0.93

0.93

0.93

0.95

0.93

 

Criterion: K-2: DIBELS NEXT; 3-8: New York State Testing Program

Time of Year: Winter

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

347

397

444

480

500

520

539

550

562

Base rate in the sample for children requiring intensive intervention

Not provided

Not provided

0.09

0.21

0.22

0.22

0.22

0.22

0.25

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not provided

Not provided

0.09

0.21

0.22

0.22

0.22

0.22

0.25

False Positive Rate

Not provided

Not provided

0.10

0.08

0.14

0.19

0.17

0.12

0.11

False Negative Rate

Not provided

Not provided

0.16

0.25

0.22

0.13

0.18

0.24

0.30

Sensitivity

Not provided

Not provided

0.85

0.75

0.78

0.87

0.82

0.76

0.70

Specificity

Not provided

Not provided

0.90

0.92

0.86

0.91

0.83

0.88

0.89

Positive Predictive Power

Not provided

Not provided

0.47

0.72

0.60

0.56

0.57

0.64

0.58

Negative Predictive Power

Not provided

Not provided

0.98

0.93

0.93

0.96

0.94

0.93

0.90

Overall Classification Rate

Not provided

Not provided

0.90

0.89

0.84

0.82

0.83

0.85

0.84

Area Under the Curve (AUC)

Not provided

Not provided

0.95

0.94

0.91

0.91

0.90

0.91

0.91

AUC 95% Confidence Interval Lower Bound

Not provided

Not provided

0.94

0.93

0.90

0.89

0.88

0.88

0.87

AUC 95% Confidence Interval Upper Bound

Not provided

Not provided

0.95

0.95

0.92

0.92

0.91

0.93

0.92

At 90% Sensitivity, specificity equals

Not provided

Not provided

0.84

0.79

0.69

0.73

0.67

0.72

0.68

At 80% Sensitivity, specificity equals

Not provided

Not provided

0.93

0.92

0.84

0.83

0.83

0.90

0.83

At 70% Sensitivity, specificity equals

Not provided

Not provided

0.97

0.96

0.93

0.91

0.91

0.93

0.91

 

Criterion: K-2: DIBELS NEXT; 3-8: New York State Testing Program

Time of Year: Spring

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

367

416

464

491

505

526

543

553

567

Base rate in the sample for children requiring intensive intervention

0.24

Not provided

0.18

0.21

0.22

0.20

0.19

0.24

0.24

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.24

Not provided

0.18

0.21

0.22

0.20

0.19

0.24

0.24

False Positive Rate

0.30

Not provided

0.06

0.08

0.13

0.18

0.17

0.12

0.12

False Negative Rate

0.39

Not provided

0.29

0.25

0.20

0.17

0.14

0.23

0.30

Sensitivity

0.61

Not provided

0.71

0.75

0.80

0.83

0.86

0.77

0.70

Specificity

0.70

Not provided

0.94

0.92

0.87

0.82

0.83

0.88

0.88

Positive Predictive Power

0.39

Not provided

0.73

0.71

0.63

0.55

0.54

0.67

0.64

Negative Predictive Power

0.85

Not provided

0.94

0.93

0.94

0.95

0.96

0.92

0.90

Overall Classification Rate

0.68

Not provided

0.90

0.88

0.85

0.83

0.83

0.86

0.83

Area Under the Curve (AUC)

0.73

Not provided

0.94

0.94

0.92

0.90

0.90

0.91

0.90

AUC 95% Confidence Interval Lower Bound

0.71

Not provided

0.94

0.93

0.91

0.89

0.89

0.89

0.88

AUC 95% Confidence Interval Upper Bound

0.74

Not provided

0.95

0.95

0.93

0.92

0.91

0.82

0.92

At 90% Sensitivity, specificity equals

0.31

Not provided

0.81

0.79

0.72

0.71

0.68

0.74

0.64

At 80% Sensitivity, specificity equals

0.48

Not provided

0.94

0.92

0.87

0.84

0.82

0.85

0.84

At 70% Sensitivity, specificity equals

0.62

Not provided

0.97

0.96

0.93

0.91

0.90

0.91

0.91

 

Reliability

GradeK12345678
RatingFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbled

1.Justification for each type of reliability reported, given the type and purpose of the tool

The i-Ready Diagnostic provides two types of reliability estimates:

·        IRT-based reliability measures such as the marginal reliability estimate and standard error of measurement.

·        Test-retest reliability coefficients.

 

Marginal Reliability:

Given that the i-Ready Diagnostic is a computer-adaptive assessment that does not have a fixed form, some traditional reliability estimates such as Cronbach’s alpha are not an appropriate index for quantifying consistency or inconsistency in student performance. The IRT analogue to classical reliability is called marginal reliability, and operates on the variance of the theta scores and the average of the expected error variance. The marginal reliability uses the classical definition of reliability as proportion of variance in the total observed score due to true score under an IRT model (the i-Ready Diagnostic uses a Rasch model to be specific).

 

Standard Error of Measurement (SEM):

In an IRT model, SEMs are affected by factors such as how well the data fit the underlying model, student response consistency, student location on the ability continuum, match of items to student ability, and test length.  Given the adaptive nature of i-Ready and the wide difficulty range in the item bank, standard errors are expected to be low and very close to the theoretical minimum for the test of the given length.

 

The theoretical minimum would be reached if each interim estimate of student ability is assessed by an item with difficulty matching perfectly to the student’s ability estimated from previous items. Theoretical minimums are restricted by the number of items served in the assessment—the more items that are served up, the lower the SEM could potentially be. For ELA, the minimum SEM for overall scores is 8.9.

 

In addition to providing the mean SEM by subject and grade, the graphical representations of the conditional standard errors of measurement (CSEM) provide additional evidence of the precision with which i-Ready measures student ability across the operational score scale. The figures included on pages 25–27 better contextualize the table of reliability analyses. In the context of model-based reliability analyses for computer adaptive tests, such as i Ready, CSEM plots permit test users to judge the relative precision of the estimate.

 

For additional context, these figures mark the scale score associated with the 1st and 99th percentile ranks to give a sense of the frequency of most (98%) of students.

 

Test-retest Reliability:

The i-Ready Diagnostic is often used as an interim assessment, and students can take the assessment multiple times a year. Therefore, the test-retest reliability estimate is appropriate to provide stability estimates for the same students who took two Diagnostic tests.

 

 

2.Description of the sample(s), including size and characteristics, for each reliability analysis conducted:

Data for obtaining the marginal reliability and SEM was from the August and September administrations of the i-Ready Diagnostic from 2016 (reported in the 2016 i-Ready Diagnostic technical report). All students tested within the timeframe were included. Sample size by grade are presented in the table below (under question 4).

 

Evidence of test-retest stability was assessed based on a subsample of students who, during the 2016–2017 school year, took i-Ready Diagnostic twice within the recommended 12–18-week testing window. The average testing interval is 106 days (15 weeks). Sample sizes by grade are presented in the table below (under question 4).

 

 

3.Description of the analysis procedures for each reported type of reliability:

This marginal reliability uses the classical definition of reliability as proportion of variance in the total observed score due to true score. The true score variance is computed as the observed score variance minus the error variance.

 

Similar to a classical reliability coefficient, the marginal reliability estimate increases as the standard error decreases; it approaches 1 when the standard error approaches 0.

 

The observed score variance, the error variance, and SEM (the square root of the error variance) are obtained through WINSTEPS calibrations. One separate calibration was conducted for each grade.

 

For test-retest reliability, Pearson correlation coefficients were obtained between scores for the two Diagnostic tests. Correlations between the two Diagnostic tests were calculated. In lower grades where growth and variability are expected to be higher, test-retest correlations are expected to be relatively lower.

 

 

4.Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Marginal

K

184,261

0.91

n/a*

Marginal

1

287,593

0.95

n/a*

Marginal

2

323,280

0.96

n/a*

Marginal

3

343,103

0.97

n/a*

Marginal

4

337,854

0.97

n/a*

Marginal

5

341,292

0.97

n/a*

Marginal

6

249,454

0.97

n/a*

Marginal

7

224,530

0.97

n/a*

Marginal

8

222,503

0.97

n/a*

Test-retest

K

120,194

0.701

0.698, 0.704

Test-retest

1

166,187

0.826

0.824, 0.827

Test-retest

2

181,997

0.852

0.850, 0.853

Test-retest

3

209,427

0.854

0.853, 0.855

Test-retest

4

204,577

0.861

0.860, 0.862

Test-retest

5

202,922

0.862

0.861, 0.863

Test-retest

6

144,272

0.860

0.859, 0.861

Test-retest

7

126,128

0.855

0.853, 0.856

Test-retest

8

119,647

0.853

0.851, 0.855

SEM

K

184,261

9.30

n/a*

SEM

1

287,593

9.33

n/a*

SEM

2

323,280

10.38

n/a*

SEM

3

343,103

10.11

n/a*

SEM

4

337,854

10.14

n/a*

SEM

5

341,292

10.35

n/a*

SEM

6

249,454

10.51

n/a*

SEM

7

224,530

10.61

n/a*

SEM

8

222,503

10.71

n/a*

* n/a: Confidence intervals are not applicable to marginal reliability estimates or SEMs due to how they are calculated for our computer-adaptive assessment. CSEM demonstrating relative measurement precision across the i-Ready score scale are available from NCII upon request.

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of Reliability

Subgroup

Age or Grade

n

Coefficient

Confidence Interval

Split-half

Asian

1

531

0.80

n/a*

Split-half

African American

1

2,665

0.75

n/a*

Split-half

Hispanic

1

2,246

0.77

n/a*

Split-half

Asian

2

549

0.86

n/a*

Split-half

African American

2

2,990

0.81

n/a*

Split-half

Hispanic

2

2,289

0.79

n/a*

Split-half

Asian

3

468

0.83

n/a*

Split-half

African American

3

2,881

0.80

n/a*

Split-half

Hispanic

3

2,269

0.80

n/a*

Split-half

Asian

4

439

0.80

n/a*

Split-half

African American

4

1,977

0.77

n/a*

Split-half

Hispanic

4

1,577

0.76

n/a*

Split-half

Asian

5

370

0.79

n/a*

Split-half

African American

5

1,612

0.78

n/a*

Split-half

Hispanic

5

1,249

0.79

n/a*

Split-half

Asian

6

247

0.83

n/a*

Split-half

African American

6

515

0.78

n/a*

Split-half

Hispanic

6

639

0.74

n/a*

Split-half

African American

7

254

0.76

n/a*

Split-half

Hispanic

7

278

0.81

n/a*

Split-half

African American

8

234

0.88

n/a*

Split-half

Hispanic

8

198

0.83

n/a*

* n/a: Confidence intervals are not applicable to split-half reliability estimates due to how they are calculated for our computer-adaptive assessment.  Although some modeling approaches exist that yield confidence intervals for adaptive tests, the psychometric field does not currently have an agreed-upon approach and instead favors the reporting of reliability point estimates for adaptive assessments (as is done here).  If specific reliability techniques are favored for this application, Curriculum Associates is happy to provide these on request.

Validity

GradeK12345678
RatingHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

1.Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool:

The internal structure of the i-Ready Diagnostic assessments is supported by the construct maps and the ordering of the skills addressed at different stages on the map. We recognize that coverage of skills and difficulty of items will overlap a fair amount across grades, as much material is reviewed from year to year. However, what should be apparent from the estimated item difficulties is that, generally, items measuring skills targeting lower levels of the map should be easier, and items measuring skills targeting higher levels of the map should be more difficult.

 

2.Description of the sample(s), including size and characteristics, for each validity analysis conducted:

Active items in the current item pool for the 2016–2017 school year are included in the analysis for interval validity. The number of items per grade is listed in the table below.

 

3.Description of the analysis procedures for each reported type of validity:

Distributions of indicator difficulties by grade level provide further evidence of internal structure. The difficulty of an indicator corresponds to a 67% probability of passing on the Indicator Characteristic Curve aggregated across all items aligned to the indicator. The table below shows the average and standard deviation of indicator difficulties.

 

4.Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Age or Grade

Indicator Difficulty (Mean)

Indicator Difficulty (SD)

Number of Items

Internal

K

383.48

29.65

439

Internal

1

440.77

37.41

430

Internal

2

502.63

40.37

316

Internal

3

524.97

33.99

302

Internal

4

562.71

21.72

225

Internal

5

583.54

19.13

224

Internal

6

601.60

17.77

244

Internal

7

616.77

19.70

253

Internal

8

627.24

14.34

253

 

 

 

Test or Criterion

n

corr

95% CI Lower Bound

95% CI Upper Bound

Concurrent/Predictive

K

Lexile*

840

0.88

0.86

0.89

Concurrent/Predictive

1

Lexile*

840

0.88

0.86

0.89

Concurrent/Predictive

2

Lexile*

840

0.88

0.86

0.89

Predictive

3

PARCC

5609

0.79

0.78

0.80

Predictive

4

PARCC

5881

0.82

0.81

0.82

Predictive

5

PARCC

5530

0.80

0.79

0.81

Predictive

6

PARCC

4022

0.79

0.78

0.80

Predictive

7

PARCC

3925

0.79

0.78

0.80

Predictive

8

PARCC

3721

0.78

0.77

0.80

Concurrent

3

NC

7603

0.83

0.82

0.83

Concurrent

4

NC

7415

0.83

0.82

0.84

Concurrent

5

NC

7505

0.82

0.81

0.83

Concurrent

6

NC

5205

0.82

0.81

0.83

Concurrent

7

NC

5685

0.81

0.80

0.82

Concurrent

8

NC

5282

0.79

0.78

0.80

Concurrent

3

MS

3260

0.81

0.80

0.82

Concurrent

4

MS

3717

0.76

0.74

0.77

Concurrent

5

MS

3380

0.79

0.77

0.80

Concurrent

6

MS

3305

0.81

0.80

0.82

Concurrent

7

MS

2291

0.81

0.80

0.82

Concurrent

8

MS

2106

0.80

0.78

0.81

Concurrent

3

OH

3025

0.76

0.74

0.77

Concurrent

4

OH

2696

0.78

0.76

0.79

Concurrent

5

OH

2693

0.78

0.76

0.79

Concurrent

6

OH

1865

0.78

0.76

0.79

Concurrent

7

OH

1607

0.77

0.75

0.79

Concurrent

8

OH

1488

0.71

0.68

0.73

* For the purposes of the Lexile study referenced above, grade-banded results are featured, rather than grade-specific results.  The i-Ready Diagnostic reading scale scores are created on a vertical scale which makes the scale scores comparable across grades.  Thus, for efficiency purposes, the linking sample for the Lexile study includes only students from every other grade (i.e., grades 1, 3, 5, and 7), but results are generalized across grades in various grade bands (e.g., K-2).  Additional information on the Lexile study, which was conducted in concert with MetaMetrics, is available upon request. 

 

5.Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

None provided  

 

6.Describe the degree to which the provided data support the validity of the tool

The internal structure of the i-Ready Diagnostic assessments is supported by the construct maps and the ordering of the skills addressed at different stages on the map. Skills representing the lower levels on the construct map are those generally associated with items targeted at lower grade levels, and skills representing the higher levels on the map are ones generally associated with items targeted at higher grade levels.

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

None

 

 

 

 

 

 

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format:

None provided

Sample Representativeness

GradeK12345678
Data
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • National with Cross-Validation
  • Primary Classification Accuracy Sample

    Representation: National (East North Central, South Atlantic, Mountain, West North Central, Pacific).  The analyses featured here include six states total, with three states yielding data for the K-2 analysis and three different states yielding data for the 3-8 analysis.  Data for the K-2 analyses come from the 2016-17 school year, which is the most recent school year available at the time of the analysis.  Note also that the data for the 3-8 analysis is from the Smarter Balanced Assessment Consortium (SBAC) program and has been sampled and stratified to reflect the representation of all SBAC states during the 2015-16 school year (the most recent data available at the time of the analysis), and builds upon our 2014-15 SBAC study that used a slightly different sample of states.  The same SBAC stratified sample used for this analysis was also used for the i-Ready prediction model, which is a model used by all i-Ready schools in the 14 SBAC states to predict proficiency on SBAC using the i-Ready assessment.

    Date: Spring, 2017 for K-2; Spring 2016 for 3-8

    Size: 38,568 for Grade 3-8.  5,010 for Grade K-2.

     

     

    K

    1

    2

    3

    4

    5

    6

    7

    8

    Male

    52%

    52%

    53%

    49%

    49%

    48%

    48%

    48%

    49%

    Female

    48%

    48%

    47%

    51%

    51%

    52%

    52%

    52%

    51%

    Unknown

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    SES

    Unknown

    Unknown

    Unknown

    27%

    27%

    20%

    10%

    20%

    22%

    White

    14%

    12%

    13%

    40%

    40%

    38%

    28%

    30%

    29%

    Black or African American

    5%

    4%

    5%

    6%

    6%

    6%

    4%

    4%

    4%

    Hispanic

    14%

    7%

    7%

    27%

    28%

    27%

    25%

    25%

    25%

    American Indian or Alaskan

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    Asian

    0%

    0%

    0%

    14%

    16%

    15%

    11%

    15%

    14%

    Native Hawaiian or P. Islander

    0%

    0%

    0%

    1%

    1%

    1%

    1%

    1%

    1%

    Other

    0%

    1%

    1%

    4%

    4%

    4%

    2%

    2%

    2%

    Unknown

    80%

    78%

    75%

    7%

    6%

    10%

    28%

    24%

    25%

    Disability classification

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    First language

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    ELL

    15%

    20%

    26%

    25%

    19%

    15%

    10%

    8%

    8%

     

    Cross Validation Sample

    Representation: National (East North Central, South Atlantic, Mountain, West North Central, Pacific).  The analyses featured here include two states, with one state yielding data for the K-2 analysis and a different state yielding data for the 3-8 analysis.  Data for the K-2 analyses come from the 2016-17 school year, which is the most recent school year available at the time of the analysis.  For the K-2 analysis, data came from one of the states used in the classification analysis featured above, but from a different district within the state that was specifically selected to determine the degree to which the results from the classification analyses were generalizable when the cut scores were applied to a different sample.  Note also that the data for the 3-8 analysis is from the New York State Testing Program and has been sampled and stratified to reflect the representation of all New York State districts during the 2015-16 school year (the most recent data available at the time of the analysis), and builds upon our 2014-15 New York State study that used a slightly different sample of districts.  The same New York State stratified sample used for this analysis was also used for the i-Ready prediction model, which is a model used by all i-Ready schools in New York State to predict proficiency on the New York State Testing Program using the i-Ready assessment.

    Date: Spring, 2017 for K-2; Spring 2016 for 3-8

    Size: 12,974  for Grade 3-8.  8,140 for Grade K-2                    

     

    K

    1

    2

    3

    4

    5

    6

    7

    8

    Male

    33%

    45%

    44%

    49%

    51%

    49%

    53%

    44%

    53%

    Female

    28%

    43%

    46%

    51%

    49%

    51%

    47%

    55%

    47%

    Unknown

    39%

    12%

    10%

    0%

    0%

    0%

    0%

    0%

    0%

    FRPL

    Unknown

    Unknown

    Unknown

    11%

    9%

    19%

    20%

    22%

    61%

    White

    33%

    45%

    33%

    5%

    4%

    3%

    3%

    1%

    0%

    Black or African American

    8%

    13%

    10%

    19%

    21%

    20%

    19%

    19%

    18%

    Hispanic

    17%

    20%

    14%

    19%

    17%

    18%

    17%

    17%

    17%

    American Indian or Alaskan

    0%

    0%

    0%

    0%

    0%

    0%

    0%

    1%

    0%

    Asian

    0%

    1%

    0%

    12%

    11%

    12%

    10%

    12%

    11%

    Native Hawaiian or P. Islander

    0%

    0%

    0%

    44%

    46%

    47%

    51%

    50%

    54%

    Other

    2%

    2%

    2%

    0%

    0%

    1%

    0%

    0%

    0%

    Unknown

    56%

    39%

    54%

    0%

    0%

    0%

    0%

    1%

    0%

    Disability classification

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    First language

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    Unknown

    ELL

    Unknown

    Unknown

    Unknown

    28%

    30%

    32%

    35%

    46%

    45%

     

    Bias Analysis Conducted

    GradeK12345678
    RatingYesYesYesYesYesYesYesYesYes
    1. Description of the method used to determine the presence or absence of bias:

    Differential Item Function (DIF) was investigated using WINSTEPS® by comparing the item difficulty measure for two demographic categories in a pairwise comparison through a combined calibration analysis. The essence of this methodology is to investigate the interaction of the person-groups with each item, while fixing all other item and person measures to those from the combined calibration. The method used to detect DIF is based on the Mantel-Haenszel procedure (MH), and the work of Linacre & Wright (1989) and Linacre (2012). Typically, the group representing test takers in a specific demographic group is referred to as the focal group. The group made up of test takers from outside this group is referred to as the reference group. For example, for gender, Female is the focal group, and Male is the reference group.

     

    1. Description of the subgroups for which bias analyses were conducted:

    The latest large-scale DIF analysis included a random sample (10%) of students from the 2015–2016 i-Ready operational data. Given the large size of the 2015–2016 i-Ready student population, it is practical to carry out the calibration analysis with a random sample. The following demographic categories were compared: Female vs. Male; African American and Hispanic vs. Caucasian; English Learner vs. non–English Learner; Special Ed vs. General Ed; Economically Disadvantaged vs. Not Economically Disadvantaged. In each pairwise comparison, estimates of item difficulty for each category in the comparison were calculated. The table below presents the total number (in thousands) and percentage of students included in the DIF analysis.

    DIF Group

    DIF Variable

    N Count (thousands)

    Percent

    Gender

    Male

    258.4

    52.0

     

    Female*

    238.8

    48.0

    Ethnicity

    Caucasian

    129.2

    36.6

     

    African American and Hispanic*

    224.2

    63.4

    EL

    non–English Learner

    250.8

    81.2

     

    English Learner*

    58.2

    18.8

    Special Ed

    General Ed

    165.8

    85.7

     

    Special Ed*

    27.6

    14.3

    Economic Status

    Not Economically Disadvantaged

    177.8

    69.0

     

    Economically Disadvantaged*

    80.0

    31.1

     

     

    1. Description of the results of the bias analyses conducted, including data and interpretative statements:

    Active items in the current item pool for the 2016–2017 school year are included in the DIF analysis. The total numbers of items are 3649 for ELA. WINSTEPS (Version 3.92) was used to conduct the calibration for DIF analysis by grade. To help interpret the results, the Educational Testing Service (ETS) criteria using the delta method was used to categorize DIF (Zwick, Thayer, & Lewis, 1999) and is presented below:

     

    ETS DIF Category

    Definition

    A (negligible)

    |DIF|<0.43

    B (moderate)

    |DIF|≥0.43 and|DIF|<0.64

    C (large)

    |DIF|≥0.64

    B- or C- suggests DIF against focal group

     

    B+ or C+ suggests DIF against reference group

     

     

    The number and percentage of items exhibiting DIF for each of the demographic categories are reported in the table below. The majority of ELA items show negligible DIF (mostly more than 90 percent), and very few items (less than 3 percent) are showing large DIF (level C) by grade.

     

    Grade

     

    Gender

    Ethnicity

    ELL

    Special Education

    Economically Disadvantaged

     

    ETS DIF Category

    N

    Percent

    N

    Percent

    N

    Percent

    N

    Percent

    N

    Percent

    0

    A

    1,315

    97.4

    1,227

    96.1

    1,106

    96.9

    408

    96.0

    1160

    98.3

     

    B+

    9

    0.7

    12

    0.9

    10

    0.9

    5

    1.2

    5

    0.4

     

    B-

    11

    0.8

    31

    2.4

    19

    1.7

    10

    2.4

    13

    1.1

     

    C+

    4

    0.3

    2

    0.2

    2

    0.2

    1

    0.2

    0

    0.0

     

    C-

    11

    0.8

    5

    0.4

    4

    0.4

    1

    0.2

    2

    0.2

     

    Total

    1,350

    100.0

    1,277

    100.0

    1,141

    100.0

    425

    100.0

    1,180

    100.0

    1

    A

    1,741

    96.5

    1,686

    95.8

    1,435

    95.1

    967

    94.7

    1,562

    97.4

     

    B+

    15

    0.8

    35

    2.0

    22

    1.5

    23

    2.3

    13

    0.8

     

    B-

    40

    2.2

    27

    1.5

    29

    1.9

    20

    2.0

    18

    1.1

     

    C+

    4

    0.2

    7

    0.4

    16

    1.1

    6

    0.6

    4

    0.2

     

    C-

    5

    0.3

    5

    0.3

    7

    0.5

    5

    0.5

    6

    0.4

     

    Total

    1,805

    100.0

    1,760

    100.0

    1,509

    100.0

    1,021

    100.0

    1,603

    100.0

    2

    A

    1,886

    95.3

    1,766

    95.2

    1,668

    93.1

    1,094

    93.0

    1,868

    96.4

     

    B+

    35

    1.8

    49

    2.6

    44

    2.5

    35

    3.0

    28

    1.4

     

    B-

    48

    2.4

    30

    1.6

    46

    2.6

    26

    2.2

    26

    1.3

     

    C+

    5

    0.3

    7

    0.4

    21

    1.2

    16

    1.4

    11

    0.6

     

    C-

    4

    0.2

    4

    0.2

    12

    0.7

    5

    0.4

    5

    0.3

     

    Total

    1,978

    100.0

    1,856

    100.0

    1,791

    100.0

    1,176

    100.0

    1,938

    100.0

    3

    A

    2,337

    94.7

    2,047

    95.1

    1,718

    91.2

    1,251

    89.7

    2,122

    95.4

     

    B+

    44

    1.8

    52

    2.4

    54

    2.9

    54

    3.9

    38

    1.7

     

    B-

    63

    2.6

    38

    1.8

    69

    3.7

    50

    3.6

    39

    1.8

     

    C+

    14

    0.6

    9

    0.4

    15

    0.8

    22

    1.6

    17

    0.8

     

    C-

    9

    0.4

    6

    0.3

    28

    1.5

    18

    1.3

    9

    0.4

     

    Total

    2,467

    100.0

    2,152

    100.0

    1,884

    100.0

    1,395

    100.0

    2,225

    100.0

    4

    A

    2,386

    95.3

    2,000

    96.3

    1,863

    89.7

    1,552

    91.8

    2,208

    96.4

     

    B+

    58

    2.3

    39

    1.9

    63

    3.0

    36

    2.1

    30

    1.3

     

    B-

    29

    1.2

    25

    1.2

    80

    3.8

    54

    3.2

    25

    1.1

     

    C+

    20

    0.8

    10

    0.5

    26

    1.3

    26

    1.5

    14

    0.6

     

    C-

    11

    0.4

    2

    0.1

    46

    2.2

    23

    1.4

    14

    0.6

     

    Total

    2,504

    100.0

    2,076

    100.0

    2,078

    100.0

    1,691

    100.0

    2,291

    100.0

    5

    A

    2,280

    95.0

    2,130

    96.1

    1,907

    89.3

    1,551

    90.8

    2,246

    97.0

     

    B+

    41

    1.7

    43

    1.9

    79

    3.7

    50

    2.9

    29

    1.3

     

    B-

    51

    2.1

    29

    1.3

    77

    3.6

    71

    4.2

    27

    1.2

     

    C+

    18

    0.8

    12

    0.5

    30

    1.4

    18

    1.1

    9

    0.4

     

    C-

    9

    0.4

    2

    0.1

    42

    2.0

    18

    1.1

    4

    0.2

     

    Total

    2,399

    100.0

    2,216

    100.0

    2,135

    100.0

    1,708

    100.0

    2,315

    100.0

    6

    A

    2,135

    92.6

    1,921

    94.1

    1,561

    86.2

    1,520

    90.1

    2,120

    95.5

     

    B+

    54

    2.3

    62

    3.0

    80

    4.4

    64

    3.8

    39

    1.8

     

    B-

    81

    3.5

    43

    2.1

    96

    5.3

    69

    4.1

    41

    1.8

     

    C+

    25

    1.1

    10

    0.5

    39

    2.2

    14

    0.8

    7

    0.3

     

    C-

    10

    0.4

    5

    0.2

    34

    1.9

    20

    1.2

    14

    0.6

     

    Total

    2,305

    100.0

    2,041

    100.0

    1,810

    100.0

    1,687

    100.0

    2,221

    100.0

    7

    A

    2,307

    91.8

    1,970

    92.5

    1,476

    82.1

    1,582

    87.6

    2,227

    94.7

     

    B+

    76

    3.0

    66

    3.1

    109

    6.1

    83

    4.6

    54

    2.3

     

    B-

    90

    3.6

    63

    3.0

    100

    5.6

    93

    5.1

    48

    2.0

     

    C+

    26

    1.0

    26

    1.2

    58

    3.2

    22

    1.2

    8

    0.3

     

    C-

    15

    0.6

    5

    0.2

    54

    3.0

    26

    1.4

    14

    0.6

     

    Total

    2,514

    100.0

    2,130

    100.0

    1,797

    100.0

    1,806

    100.0

    2,351

    100.0

    8

    A

    2,280

    89.1

    1,930

    93.2

    1,412

    78.5

    1,599

    87.0

    2,209

    94.4

     

    B+

    95

    3.7

    50

    2.4

    120

    6.7

    76

    4.1

    54

    2.3

     

    B-

    127

    5.0

    51

    2.5

    114

    6.3

    95

    5.2

    53

    2.3

     

    C+

    33

    1.3

    24

    1.2

    81

    4.5

    27

    1.5

    12

    0.5

     

    C-

    25

    1.0

    15

    0.7

    71

    3.9

    40

    2.2

    11

    0.5

     

    Total

    2,560

    100.0

    2,070

    100.0

    1,798

    100.0

    1,837

    100.0

    2,339

    100.0

     

    Administration Format

    GradeK12345678
    Data
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Administration & Scoring Time

    GradeK12345678
    Data
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • Scoring Format

    GradeK12345678
    Data
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Types of Decision Rules

    GradeK12345678
    Data
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    GradeK12345678
    Data
  • No
  • No
  • No
  • No
  • No
  • No
  • No
  • No
  • No