i-Ready Diagnostics

Reading

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

$6.00/student/year for i-Ready Diagnostic for reading. Annual license fee includes online student access to assessment, plus staff access to management and reporting suite, downloadable lesson plans, and user resources including i-Ready Central support website; account set-up and secure hosting; all program maintenance/ updates/ enhancements during the active license term; unlimited user access to U.S.-based service and support via toll-free phone and email during business hours. Professional development is required and available at an additional cost ($2,000/session up to six hours)

 

Replacement Cost:

License renewal fees subject to change annually.

 

Included in Cost:

i‑Ready Diagnostic is a fully web-based, vendor-hosted, Software-as-a-Service application. The per-student or site-based license fee includes account set-up and management; unlimited access to i-Ready’s assessment, management, and reporting functionality; plus unlimited access to U.S.-based customer service/technical support and all program maintenance, updates, and enhancements for as long as the license remains active. The license fee also includes hosting, data storage, and data security.

 

Via the i-Ready teacher and administrator dashboards and i-Ready Central support website, educators may access comprehensive user guides and downloadable lesson plans, as well as implementation tips, best practices, video tutorials, and more to supplement onsite, fee-based professional development. These resources are self-paced and available 24/7.

 

Technology Requirements:

  • Computer or Tablet
  • Internet connection

 

Training Requirements:

  • 4-8 hours of training

 

Qualified Administrators:

  • Professionals
  • Paraprofessionals

 

Accommodations:

Curriculum Associates engaged an independent consultant to evaluate i‑Ready Diagnostic’s accessibility. Overall, the report found that i-Ready “materials included significant functionality that indirectly supports… students with disabilities.” All items in i-Ready Diagnostic are designed to be accessible for most students. In a majority of cases, students who require accommodations (e.g., large print, extra time) will not require additional help during administration. The intentional integration of accessible design features should aid most students who typically require testing accommodations.

To address the elements of Universal Design as they apply to large-scale assessment (http://www.cehd.umn.edu/nceo/onlinepubs/Synthesis44.html), in developing i-Ready Diagnostic Curriculum Associates considered several issues related to accommodations. Most may be grouped into the following general categories that i‑Ready addresses:

  • Timing—Students may need extra time to complete the task. The Diagnostic assessment may be stopped and started as needed to allow students needing extra time to finish. The Diagnostic is untimed and can be administered in multiple test sessions. In fact, to ensure accurate results, a time limit is not recommended for any student, though administration must be completed within a period of no longer than 22 days.
  • Flexible Scheduling—Students may need multiple days to complete the assessment. i-Ready recommends that all students be given multiple days, as necessary, to complete the test (as noted above, administration must be completed within a period of no longer than 22 days).
  • Accommodated Presentation of Material—All i-Ready Diagnostic items are presented in a large, easily legible format specifically chosen for its readability. i‑Ready currently offers the ability to change the screen size; with the coming HTML5 items slated for a future release, users will be able to adjust the font size. There is only one item on the screen at a time. As appropriate to the skill(s) being assessed, some grade levels K–2 reading items also offer optional audio support.
  • Setting—Students may need to complete the task in a quiet room to minimize distraction. This can easily be done, as i-Ready Diagnostic is available on any computer with internet access that meets the technical requirements. Furthermore, all students are encouraged to use quality headphones in order to hear the audio portion of the items. Headphones also help to cancel out peripheral noise, which can be distracting to students.
  • Response Accommodation—Students should be able to control a mouse. They only need to be able to move a cursor with the mouse and be able to point, click, and drag. We are moving toward iPad® compatibility (see updates at www.i-Ready.com/support), with a beta expected in 2017-2018. This would mean touchscreen, which is potentially easier for those with motor impairments.

Where to Obtain:

Website: www.curriculumassociates.com              

Address: 153 Rangeway Road, N. Billerica MA 01862

Phone number: 800-225-0248              

Email address: info@cainc.com


Access to Technical Support:

Dedicated account manager plus unlimited access to in-house technical support during business hours

Offering a continuum of scale scores from kindergarten through high school, i‑Ready Diagnostic is a web-based adaptive screening assessment for reading which has been aligned with state and Common Core standards. i-Ready meets the expected rigor in each of the covered domains—Phonological Awareness, Phonics, High-Frequency Words, Vocabulary, Comprehension of Informational Text, and Comprehension of Literature—providing data and reports for each domain. Screening is administered up to three times per academic year, with 12-18 weeks of instruction between assessments. Each screening takes approximately 30-60 minutes—which may be broken into multiple sittings—and may be conducted with all students or with specific groups of students who have been identified as at risk of academic failure. i-Ready’s sophisticated adaptive algorithm automatically selects from thousands of technology-enhanced and multiple-choice items to get to the core of each student's strengths and challenges, regardless of the grade level at which he or she is performing.

 

The system automatically analyzes, scores, and reports student responses and results. Available as soon as a student completes the assessment, i‑Ready’s intuitive reports provide comprehensive information (including developmental analyses) about student performance, group students who struggle with the same concepts, make instructional recommendations to target skill deficiencies, and monitor progress and growth as students follow their individualized instructional paths. Reports include suggested next steps for instruction and PDF Tools for Instruction lesson plans for the teacher to use during individual, small-group, or whole-class instruction. In addition, should educators also purchase the optional i‑Ready Instruction, the system automatically prescribes online lessons that address each student’s identified academic needs.

 

 

Assessment Format:

  • Direct: Computerized

 

Administration Time:

  • 30–60 minutes per student

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

i-Ready Diagnostic scale scores are linear transformations of logit values. Logits, also known as “log odd units,” are measurement units for logarithmic probability models such as the Rasch model. Logit is used to determine both student ability and item difficulty. Within the Rasch model, if the ability matches the item difficulty, then the person has a .50 chance of answering the item correctly. For i-Ready Diagnostic, student ability and item logit values generally range from around -6 to 6.

 

Scores Generated:

  • Percentile score
  • IRT-based score
  • Developmental benchmarks
  • Lexile score
  • on-grade achievement level placements

 

 

Classification Accuracy

GradeK12345678
Criterion 1 FallHalf-filled bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 1 WinterdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 1 SpringHalf-filled bubbledashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 FalldashdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 WinterdashdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 SpringdashdashdashFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Primary Sample

 

Criterion 1: K-2: DIBELS NEXT; 3-8: Smarter Balanced

Time of Year: Fall

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

328

370

421

463

486

509

528

542

555

Base rate in the sample for children requiring intensive intervention

0.17

0.18

0.22

0.18

0.18

0.19

0.21

0.17

0.21

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.17

0.18

0.22

0.18

0.18

0.19

0.21

0.17

0.21

False Positive Rate

0.28

0.18

0.08

0.11

0.10

0.11

0.10

0.13

0.11

False Negative Rate

0.37

0.28

0.25

0.21

0.21

0.19

0.18

0.13

0.19

Sensitivity

0.63

0.72

0.75

0.79

0.79

0.81

0.82

0.87

0.81

Specificity

0.72

0.82

0.92

0.89

0.9

0.89

0.9

0.87

0.89

Positive Predictive Power

0.32

0.46

0.72

0.61

0.63

0.64

0.67

0.58

0.66

Negative Predictive Power

0.90

0.93

0.93

0.95

0.95

0.95

0.95

0.97

0.94

Overall Classification Rate

0.71

0.80

0.88

0.87

0.88

0.88

0.88

0.87

0.87

Area Under the Curve (AUC)

0.75

0.87

0.93

0.93

0.94

0.94

0.94

0.94

0.94

AUC 95% Confidence Interval Lower Bound

0.72

0.85

0.92

0.93

0.93

0.94

0.94

0.93

0.93

AUC 95% Confidence Interval Upper Bound

0.78

0.89

0.94

0.94

0.94

0.95

0.95

0.94

0.95

At 90% Sensitivity, specificity equals

0.32

0.56

0.79

0.76

0.78

0.82

0.80

0.80

0.81

At 80% Sensitivity, specificity equals

0.55

0.75

0.91

0.91

0.92

0.92

0.91

0.91

0.91

At 70% Sensitivity, specificity equals

0.68

0.86

0.95

0.96

0.97

0.96

0.97

0.95

0.96

 

Criterion 1: K-2: DIBELS NEXT; 3-8: Smarter Balanced

Time of Year: Winter

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

347

397

444

480

500

520

539

552

562

Base rate in the sample for children requiring intensive intervention

Not Provided

Not Provided

0.13

0.16

0.17

0.18

0.17

0.17

0.19

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not Provided

Not Provided

0.13

0.16

0.17

0.18

0.17

0.17

0.19

False Positive Rate

Not Provided

Not Provided

0.13

0.11

0.10

0.11

0.10

0.12

0.11

False Negative Rate

Not Provided

Not Provided

0.15

0.18

0.17

0.17

0.18

0.18

0.18

Sensitivity

Not Provided

Not Provided

0.85

0.82

0.83

0.83

0.82

0.82

0.82

Specificity

Not Provided

Not Provided

0.87

0.89

0.90

0.89

0.90

0.88

0.89

Positive Predictive Power

Not Provided

Not Provided

0.49

0.59

0.63

0.63

0.63

0.59

0.63

Negative Predictive Power

Not Provided

Not Provided

0.98

0.96

0.96

0.96

0.96

0.96

0.95

Overall Classification Rate

Not Provided

Not Provided

0.87

0.88

0.89

0.88

0.88

0.87

0.88

Area Under the Curve (AUC)

Not Provided

Not Provided

0.93

0.93

0.94

0.93

0.94

0.93

0.93

AUC 95% Confidence Interval Lower Bound

Not Provided

Not Provided

0.92

0.93

0.94

0.93

0.93

0.92

0.93

AUC 95% Confidence Interval Upper Bound

Not Provided

Not Provided

0.95

0.94

0.95

0.94

0.94

0.93

0.94

At 90% Sensitivity, specificity equals

Not Provided

Not Provided

0.93

0.79

0.82

0.80

0.80

0.76

0.79

At 80% Sensitivity, specificity equals

Not Provided

Not Provided

0.89

0.91

0.93

0.90

0.91

0.88

0.90

At 70% Sensitivity, specificity equals

Not Provided

Not Provided

0.95

0.95

0.97

0.95

0.96

0.94

0.95

 

Criterion 1: K-2: DIBELS NEXT; 3-8: Smarter Balanced

Time of Year: Spring

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

367

416

464

491

505

526

543

553

567

Base rate in the sample for children requiring intensive intervention

0.03

Not Provided

0.07

0.16

0.17

0.18

0.18

0.20

0.25

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.03

Not Provided

0.07

0.16

0.17

0.18

0.18

0.20

0.25

False Positive Rate

0.29

Not Provided

0.18

0.11

0.10

0.11

0.11

0.18

0.23

False Negative Rate

0.31

Not Provided

0.11

0.16

0.14

0.17

0.16

0.10

0.08

Sensitivity

0.69

Not Provided

0.89

0.84

0.86

0.83

0.84

0.90

0.92

Specificity

0.71

Not Provided

0.82

0.89

0.90

0.89

0.89

0.82

0.77

Positive Predictive Power

0.06

Not Provided

0.26

0.61

0.63

0.63

0.63

0.56

0.57

Negative Predictive Power

0.99

Not Provided

0.99

0.97

0.97

0.96

0.96

0.97

0.97

Overall Classification Rate

0.71

Not Provided

0.83

0.88

0.89

0.88

0.88

0.83

0.81

Area Under the Curve (AUC)

0.80

Not Provided

0.92

0.94

0.95

0.94

0.94

0.93

0.93

AUC 95% Confidence Interval Lower Bound

0.77

Not Provided

0.90

0.94

0.94

0.93

0.94

0.92

0.92

AUC 95% Confidence Interval Upper Bound

0.83

Not Provided

0.93

0.95

0.95

0.94

0.95

0.93

0.94

At 90% Sensitivity, specificity equals

0.47

Not Provided

0.75

0.82

0.84

0.79

0.81

0.77

0.77

At 80% Sensitivity, specificity equals

0.62

Not Provided

0.89

0.94

0.94

0.91

0.92

0.88

0.89

At 70% Sensitivity, specificity equals

0.94

Not Provided

0.93

0.97

0.97

0.95

0.96

0.93

0.94

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Cross-Validation Sample

 

Criterion: K-2: DIBELS NEXT; 3-8: New York State Testing Program

Time of Year: Fall

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

328

370

421

463

486

509

528

542

555

Base rate in the sample for children requiring intensive intervention

0.24

0.17

0.18

0.23

0.23

0.19

0.22

0.24

0.27

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.24

0.17

0.18

0.23

0.23

0.19

0.22

0.24

0.27

False Positive Rate

0.30

0.15

0.06

0.07

0.13

0.20

0.16

0.11

0.10

False Negative Rate

0.39

0.30

0.29

0.30

0.22

0.06

0.17

0.22

0.29

Sensitivity

0.61

0.70

0.71

0.70

0.78

0.94

0.83

0.78

0.71

Specificity

0.70

0.85

0.94

0.93

0.87

0.80

0.84

0.89

0.90

Positive Predictive Power

0.39

0.49

0.73

0.75

0.64

0.53

0.59

0.69

0.72

Negative Predictive Power

0.85

0.93

0.94

0.91

0.93

0.98

0.95

0.93

0.89

Overall Classification Rate

0.68

0.83

0.90

0.88

0.85

0.82

0.84

0.86

0.85

Area Under the Curve (AUC)

0.73

0.88

0.94

0.94

0.91

0.93

0.91

0.92

0.92

AUC 95% Confidence Interval Lower Bound

0.71

0.87

0.94

0.93

0.90

0.91

0.90

0.91

0.90

AUC 95% Confidence Interval Upper Bound

0.74

0.89

0.95

0.95

0.93

0.94

0.93

0.94

0.93

At 90% Sensitivity, specificity equals

0.31

0.61

0.81

0.78

0.70

0.76

0.72

0.76

0.71

At 80% Sensitivity, specificity equals

0.48

0.78

0.94

0.90

0.84

0.88

0.86

0.90

0.86

At 70% Sensitivity, specificity equals

0.62

0.88

0.97

0.98

0.93

0.93

0.93

0.95

0.93

 

Criterion: K-2: DIBELS NEXT; 3-8: New York State Testing Program

Time of Year: Winter

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

347

397

444

480

500

520

539

550

562

Base rate in the sample for children requiring intensive intervention

Not provided

Not provided

0.09

0.21

0.22

0.22

0.22

0.22

0.25

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not provided

Not provided

0.09

0.21

0.22

0.22

0.22

0.22

0.25

False Positive Rate

Not provided

Not provided

0.10

0.08

0.14

0.19

0.17

0.12

0.11

False Negative Rate

Not provided

Not provided

0.16

0.25

0.22

0.13

0.18

0.24

0.30

Sensitivity

Not provided

Not provided

0.85

0.75

0.78

0.87

0.82

0.76

0.70

Specificity

Not provided

Not provided

0.90

0.92

0.86

0.91

0.83

0.88

0.89

Positive Predictive Power

Not provided

Not provided

0.47

0.72

0.60

0.56

0.57

0.64

0.58

Negative Predictive Power

Not provided

Not provided

0.98

0.93

0.93

0.96

0.94

0.93

0.90

Overall Classification Rate

Not provided

Not provided

0.90

0.89

0.84

0.82

0.83

0.85

0.84

Area Under the Curve (AUC)

Not provided

Not provided

0.95

0.94

0.91

0.91

0.90

0.91

0.91

AUC 95% Confidence Interval Lower Bound

Not provided

Not provided

0.94

0.93

0.90

0.89

0.88

0.88

0.87

AUC 95% Confidence Interval Upper Bound

Not provided

Not provided

0.95

0.95

0.92

0.92

0.91

0.93

0.92

At 90% Sensitivity, specificity equals

Not provided

Not provided

0.84

0.79

0.69

0.73

0.67

0.72

0.68

At 80% Sensitivity, specificity equals

Not provided

Not provided

0.93

0.92

0.84

0.83

0.83

0.90

0.83

At 70% Sensitivity, specificity equals

Not provided

Not provided

0.97

0.96

0.93

0.91

0.91

0.93

0.91

 

Criterion: K-2: DIBELS NEXT; 3-8: New York State Testing Program

Time of Year: Spring

 

Grade K

Grade 1

Grade 2

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

367

416

464

491

505

526

543

553

567

Base rate in the sample for children requiring intensive intervention

0.24

Not provided

0.18

0.21

0.22

0.20

0.19

0.24

0.24

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.24

Not provided

0.18

0.21

0.22

0.20

0.19

0.24

0.24

False Positive Rate

0.30

Not provided

0.06

0.08

0.13

0.18

0.17

0.12

0.12

False Negative Rate

0.39

Not provided

0.29

0.25

0.20

0.17

0.14

0.23

0.30

Sensitivity

0.61

Not provided

0.71

0.75

0.80

0.83

0.86

0.77

0.70

Specificity

0.70

Not provided

0.94

0.92

0.87

0.82

0.83

0.88

0.88

Positive Predictive Power

0.39

Not provided

0.73

0.71

0.63

0.55

0.54

0.67

0.64

Negative Predictive Power

0.85

Not provided

0.94

0.93

0.94

0.95

0.96

0.92

0.90

Overall Classification Rate

0.68

Not provided

0.90

0.88

0.85

0.83

0.83

0.86

0.83

Area Under the Curve (AUC)

0.73

Not provided

0.94

0.94

0.92

0.90

0.90

0.91

0.90

AUC 95% Confidence Interval Lower Bound

0.71

Not provided

0.94

0.93

0.91

0.89

0.89

0.89

0.88

AUC 95% Confidence Interval Upper Bound

0.74

Not provided

0.95

0.95

0.93

0.92

0.91

0.82

0.92

At 90% Sensitivity, specificity equals

0.31

Not provided

0.81

0.79

0.72

0.71

0.68

0.74

0.64

At 80% Sensitivity, specificity equals

0.48

Not provided

0.94

0.92

0.87

0.84

0.82

0.85

0.84

At 70% Sensitivity, specificity equals

0.62

Not provided

0.97

0.96

0.93

0.91

0.90

0.91

0.91

 

Reliability

GradeK12345678
RatingFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbledFull bubbled

1.Justification for each type of reliability reported, given the type and purpose of the tool

The i-Ready Diagnostic provides two types of reliability estimates:

·        IRT-based reliability measures such as the marginal reliability estimate and standard error of measurement.

·        Test-retest reliability coefficients.

 

Marginal Reliability:

Given that the i-Ready Diagnostic is a computer-adaptive assessment that does not have a fixed form, some traditional reliability estimates such as Cronbach’s alpha are not an appropriate index for quantifying consistency or inconsistency in student performance. The IRT analogue to classical reliability is called marginal reliability, and operates on the variance of the theta scores and the average of the expected error variance. The marginal reliability uses the classical definition of reliability as proportion of variance in the total observed score due to true score under an IRT model (the i-Ready Diagnostic uses a Rasch model to be specific).

 

Standard Error of Measurement (SEM):

In an IRT model, SEMs are affected by factors such as how well the data fit the underlying model, student response consistency, student location on the ability continuum, match of items to student ability, and test length.  Given the adaptive nature of i-Ready and the wide difficulty range in the item bank, standard errors are expected to be low and very close to the theoretical minimum for the test of the given length.

 

The theoretical minimum would be reached if each interim estimate of student ability is assessed by an item with difficulty matching perfectly to the student’s ability estimated from previous items. Theoretical minimums are restricted by the number of items served in the assessment—the more items that are served up, the lower the SEM could potentially be. For ELA, the minimum SEM for overall scores is 8.9.

 

In addition to providing the mean SEM by subject and grade, the graphical representations of the conditional standard errors of measurement (CSEM) provide additional evidence of the precision with which i-Ready measures student ability across the operational score scale. The figures included on pages 25–27 better contextualize the table of reliability analyses. In the context of model-based reliability analyses for computer adaptive tests, such as i Ready, CSEM plots permit test users to judge the relative precision of the estimate.

 

For additional context, these figures mark the scale score associated with the 1st and 99th percentile ranks to give a sense of the frequency of most (98%) of students.

 

Test-retest Reliability:

The i-Ready Diagnostic is often used as an interim assessment, and students can take the assessment multiple times a year. Therefore, the test-retest reliability estimate is appropriate to provide stability estimates for the same students who took two Diagnostic tests.

 

 

2.Description of the sample(s), including size and characteristics, for each reliability analysis conducted:

Data for obtaining the marginal reliability and SEM was from the August and September administrations of the i-Ready Diagnostic from 2016 (reported in the 2016 i-Ready Diagnostic technical report). All students tested within the timeframe were included. Sample size by grade are presented in the table below (under question 4).

 

Evidence of test-retest stability was assessed based on a subsample of students who, during the 2016–2017 school year, took i-Ready Diagnostic twice within the recommended 12–18-week testing window. The average testing interval is 106 days (15 weeks). Sample sizes by grade are presented in the table below (under question 4).

 

 

3.Description of the analysis procedures for each reported type of reliability:

This marginal reliability uses the classical definition of reliability as proportion of variance in the total observed score due to true score. The true score variance is computed as the observed score variance minus the error variance.

 

Similar to a classical reliability coefficient, the marginal reliability estimate increases as the standard error decreases; it approaches 1 when the standard error approaches 0.

 

The observed score variance, the error variance, and SEM (the square root of the error variance) are obtained through WINSTEPS calibrations. One separate calibration was conducted for each grade.

 

For test-retest reliability, Pearson correlation coefficients were obtained between scores for the two Diagnostic tests. Correlations between the two Diagnostic tests were calculated. In lower grades where growth and variability are expected to be higher, test-retest correlations are expected to be relatively lower.

 

 

4.Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Marginal

K

184,261

0.91

n/a*

Marginal

1

287,593

0.95

n/a*

Marginal

2

323,280

0.96

n/a*

Marginal

3

343,103

0.97

n/a*

Marginal

4

337,854

0.97

n/a*

Marginal

5

341,292

0.97

n/a*

Marginal

6

249,454

0.97

n/a*

Marginal

7

224,530

0.97

n/a*

Marginal

8

222,503

0.97

n/a*

Test-retest

K

120,194

0.701

0.698, 0.704

Test-retest

1

166,187

0.826

0.824, 0.827

Test-retest

2

181,997

0.852

0.850, 0.853

Test-retest

3

209,427

0.854

0.853, 0.855

Test-retest

4

204,577

0.861

0.860, 0.862

Test-retest

5

202,922

0.862

0.861, 0.863

Test-retest

6

144,272

0.860

0.859, 0.861

Test-retest

7

126,128

0.855

0.853, 0.856

Test-retest

8

119,647

0.853

0.851, 0.855

SEM

K

184,261

9.30

n/a*

SEM

1

287,593

9.33

n/a*

SEM

2

323,280

10.38

n/a*

SEM

3

343,103

10.11

n/a*

SEM

4

337,854

10.14

n/a*

SEM

5

341,292

10.35

n/a*

SEM

6

249,454

10.51

n/a*

SEM

7

224,530

10.61

n/a*

SEM

8

222,503

10.71

n/a*

* n/a: Confidence intervals are not applicable to marginal reliability estimates or SEMs due to how they are calculated for our computer-adaptive assessment. CSEM demonstrating relative measurement precision across the i-Ready score scale are available from NCII upon request.

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of Reliability

Subgroup

Age or Grade

n

Coefficient

Confidence Interval

Split-half

Asian

1

531

0.80

n/a*

Split-half

African American

1

2,665

0.75

n/a*

Split-half

Hispanic

1

2,246

0.77

n/a*

Split-half

Asian

2

549

0.86

n/a*

Split-half

African American

2

2,990

0.81

n/a*

Split-half

Hispanic

2

2,289

0.79

n/a*

Split-half

Asian

3

468

0.83

n/a*

Split-half

African American

3

2,881

0.80

n/a*

Split-half

Hispanic

3

2,269

0.80

n/a*

Split-half

Asian

4

439

0.80

n/a*

Split-half

African American

4

1,977

0.77

n/a*

Split-half

Hispanic

4

1,577

0.76

n/a*

Split-half

Asian

5

370

0.79

n/a*

Split-half

African American

5

1,612

0.78

n/a*

Split-half

Hispanic

5

1,249

0.79

n/a*

Split-half

Asian

6

247

0.83

n/a*

Split-half

African American

6

515

0.78

n/a*

Split-half

Hispanic

6

639

0.74

n/a*

Split-half

African American

7

254

0.76

n/a*

Split-half

Hispanic

7

278

0.81

n/a*

Split-half

African American

8

234

0.88

n/a*

Split-half

Hispanic

8

198

0.83

n/a*

* n/a: Confidence intervals are not applicable to split-half reliability estimates due to how they are calculated for our computer-adaptive assessment.  Although some modeling approaches exist that yield confidence intervals for adaptive tests, the psychometric field does not currently have an agreed-upon approach and instead favors the reporting of reliability point estimates for adaptive assessments (as is done here).  If specific reliability techniques are favored for this application, Curriculum Associates is happy to provide these on request.

Validity

GradeK12345678
RatingHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

1.Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool:

The internal structure of the i-Ready Diagnostic assessments is supported by the construct maps and the ordering of the skills addressed at different stages on the map. We recognize that coverage of skills and difficulty of items will overlap a fair amount across grades, as much material is reviewed from year to year. However, what should be apparent from the estimated item difficulties is that, generally, items measuring skills targeting lower levels of the map should be easier, and items measuring skills targeting higher levels of the map should be more difficult.

 

2.Description of the sample(s), including size and characteristics, for each validity analysis conducted:

Active items in the current item pool for the 2016–2017 school year are included in the analysis for interval validity. The number of items per grade is listed in the table below.

 

3.Description of the analysis procedures for each reported type of validity:

Distributions of indicator difficulties by grade level provide further evidence of internal structure. The difficulty of an indicator corresponds to a 67% probability of passing on the Indicator Characteristic Curve aggregated across all items aligned to the indicator. The table below shows the average and standard deviation of indicator difficulties.

 

4.Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Age or Grade

Indicator Difficulty (Mean)

Indicator Difficulty (SD)

Number of Items

Internal

K

383.48

29.65

439

Internal

1

440.77

37.41

430

Internal

2

502.63

40.37

316

Internal

3

524.97

33.99

302

Internal

4

562.71

21.72

225

Internal

5

583.54

19.13

224

Internal

6

601.60

17.77

244

Internal

7

616.77

19.70

253

Internal

8

627.24

14.34

253

 

 

 

Test or Criterion

n

corr

95% CI Lower Bound

95% CI Upper Bound

Concurrent/Predictive

K

Lexile*

840

0.88

0.86

0.89

Concurrent/Predictive

1

Lexile*

840

0.88

0.86

0.89

Concurrent/Predictive

2

Lexile*

840

0.88

0.86

0.89

Predictive

3

PARCC

5609

0.79

0.78

0.80

Predictive

4

PARCC

5881

0.82

0.81

0.82

Predictive

5

PARCC

5530

0.80

0.79

0.81

Predictive

6

PARCC

4022

0.79

0.78

0.80

Predictive

7

PARCC

3925

0.79

0.78

0.80

Predictive

8

PARCC

3721

0.78

0.77

0.80

Concurrent

3

NC

7603

0.83

0.82

0.83

Concurrent

4

NC

7415

0.83

0.82

0.84

Concurrent

5

NC

7505

0.82

0.81

0.83

Concurrent

6

NC

5205

0.82

0.81

0.83

Concurrent

7

NC

5685

0.81

0.80

0.82

Concurrent

8

NC

5282

0.79

0.78

0.80

Concurrent

3

MS

3260

0.81

0.80

0.82

Concurrent

4

MS

3717

0.76

0.74

0.77

Concurrent

5

MS

3380

0.79

0.77

0.80

Concurrent

6

MS

3305

0.81

0.80

0.82

Concurrent

7

MS

2291

0.81

0.80

0.82

Concurrent

8

MS

2106

0.80

0.78

0.81

Concurrent

3

OH

3025

0.76

0.74

0.77

Concurrent

4

OH

2696

0.78

0.76

0.79

Concurrent

5

OH

2693

0.78

0.76

0.79

Concurrent

6

OH

1865

0.78

0.76

0.79

Concurrent

7

OH

1607

0.77

0.75

0.79

Concurrent

8

OH

1488

0.71

0.68

0.73

* For the purposes of the Lexile study referenced above, grade-banded results are featured, rather than grade-specific results.  The i-Ready Diagnostic reading scale scores are created on a vertical scale which makes the scale scores comparable across grades.  Thus, for efficiency purposes, the linking sample for the Lexile study includes only students from every other grade (i.e., grades 1, 3, 5, and 7), but results are generalized across grades in various grade bands (e.g., K-2).  Additional information on the Lexile study, which was conducted in concert with MetaMetrics, is available upon request. 

 

5.Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

None provided  

 

6.Describe the degree to which the provided data support the validity of the tool

The internal structure of the i-Ready Diagnostic assessments is supported by the construct maps and the ordering of the skills addressed at different stages on the map. Skills representing the lower levels on the construct map are those generally associated with items targeted at lower grade levels, and skills representing the higher levels on the map are ones generally associated with items targeted at higher grade levels.

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

None

 

 

 

 

 

 

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format:

None provided

Sample Representativeness

GradeK12345678
RatingFull bubbleFull bubbleFull bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubble

Primary Classification Accuracy Sample

Representation: National (East North Central, South Atlantic, Mountain, West North Central, Pacific).  The analyses featured here include six states total, with three states yielding data for the K-2 analysis and three different states yielding data for the 3-8 analysis.  Data for the K-2 analyses come from the 2016-17 school year, which is the most recent school year available at the time of the analysis.  Note also that the data for the 3-8 analysis is from the Smarter Balanced Assessment Consortium (SBAC) program and has been sampled and stratified to reflect the representation of all SBAC states during the 2015-16 school year (the most recent data available at the time of the analysis), and builds upon our 2014-15 SBAC study that used a slightly different sample of states.  The same SBAC stratified sample used for this analysis was also used for the i-Ready prediction model, which is a model used by all i-Ready schools in the 14 SBAC states to predict proficiency on SBAC using the i-Ready assessment.

Date: Spring, 2017 for K-2; Spring 2016 for 3-8

Size: 38,568 for Grade 3-8.  5,010 for Grade K-2.

 

 

K

1

2

3

4

5

6

7

8

Male

52%

52%

53%

49%

49%

48%

48%

48%

49%

Female

48%

48%

47%

51%

51%

52%

52%

52%

51%

Unknown

0%

0%

0%

0%

0%

0%

0%

0%

0%

SES

Unknown

Unknown

Unknown

27%

27%

20%

10%

20%

22%

White

14%

12%

13%

40%

40%

38%

28%

30%

29%

Black or African American

5%

4%

5%

6%

6%

6%

4%

4%

4%

Hispanic

14%

7%

7%

27%

28%

27%

25%

25%

25%

American Indian or Alaskan

0%

0%

0%

0%

0%

0%

0%

0%

0%

Asian

0%

0%

0%

14%

16%

15%

11%

15%

14%

Native Hawaiian or P. Islander

0%

0%

0%

1%

1%

1%

1%

1%

1%

Other

0%

1%

1%

4%

4%

4%

2%

2%

2%

Unknown

80%

78%

75%

7%

6%

10%

28%

24%

25%

Disability classification

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

First language

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

ELL

15%

20%

26%

25%

19%

15%

10%

8%

8%

 

Cross Validation Sample

Representation: National (East North Central, South Atlantic, Mountain, West North Central, Pacific).  The analyses featured here include two states, with one state yielding data for the K-2 analysis and a different state yielding data for the 3-8 analysis.  Data for the K-2 analyses come from the 2016-17 school year, which is the most recent school year available at the time of the analysis.  For the K-2 analysis, data came from one of the states used in the classification analysis featured above, but from a different district within the state that was specifically selected to determine the degree to which the results from the classification analyses were generalizable when the cut scores were applied to a different sample.  Note also that the data for the 3-8 analysis is from the New York State Testing Program and has been sampled and stratified to reflect the representation of all New York State districts during the 2015-16 school year (the most recent data available at the time of the analysis), and builds upon our 2014-15 New York State study that used a slightly different sample of districts.  The same New York State stratified sample used for this analysis was also used for the i-Ready prediction model, which is a model used by all i-Ready schools in New York State to predict proficiency on the New York State Testing Program using the i-Ready assessment.

Date: Spring, 2017 for K-2; Spring 2016 for 3-8

Size: 12,974  for Grade 3-8.  8,140 for Grade K-2                    

 

K

1

2

3

4

5

6

7

8

Male

33%

45%

44%

49%

51%

49%

53%

44%

53%

Female

28%

43%

46%

51%

49%

51%

47%

55%

47%

Unknown

39%

12%

10%

0%

0%

0%

0%

0%

0%

FRPL

Unknown

Unknown

Unknown

11%

9%

19%

20%

22%

61%

White

33%

45%

33%

5%

4%

3%

3%

1%

0%

Black or African American

8%

13%

10%

19%

21%

20%

19%

19%

18%

Hispanic

17%

20%

14%

19%

17%

18%

17%

17%

17%

American Indian or Alaskan

0%

0%

0%

0%

0%

0%

0%

1%

0%

Asian

0%

1%

0%

12%

11%

12%

10%

12%

11%

Native Hawaiian or P. Islander

0%

0%

0%

44%

46%

47%

51%

50%

54%

Other

2%

2%

2%

0%

0%

1%

0%

0%

0%

Unknown

56%

39%

54%

0%

0%

0%

0%

1%

0%

Disability classification

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

First language

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

ELL

Unknown

Unknown

Unknown

28%

30%

32%

35%

46%

45%

 

Bias Analysis Conducted

GradeK12345678
RatingYesYesYesYesYesYesYesYesYes
  1. Description of the method used to determine the presence or absence of bias:

Differential Item Function (DIF) was investigated using WINSTEPS® by comparing the item difficulty measure for two demographic categories in a pairwise comparison through a combined calibration analysis. The essence of this methodology is to investigate the interaction of the person-groups with each item, while fixing all other item and person measures to those from the combined calibration. The method used to detect DIF is based on the Mantel-Haenszel procedure (MH), and the work of Linacre & Wright (1989) and Linacre (2012). Typically, the group representing test takers in a specific demographic group is referred to as the focal group. The group made up of test takers from outside this group is referred to as the reference group. For example, for gender, Female is the focal group, and Male is the reference group.

 

  1. Description of the subgroups for which bias analyses were conducted:

The latest large-scale DIF analysis included a random sample (10%) of students from the 2015–2016 i-Ready operational data. Given the large size of the 2015–2016 i-Ready student population, it is practical to carry out the calibration analysis with a random sample. The following demographic categories were compared: Female vs. Male; African American and Hispanic vs. Caucasian; English Learner vs. non–English Learner; Special Ed vs. General Ed; Economically Disadvantaged vs. Not Economically Disadvantaged. In each pairwise comparison, estimates of item difficulty for each category in the comparison were calculated. The table below presents the total number (in thousands) and percentage of students included in the DIF analysis.

DIF Group

DIF Variable

N Count (thousands)

Percent

Gender

Male

258.4

52.0

 

Female*

238.8

48.0

Ethnicity

Caucasian

129.2

36.6

 

African American and Hispanic*

224.2

63.4

EL

non–English Learner

250.8

81.2

 

English Learner*

58.2

18.8

Special Ed

General Ed

165.8

85.7

 

Special Ed*

27.6

14.3

Economic Status

Not Economically Disadvantaged

177.8

69.0

 

Economically Disadvantaged*

80.0

31.1

 

 

  1. Description of the results of the bias analyses conducted, including data and interpretative statements:

Active items in the current item pool for the 2016–2017 school year are included in the DIF analysis. The total numbers of items are 3649 for ELA. WINSTEPS (Version 3.92) was used to conduct the calibration for DIF analysis by grade. To help interpret the results, the Educational Testing Service (ETS) criteria using the delta method was used to categorize DIF (Zwick, Thayer, & Lewis, 1999) and is presented below:

 

ETS DIF Category

Definition

A (negligible)

|DIF|<0.43

B (moderate)

|DIF|≥0.43 and|DIF|<0.64

C (large)

|DIF|≥0.64

B- or C- suggests DIF against focal group

 

B+ or C+ suggests DIF against reference group

 

 

The number and percentage of items exhibiting DIF for each of the demographic categories are reported in the table below. The majority of ELA items show negligible DIF (mostly more than 90 percent), and very few items (less than 3 percent) are showing large DIF (level C) by grade.

 

Grade

 

Gender

Ethnicity

ELL

Special Education

Economically Disadvantaged

 

ETS DIF Category

N

Percent

N

Percent

N

Percent

N

Percent

N

Percent

0

A

1,315

97.4

1,227

96.1

1,106

96.9

408

96.0

1160

98.3

 

B+

9

0.7

12

0.9

10

0.9

5

1.2

5

0.4

 

B-

11

0.8

31

2.4

19

1.7

10

2.4

13

1.1

 

C+

4

0.3

2

0.2

2

0.2

1

0.2

0

0.0

 

C-

11

0.8

5

0.4

4

0.4

1

0.2

2

0.2

 

Total

1,350

100.0

1,277

100.0

1,141

100.0

425

100.0

1,180

100.0

1

A

1,741

96.5

1,686

95.8

1,435

95.1

967

94.7

1,562

97.4

 

B+

15

0.8

35

2.0

22

1.5

23

2.3

13

0.8

 

B-

40

2.2

27

1.5

29

1.9

20

2.0

18

1.1

 

C+

4

0.2

7

0.4

16

1.1

6

0.6

4

0.2

 

C-

5

0.3

5

0.3

7

0.5

5

0.5

6

0.4

 

Total

1,805

100.0

1,760

100.0

1,509

100.0

1,021

100.0

1,603

100.0

2

A

1,886

95.3

1,766

95.2

1,668

93.1

1,094

93.0

1,868

96.4

 

B+

35

1.8

49

2.6

44

2.5

35

3.0

28

1.4

 

B-

48

2.4

30

1.6

46

2.6

26

2.2

26

1.3

 

C+

5

0.3

7

0.4

21

1.2

16

1.4

11

0.6

 

C-

4

0.2

4

0.2

12

0.7

5

0.4

5

0.3

 

Total

1,978

100.0

1,856

100.0

1,791

100.0

1,176

100.0

1,938

100.0

3

A

2,337

94.7

2,047

95.1

1,718

91.2

1,251

89.7

2,122

95.4

 

B+

44

1.8

52

2.4

54

2.9

54

3.9

38

1.7

 

B-

63

2.6

38

1.8

69

3.7

50

3.6

39

1.8

 

C+

14

0.6

9

0.4

15

0.8

22

1.6

17

0.8

 

C-

9

0.4

6

0.3

28

1.5

18

1.3

9

0.4

 

Total

2,467

100.0

2,152

100.0

1,884

100.0

1,395

100.0

2,225

100.0

4

A

2,386

95.3

2,000

96.3

1,863

89.7

1,552

91.8

2,208

96.4

 

B+

58

2.3

39

1.9

63

3.0

36

2.1

30

1.3

 

B-

29

1.2

25

1.2

80

3.8

54

3.2

25

1.1

 

C+

20

0.8

10

0.5

26

1.3

26

1.5

14

0.6

 

C-

11

0.4

2

0.1

46

2.2

23

1.4

14

0.6

 

Total

2,504

100.0

2,076

100.0

2,078

100.0

1,691

100.0

2,291

100.0

5

A

2,280

95.0

2,130

96.1

1,907

89.3

1,551

90.8

2,246

97.0

 

B+

41

1.7

43

1.9

79

3.7

50

2.9

29

1.3

 

B-

51

2.1

29

1.3

77

3.6

71

4.2

27

1.2

 

C+

18

0.8

12

0.5

30

1.4

18

1.1

9

0.4

 

C-

9

0.4

2

0.1

42

2.0

18

1.1

4

0.2

 

Total

2,399

100.0

2,216

100.0

2,135

100.0

1,708

100.0

2,315

100.0

6

A

2,135

92.6

1,921

94.1

1,561

86.2

1,520

90.1

2,120

95.5

 

B+

54

2.3

62

3.0

80

4.4

64

3.8

39

1.8

 

B-

81

3.5

43

2.1

96

5.3

69

4.1

41

1.8

 

C+

25

1.1

10

0.5

39

2.2

14

0.8

7

0.3

 

C-

10

0.4

5

0.2

34

1.9

20

1.2

14

0.6

 

Total

2,305

100.0

2,041

100.0

1,810

100.0

1,687

100.0

2,221

100.0

7

A

2,307

91.8

1,970

92.5

1,476

82.1

1,582

87.6

2,227

94.7

 

B+

76

3.0

66

3.1

109

6.1

83

4.6

54

2.3

 

B-

90

3.6

63

3.0

100

5.6

93

5.1

48

2.0

 

C+

26

1.0

26

1.2

58

3.2

22

1.2

8

0.3

 

C-

15

0.6

5

0.2

54

3.0

26

1.4

14

0.6

 

Total

2,514

100.0

2,130

100.0

1,797

100.0

1,806

100.0

2,351

100.0

8

A

2,280

89.1

1,930

93.2

1,412

78.5

1,599

87.0

2,209

94.4

 

B+

95

3.7

50

2.4

120

6.7

76

4.1

54

2.3

 

B-

127

5.0

51

2.5

114

6.3

95

5.2

53

2.3

 

C+

33

1.3

24

1.2

81

4.5

27

1.5

12

0.5

 

C-

25

1.0

15

0.7

71

3.9

40

2.2

11

0.5

 

Total

2,560

100.0

2,070

100.0

1,798

100.0

1,837

100.0

2,339

100.0

 

Administration Format

GradeK12345678
Data
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Administration & Scoring Time

    GradeK12345678
    Data
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • Scoring Format

    GradeK12345678
    Data
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Types of Decision Rules

    GradeK12345678
    Data
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    GradeK12345678
    Data
  • No
  • No
  • No
  • No
  • No
  • No
  • No
  • No
  • No