i-Ready Diagnostics

Mathematics

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

$6.00/student/year for i-Ready Diagnostic for math. Annual license fee includes online student access to assessment, plus staff access to management and reporting suite, downloadable lesson plans, and user resources including i-Ready Central support website; account set-up and secure hosting; all program maintenance/ updates/ enhancements during the active license term; unlimited user access to U.S.-based service and support via toll-free phone and email during business hours. Professional development is required and available at an additional cost ($2,000/session up to six hours).

 

Replacement Cost:

License renewal fees subject to change annually.

 

Included in Cost:

i‑Ready Diagnostic is a fully web-based, vendor-hosted, Software-as-a-Service application. The per-student or site-based license fee includes account set-up and management; unlimited access to i-Ready’s assessment, management, and reporting functionality; plus unlimited access to U.S.-based customer service/technical support and all program maintenance, updates, and enhancements for as long as the license remains active. The license fee also includes hosting, data storage, and data security.

 

Via the i-Ready teacher and administrator dashboards and i-Ready Central support website, educators may access comprehensive user guides and downloadable lesson plans, as well as implementation tips, best practices, video tutorials, and more to supplement onsite, fee-based professional development. These resources are self-paced and available 24/7.

 

 

Technology Requirements:

  • Computer or tablet
  • Internet access

 

Training Requirements:

  • 4-8 hours of training

 

Qualified Administrators:

  • Professionals
  • Paraprofessionals

 

Accommodations:

Curriculum Associates engaged an independent consultant to evaluate i‑Ready Diagnostic’s accessibility. Overall, the report found that i-Ready “materials included significant functionality that indirectly supports… students with disabilities.” All items in i-Ready Diagnostic are designed to be accessible for most students. In a majority of cases, students who require accommodations (e.g., large print, extra time) will not require additional help during administration. The intentional integration of accessible design features should aid most students who typically require testing accommodations.

To address the elements of Universal Design as they apply to large-scale assessment (http://www.cehd.umn.edu/nceo/onlinepubs/Synthesis44.html), in developing i-Ready Diagnostic Curriculum Associates considered several issues related to accommodations. Most may be grouped into the following general categories that i‑Ready addresses:

  • Timing—Students may need extra time to complete the task. The Diagnostic assessment may be stopped and started as needed to allow students needing extra time to finish. The Diagnostic is untimed and can be administered in multiple test sessions. In fact, to ensure accurate results, a time limit is not recommended for any student, though administration must be completed within a period of no longer than 22 days.
  • Flexible Scheduling—Students may need multiple days to complete the assessment. i-Ready recommends that all students be given multiple days, as necessary, to complete the test (as noted above, administration must be completed within a period of no longer than 22 days).
  • Accommodated Presentation of Material—All i-Ready Diagnostic items are presented in a large, easily legible format specifically chosen for its readability. i‑Ready currently offers the ability to change the screen size; with the coming HTML5 items slated for a future release, users will be able to adjust the font size. There is only one item on the screen at a time. Most items for grade levels K–5 mathematics have optional audio support.
  • Setting—Students may need to complete the task in a quiet room to minimize distraction. This can easily be done, as i-Ready Diagnostic is available on any computer with internet access that meets the technical requirements. Furthermore, all students are encouraged to use quality headphones in order to hear the audio portion of the items. Headphones also help to cancel out peripheral noise, which can be distracting to students.
  • Response Accommodation—Students should be able to control a mouse. They only need to be able to move a cursor with the mouse and be able to point, click, and drag. We are moving toward iPad® compatibility (see updates at www.i-Ready.com/support), with a beta expected in 2017-2018. This would mean touchscreen, which is potentially easier for those with motor impairments.

Where to Obtain:

Website: www.curriculumassociates.com              

Address: 153 Rangeway Road, N. Billerica MA 01862

Phone number: 800-225-0248              

Email address: info@cainc.com


Access to Technical Support:

Dedicated account manager plus unlimited access to in-house technical support during business hours.  

Offering a continuum of scale scores from kindergarten through high school, i‑Ready Diagnostic is a web-based adaptive screening assessment for mathematics which has been aligned with state and Common Core standards. i-Ready meets the expected rigor in each of the covered domains—Number and Operations/The Number System, Algebra and Algebraic Thinking, Geometry, and Measurement and Data—providing data and reports for each domain. Screening is administered up to three times per academic year, with 12-18 weeks of instruction between assessments. Each screening takes approximately 30-60 minutes—which may be broken into multiple sittings—and may be conducted with all students or with specific groups of students who have been identified as at risk of academic failure. i-Ready’s sophisticated adaptive algorithm automatically selects from thousands of technology-enhanced and multiple-choice items to get to the core of each student's strengths and challenges, regardless of the grade level at which he or she is performing.

 

The system automatically analyzes, scores, and reports student responses and results. Available as soon as a student completes the assessment, i‑Ready’s intuitive reports provide comprehensive information (including developmental analyses) about student performance, group students who struggle with the same concepts, make instructional recommendations to target skill deficiencies, and monitor progress and growth as students follow their individualized instructional paths. Reports include suggested next steps for instruction and PDF Tools for Instruction lesson plans for the teacher to use during individual, small-group, or whole-class instruction. In addition, should educators also purchase the optional i‑Ready Instruction, the system automatically prescribes online lessons that address each student’s identified academic needs.

 

 

Assessment Format:

  • Direct: Computerized

 

Administration Time:

  • 30-60 minutes per student

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

i-Ready Diagnostic scale scores are linear transformations of logit values. Logits, also known as “log odd units,” are measurement units for logarithmic probability models such as the Rasch model. Logit is used to determine both student ability and item difficulty. Within the Rasch model, if the ability matches the item difficulty, then the person has a .50 chance of answering the item correctly. For i-Ready Diagnostic, student ability and item logit values generally range from around -6 to 6.

 

Scores Generated:

  • Percentile score
  • IRT-based score            
  • Developmental benchmarks
  • Quantile score, on-grade achievement level placements

 

 

Classification Accuracy

Grade345678
Criterion 1 FallFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 1 WinterFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 1 SpringFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 FallFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 WinterFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
Criterion 2 SpringFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Criterion 1: Smarter Balanced

Time of Year: Fall

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

407

426

444

458

465

474

Base rate in the sample for children requiring intensive intervention

0.19

0.19

0.18

0.21

0.17

0.20

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.19

0.19

0.18

0.21

0.17

0.20

False Positive Rate

0.12

0.10

0.11

0.10

0.12

0.10

False Negative Rate

0.26

0.23

0.21

0.20

0.15

0.19

Sensitivity

0.74

0.77

0.79

0.80

0.85

0.81

Specificity

0.88

0.90

0.89

0.90

0.88

0.90

Positive Predictive Power

0.60

0.65

0.60

0.68

0.59

0.66

Negative Predictive Power

0.94

0.94

0.95

0.94

0.97

0.95

Overall Classification Rate

0.86

0.88

0.87

0.88

0.88

0.88

Area Under the Curve (AUC)

0.92

0.93

0.93

0.94

0.95

0.94

AUC 95% Confidence Interval Lower Bound

0.92

0.93

0.92

0.94

0.94

0.93

AUC 95% Confidence Interval Upper Bound

0.93

0.94

0.93

0.95

0.95

0.95

At 90% Sensitivity, specificity equals

0.72

0.78

0.74

0.81

0.81

0.79

At 80% Sensitivity, specificity equals

0.88

0.92

0.88

0.93

0.94

0.91

At 70% Sensitivity, specificity equals

0.94

0.96

0.95

0.98

0.98

0.96

 

Criterion 1: Smarter Balanced

Time of Year: Winter

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

420

439

453

465

472

481

Base rate in the sample for children requiring intensive intervention

0.19

0.17

0.19

0.17

0.16

0.17

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.19

0.17

0.19

0.17

0.16

0.17

False Positive Rate

0.10

0.10

0.11

0.10

0.11

0.10

False Negative Rate

0.23

0.17

0.23

0.16

0.15

0.22

Sensitivity

0.77

0.83

0.77

0.84

0.85

0.78

Specificity

0.90

0.90

0.89

0.90

0.89

0.90

Positive Predictive Power

0.63

0.62

0.62

0.64

0.60

0.61

Negative Predictive Power

0.94

0.96

0.95

0.96

0.97

0.95

Overall Classification Rate

0.87

0.88

0.87

0.89

0.88

0.88

Area Under the Curve (AUC)

0.93

0.94

0.93

0.95

0.94

0.93

AUC 95% Confidence Interval Lower Bound

0.92

0.93

0.92

0.94

0.94

0.92

AUC 95% Confidence Interval Upper Bound

0.94

0.94

0.94

0.95

0.95

0.94

At 90% Sensitivity, specificity equals

0.76

0.80

0.76

0.82

0.82

0.76

At 80% Sensitivity, specificity equals

0.90

0.90

0.91

0.94

0.93

0.89

At 70% Sensitivity, specificity equals

0.95

0.96

0.96

0.97

0.96

0.96

 

Criterion 1: Smarter Balanced

Time of Year: Spring

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

430

446

459

470

474

482

Base rate in the sample for children requiring intensive intervention

0.19

0.17

0.20

0.20

0.20

0.24

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.19

0.17

0.20

0.20

0.20

0.24

False Positive Rate

0.10

0.10

0.10

0.09

0.12

0.11

False Negative Rate

0.19

0.15

0.23

0.16

0.18

0.21

Sensitivity

0.81

0.85

0.77

0.84

0.82

0.79

Specificity

0.90

0.90

0.90

0.91

0.88

0.89

Positive Predictive Power

0.64

0.62

0.67

0.69

0.63

0.70

Negative Predictive Power

0.95

0.97

0.94

0.96

0.95

0.93

Overall Classification Rate

0.88

0.89

0.88

0.89

0.87

0.87

Area Under the Curve (AUC)

0.94

0.95

0.94

0.95

0.94

0.93

AUC 95% Confidence Interval Lower Bound

0.93

0.94

0.94

0.95

0.93

0.93

AUC 95% Confidence Interval Upper Bound

0.94

0.95

0.95

0.96

0.94

0.94

At 90% Sensitivity, specificity equals

0.81

0.84

0.78

0.85

0.79

0.78

At 80% Sensitivity, specificity equals

0.91

0.94

0.94

0.95

0.93

0.90

At 70% Sensitivity, specificity equals

0.96

0.98

0.97

0.97

0.97

0.96

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Cross-Validation Sample

Criterion: New York State Testing Program

Time of Year:  Fall

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

407

426

444

458

465

474

Base rate in the sample for children requiring intensive intervention

0.29

0.27

0.22

0.23

0.23

0.31

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.29

0.27

0.22

0.23

0.23

0.31

False Positive Rate

0.16

0.14

0.19

0.15

0.19

0.12

False Negative Rate

0.25

0.20

0.15

0.19

0.12

0.30

Sensitivity

0.75

0.80

0.85

0.81

0.88

0.70

Specificity

0.84

0.86

0.81

0.85

0.81

0.88

Positive Predictive Power

0.67

0.68

0.56

0.61

0.58

0.73

Negative Predictive Power

0.89

0.92

0.95

0.93

0.96

0.87

Overall Classification Rate

0.82

0.84

0.82

0.84

0.82

0.83

Area Under the Curve (AUC)

0.92

0.93

0.93

0.94

0.95

0.94

AUC 95% Confidence Interval Lower Bound

0.92

0.93

0.92

0.94

0.94

0.93

AUC 95% Confidence Interval Upper Bound

0.93

0.94

0.93

0.95

0.95

0.95

At 90% Sensitivity, specificity equals

0.65

0.73

0.71

0.69

0.74

0.70

At 80% Sensitivity, specificity equals

0.83

0.86

0.84

0.86

0.87

0.84

At 70% Sensitivity, specificity equals

0.91

0.91

0.93

0.92

0.93

0.91

 

Criterion: New York State Testing Program

Time of Year:  Winter

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

420

439

453

465

472

481

Base rate in the sample for children requiring intensive intervention

0.26

0.22

0.20

0.22

0.28

0.32

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.26

0.22

0.20

0.22

0.28

0.32

False Positive Rate

0.16

0.17

0.19

0.17

0.20

0.11

False Negative Rate

0.18

0.15

0.11

0.14

0.15

0.29

Sensitivity

0.82

0.85

0.89

0.86

0.85

0.71

Specificity

0.84

0.83

0.81

0.83

0.80

0.89

Positive Predictive Power

0.65

0.58

0.55

0.59

0.62

0.75

Negative Predictive Power

0.93

0.95

0.97

0.95

0.93

0.87

Overall Classification Rate

0.84

0.83

0.83

0.84

0.81

0.83

Area Under the Curve (AUC)

0.92

0.92

0.92

0.91

0.91

0.89

AUC 95% Confidence Interval Lower Bound

0.90

0.91

0.91

0.90

0.89

0.87

AUC 95% Confidence Interval Upper Bound

0.93

0.93

0.93

0.93

0.92

0.91

At 90% Sensitivity, specificity equals

0.72

0.76

0.77

0.73

0.72

0.65

At 80% Sensitivity, specificity equals

0.87

0.87

0.87

0.86

0.84

0.84

At 70% Sensitivity, specificity equals

0.94

0.93

0.92

0.92

0.90

0.89

 

Criterion: New York State Testing Program

Time of Year:  Spring

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Cut points

430

446

459

470

474

482

Base rate in the sample for children requiring intensive intervention

0.22

0.20

0.19

0.20

0.24

0.25

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.22

0.20

0.19

0.20

0.24

0.25

False Positive Rate

0.18

0.18

0.19

0.16

0.18

0.13

False Negative Rate

0.14

0.09

0.09

0.11

0.12

0.23

Sensitivity

0.86

0.91

0.94

0.89

0.88

0.77

Specificity

0.82

0.82

0.81

0.84

0.82

0.87

Positive Predictive Power

0.57

0.55

0.54

0.58

0.59

0.66

Negative Predictive Power

0.95

0.97

0.98

0.97

0.96

0.92

Overall Classification Rate

0.93

0.84

0.83

0.85

0.83

0.84

Area Under the Curve (AUC)

0.92

0.93

0.94

0.93

0.92

0.90

AUC 95% Confidence Interval Lower Bound

0.91

0.92

0.93

0.92

0.91

0.88

AUC 95% Confidence Interval Upper Bound

0.94

0.94

0.95

0.94

0.94

0.92

At 90% Sensitivity, specificity equals

0.76

0.77

0.79

0.77

0.75

0.71

At 80% Sensitivity, specificity equals

0.88

0.91

0.91

0.88

0.88

0.84

At 70% Sensitivity, specificity equals

0.95

0.97

0.96

0.95

0.94

0.90

 

Reliability

Grade345678
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble
  1. Justification for each type of reliability reported, given the type and purpose of the tool:

The i-Ready Diagnostic provides two types of reliability estimates:

  • IRT-based reliability measures such as the marginal reliability estimate and standard error of measurement.
  • Test-retest reliability coefficients.

 

Marginal Reliability:

Given that the i-Ready Diagnostic is a computer-adaptive assessment that does not have a fixed form, some traditional reliability estimates such as Cronbach’s alpha are not an appropriate index for quantifying consistency or inconsistency in student performance. The IRT analogue to classical reliability is called marginal reliability, and operates on the variance of the theta scores and the average of the expected error variance. The marginal reliability uses the classical definition of reliability as proportion of variance in the total observed score due to true score under an IRT model (the i-Ready Diagnostic uses a Rasch model to be specific).

 

Standard Error of Measurement (SEM):

In an IRT model, SEMs are affected by factors such as how well the data fit the underlying model, student response consistency, student location on the ability continuum, match of items to student ability, and test length. Given the adaptive nature of i-Ready and the wide difficulty range in the item bank, standard errors are expected to be low and very close to the theoretical minimum for the test of the given length.

The theoretical minimum would be reached if each interim estimate of student ability is assessed by an item with difficulty matching perfectly to the student’s ability estimated from previous items. Theoretical minimums are restricted by the number of items served in the assessment—the more items that are served up, the lower the SEM could potentially be. For mathematics, the minimum SEM for overall scores is 6.0.

 

In addition to providing the mean SEM by subject and grade, the graphical representations of the conditional standard errors of measurement (CSEM) provide additional evidence of the precision with which i-Ready measures student ability across the operational score scale. The figures included on pages 23–24 better contextualize the table of reliability analyses. In the context of model-based reliability analyses for computer adaptive tests, such as i Ready, CSEM plots permit test users to judge the relative precision of the estimate.

 

For additional context, these figures mark the scale score associated with the 1st and 99th percentile ranks to give a sense of the frequency of most (98%) of students.

 

Test-retest Reliability:

The i-Ready Diagnostic is often used as an interim assessment, and students can take the assessment multiple times a year. Therefore, the test-retest reliability estimate is appropriate to provide stability estimates for the same students who took two Diagnostic tests.

 

  1. Description of the sample(s), including size and characteristics, for each reliability analysis conducted:

Data for obtaining the marginal reliability and SEM was from the August and September administrations of the i-Ready Diagnostic from 2016 (reported in the 2016 i-Ready Diagnostic technical report). All students tested within the timeframe were included. Sample size by grade are presented in the table below (under question 4).

 

Evidence of test-retest stability was assessed based on a subsample of students who, during the 2016–2017 school year, took i-Ready Diagnostic twice within the recommended 12–18-week testing window. The average testing interval is 106 days (15 weeks). Sample sizes by grade are presented in the table below (under question 4).

 

 

  1. Description of the analysis procedures for each reported type of reliability:

This marginal reliability uses the classical definition of reliability as proportion of variance in the total observed score due to true score. The true score variance is computed as the observed score variance minus the error variance.

 

Similar to a classical reliability coefficient, the marginal reliability estimate increases as the standard error decreases; it approaches 1 when the standard error approaches 0.

 

The observed score variance, the error variance, and SEM (the square root of the error variance) are obtained through WINSTEPS calibrations. One separate calibration was conducted for each grade.

 

For test-retest reliability, Pearson correlation coefficients were obtained between scores for the two Diagnostic tests. Correlations between the two Diagnostic tests were calculated. In lower grades where growth and variability are expected to be higher, test-retest correlations are expected to be relatively lower.

 

  1. Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Marginal

K

191,221

0.92

n/a*

Marginal

1

298,476

0.93

n/a*

Marginal

2

334,238

0.94

n/a*

Marginal

3

376,087

0.95

n/a*

Marginal

4

366,044

0.96

n/a*

Marginal

5

366,142

0.96

n/a*

Marginal

6

276,255

0.96

n/a*

Marginal

7

254,216

0.97

n/a*

Marginal

8

238,758

0.97

n/a*

Test-retest

K

112,893

0.71

0.71, 0.71

Test-retest

1

161,070

0.77

0.77, 0.77

Test-retest

2

184,872

0.81

0.81, 0.81

Test-retest

3

213,324

0.83

0.82, 0.83

Test-retest

4

214,833

0.85

0.85, 0.85

Test-retest

5

212,796

0.87

0.86, 0.87

Test-retest

6

160,344

0.87

0.87, 0.88

Test-retest

7

141,754

0.87

0.87, 0.87

Test-retest

8

130,054

0.87

0.87, 0.87

SEM

K

191,221

6.48

n/a*

SEM

1

298,476

6.45

n/a*

SEM

2

334,238

6.43

n/a*

SEM

3

376,087

6.43

n/a*

SEM

4

366,044

6.43

n/a*

SEM

5

366,142

6.43

n/a*

SEM

6

276,255

6.43

n/a*

SEM

7

254,216

6.43

n/a*

SEM

8

238,758

6.44

n/a*

* n/a: Confidence intervals are not applicable to marginal reliability estimates or SEMs due to how they are calculated for our computer-adaptive assessment. CSEM plots demonstrating relative measurement precision across the i Ready score scale are available from NCII upon request.

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of Reliability

Subgroup

Age or Grade

n

Coefficient

Confidence Interval

None

 

 

 

 

 

 

Validity

Grade345678
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

1.Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool:

The internal structure of the i-Ready Diagnostic assessments is supported by the construct maps and the ordering of the skills addressed at different stages on the map. We recognize that coverage of skills and difficulty of items will overlap a fair amount across grades, as much material is reviewed from year to year. However, what should be apparent from the estimated item difficulties is that, generally, items measuring skills targeting lower levels of the map should be easier, and items measuring skills targeting higher levels of the map should be more difficult.

 

2.Description of the sample(s), including size and characteristics, for each validity analysis conducted:

Active items in the current item pool for the 2016–2017 school year are included in the analysis for interval validity. The number of items per grade is listed in the table below.

 

3.Description of the analysis procedures for each reported type of validity:

Distributions of indicator difficulties by grade level provide further evidence of internal structure. The difficulty of an indicator corresponds to a 67% probability of passing on the indicator characteristic curve aggregated across all items aligned to the indicator. The table below shows the average and standard deviation of indicator difficulties.

 

4.Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Age or Grade

Indicator Difficulty (Mean)

Indicator Difficulty (SD)

Number of Items

Internal

K

371.97

21.38

241

Internal

1

398.10

19.15

245

Internal

2

431.24

22.00

324

Internal

3

463.80

20.28

306

Internal

4

483.96

24.28

354

Internal

5

508.11

19.15

270

Internal

6

521.85

23.29

374

Internal

7

546.38

15.59

261

Internal

8

548.85

19.72

223

 

 

 

Age or Grade

Test or Criterion

n

corr

95% CI Lower Bound

95% CI Upper Bound

Concurrent/Predictive

K

Quantile*

1,074

0.59

0.55

0.63

Concurrent/Predictive

1

Quantile*

1,074

0.59

0.55

0.63

Concurrent/Predictive

2

Quantile*

1,074

0.59

0.55

0.63

Predictive

3

PARCC

5,969

0.78

0.77

0.79

Predictive

4

PARCC

6,067

0.80

0.79

0.81

Predictive

5

PARCC

5,899

0.81

0.80

0.82

Predictive

6

PARCC

4,096

0.79

0.78

0.80

Predictive

7

PARCC

3,913

0.80

0.79

0.81

Predictive

8

PARCC

3,146

0.79

0.77

0.80

Concurrent

3

NC

7,662

0.82

0.81

0.83

Concurrent

4

NC

7,686

0.82

0.81

0.83

Concurrent

5

NC

7,208

0.82

0.81

0.83

Concurrent

6

NC

4,829

0.83

0.82

0.84

Concurrent

7

NC

5,578

0.82

0.81

0.83

Concurrent

8

NC

5,086

0.81

0.80

0.82

Concurrent

3

MS

3,483

0.84

0.83

0.85

Concurrent

4

MS

3,750

0.86

0.85

0.86

Concurrent

5

MS

3,481

0.84

0.83

0.85

Concurrent

6

MS

3,570

0.85

0.84

0.86

Concurrent

7

MS

3,104

0.84

0.83

0.85

Concurrent

8

MS

2,942

0.85

0.84

0.86

Concurrent

3

OH

2,429

0.81

0.79

0.82

Concurrent

4

OH

2,151

0.82

0.80

0.83

Concurrent

5

OH

2,183

0.84

0.84

0.85

Concurrent

6

OH

1,241

0.85

0.83

0.86

Concurrent

7

OH

1,114

0.82

0.80

0.84

Concurrent

8

OH

935

0.80

0.77

0.82

* For the purposes of the Quantile study referenced above, grade-banded results are featured, rather than grade-specific results. The i-Ready Diagnostic scale scores are created on a vertical scale which makes the scale scores comparable across grades. Thus, for efficiency purposes, the linking sample for the Quantile study includes only students from every other grade (i.e., grades 1, 3, 5, and 7), but results are generalized across grades in various grade bands (e.g., K-2). Additional information on the Quantile study, which was conducted in concert with MetaMetrics, is available upon request. 

 

Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

Not provided

 

5.Describe the degree to which the provided data support the validity of the tool:

The internal structure of the i-Ready Diagnostic assessments is supported by the construct maps and the ordering of the skills addressed at different stages on the map. Skills representing the lower levels on the construct map are those generally associated with items targeted at lower grade levels, and skills representing the higher levels on the map are ones generally associated with items targeted at higher grade levels.

 

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

None

 

 

 

 

 

 

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format:

Not provided

Sample Representativeness

Grade345678
RatingEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubbleEmpty bubble

Primary Classification Accuracy Sample

Representation: National (West North Central, Pacific). The data for the 3-8 analysis is from three states from the Smarter Balanced Assessment Consortium (SBAC) program and has been sampled and stratified to reflect the representation of all SBAC states during the 2015-16 school year (the most recent data available at the time of the analysis), and builds upon our 2014-15 SBAC study that used a slightly different sample of states. The same SBAC stratified sample used for this analysis was also used for the i-Ready prediction model, which is a model used by all i-Ready schools in the 14 SBAC states to predict proficiency on SBAC using the i-Ready assessment.

Date: Spring 2016

Size: 38,745

 

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Male

49%

49%

48%

48%

48%

49%

Female

51%

51%

52%

52%

52%

51%

Unknown Gender

0%

0%

0%

0%

0%

0%

Free or reduced-price lunch

27%

27%

20%

10%

20%

22%

White

40%

40%

38%

28%

30%

29%

Black or African American

6%

6%

6%

4%

4%

4%

Hispanic

27%

28%

27%

25%

25%

25%

American Indian or Alaskan

0%

0%

0%

0%

0%

0%

Asian

14%

16%

15%

11%

15%

14%

Native Hawaiian or P. Islander

1%

1%

1%

1%

1%

1%

Other

4%

4%

4%

2%

2%

2%

Unknown Ethnicity

7%

6%

10%

28%

24%

25%

Disability Classification

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

First language

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

ELL

25%

19%

15%

10%

8%

8%

 

 

Cross Validation Sample

Representation: National (Middle Atlantic). The data for the 3-8 analysis is from the New York State Testing Program and has been sampled and stratified to reflect the representation of all New York State districts during the 2015-16 school year (the most recent data available at the time of the analysis), and builds upon our 2014-15 New York State study that used a slightly different sample of districts. The same New York State stratified sample used for this analysis was also used for the i-Ready prediction model, which is a model used by all i-Ready schools in New York State to predict proficiency on the New York State Testing Program using the i-Ready assessment.

Date: Spring 2016

Size: 12,424

 

 

Grade 3

Grade 4

Grade 5

Grade 6

Grade 7

Grade 8

Male

49%

51%

49%

53%

44%

53%

Female

51%

49%

51%

47%

55%

47%

Unknown Gender

0%

0%

0%

0%

0%

0%

Free or reduced-price lunch

11%

9%

19%

20%

22%

61%

American Indian or Alaskan

0%

0%

0%

0%

1%

0%

Asian

12%

11%

12%

10%

12%

11%

Black or African American

19%

21%

20%

19%

19%

18%

Hispanic

19%

17%

18%

17%

17%

17%

Native Hawaiian or P. Islander

44%

46%

47%

51%

50%

54%

White

5%

4%

3%

3%

1%

0%

Other

0%

0%

1%

0%

0%

0%

Unknown Ethnicity

0%

0%

0%

0%

1%

0%

Disability Classification

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

First language

Unknown

Unknown

Unknown

Unknown

Unknown

Unknown

ELL

28%

30%

32%

35%

46%

45%

 

Bias Analysis Conducted

Grade345678
RatingYesYesYesYesYesYes
  1. Description of the method used to determine the presence or absence of bias:

Differential Item Function (DIF) was investigated using WINSTEPS® by comparing the item difficulty measure for two demographic categories in a pairwise comparison through a combined calibration analysis. The essence of this methodology is to investigate the interaction of the person-groups with each item, while fixing all other item and person measures to those from the combined calibration. The method used to detect DIF is based on the Mantel-Haenszel procedure (MH), and the work of Linacre & Wright (1989) and Linacre (2012). Typically, the group representing test takers in a specific demographic group is referred to as the focal group. The group made up of test takers from outside this group is referred to as the reference group. For example, for gender, Female is the focal group, and Male is the reference group.

 

  1. Description of the subgroups for which bias analyses were conducted:

The latest large-scale DIF analysis included a random sample (10%) of students from the 2015–2016 i-Ready operational data. Given the large size of the 2015–2016 i-Ready student population, it is practical to carry out the calibration analysis with a random sample. The following demographic categories were compared: Female vs. Male; African American and Hispanic vs. Caucasian; English Learner vs. non–English Learner; Special Ed vs. General Ed; Economically Disadvantaged vs. Not Economically Disadvantaged. In each pairwise comparison, estimates of item difficulty for each category in the comparison were calculated. The table below presents the total number (in thousands) and percentage of students included in the DIF analysis.

DIF Group

DIF Variable

N Count (thousands)

Percent

Gender

Male

267.2

52

Female*[1]

247

48

Ethnicity

Caucasian

126.4

34.1

African American and Hispanic*

244.1

65.9

EL

non–English Learner

262.7

80.8

English Learner*

62.4

19.2

Special Ed

General Ed

181

85.1

Special Ed*

31.6

14.9

Economic Status

Not Economically Disadvantaged

192.1

67.1

Economically Disadvantaged*

94.1

32.9

[1] * indicates the focal group.

 

  1. Description of the results of the bias analyses conducted, including data and interpretative statements:

Active items in the current item pool for the 2016–2017 school year are included in the DIF analysis. The total numbers of items are 3103 for Mathematics. WINSTEPS (Version 3.92) was used to conduct the calibration for DIF analysis by grade. To help interpret the results, the Educational Testing Service (ETS) criteria using the delta method was used to categorize DIF (Zwick, Thayer, & Lewis, 1999) and is presented below:

ETS DIF Category

Definition

A (negligible)

|DIF|<0.43

B (moderate)

|DIF|≥0.43 and |DIF|<0.64

C (large)

|DIF|≥0.64

B- or C- suggests DIF against focal group

 

B+ or C+ suggests DIF against reference group

 

 

The number and percentage of items exhibiting DIF for each of the demographic categories are reported in the table below. It should be noted that not all students have individual demographic information and the total number of items for two exclusive groups in the categories does not necessarily equal to the total number of items. The majority of ELA items show negligible DIF (mostly more than 90 percent), and very few items (less than 6 percent) are showing large DIF (level C) by grade.

Grade

ETS DIF Category

Gender

N

Gender Percent

Ethnicity N

Ethnicity Percent

ELL

N

ELL Percent

Special Education

N

Special Education

Percent

Economically Disadvantaged

N

Economically Disadvantaged

Percent

0

A

602

98.4

579

94.6

560

98.6

424

97.5

582

98.6

 

B+

4

0.7

16

2.6

4

0.7

2

0.5

2

0.3

 

B-

4

0.7

9

1.5

3

0.5

9

2.1

5

0.8

 

C+

0

0.0

7

1.1

1

0.2

0

0.0

0

0.0

 

C-

2

0.3

1

0.2

568

100.0

0

0.0

1

0.2

 

Total

612

100.0

612

100.0

817

96.5

435

100.0

590

100.0

1

A

895

97.8

841

94.0

19

2.2

733

98.1

861

98.6

 

B+

10

1.1

30

3.4

4

0.5

4

0.5

1

0.1

 

B-

5

0.5

9

1.0

7

0.8

10

1.3

5

0.6

 

C+

4

0.4

12

1.3

0

0.0

0

0.0

5

0.6

 

C-

1

0.1

3

0.3

0

0.0

0

0.0

1

0.1

 

Total

915

100.0

895

100.0

847

100.0

747

100.0

873

100.0

2

A

1,160

97.3

1,062

93.9

1,095

96.8

1,000

97.9

1,134

99.0

 

B+

24

2.0

42

3.7

20

1.8

10

1.0

5

0.4

 

B-

4

0.3

16

1.4

9

0.8

8

0.8

5

0.4

 

C+

4

0.3

10

0.9

7

0.6

1

0.1

1

0.1

 

C-

0

0.0

1

0.1

0

0.0

2

0.2

0

0.0

 

Total

1,192

100.0

1,131

100.0

1,131

100.0

1,021

100.0

1,145

100.0

3

A

1,576

96.2

1,434

91.7

1,396

94.3

1,297

95.9

1,509

97.0

 

B+

29

1.8

51

3.3

45

3.0

21

1.6

8

0.5

 

B-

20

1.2

46

2.9

25

1.7

26

1.9

24

1.5

 

C+

8

0.5

13

0.8

9

0.6

5

0.4

1

0.1

 

C-

5

0.3

19

1.2

6

0.4

4

0.3

14

0.9

 

Total

1,638

100.0

1,563

100.0

1,481

100.0

1,353

100.0

1,556

100.0

4

A

1,812

95.1

1,610

90.6

1,588

91.0

1,467

95.0

1,759

96.3

 

B+

44

2.3

66

3.7

52

3.0

26

1.7

18

1.0

 

B-

37

1.9

69

3.9

66

3.8

37

2.4

36

2.0

 

C+

9

0.5

20

1.1

20

1.1

5

0.3

4

0.2

 

C-

3

0.2

12

0.7

20

1.1

9

0.6

10

0.5

 

Total

1,905

100.0

1,777

100.0

1,746

100.0

1,544

100.0

1,827

100.0

5

A

2,113

93.7

1,779

89.4

1,677

89.6

1,488

92.6

2,039

94.5

 

B+

62

2.7

79

4.0

63

3.4

42

2.6

41

1.9

 

B-

51

2.3

88

4.4

86

4.6

58

3.6

50

2.3

 

C+

18

0.8

28

1.4

18

1.0

10

0.6

14

0.6

 

C-

11

0.5

17

0.9

28

1.5

9

0.6

13

0.6

 

Total

2255

100.0

1991

100.0

1872

100.0

1607

100.0

2157

100.0

6

A

2169

91.3

1717

89.6

1483

86.3

1420

88.8

2081

93.5

 

B+

73

3.1

95

5.0

70

4.1

53

3.3

47

2.1

 

B-

84

3.5

70

3.7

118

6.9

76

4.8

58

2.6

 

C+

28

1.2

20

1.0

23

1.3

20

1.3

13

0.6

 

C-

21

0.9

15

0.8

25

1.5

30

1.9

27

1.2

 

Total

2375

100.0

1917

100.0

1719

100.0

1599

100.0

2226

100.0

7

A

2296

92.5

1796

85.2

1474

84.5

1359

88.3

2158

93.5

 

B+

77

3.1

126

6.0

77

4.4

63

4.1

48

2.1

 

B-

76

3.1

123

5.8

114

6.5

75

4.9

67

2.9

 

C+

20

0.8

20

0.9

29

1.7

12

0.8

19

0.8

 

C-

12

0.5

43

2.0

51

2.9

30

1.9

16

0.7

 

Total

2481

100.0

2108

100.0

1745

100.0

1539

100.0

2308

100.0

8

A

2289

92.1

1804

86.6

1348

81.7

1326

88.1

2182

93.5

 

B+

108

4.3

102

4.9

86

5.2

59

3.9

52

2.2

 

B-

54

2.2

101

4.8

114

6.9

76

5.0

46

2.0

 

C+

20

0.8

26

1.2

44

2.7

13

0.9

30

1.3

 

C-

14

0.6

51

2.4

57

3.5

31

2.1

24

1.0

 

Total

2485

100.0

2084

100.0

1649

100.0

1505

100.0

2334

100.0

 

 

Administration Format

Grade345678
Data
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Administration & Scoring Time

    Grade345678
    Data
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • 30-60 minutes
  • Scoring Format

    Grade345678
    Data
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Types of Decision Rules

    Grade345678
    Data
  • None
  • None
  • None
  • None
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    Grade345678
    Data
  • No
  • No
  • No
  • No
  • No
  • No