Istation’s Indicators of Progress (ISIP)

Advanced Reading

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

$5.95 per student

 

Replacement Cost:

$5.95 per student per year.


Annual license renewal fee subject to change.

 

Included in Cost:

ISIP AR is purchased as a yearly subscription. ISIP AR assessment packages includes online assessment, data hosting, reporting, teacher resources, online training center, user and manuals. In-person training conducted by a professional development specialist is available at additional cost ($2800 per specialist per day). Computers and/or tablets are needed to implement this assessment, as well as internet access. ISIP AR can be used on many different technology platforms including desktops, laptops, and tablets.

 

Technology Requirements:

  • Computer or tablet
  • Internet connection

 

Training Requirements:

  • 1-4 hours of training

 

Qualified Administrators:

  • Paraprofessionals
  • Professionals

 

Accommodations:

Appropriate accommodations are provided during ISIP assessments for students who are receiving support services, including those who have an Individual Education or 504 Plan, or who qualify as English Language learners. These accommodations support students’ access to the content of the assessment by reducing or eliminating the effects of the disability or limitation but do not change the content of the assessment. ISIP assessments provide people with disabilities access that is comparable to access for non-impaired people — with the exception of a totally blind or totally deaf disabled person. Administrators with manager accounts can assign accommodations to students in the Istation report and Management Portal.

 

ISIP supports or is compatible with the following types of accommodations:

  • Scribe
  • Touch screen overlay
  • ZoomText software
  • Extended time (Untimed Assessment feature)
  • Adjustable volume and/or headphones for students with hearing difficulties

Where to Obtain:

Website: www.istation.com

Address: 8150 North Central Expressway, Suite 2000, Dallas, TX, 75206

Phone number: (866) 883-READ

Email: info@istation.com

 

Access to Technical Support:

By email and phone (M-F 7am-6:30pm, CST).

 

ISIP Advanced Reading (ISIP AR) is a web-based computer adaptive assessment intended for students in Grade 4 through Grade 8 and can be administered simultaneously to an entire classroom in approximately 30 minutes. There is no additional scoring time required for the assessment. Teachers can be trained on ISIP AR through either a webinar or an in-person training session. Training takes between 1 and 4 hours. All training materials are online and are created by Istation. Reports are available for both individual and groups of students indicating single administration results and comparisons of results over time. All reports include student scaled scores and tier levels based on student percentiles.

Assessment Format:

  • Performance measure
  • Direct: Computerized
  • One-to-one

 

Administration Time:

  • 30 minutes per student
  • 30 minutes per group

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

  • Calculated automatically

 

Scores Generated:

  • Raw score
  • Percentile score
  • IRT-based score
  • Age equivalents
  • Lexile score
  • Composite scores
  • Subscale/subtest scores

 

Classification Accuracy

Grade45678
Criterion 1 FallHalf-filled bubbledEmpty bubbledHalf-filled bubbleEmpty bubbleEmpty bubble
Criterion 1 WinterHalf-filled bubbledEmpty bubbledHalf-filled bubbleEmpty bubbleEmpty bubble
Criterion 1 SpringHalf-filled bubbledEmpty bubbledHalf-filled bubbleEmpty bubbleEmpty bubble
Criterion 2 Falldashdashdashdashdash
Criterion 2 Winterdashdashdashdashdash
Criterion 2 Springdashdashdashdashdash

Primary Sample

 

Criterion 1, Fall

Grade

4

5

6

7

8

Criterion

MAP

MAP

MAP

MAP

MAP

Cut points: Percentile rank on criterion measure

20th

20th

20th

20th

20th

Cut points: Performance score (numeric) on criterion measure

188

195

201

204

207

Cut points: Corresponding performance score (numeric) on screener measure

1689

1783

1858

1989

2059

Base rate in the sample for children requiring intensive intervention

 

0.11

 

0.09

 

0.11

 

0.42

 

0.55

False Positive Rate

0.04

0.03

0.03

0.21

0.33

False Negative Rate

0.54

0.58

0.64

0.27

0.19

Sensitivity

0.46

0.42

0.36

0.73

0.81

Specificity

0.96

0.97

0.97

0.79

0.67

Positive Predictive Power

0.72

0.73

0.81

0.69

0.66

Negative Predictive Power

0.89

0.90

0.82

0.82

0.82

Overall Classification Rate

0.87

0.88

0.82

0.76

0.73

Area Under the Curve (AUC)

0.90

0.92

0.90

0.83

0.82

AUC 95% Confidence Interval Lower Bound

 

0.88

 

0.90

 

0.89

 

0.81

 

0.79

AUC 95% Confidence Interval Upper Bound

 

0.91

 

0.93

 

0.92

 

0.85

 

0.85

 

Criterion 1, Winter

Grade

4

5

6

7

8

Criterion

MAP

MAP

MAP

MAP

MAP

Cut points: Percentile rank on criterion measure

20th

20th

20th

20th

20th

Cut points: Performance score (numeric) on criterion measure

 

194

 

200

 

204

 

207

 

209

Cut points: Corresponding performance score (numeric) on screener measure

 

 

1738

 

 

1812

 

 

1880

 

 

2010

 

 

2082

Base rate in the sample for children requiring intensive intervention

 

0.10

 

0.08

 

0.19

 

0.48

 

0.55

False Positive Rate

0.02

0.02

0.05

0.24

0.31

False Negative Rate

0.60

0.62

0.57

0.27

0.22

Sensitivity

0.40

0.38

0.43

0.73

0.78

Specificity

0.98

0.98

0.95

0.76

0.69

Positive Predictive Power

0.80

0.80

0.83

0.75

0.72

Negative Predictive Power

0.87

0.88

0.73

0.75

0.76

Overall Classification Rate

0.87

0.87

0.75

0.75

0.74

Area Under the Curve (AUC)

0.93

0.93

0.86

0.82

0.83

AUC 95% Confidence Interval Lower Bound

 

0.91

 

0.92

 

0.84

 

0.80

 

0.80

AUC 95% Confidence Interval Upper Bound

 

0.94

 

0.94

 

0.88

 

0.85

 

0.85

 

Criterion 1, Spring

Grade

4

5

6

7

8

Criterion

MAP

MAP

MAP

MAP

MAP

Cut points: Percentile rank on criterion measure

20th

20th

20th

20th

20th

Cut points: Performance score (numeric) on criterion measure

 

196

 

202

 

206

 

208

 

209

Cut points: Corresponding performance score (numeric) on screener measure

 

1776

 

1936

 

1897

 

2031

 

2105

Base rate in the sample for children requiring intensive intervention

 

0.10

 

0.09

 

0.19

 

0.52

 

0.50

False Positive Rate

0.03

0.03

0.05

0.25

0.27

False Negative Rate

0.56

0.61

0.59

0.26

0.27

Sensitivity

0.44

0.39

0.41

0.74

0.73

Specificity

0.97

0.97

0.95

0.75

0.73

Positive Predictive Power

0.77

0.70

0.83

0.78

0.74

Negative Predictive Power

0.89

0.89

0.72

0.71

0.73

Overall Classification Rate

0.87

0.88

0.74

0.74

0.73

Area Under the Curve (AUC)

0.92

0.89

0.86

0.81

0.79

AUC 95% Confidence Interval Lower Bound

 

0.91

 

0.87

 

0.83

 

0.78

 

0.75

AUC 95% Confidence Interval Upper Bound

 

0.93

 

0.91

 

0.88

 

0.84

 

0.84

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Disaggregated Data

Criterion 1, Fall

Subgroup: LEP

Grade

4

5

6

7

8

Criterion

MAP

MAP

Not Provided

Not Provided

Not Provided

Cut points: Percentile rank on criterion measure

 

20th

 

20th

Not Provided

Not Provided

Not Provided

Cut points: Performance score (numeric) on criterion measure

 

188

 

195

Not Provided

Not Provided

Not Provided

Cut points: Corresponding performance score (numeric) on screener measure

 

 

1689

 

 

1783

Not Provided

Not Provided

Not Provided

Base rate in the sample for children requiring intensive intervention

 

0.47

 

 

0.51

 

Not Provided

Not Provided

Not Provided

False Positive Rate

0.37

0.42

Not Provided

Not Provided

Not Provided

False Negative Rate

0.15

0.05

Not Provided

Not Provided

Not Provided

Sensitivity

0.85

0.95

Not Provided

Not Provided

Not Provided

Specificity

0.63

0.58

Not Provided

Not Provided

Not Provided

Positive Predictive Power

0.38

 

0.34

 

Not Provided

Not Provided

Not Provided

Negative Predictive Power

0.94

 

0.98

 

Not Provided

Not Provided

Not Provided

Overall Classification Rate

0.68

 

0.65

 

Not Provided

Not Provided

Not Provided

Area Under the Curve (AUC)

0.84

0.88

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Lower Bound

 

0.81

 

0.85

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Upper Bound

 

0.88

 

0.91

Not Provided

Not Provided

Not Provided

 

Criterion 1, Winter

Subgroup: LEP

Grade

4

5

6

7

8

Criterion

MAP

MAP

Not Provided

Not Provided

Not Provided

Cut points: Percentile rank on criterion measure

20th

20th

Not Provided

Not Provided

Not Provided

Cut points: Performance score (numeric) on criterion measure

 

194

 

200

Not Provided

Not Provided

Not Provided

Cut points: Corresponding performance score (numeric) on screener measure

 

 

1738

 

 

1812

Not Provided

Not Provided

Not Provided

Base rate in the sample for children requiring intensive intervention

 

0.42

 

 

0.47

 

Not Provided

Not Provided

Not Provided

False Positive Rate

0.32

0.38

Not Provided

Not Provided

Not Provided

False Negative Rate

0.08

0.08

Not Provided

Not Provided

Not Provided

Sensitivity

0.93

0.92

Not Provided

Not Provided

Not Provided

Specificity

0.68

0.62

Not Provided

Not Provided

Not Provided

Positive Predictive Power

0.36

 

0.33

 

Not Provided

Not Provided

Not Provided

Negative Predictive Power

0.98

 

0.97

 

Not Provided

Not Provided

Not Provided

Overall Classification Rate

0.72

 

0.67

 

Not Provided

Not Provided

Not Provided

Area Under the Curve (AUC)

0.90

0.88

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Lower Bound

 

0.87

 

0.85

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Upper Bound

 

0.93

 

0.91

Not Provided

Not Provided

Not Provided

 

Criterion 1, Spring

Subgroup: LEP

Grade

4

5

6

7

8

Criterion

MAP

MAP

Not Provided

Not Provided

Not Provided

Cut points: Percentile rank on criterion measure

 

20th

 

20th

Not Provided

Not Provided

Not Provided

Cut points: Performance score (numeric) on criterion measure

 

196

 

202

Not Provided

Not Provided

Not Provided

Cut points: Corresponding performance score (numeric) on screener measure

 

 

1776

 

 

1936

Not Provided

Not Provided

Not Provided

Base rate in the sample for children requiring intensive intervention

 

0.40

 

 

0.42

 

Not Provided

Not Provided

Not Provided

False Positive Rate

0.30

0.34

Not Provided

Not Provided

Not Provided

False Negative Rate

0.06

0.12

Not Provided

Not Provided

Not Provided

Sensitivity

0.94

0.88

Not Provided

Not Provided

Not Provided

Specificity

0.70

0.66

Not Provided

Not Provided

Not Provided

Positive Predictive Power

0.39

 

0.32

 

Not Provided

Not Provided

Not Provided

Negative Predictive Power

0.98

 

0.97

 

Not Provided

Not Provided

Not Provided

Overall Classification Rate

0.74

 

0.69

 

Not Provided

Not Provided

Not Provided

Area Under the Curve (AUC)

0.92

0.86

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Lower Bound

 

0.90

 

0.82

Not Provided

Not Provided

Not Provided

AUC 95% Confidence Interval Upper Bound

 

0.94

 

0.89

Not Provided

Not Provided

Not Provided

 

Cross-Validation Sample

Criterion 1, Fall

Grade

4

5

6

7

8

Criterion

MAP

MAP

MAP

MAP

MAP

Cut points: Percentile rank on criterion measure

20th

20th

20th

20th

20th

Cut points: Performance score (numeric) on criterion measure

 

188

 

195

 

201

 

204

 

207

Cut points: Corresponding performance score (numeric) on screener measure

 

 

1689

 

 

1783

 

 

1858

 

 

1989

 

 

2059

Base rate in the sample for children requiring intensive intervention

 

0.11

 

0.09

 

0.11

 

0.42

 

0.55

False Positive Rate

0.04

0.03

0.03

0.21

0.33

False Negative Rate

0.54

0.58

0.64

0.27

0.19

Sensitivity

0.46

0.42

0.36

0.73

0.81

Specificity

0.96

0.97

0.97

0.79

0.67

Positive Predictive Power

0.72

0.73

0.81

0.69

0.66

Negative Predictive Power

0.89

0.90

0.82

0.82

0.82

Overall Classification Rate

0.87

0.88

0.82

0.76

0.73

Area Under the Curve (AUC)

0.90

0.92

0.90

0.83

0.82

AUC 95% Confidence Interval Lower Bound

 

0.88

 

0.90

 

0.89

 

0.81

 

0.79

AUC 95% Confidence Interval Upper Bound

 

0.91

 

0.93

 

0.92

 

0.85

 

0.85

 

Criterion 1, Winter

Grade

4

5

6

7

8

Criterion

MAP

MAP

MAP

MAP

MAP

Cut points: Percentile rank on criterion measure

20th

20th

20th

20th

20th

Cut points: Performance score (numeric) on criterion measure

 

194

 

200

 

204

 

207

 

209

Cut points: Corresponding performance score (numeric) on screener measure

 

 

1738

 

 

1812

 

 

1880

 

 

2010

 

 

2082

Base rate in the sample for children requiring intensive intervention

 

0.10

 

0.08

 

0.19

 

0.48

 

0.55

False Positive Rate

0.02

0.02

0.05

0.24

0.31

False Negative Rate

0.60

0.62

0.57

0.27

0.22

Sensitivity

0.40

0.38

0.43

0.73

0.78

Specificity

0.98

0.98

0.95

0.76

0.69

Positive Predictive Power

0.80

0.80

0.83

0.75

0.72

Negative Predictive Power

0.87

0.88

0.73

0.75

0.76

Overall Classification Rate

0.87

0.87

0.75

0.75

0.74

Area Under the Curve (AUC)

0.93

0.93

0.86

0.82

0.83

AUC 95% Confidence Interval Lower Bound

 

0.91

 

0.92

 

0.84

 

0.80

 

0.80

AUC 95% Confidence Interval Upper Bound

 

0.94

 

0.94

 

0.88

 

0.85

 

0.85

 

Criterion 1, Spring

Grade

4

5

6

7

8

Criterion

MAP

MAP

MAP

MAP

MAP

Cut points: Percentile rank on criterion measure

20th

20th

20th

20th

20th

Cut points: Performance score (numeric) on criterion measure

 

196

 

202

 

206

 

208

 

209

Cut points: Corresponding performance score (numeric) on screener measure

 

1776

 

1936

 

1897

 

2031

 

2105

Base rate in the sample for children requiring intensive intervention

 

0.10

 

0.09

 

0.19

 

0.52

 

0.50

False Positive Rate

0.03

0.03

0.05

0.25

0.27

False Negative Rate

0.56

0.61

0.59

0.26

0.27

Sensitivity

0.44

0.39

0.41

0.74

0.73

Specificity

0.97

0.97

0.95

0.75

0.73

Positive Predictive Power

0.77

0.70

0.83

0.78

0.74

Negative Predictive Power

0.89

0.89

0.72

0.71

0.73

Overall Classification Rate

0.87

0.88

0.74

0.74

0.73

Area Under the Curve (AUC)

0.92

0.89

0.86

0.81

0.79

AUC 95% Confidence Interval Lower Bound

 

0.91

 

0.87

 

0.83

 

0.78

 

0.75

AUC 95% Confidence Interval Upper Bound

 

0.93

 

0.91

 

0.88

 

0.84

 

0.84

 

Reliability

Grade45678
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

1.Justification for each type of reliability reported, given the type and purpose of the tool: Cronbach’s (1951) coefficient alpha is typically used as an indicator of reliability across test items within a testing instance. However, Cronbach’s Alpha is not appropriate for any IRT based measure because alpha assumes that all students in the testing instance respond to a common set of items. Due to its very nature, students taking a CAT-based assessment, such as ISIP Advanced Reading, will receive a custom set of items based on their initial estimates of ability and response patterns. Thus, students do not respond to a common set of items.

The IRT analogue to classical internal consistency is marginal reliability (Bock & Mislevy, 1982) and thus applied to ISIP Advanced Reading. Marginal reliability is a method of combining the variability in estimating abilities at different points on the ability scale into a single index. Like Cronbach’s alpha, marginal reliability is a unitless measure bounded by 0 and 1, and it can be used with Cronbach’s alpha to directly compare the internal consistencies of classical test data to IRT-based test data. ISIP Advanced Reading has a stopping criteria based on minimizing the standard error of the ability estimate. As such, the lower limit of the marginal reliability of the data for any testing instance of ISIP Advanced Reading will always be approximately 0.90.

 

2.Description of the sample(s), including size and characteristics, for each reliability analysis conducted: Sample derived from the total population of students using the ISIP AR assessment throughout the 2014-2015 school year.  Large sample size ranges from 83,621 to 226,558 students across the United States.

 

3.Description of the analysis procedures for each reported type of reliability: Istation derived IRT-based reliability from Classical Test Theory standpoint to Item Response Theory.

 

4.Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Age or Grade

n

Coefficient

95% Confidence Interval: Lower Bound

95% Confidence Interval: Upper Bound

IRT-based reliability

4

215,904

0.93

0.92

0.94

IRT-based reliability

5

203,788

0.94

0.93

0.95

IRT-based reliability

6

107,728

0.94

0.92

0.95

IRT-based reliability

7

92,450

0.94

0.92

0.95

IRT-based reliability

8

83,621

0.93

0.91

0.94

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of Reliability

Subgroup

Age or Grade

n

Coefficient

95% Confidence Interval: Lower Bound

95% Confidence Interval: Upper Bound

None

 

 

 

 

 

 

 

If your manual cites other published studies on reliability, provide these citations: Mathes, P. (2016). Istation’s Indicators of Progress (ISIP) Advanced Reading: Technical Report. Retrieved from https://www.istation.com/Content/downloads/studies/ar_technical_report.pdf

Validity

Grade45678
RatingFull bubbledFull bubbledFull bubbledFull bubbledFull bubbled

1.Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool: Predictive validity:

The Kansas Assessment Program (KAP) was developed by the Center for Educational Testing and Evaluation (CETE), a part of the University of Kansas’ Achievement and Assessment Institute. The content of all KAP tests and tools is derived from Kansas’ approved content standards for English language arts, science, mathematics, and social studies. KAP field tests its test questions to ensure appropriate fairness and difficulty.

The Georgia Milestones Assessment System (Georgia Milestones) is a comprehensive summative assessment program that spans from 3rd grade through high school. Georgia Milestones measures how well students have learned the knowledge and skills outlined in the state-adopted content standards in English language arts (ELA), mathematics, science, and social studies.

The Colorado Measures of Academic Success (CMAS) is Colorado’s standards-based assessment. The English Language Arts/Literacy (ELA) is a mandatory state assessment administered at the end of each school year between the months of March and May.

The State of Texas Assessments of Academic Readiness (STAAR) is the testing program for students in Texas public schools. STAAR Reading is the assessment used to determine whether students are successful in meeting the reading standards of their current grade and able to make academic progress from year to year.

Concurrent validity:

Gray Oral Reading Test-4 (GORT-4) is a standardized assessment that helps identify school-age children who are below their peers in oral reading proficiency, accuracy, fluency and comprehension. It diagnoses specific reading strengths and weaknesses, and document student reading growth as a result of special intervention. It is one of the most widely used measures of oral reading fluency and comprehension in the United States.

The Woodcock–Johnson Tests of Achievement (WJ-III) is a standardized achievement battery first developed in 1977 by Richard Woodcock and Mary E. Bonner Johnson. It is a comprehensive instrument that may be administered to children from age two to the oldest adults (with norms utilizing individuals in their 90s).

Wechsler Individual Achievement Test-II (WIAT-II; Wechsler, 2005) it a standardized test. It assesses the academic achievement of children, adolescents, college students and adults, age 4 through 85. The test enables the assessment of a broad range of academics skills or only a particular area of need. The WIAT-II is a revision of the original WIAT (The Psychological Corporation), and includes additional measures. There are four basic scales: reading, math, writing, and oral language.

 

2.Description of the sample(s), including size and characteristics, for each validity analysis conducted: Predictive validity:

KAP-ELA: The sample is derived from urban school districts in the state of Kansas. Sample size ranges from n=1,031 to 1,365. Georgia Milestones: The sample is derived from urban school districts in the state of Georgia. Sample size ranges from n=185 to 365. CMAS: sample is derived from urban school districts in the state of Colorado. Sample size ranges from n=37 to 3,877. STAAR: sample is derived from urban school districts in the northeast area of the state of Texas. Sample size ranges from n=3,877 to 2,647. Samples have different background and knowledge across all performance levels.

Concurrent validity:

The GORT-4, WJ-III, and WIAT-II: sample is derived from two large Texas independent school districts. Sample size ranges from n=86 to 138. Samples have different background and knowledge across all performance levels.

 

3.Description of the analysis procedures for each reported type of validity: Predictive validity:

KAP-ELA: The sample is derived from urban school districts in the state of Kansas. Sample size ranges from n=1,031 to 1,365. The data were collected in the 2016-17 school year. Georgia Milestones: The sample is derived from urban school districts in the state of Georgia. Sample size ranges from n=185 to 365. The data were collected in 2015-16 school year. CMAS: The sample is derived from urban school districts in the state of Colorado. Sample size ranges from n=37 to 3,877. The data were collected in the 2016-17 school year. STAAR: The sample is derived from urban school districts in the northeast area of the state of Texas. Sample size ranges from n=3,877 to 2,647. The data were collected in the 2015-16 school year.

Concurrent validity:

The GORT-4, WJ-III, and WIAT-II: The sample is derived from two large Texas independent school districts. Sample size ranges from n=86 to 138. Data collection occurred in the 2010-11 school year.

 

4.Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Age or Grade

Test or Criterion

n

Coefficient

95% Confidence Interval: Lower Bound

95% Confidence Interval: Upper Bound

Predictive Validity

4

KAP-ELA

1,265

0.74

0.71

0.76

Predictive Validity

5

KAP-ELA

1,170

0.77

0.75

0.79

Predictive Validity

6

KAP-ELA

1,031

0.75

0.72

0.78

Predictive Validity

4

Georgia Milestones

365

0.80

0.76

0.83

Predictive Validity

5

Georgia Milestones

185

0.70

0.62

0.77

Predictive Validity

6

Georgia Milestones

348

0.78

0.74

0.82

Predictive Validity

4

CMAS

3,877

0.82

0.81

0.83

Predictive Validity

5

CMAS

3,646

0.82

0.81

0.83

Predictive Validity

6

CMAS

172

0.78

0.71

0.83

Predictive Validity

7

CMAS

37

0.78

0.61

0.88

Predictive Validity

8

CMAS

43

0.76

0.60

0.86

Predictive Validity

4

STAAR

3,783

0.74

0.73

0.75

Predictive Validity

5

STAAR

3,877

0.72

0.70

0.73

Predictive Validity

6

STAAR

3,519

0.73

0.71

0.75

Predictive Validity

7

STAAR

2,973

0.71

0.69

0.73

Predictive Validity

8

STAAR

2,647

0.72

0.70

0.74

Concurrent Validity

4

GORT-4

115

0.71

0.70

0.72

Concurrent Validity

5

WJ-III

123

0.76

0.75

0.77

Concurrent Validity

6

WIAT-II

138

0.75

0.74

0.76

Concurrent Validity

7

GORT-4

106

0.66

0.65

0.67

Concurrent Validity

8

WIAT-II

86

0.70

0.68

0.71

 

5.Results for other forms of validity (e.g. factor analysis) not conducive to the table format: Not Provided

 

6.Describe the degree to which the provided data support the validity of the tool: Predictive validity: the state tests are used for our predictive validity.

The Kansas Assessment Program (KAP) was developed by the Center for Educational Testing and Evaluation (CETE), a part of the University of Kansas’ Achievement and Assessment Institute. It is a state test for students in Kansas. The content of all KAP tests and tools is derived from Kansas’ approved content standards for English language arts, science, mathematics, and social studies. KAP field tests its test questions to ensure appropriate fairness and difficulty. In this study, Kansas Assessment Program: English Language Arts (KAP-ELA) is used as a separate criterion measure to provide further student performance information for the sample district.

The Georgia Milestones Assessment System (Georgia Milestones) is a comprehensive summative assessment program that spans from 3rd grade through high school. It is a state test for students in Georgia. Georgia Milestones measures how well students have learned the knowledge and skills outlined in the state-adopted content standards in English language arts (ELA), mathematics, science, and social studies.

The Colorado Measures of Academic Success (CMAS) is Colorado’s standards-based assessment. It is a state test for students in Colorado. The English Language Arts/Literacy (ELA) is a mandatory state assessment administered at the end of each school year between the months of March and May. CMAS was used as a criterion assessment, or as a benchmark, to support the inferences made from ISIP AR for Grades 4 through 8.

State of Texas Assessments for Academic Readiness (STAAR) is the current state sponsored testing program in Texas. The STAAR Reading is a mandatory state assessment administered at the end of each school year between the months of March and May to students in Grades 3–8. STAAR was also used as a criterion assessment, or as a benchmark, to support the inferences made from ISIP AR for Grades 4 through 8.The results of this study suggest very strong relationships between ISIP AR and STAAR Reading.

Concurrent validity: the standardized tests are used for our concurrent validity.

Gray Oral Reading Test-4 (GORT-4) is a standardized assessment that helps identify school-age children who are below their peers in oral reading proficiency, accuracy, fluency and comprehension. It diagnoses specific reading strengths and weaknesses, and document student reading growth as a result of special intervention. It is one of the most widely used measures of oral reading fluency and comprehension in the United States.

The Woodcock–Johnson Tests of Achievement (WJ-III) is an achievement battery first developed in 1977 by Richard Woodcock and Mary E. Bonner Johnson. It is a comprehensive instrument that may be administered to children from age two to the oldest adults (with norms utilizing individuals in their 90s).

Wechsler Individual Achievement Test-II (WIAT-II; Wechsler, 2005) is a standardized test. It assesses the academic achievement of children, adolescents, college students and adults, aged 4 through 85. The test enables the assessment of a broad range of academics skills or only a particular area of need. The WIAT-II is a revision of the original WIAT (The Psychological Corporation), with additional measures. There are four basic scales: reading, math, writing, and oral language.

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

95% Confidence Interval: Lower Bound

95% Confidence Interval: Upper Bound

Predictive Validity

ELL

4

CMAS

1,523

0.75

0.73

0.77

Predictive Validity

ELL

5

CMAS

1,581

0.74

0.72

0.76

Predictive Validity

ELL

6

CMAS

42

0.57

0.32

0.74

Predictive Validity

ELL

7

CMAS

15

0.66

0.22

0.88

Predictive Validity

ELL

8

CMAS

15

0.75

0.39

0.91

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format: Not Provided  

 

If your manual cites other published validity studies, provide these citations: Mathes, P. (2016). Istation’s Indicators of Progress (ISIP) Advanced Reading: Technical Report. Retrieved from https://www.istation.com/Content/downloads/studies/ar_technical_report.pdf

Sample Representativeness

Grade45678
Data
  • Local with Cross-Validation
  • Local with Cross-Validation
  • Local with Cross-Validation
  • Local with Cross-Validation
  • Local with Cross-Validation
  • Primary Classification Accuracy Sample

    Criterion 1, Fall

    Grade

    4

    5

    6

    7

    8

    Criterion

    MAP

    MAP

    MAP

    MAP

    MAP

    National/Local Representation

    Urban school districts in the state of Texas

     

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Date

    October, 2015

    October, 2015

    October, 2015

    October, 2015

    October, 2015

    Sample Size

    3,053

    2,886

    920

    1,454

    1,047

    Male

    53%

    52%

    51%

    52%

    51%

    Female

    47%

    48%

    49%

    48%

    49%

    Gender Unknown

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Free or Reduced-price Lunch Eligible

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    White, Non-Hispanic

    20%

    19%

    21%

    20%

    19%

    Black, Non-Hispanic

    18%

    19%

    17%

    17%

    17%

    Hispanic

    52%

    52%

    52%

    52%

    50%

    American Indian/Alaska Native

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Other

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Race/Ethnicity Unknown

    10%

    10%

    10%

    11%

    14%

    Disability Classification

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    First Language

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Language Proficiency Status

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

     

    Criterion 1, Winter

    Grade

    4

    5

    6

    7

    8

    Criterion

    MAP

    MAP

    MAP

    MAP

    MAP

    National/Local Representation

    Urban school districts in the state of Texas

     

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Date

    January, 2016

    January, 2016

    January, 2016

    January, 2016

    January, 2016

    Sample Size

    3,245

    3,103

    1,623

    1,421

    974

    Male

    53%

    52%

    51%

    52%

    51%

    Female

    47%

    48%

    49%

    48%

    49%

    Gender Unknown

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Free or Reduced-price Lunch Eligible

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    White, Non-Hispanic

    21%

    20%

    21%

    20%

    19%

    Black, Non-Hispanic

    18%

    19%

    17%

    17%

    17%

    Hispanic

    52%

    52%

    52%

    52%

    50%

    American Indian/Alaska Native

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Other

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Race/Ethnicity Unknown

    9%

    9%

    10%

    11%

    14%

    Disability Classification

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    First Language

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Language Proficiency Status

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

     

    Criterion 1, Spring

    Grade

    4

    5

    6

    7

    8

    Criterion

    MAP

    MAP

    MAP

    MAP

    MAP

    National/Local Representation

    Urban school districts in the state of Texas

     

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Date

    June, 2016

    June, 2016

    June, 2016

    June, 2016

    June, 2016

    Sample Size

    3,390

    3,101

    1,083

    736

    449

    Male

    52%

    52%

    51%

    52%

    51%

    Female

    47%

    48%

    49%

    48%

    49%

    Gender Unknown

    1%

    0

    0

    0

    0

    Free or Reduced-price Lunch Eligible

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    White, Non-Hispanic

    21%

    20%

    21%

    20%

    19%

    Black, Non-Hispanic

    18%

    19%

    17%

    17%

    17%

    Hispanic

    52%

    52%

    52%

    52%

    50%

    American Indian/Alaska Native

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Other

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Race/Ethnicity Unknown

    9%

    9%

    10%

    11%

    14%

    Disability Classification

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    First Language

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Language Proficiency Status

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

     

    Cross Validation Sample

    Criterion 1, Fall

    Grade

    4

    5

    6

    7

    8

    Criterion

    MAP

    MAP

    MAP

    MAP

    MAP

    National/Local Representation

    Urban school districts in the state of Texas

     

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Date

    October, 2015

    October, 2015

    October, 2015

    October, 2015

    October, 2015

    Sample Size

    3,053

    2,886

    920

    1,454

    1,047

    Male

    53%

    52%

    51%

    52%

    51%

    Female

    47%

    48%

    49%

    48%

    49%

    Gender Unknown

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Free or Reduced-price Lunch Eligible

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    White, Non-Hispanic

    20%

    19%

    21%

    20%

    19%

    Black, Non-Hispanic

    18%

    19%

    17%

    17%

    17%

    Hispanic

    52%

    52%

    52%

    52%

    50%

    American Indian/Alaska Native

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Other

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Race/Ethnicity Unknown

    10%

    10%

    10%

    11%

    14%

    Disability Classification

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    First Language

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Language Proficiency Status

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

     

    Criterion 1, Winter

    Grade

    4

    5

    6

    7

    8

    Criterion

    MAP

    MAP

    MAP

    MAP

    MAP

    National/Local Representation

    Urban school districts in the state of Texas

     

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Date

    January, 2016

    January, 2016

    January, 2016

    January, 2016

    January, 2016

    Sample Size

    3,245

    3,103

    1,623

    1,421

    974

    Male

    53%

    52%

    51%

    52%

    51%

    Female

    47%

    48%

    49%

    48%

    49%

    Gender Unknown

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Free or Reduced-price Lunch Eligible

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    White, Non-Hispanic

    21%

    20%

    21%

    20%

    19%

    Black, Non-Hispanic

    18%

    19%

    17%

    17%

    17%

    Hispanic

    52%

    52%

    52%

    52%

    50%

    American Indian/Alaska Native

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Other

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Race/Ethnicity Unknown

    9%

    9%

    10%

    11%

    14%

    Disability Classification

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    First Language

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Language Proficiency Status

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

     

    Criterion 1, Spring

    Grade

    4

    5

    6

    7

    8

    Criterion

    MAP

    MAP

    MAP

    MAP

    MAP

    National/Local Representation

    Urban school districts in the state of Texas

     

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Urban school districts in the state of Texas

    Date

    June, 2016

    June, 2016

    June, 2016

    June, 2016

    June, 2016

    Sample Size

    3,390

    3,101

    1,083

    736

    449

    Male

    52%

    52%

    51%

    52%

    51%

    Female

    47%

    48%

    49%

    48%

    49%

    Gender Unknown

    1%

    0

    0

    0

    0

    Free or Reduced-price Lunch Eligible

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    White, Non-Hispanic

    21%

    20%

    21%

    20%

    19%

    Black, Non-Hispanic

    18%

    19%

    17%

    17%

    17%

    Hispanic

    52%

    52%

    52%

    52%

    50%

    American Indian/Alaska Native

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Other

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Race/Ethnicity Unknown

    9%

    9%

    10%

    11%

    14%

    Disability Classification

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    First Language

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Language Proficiency Status

    Not Provided

    Not Provided

    Not Provided

    Not Provided

    Not Provided

     

    Bias Analysis Conducted

    Grade45678
    RatingYesYesYesYesYes
    1. Description of the method used to determine the presence or absence of bias: Differential Item Functioning (DIF) analysis was conducted by grade level (4 - 8) using logistic regression DIF detection analysis by difR package in R software.

     

    1. Description of the subgroups for which bias analyses were conducted: Four DIF factors were investigated: socioeconomic status, gender, race/ethnicity, and special education students.

     

    1. Description of the results of the bias analyses conducted, including data and interpretative statements: Using Zumbo & Thomas (ZT) DIF criterion, results showed 97% displayed as A item (negligible or non-significant DIF effect), 2% displayed as B item (slightly to moderate DIF effect), and only 1% displayed as C item (moderate to large DIF effect) across grade level.

     

    Administration Format

    Grade45678
    Data
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Individual
  • Group
  • Administration & Scoring Time

    Grade45678
    Data
  • 30 minutes
  • 30 minutes
  • 30 minutes
  • 30 minutes
  • 30 minutes
  • Scoring Format

    Grade45678
    Data
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Automatic
  • Types of Decision Rules

    Grade45678
    Data
  • None
  • None
  • None
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    Grade45678
    Data
  • No
  • No
  • No
  • No
  • No