istation Indicators of Progress (ISIP)

Advanced Reading

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

$5.95 per student per year.

 

Replacement Cost:

$5.95 per student per year.

Annual license renewal fee subject to change.

 

Included in Cost:

ISIP AR assessment packages includes online assessment, data hosting, reporting, teacher resources, online training center, user and manuals.

Technology Requirements:

  • Computer or tablet
  • Internet connection

 

Training Requirements:

  • 1 – 4 hours of training

 

Qualified Administrators:

  • Paraprofessionals
  • Professionals

 

Accommodations:

No information provided; contact vendor for details.

Where to Obtain:

Website:

www.istation.com

Address:

Istation, 8150 North Central Expressway, Suite 2000, Dallas, TX, 75206

Phone:
866-883-READ (7323)

Email:
info@istation.com


Access to Technical Support:

By email and phone (M-F 7am-6:30pm, CST).

ISIP Advanced Reading (ISIP AR) is a computer adaptive assessment of reading ability that automatically adjusts the difficulty of items delivered to limit the amount of frustration or boredom often associated with traditional assessments. ISIP AR includes comprehensive reporting of teachers and parents, as well as downloadable teacher-directed lesson and resources for differentiated instruction. ISIP AR is intended to be used with students in grades 4-8, and can be administered simultaneously to an entire classroom in approximately 30 minutes.

Assessment Format:

  • Individual
  • Group
  • Computer-administered

 

Administration Time:

  • 30 minutes per student
  • 30 minutes per group

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

  • Calculated automatically

 

Scores Generated:

  • Percentile Score
  • Raw Score
  • IRT-Based Score
  • Lexile Score
  • Composite Scores

 

Reliability

Grade45678
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Justify the appropriateness of each type of reliability reported:

Cronbach’s (1951) coefficient alpha is typically used as an indicator of reliability across test items within a testing instance. However, Cronbach’s Alpha is not appropriate for any IRT based measure because alpha assumes that all students in the testing instance respond to a common set of items. Due to its very nature, students taking a CAT-based assessment, such as ISIP Advanced Reading, will receive a custom set of items based on their initial estimates of ability and response patterns. Thus, students do not respond to a common set of items.    

The IRT analogue to classical internal consistency is marginal reliability (Bock & Mislevy, 1982) and thus applied to ISIP Advanced Reading. Marginal reliability is a method of combining the variability in estimating abilities at different points on the ability scale into a single index. Like Cronbach’s alpha, marginal reliability is a unitless measure bounded by 0 and 1, and it can be used with Cronbach’s alpha to directly compare the internal consistencies of classical test data to IRT-based test data. ISIP Advanced Reading has a stopping criteria based on minimizing the standard error of the ability estimate. As such, the lower limit of the marginal reliability of the data for any testing instance of ISIP Advanced Reading will always be approximately 0.90.

 

Describe the sample characteristics for each reliability analysis conducted:

Sample derived from the total population of students using the ISIP assessment throughout the 2014-2015 school year.  Large sample size ranges from 83,621 to 226,558 students across the United States.

 

Describe the analysis procedures for each reported type of reliability:

Istation derived IRT-based reliability from Classical Test Theory standpoint to Item Response Theory.

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

IRT-based reliability

Grade 4

215,904

0.93

0.929-0.931

IRT-based reliability

Grade 5

203,788

0.94

0.939-0.941

IRT-based reliability

Grade 6

107,728

0.94

0.939-0.941

IRT-based reliability

Grade 7

92,450

0.94

0.939-0.941

IRT-based reliability

Grade 8

83,621

0.93

0.929-0.931

 

Validity

Grade45678
RatingHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubbleHalf-filled bubble

Describe and justify the criterion measures used to demonstrate validity:

The State of Texas Assessments of Academic Readiness (STAAR) is the testing program for students in Texas public schools. STAAR Reading is the assessment used to determine whether students are successful in meeting the reading standards of their current grade and able to make academic progress from year to year. ISIP AR was developed to measure the skills that are most predictive of students’ reading success. Since STAAR Reading is a measure of reading ability and determines students’ grade level success, it is important to understand the predictive validity of ISIP AR, used as a screener, when compared to STAAR Reading.

 

Describe the sample characteristics for each validity analysis conducted:

Sample is derived from urban school districts in the northeast area of the state of Texas. Sample size ranges from 2,647 to 3,877.

 

Describe the analysis procedures for each reported type of validity:

The predictive validity study was conducted to determine how well ISIP measures predicted students' performance on other reading tests. The data were collected from one district in the State of Texas in the 2012-2013 school year. Each student had both ISIP reading ability scores and STAAR scores. SPSS software was used to conduct the analyses. Pearson Product-Moment correlation analysis, multiple linear regression, and multiple logistic regression models were applied to each grade’s data using SPSS software.

 

Type of Validity

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Predictive Validity

Grade 4

State of Texas Assessment of Academic Readiness (STAAR)

3,783

0.74

0.725-0.754

Predictive Validity

Grade 5

STAAR

3,877

0.72

0.704-0.735

Predictive Validity

Grade 6

STAAR

3,519

0.73

0.714-0.745

Predictive Validity

Grade 7

STAAR

2,973

0.71

0.692-0.727

Predictive Validity

Grade 8

STAAR

2,647

0.72

0.701-0.738

 

Describe the degree to which the provided data support the validity of the tool:

The results of this study suggest very strong relationships between ISIP AR and STAAR Reading. The findings also add to the evidence that ISIP Reading measures are predictive of STAAR Reading across grades. The ISIP tests can be used as a prediction of how a student will score on STAAR.

Bias Analysis Conducted

Grade45678
RatingYesYesYesYesYes

Have additional analyses been conducted to establish whether the tool is or is not biased against demographic subgroups (e.g., students who vary by race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?

Bias Analysis Method:

Differential Item Functioning (DIF) analysis was conducted by grade level (4-8) using logistic regression DIF detection analysis with the difR package in R software.

 

Subgroups Included:

socioeconomic status, gender, race/ethnicity, and special education

 

Bias Analysis Results:

Using Zumbo & Thomas’ (ZT) DIF criterion, results showed 97% displayed as A item (negligible or non-significant DIF effect), 2% displayed as B item (slight to moderate DIF effect), and only 1% displayed as C item (moderate to large DIF effect). 

Sensitivity: Reliability of the Slope

Grade45678
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Describe the sample used for analyses, including size and characteristics:

Only ISIP AR Tier 3 (students performing seriously below grade level and in need of intensive intervention) students were included. Sample size ranged from 2,280 to 4,242 across the United States.

 

Describe the frequency of measurement:

The data were collected monthly from August 2016 (Period 0) to May (Period 9) 2017. Each student had all 10 data points.

 

Describe reliability of the slope analyses conducted with a population of students in need of intensive intervention:

A structural equation modeling (SEM) framework was applied to estimate the reliability of slope. A growth model with two parallel growth processes is used. To be more specific, two linear growth models were simultaneously modeled. The two parallel growth processes were established by splitting the available time segments into two groups. One group of time segments (Periods 0,2,4,6, and 8) was used to form one linear growth process, and another group of time segments (Periods 1,3,5,7, and 9) was used to form another linear growth process. For each linear growth process, the individual slopes of growth were estimated as factor scores of the latent slope factor. Then, the correlation between individual slopes from the two parallel growth processes was computed as an estimate of the reliability of the growth slope. Mplus software was used. The Spearman-Brown formula was then used to correct the correlation coefficient because each process had only half the available time represented.

 

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Split-half Growth Model Reliability

Grade 4

4,242

0.93

0.926-0.934

Split-half Growth Model Reliability

Grade 5

3,467

0.92

0.915-0.925

Split-half Growth Model Reliability

Grade 6

3,872

0.87

0.862-0.877

Split-half Growth Model Reliability

Grade 7

2,658

0.85

0.839-0.860

Split-half Growth Model Reliability

Grade 8

2,280

0.79

0.774-0.805

 

Sensitivity: Validity of the Slope

Grade45678
Ratingdashdashdashdashdash

Describe and justify the criterion measures used to demonstrate validity:

No qualifying evidence provided.

 

Describe the sample used for analyses, including size and characteristics:

No qualifying evidence provided.

 

Describe predictive validity of the slope of improvement analyses conducted with a population of students in need of intensive intervention:

No qualifying evidence provided.

 

Describe the degree to which the provided data support the validity of the tool:

No qualifying evidence provided.

Alternate Forms

Grade45678
RatingFull bubbleFull bubbleFull bubbleFull bubbleFull bubble

Describe the sample for these analyses, including size and characteristics:

No qualifying evidence provided.

 

Evidence that alternate forms are of equal and controlled difficulty or, if IRT based, evidence of item or ability invariance:

All items were calibrated via 2PL Unidimensional IRT. Item difficulties ranged from -3.0 to 3.0 and item discriminations ranged from 0.2 to 2.5. Items that did not meet these criteria were removed. Because the ISIP assessment is computer adaptive, the test forms are built at the item level with each student response. The CAT system assigns an initial ability estimate to a student based on their grade to deliver the first item. With each student response to an item the system then selects an item that fits best based on the student’s ability estimate using both item discrimination and item difficulty under Unidimensional 2PL IRT model. This process continues with each student response to an item. Once one of the stopping criteria is met (reaching maximum item per subtest, the standard errors of student's ability drops below a preset threshold, or 4 consecutive items have each reduced the standard error by less than a preset amount), the assessment stops, and a student ability score is reported for each subtest along with an overall reading composite score.

 

Number of alternate forms of equal and controlled difficultyNot applicable.

Decision Rules: Setting and Revising Goals

Grade45678
Ratingdashdashdashdashdash

Specification of validated decision rules for when goals should be set or revised:

No qualifying evidence provided.

 

Evidentiary basis for these rules:

No qualifying evidence provided.

Decision Rules: Changing Instruction

Grade45678
Ratingdashdashdashdashdash

Specification of validated decision rules for when changes to instruction should be made:

No qualifying evidence provided.

 

Evidentiary basis for these rules:

No qualifying evidence provided.

Administration Format

Grade45678
Data
  • Individual
  • Group
  • Computer-administered
  • Individual
  • Group
  • Computer-administered
  • Individual
  • Group
  • Computer-administered
  • Individual
  • Group
  • Computer-administered
  • Individual
  • Group
  • Computer-administered
  • Administration & Scoring Time

    Grade45678
    Data
  • 30 minutes
  • 30 minutes
  • 30 minutes
  • 30 minutes
  • 30 minutes
  • Scoring Format

    Grade45678
    Data
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • Computer-scored
  • ROI & EOY Benchmarks

    Grade45678
    Data
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • ROI & EOY Benchmarks Available
  • Specify the minimum acceptable rate of growth/improvement:

    National norms for ISIP Advanced Reading enable teachers, parents, and students to know how their students’ scores compare with a nationally representative sample of children in their particular grade. Norming samples are obtained as part of istation's ongoing research in assessing reading ability. The samples were drawn from enrolled ISIP users during the 2014-2015 school year. Considerable attention was given to ensure the sample was nationally representative of students in 4th through 8th grade with respect to the demographic variables of age, race/ethnicity, gender, socioeconomic status, Special education services, and English language proficiency.

    Norming establishes the Instructional Tier Goals used to determine Instructional Tiers for each month. Consistent with other reading assessments, istation has defined a three-tier normative grouping based on indices associated with the 20th and 40th percentiles. Students with an index on or above the 40th percentile for their grade are placed into Tier 1. Students with an index below the 20th percentile are placed into Tier 3. These tiers are used to guide educators in determining the level of instruction for each student.

     

    Specify the benchmarks for minimum acceptable end-of-year performance:

    Istation establishes Instruction Tier Goals to determine Instructional Tiers for each month of the year. The monthly goals for May or June (determined by each customer) are used as the end-of-year performance goal.