MAP Growth K-2

Reading

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

MAP Growth K–2 Reading annual per-student subscription fees range from $7.00–$9.50. A bundled assessment suite of Mathematics and Reading tests starts at $13.50 per student. Discounts are available based on volume and other factors.

 

Replacement Cost:

Subscription renewal fees subject to change annually.

 

Included in Cost:

Annual subscription fees include the following:

  • Full Assessment Suite: MAP Growth K–2 assessments can be administered up to four times per calendar year. The abbreviated Screening Assessment may be administered once a year for placement purposes. The license also includes access to ten Skills Checklist tests that provide information about specific skills and concepts (e.g. phonological awareness and matching letters to sound).
  • Robust Reporting: All results from MAP Growth K–2 assessments (including RIT scale scores, proficiency projections, and status and growth norms) are available in a variety of views and formats through MAP Growth’s comprehensive suite of reports.
  • Learning Continuum: Dynamic reporting of learning statements, specifically aligned to the applicable state standards, provide information about what each student is ready to learn.
  • System of Support: A full system of support is provided to enable the success of MAP Growth partners, including technical support; implementation support through the first test administration; and ongoing, dedicated account management for the duration of the partnership.
  • NWEA Professional Learning Online: Access to this online learning portal offers on-demand tutorials, webinars, courses, and videos to supplement professional learning plans and help educators use MAP Growth to improve teaching and learning.

 

NWEA offers a portfolio of flexible, customizable professional learning and training options to meet the needs of partners. Please contact NWEA via https://www.nwea.org/sales-information/ for specific details on pricing.

 

Technology Requirements:

  • Computer or tablet
  • Internet connection

 

Training Requirements:

  • 1–4 hours of training

 

Qualified Administrators:

Examiners should meet the same qualifications as a teaching paraprofessional; examiners should complete all necessary training related to administering an assessment.

 

Accommodations:

MAP Growth assessments incorporate universal design principles for greater accessibility. This means that all content areas are created considering universal design and accessibility standards from the start. For example, alternative text descriptions (alt-tags) for images are an important feature on a website to provide access to those using screen readers. Alt-tags provide descriptions of pictures, charts, graphs, etc., to those who may not be able to see the information. Laying this foundation promotes accessibility for students using various accommodations.

 

Following national standards, such as the Web Content Accessibility Guidelines (WCAG) 2.0 and Accessible Rich Internet Applications (ARIA), helps to guide the creation of MAP Growth assessments.

With support from the WGBH National Center for Accessible Media (NCAM), NWEA has created detailed and thorough guidelines for describing many variations of images, charts, and graphics targeted specifically to mathematics and reading. The guidelines review concepts such as item integrity, fairness, and the unique challenges image description writers face in the context of assessment. These guidelines result in consistent, user-friendly, and valid image descriptions that support the use of screen readers.

 

MAP Growth K–2 assessments include built-in human audio support and other interactive features for early learners. The purpose of providing human voice audio is to address specific content areas (e.g., phonemic awareness, phonics, and listening skills). Therefore, the audio is strategically placed within the item to maintain content validity.

 

This assessment does not include many of the accessibility features in MAP Growth for grades two and above because adding new assistive technology at the K–2 level calls into question the validity of what is being tested: the use of new technology or the item.

 

Tools are made available for all students on the assessment. These tools are embedded into the user interface for each item and are at the appropriate test level. Tools are not specific to a certain population but will be available to all users whenever necessary so that students can use these tools during their testing experience.

Where to Obtain:

Website: www.nwea.org

Address: 121 NW Everett Street, Portland, OR 97209

Phone number: (503) 624-1951

Please contact NWEA via https://www.nwea.org/sales-information/ for service and support questions.
Access to Technical Support:

Toll-free telephone support, online support, website knowledge base, and live chat support are available.

MAP Growth K–2 assessments are used across the country for multiple purposes, including as universal screening tools in response to intervention (RTI) programs.

 

MAP Growth K–2 can serve as universal screeners for identifying students at risk of poor academic outcomes in reading. MAP Growth K–2 assessments give educators insight into the instructional needs of all students, whether they are performing at, above, or below grade level.

 

MAP Growth K–2 assessments contain appropriate items for students who are still acquiring the skills needed to read independently. The assessments give educators insight into the instructional needs of all students, whether they are performing at, above, or below grade level.

 

MAP Growth K–2 assessments are computer adaptive tests with a cross-grade vertical scale that assess achievement according to standards-aligned content. Scores from repeated administrations measure growth over time. MAP Growth K–2 tests can be administered four times per calendar year.

 

MAP Growth and MAP Growth K–2 are scaled across grades. The Rasch model, an item response theory (IRT) model commonly employed in K–12 assessment programs, was used to create the scales for MAP Growth and MAP Growth K–2 assessments. These scales have been named RIT scales (for Rasch Unit).

 

 

Assessment Format:

  • Direct: Computerized

 

Administration Time:

  • 45 minutes per student, per subject

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

MAP Growth K–2 scores are not based on raw scores because they are adaptive. The difficulty of the item answered is used to derive the student’s scale score. During the assessment, a Bayesian scoring algorithm is used to inform item selection. Bayesian scoring for item selection prevents artificially dramatic fluctuations in student achievement at the beginning of the test, which can occur with other scoring algorithms. Although the Bayesian scoring works well as a procedure for selecting items during test administration, Bayesian scores are not appropriate for the calculation of final student achievement scores. This is because Bayesian scoring uses information other than the student’s responses to questions (such as past performance) to calculate the achievement estimate. Since only the student’s performance today should be used to give the student’s current score, a maximum-likelihood algorithm is used to calculate a student’s actual score at the completion of the test.

 

Scores Generated:

  • Percentile score
  • IRT-based score
  • Developmental benchmarks
  • Developmental cut points
  • Composite scores
  • Subscale/subtest scores

 

 

Classification Accuracy

GradeK1
Criterion 1 FallHalf-filled bubbledHalf-filled bubbled
Criterion 1 WinterHalf-filled bubbledHalf-filled bubbled
Criterion 1 SpringHalf-filled bubbledHalf-filled bubbled
Criterion 2 Falldashdash
Criterion 2 Winterdashdash
Criterion 2 Springdashdash

Primary Sample

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) ELA

Time of Year: Fall

 

Grade K

 Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.13

0.13

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.15

0.13

False Negative Rate

0.50

0.37

Sensitivity

0.50

0.63

Specificity

0.85

0.87

Positive Predictive Power

0.33

0.40

Negative Predictive Power

0.92

0.94

Overall Classification Rate

0.80

0.84

Area Under the Curve (AUC)

0.81

0.86

AUC 95% Confidence Interval Lower

0.79

0.85

AUC 95% Confidence Interval Upper

0.82

0.87

At 90% Sensitivity, specificity equals

0.37

0.55

At 80% Sensitivity, specificity equals

0.62

0.75

At 70% Sensitivity, specificity equals

0.79

0.85

 

Criterion 1: PARCC ELA

Time of Year: Winter

 

Grade K

 Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.13

0.13

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.14

0.13

False Negative Rate

0.45

0.35

Sensitivity

0.55

0.65

Specificity

0.86

0.87

Positive Predictive Power

0.38

0.42

Negative Predictive Power

0.93

0.95

Overall Classification Rate

0.82

0.84

Area Under the Curve (AUC)

0.82

0.87

AUC 95% Confidence Interval Lower

0.800

0.86

AUC 95% Confidence Interval Upper

0.84

0.88

At 90% Sensitivity, specificity equals

0.44

0.57

At 80% Sensitivity, specificity equals

0.66

0.78

At 70% Sensitivity, specificity equals

0.80

0.87

 

Criterion 1: PARCC ELA

Time of Year: Spring

 

Grade K

 Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.12

0.12

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.15

0.12

False Negative Rate

0.43

0.33

Sensitivity

0.57

0.67

Specificity

0.86

0.88

Positive Predictive Power

0.36

0.43

Negative Predictive Power

0.93

0.95

Overall Classification Rate

0.82

0.85

Area Under the Curve (AUC)

0.82

0.88

AUC 95% Confidence Interval Lower

0.80

0.87

AUC 95% Confidence Interval Upper

0.84

0.89

At 90% Sensitivity, specificity equals

0.45

0.60

At 80% Sensitivity, specificity equals

0.67

0.81

At 70% Sensitivity, specificity equals

0.79

0.88

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

 

Disaggregated Data

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: Asian or Pacific Islander

 

Grade K

 Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.04

0.05

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.13

0.08

False Negative Rate

0.47

0.27

Sensitivity

0.53

0.73

Specificity

0.87

0.92

Positive Predictive Power

0.15

0.31

Negative Predictive Power

0.98

0.99

Overall Classification Rate

0.86

0.91

Area Under the Curve (AUC)

0.83

0.92

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.46

0.77

At 80% Sensitivity, specificity equals

0.78

0.90

At 70% Sensitivity, specificity equals

0.87

1.00

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: Asian or Pacific Islander

 

Grade K

 Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.05

0.05

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.14

0.11

False Negative Rate

0.53

0.31

Sensitivity

0.47

0.69

Specificity

0.86

0.89

Positive Predictive Power

0.14

0.24

Negative Predictive Power

0.97

0.98

Overall Classification Rate

0.84

0.88

Area Under the Curve (AUC)

0.79

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.32

0.67

At 80% Sensitivity, specificity equals

0.61

0.91

At 70% Sensitivity, specificity equals

0.84

0.96

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: Asian or Pacific Islander

 

Grade K

 Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.05

0.05

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.12

0.09

False Negative Rate

0.60

0.46

Sensitivity

0.40

0.54

Specificity

0.88

0.91

Positive Predictive Power

0.15

0.23

Negative Predictive Power

0.97

0.97

Overall Classification Rate

0.86

0.89

Area Under the Curve (AUC)

0.78

0.87

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.35

0.61

At 80% Sensitivity, specificity equals

0.60

0.78

At 70% Sensitivity, specificity equals

0.80

0.89

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: Black

 

Grade K

 Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.25

0.22

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.25

0.19

False Negative Rate

0.49

0.48

Sensitivity

0.51

0.52

Specificity

0.75

0.82

Positive Predictive Power

0.41

0.44

Negative Predictive Power

0.82

0.86

Overall Classification Rate

0.69

0.75

Area Under the Curve (AUC)

0.74

0.80

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.29

0.45

At 80% Sensitivity, specificity equals

0.46

0.56

At 70% Sensitivity, specificity equals

0.64

0.75

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: Black

 

Grade K

 Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.26

0.23

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.17

0.17

False Negative Rate

0.38

0.43

Sensitivity

0.62

0.57

Specificity

0.83

0.83

Positive Predictive Power

0.57

0.51

Negative Predictive Power

0.86

0.87

Overall Classification Rate

0.78

0.77

Area Under the Curve (AUC)

0.79

0.82

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.43

0.49

At 80% Sensitivity, specificity equals

0.67

0.67

At 70% Sensitivity, specificity equals

0.79

0.78

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: Black

 

Grade K

 Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.22

0.22

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.17

0.17

False Negative Rate

0.45

0.32

Sensitivity

0.55

0.68

Specificity

0.83

0.83

Positive Predictive Power

0.47

0.53

Negative Predictive Power

0.87

0.90

Overall Classification Rate

0.77

0.80

Area Under the Curve (AUC)

0.78

0.84

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.35

0.53

At 80% Sensitivity, specificity equals

0.62

0.73

At 70% Sensitivity, specificity equals

0.76

0.82

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: Hispanic

 

Grade K

 Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.21

0.23

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.27

0.23

False Negative Rate

0.42

0.27

Sensitivity

0.58

0.73

Specificity

0.74

0.77

Positive Predictive Power

0.36

0.49

Negative Predictive Power

0.87

0.90

Overall Classification Rate

0.70

0.76

Area Under the Curve (AUC)

0.74

0.82

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.29

0.48

At 80% Sensitivity, specificity equals

0.48

0.69

At 70% Sensitivity, specificity equals

0.60

0.82

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: Hispanic

 

Grade K

 Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.22

0.24

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.23

0.23

False Negative Rate

0.36

0.26

Sensitivity

0.64

0.74

Specificity

0.77

0.77

Positive Predictive Power

0.44

0.49

Negative Predictive Power

0.88

0.91

Overall Classification Rate

0.74

0.76

Area Under the Curve (AUC)

0.80

0.83

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.40

0.50

At 80% Sensitivity, specificity equals

0.57

0.70

At 70% Sensitivity, specificity equals

0.73

0.82

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: Hispanic

 

Grade K

 Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.22

0.23

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.25

0.22

False Negative Rate

0.33

0.25

Sensitivity

0.67

0.75

Specificity

0.75

0.78

Positive Predictive Power

0.43

0.51

Negative Predictive Power

0.89

0.91

Overall Classification Rate

0.74

0.78

Area Under the Curve (AUC)

0.80

0.85

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.39

0.51

At 80% Sensitivity, specificity equals

0.61

0.72

At 70% Sensitivity, specificity equals

0.76

0.88

 

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: White

 

Grade K

 Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.09

0.09

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.10

0.10

False Negative Rate

0.60

0.46

Sensitivity

0.40

0.54

Specificity

0.90

0.90

Positive Predictive Power

0.29

0.33

Negative Predictive Power

0.94

0.96

Overall Classification Rate

0.86

0.87

Area Under the Curve (AUC)

0.82

0.86

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.41

0.52

At 80% Sensitivity, specificity equals

0.66

0.74

At 70% Sensitivity, specificity equals

0.83

0.89

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: White

 

Grade K

 Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.09

0.09

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.10

0.10

False Negative Rate

0.56

0.43

Sensitivity

0.44

0.57

Specificity

0.91

0.90

Positive Predictive Power

0.32

0.36

Negative Predictive Power

0.94

0.96

Overall Classification Rate

0.86

0.88

Area Under the Curve (AUC)

0.81

0.87

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.45

0.59

At 80% Sensitivity, specificity equals

0.60

0.79

At 70% Sensitivity, specificity equals

0.77

0.87

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: White

 

Grade K

 Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.09

0.08

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.11

0.09

False Negative Rate

0.53

0.40

Sensitivity

0.47

0.60

Specificity

0.89

0.91

Positive Predictive Power

0.29

0.37

Negative Predictive Power

0.95

0.96

Overall Classification Rate

0.86

0.89

Area Under the Curve (AUC)

0.81

0.88

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.42

0.63

At 80% Sensitivity, specificity equals

0.67

0.77

At 70% Sensitivity, specificity equals

0.77

0.89

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: Multi-Ethnic

 

Grade 1

Cut points

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.14

False Negative Rate

0.33

Sensitivity

0.67

Specificity

0.86

Positive Predictive Power

0.30

Negative Predictive Power

0.97

Overall Classification Rate

0.84

Area Under the Curve (AUC)

0.83

AUC 95% Confidence Interval Lower

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

At 90% Sensitivity, specificity equals

0.45

At 80% Sensitivity, specificity equals

0.67

At 70% Sensitivity, specificity equals

0.79

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: Multi-Ethnic

 

Grade 1

Cut points

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.12

False Negative Rate

0.39

Sensitivity

0.62

Specificity

0.88

Positive Predictive Power

0.30

Negative Predictive Power

0.96

Overall Classification Rate

0.86

Area Under the Curve (AUC)

0.86

AUC 95% Confidence Interval Lower

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

At 90% Sensitivity, specificity equals

0.45

At 80% Sensitivity, specificity equals

0.62

At 70% Sensitivity, specificity equals

0.81

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: Multi-Ethnic

 

Grade 1

Cut points

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.09

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.13

False Negative Rate

0.41

Sensitivity

0.59

Specificity

0.87

Positive Predictive Power

0.32

Negative Predictive Power

0.95

Overall Classification Rate

0.85

Area Under the Curve (AUC)

0.88

AUC 95% Confidence Interval Lower

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

At 90% Sensitivity, specificity equals

0.54

At 80% Sensitivity, specificity equals

0.82

At 70% Sensitivity, specificity equals

0.92

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: Female

 

Grade K

Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.11

0.11

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.12

0.11

False Negative Rate

0.47

0.31

Sensitivity

0.53

0.69

Specificity

0.88

0.89

Positive Predictive Power

0.34

0.42

Negative Predictive Power

0.94

0.96

Overall Classification Rate

0.84

0.87

Area Under the Curve (AUC)

0.83

0.89

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.45

0.64

At 80% Sensitivity, specificity equals

0.67

0.81

At 70% Sensitivity, specificity equals

0.83

0.91

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: Female

 

Grade K

Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.11

0.11

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.12

0.12

False Negative Rate

0.38

0.29

Sensitivity

0.62

0.71

Specificity

0.88

0.89

Positive Predictive Power

0.39

0.42

Negative Predictive Power

0.95

0.96

Overall Classification Rate

0.85

0.87

Area Under the Curve (AUC)

0.85

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.57

0.65

At 80% Sensitivity, specificity equals

0.73

0.84

At 70% Sensitivity, specificity equals

0.84

0.90

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: Female

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.10

0.10

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.13

0.11

False Negative Rate

0.34

0.28

Sensitivity

0.66

0.72

Specificity

0.87

0.89

Positive Predictive Power

0.36

0.42

Negative Predictive Power

0.96

0.97

Overall Classification Rate

0.85

0.87

Area Under the Curve (AUC)

0.86

0.91

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.55

0.68

At 80% Sensitivity, specificity equals

0.77

0.86

At 70% Sensitivity, specificity equals

0.86

0.93

 

Criterion 1: PARCC ELA

Time of Year: Fall

Subgroup: Male

 

Grade K

Grade 1

Cut points

MAP Growth < 138, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.15

0.14

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.18

0.15

False Negative Rate

0.52

0.41

Sensitivity

0.48

0.59

Specificity

0.82

0.85

Positive Predictive Power

0.32

0.39

Negative Predictive Power

0.90

0.93

Overall Classification Rate

0.77

0.81

Area Under the Curve (AUC)

0.78

0.83

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.31

0.49

At 80% Sensitivity, specificity equals

0.54

0.68

At 70% Sensitivity, specificity equals

0.75

0.79

 

Criterion 1: PARCC ELA

Time of Year: Winter

Subgroup: Male

 

Grade K

Grade 1

Cut points

MAP Growth < 145, PARCC < 700

MAP Growth < 163, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.15

0.15

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.15

0.15

False Negative Rate

0.50

0.39

Sensitivity

0.50

0.61

Specificity

0.85

0.85

Positive Predictive Power

0.37

0.42

Negative Predictive Power

0.90

0.93

Overall Classification Rate

0.79

0.82

Area Under the Curve (AUC)

0.79

0.85

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.35

0.51

At 80% Sensitivity, specificity equals

0.59

0.73

At 70% Sensitivity, specificity equals

0.74

0.84

 

Criterion 1: PARCC ELA

Time of Year: Spring

Subgroup: Male

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 171, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.14

0.14

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.16

0.14

False Negative Rate

0.50

0.36

Sensitivity

0.50

0.64

Specificity

0.84

0.87

Positive Predictive Power

0.35

0.44

Negative Predictive Power

0.91

0.94

Overall Classification Rate

0.80

0.83

Area Under the Curve (AUC)

0.79

0.86

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.38

0.54

At 80% Sensitivity, specificity equals

0.59

0.75

At 70% Sensitivity, specificity equals

0.74

0.84

 

Reliability

GradeK1
RatingFull bubbledFull bubbled
  1. Justification for each type of reliability reported, given the type and purpose of the tool:

Using MAP Growth K–2 as an academic screener, the internal consistency reliability of student test scores (i.e., student RIT scores on MAP Growth) is key. However, estimating the internal consistency of an adaptive test, such as MAP Growth K–2, is challenging because traditional methods depend on all test takers to take a common test consisting of the same items. Application of these methods to adaptive tests is statistically cumbersome and inaccurate. Fortunately, an equally valid alternative is available in the marginal reliability coefficient[1] [2] that incorporates measurement error as a function of the test score. In effect, it is the result of combining measurement error estimated at different points on the achievement scale into a single index. Note that this method of calculating reliability yields results that are nearly identical to coefficient alpha, when both methods are applied to the same fixed-form test.

MAP Growth K–2 affords the means to screen students on multiple occasions (e.g., Fall, Winter, Spring) during the school year. Thus, test-retest reliability is also key, and we estimate test-retest reliability via the Pearson correlation between MAP Growth K–2 RIT scores of students taking MAP Growth K–2 in two terms within the school year (Fall and Winter, Fall and Spring, and Winter and Spring). Given that MAP Growth K–2 is an adaptive test, without any fixed-forms, this approach to test-retest reliability may be more accurately described as a mix between test-retest reliability and a type of parallel forms reliability. That is, MAP Growth K–2 RIT scores are obtained for students taking MAP Growth K–2 twice, spread across several months. The second test (or retest) is not the same test. Rather, the second test is comparable to the first, by its content and structure, differing only in the difficulty level of its items. Thus, both temporally related and parallel forms of reliability are defined as the consistency of covalent measures taken across time. Green, Bock, Humphreys, Linn, and Reckase[3] suggested the term “stratified, randomly parallel form reliability” to characterize this form of reliability.

 

 

  1. Description of the sample(s), including size and characteristics, for each reliability analysis conducted:

Representation

New England, Middle Atlantic, East North Central, South Atlantic, Mountain. The sample for the study contained student records from a total of five states (Colorado, Illinois, New Jersey, New Mexico, and Rhode Island) and one federal district (District of Columbia), and thus had representation from all four U.S. Census regions.

Date

PARCC data was based on students in Grade 3 who took the PARCC assessment during Spring 2016. The Spring 2016 PARCC administration spanned from March 2016 through June 2016.MAP Growth K–2 data was obtained for this sample of Grade 3 students taking PARCC in Spring 2016. Specifically, for these students who were in Grade 3 in Spring 2016, their MAP Growth K–2 scores from previous years were obtained. Their MAP Growth K–2 data from Fall 2012, Winter 2013, and Spring 2013 served as their MAP Growth K–2 scores for Grade K; their MAP Growth K–2 data from Fall 2013, Winter 2014, and Spring 2014 served as their MAP Growth K–2 scores for Grade 1; and their MAP Growth K–2 data from Fall 2014, Winter 2015, and Spring 2015 served as their MAP Growth K–2 scores for Grade 2.Thus, the data used was based on a sample of 3rd grade students taking PARCC assessment in Spring 2016 and their MAP Growth K–2 from previous years, when they were in Grades K, 1, and 2.

Size

Table 4 summarizes the total number of students, as functions of grade, state, region, and division.

Male

50.94%

Female

49.04%

Unknown

0.02%

Other SES Indicators

Not Provided

Free or reduced-price lunch

Not Provided

White, Non-Hispanic

40.88%

Black, Non-Hispanic

6.14%

Hispanic

23.40%

American Indian/Alaska Native:

3.00%

Asian/Pacific Islander:

9.02%

Multi-Ethnic

3.33%

Not Specified or Other

14.96%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

Table 4: Number of Students

State, Region, or Division

N

Number of Students Per State

CO

2,967

DC

167

IL

12,140

NJ

643

NM

208

RI

209

Total

16,334

Number of Students Per Region

Midwest

12,140

Northeast

852

South

167

West

3,175

Total

16,334

Number of Students Per Division

East North Central

12,140

Middle Atlantic

643

Mountain

3,175

New England

209

South Atlantic

167

Total

16,334

 

 

  1. Description of the analysis procedures for each reported type of reliability:

Marginal Reliability. The approach taken for estimating marginal reliability on MAP Growth K–2 was suggested by Wright in 1999[4]. For a sample of N students, marginal reliability () is estimated by

where  is the IRT achievement level (on a standardized or scaled score metric),  is an estimate of ,  is the observed variance of  across the sample of N students,  is the squared conditional (on ) standard error of measurement (CSEM), and  is the average squared CSEM across the sample of N students.

A bootstrapping approach is used to calculate a 95% confidence interval for marginal reliability. For an initial dataset of the achievement levels and CSEMs for N students, a bootstrap 95% confidence interval for marginal reliability is obtained as follows:

  1. Draw a random sample of size N with replacement from the initial dataset.
  2. Calculate marginal reliability based on the random sample drawn in Step 1.
  3. Repeat steps 1 and 2, 1,000 times.
  4. Determine the 2.5 and 97.5 percentile points from the resulting 1,000 estimates of marginal reliability. The value of these two percentiles are the bootstrap 95% confidence interval.

Test-Retest Reliability. Test-retest reliability of MAP Growth K–2 was estimated as the Pearson correlation of student RIT scores on MAP Growth K–2 for students in the study dataset, who took MAP Growth K–2 twice within a school year. Fundamentally, the test-retest reliability coefficient is a Pearson correlation. As such, the confidence interval (CI) for the test-retest reliability coefficient was obtained using the standard CI for a Pearson correlation (i.e., via the Fisher’s z-transformation).

 

 

  1. Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Grade

N

Coefficient

Confidence Interval

Marginal (Fall)

K

3244

0.93

0.92, 0.93

Marginal (Winter)

K

3475

0.94

0.93, 0.94

Marginal (Spring)

K

3828

0.95

0.94, 0.95

Marginal (Fall)

1

4765

0.95

0.95, 0.95

Marginal (Winter)

1

4682

0.96

0.95, 0.96

Marginal (Spring)

1

5124

0.95

0.95, 0.96

Test-Retest (Fall/Winter)

K

2,886

0.77

0.76, 0.79

Test-Retest (Fall/Spring)

K

2,998

0.73

0.71, 0.74

Test-Retest (Winter/Spring)

K

3,306

0.81

0.79, 0.82

Test-Retest (Fall/Winter)

1

4,421

0.86

0.85, 0.87

Test-Retest (Fall/Spring)

1

4,684

0.81

0.80, 0.82

Test-Retest (Winter/Spring)

1

4,631

0.87

0.86, 0.87

 

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of

Reliability

Subgroup

Grade

N

Coefficient

Confidence Interval

Marginal (Fall)

Ethnicity: Asian or Pacific Islander

K

368

0.95

0.94, 0.96

Marginal (Winter)

Ethnicity: Asian or Pacific Islander

K

394

0.95

0.94, 0.96

Marginal (Spring)

Ethnicity: Asian or Pacific Islander

K

400

0.96

0.95, 0.96

Marginal (Fall)

Ethnicity: Black

K

256

0.90

0.87, 0.92

Marginal (Winter)

Ethnicity: Black

K

251

0.91

0.88, 0.93

Marginal (Spring)

Ethnicity: Black

K

225

0.93

0.91, 0.94

Marginal (Fall)

Ethnicity: Hispanic

K

822

0.91

0.89, 0.92

Marginal (Winter)

Ethnicity: Hispanic

K

863

0.92

0.91, 0.93

Marginal (Spring)

Ethnicity: Hispanic

K

901

0.93

0.93, 0.94

Marginal (Fall)

Ethnicity: White

K

1,605

0.92

0.91, 0.93

Marginal (Winter)

Ethnicity: White

K

1,744

0.93

0.92, 0.93

Marginal (Spring)

Ethnicity: White

K

2,062

0.94

0.94, 0.94

Marginal (Fall)

Gender: Female

K

1,585

0.92

0.91, 0.93

Marginal (Winter)

Gender: Female

K

1,712

0.93

0.93, 0.94

Marginal (Spring)

Gender: Female

K

1,864

0.95

0.94, 0.95

Marginal (Fall)

Gender: Male

K

1,659

0.93

0.92, 0.93

Marginal (Winter)

Gender: Male

K

1,763

0.94

0.93, 0.94

Marginal (Spring)

Gender: Male

K

1,964

0.94

0.94, 0.95

Marginal (Fall)

Ethnicity: Asian or Pacific Islander

1

543

0.96

0.96, 0.97

Marginal (Winter)

Ethnicity: Asian or Pacific Islander

1

544

0.96

0.96, 0.96

Marginal (Spring)

Ethnicity: Asian or Pacific Islander

1

569

0.95

0.94, 0.95

Marginal (Fall)

Ethnicity: Black

1

290

0.93

0.92, 0.94

Marginal (Winter)

Ethnicity: Black

1

281

0.95

0.94, 0.95

Marginal (Spring)

Ethnicity: Black

1

309

0.95

0.94, 0.95

Marginal (Fall)

Ethnicity: Hispanic

1

1,044

0.95

0.94, 0.95

Marginal (Winter)

Ethnicity: Hispanic

1

1,039

0.95

0.95, 0.96

Marginal (Spring)

Ethnicity: Hispanic

1

1,090

0.95

0.95, 0.96

Marginal (Fall)

Ethnicity: Multi-Ethnic

1

178

0.95

0.94, 0.96

Marginal (Winter)

Ethnicity: Multi-Ethnic

1

167

0.96

0.95, 0.96

Marginal (Spring)

Ethnicity: Multi-Ethnic

1

184

0.95

0.94, 0.96

Marginal (Fall)

Ethnicity: White

1

2,534

0.95

0.94, 0.95

Marginal (Winter)

Ethnicity: White

1

2,484

0.95

0.95, 0.95

Marginal (Spring)

Ethnicity: White

1

2,783

0.95

0.95, 0.95

Marginal (Fall)

Gender: Female

1

2,335

0.95

0.95, 0.95

Marginal (Winter)

Gender: Female

1

2,302

0.96

0.95, 0.96

Marginal (Spring)

Gender: Female

1

2,513

0.95

0.95, 0.96

Marginal (Fall)

Gender: Male

1

2,430

0.95

0.95, 0.96

Marginal (Winter)

Gender: Male

1

2,379

0.96

0.95, 0.96

Marginal (Spring)

Gender: Male

1

2,610

0.95

0.95, 0.96

Test-Retest (Fall/Winter)

Ethnicity: Asian or Pacific Islander

K

356

0.83

0.79, 0.86

Test-Retest (Fall/Spring)

Ethnicity: Asian or Pacific Islander

K

348

0.80

0.76, 0.84

Test-Retest (Winter/Spring)

Ethnicity: Asian or Pacific Islander

K

377

0.85

0.82, 0.88

Test-Retest (Fall/Winter)

Ethnicity: Black

K

227

0.63

0.55, 0.70

Test-Retest (Fall/Spring)

Ethnicity: Black

K

189

0.58

0.47, 0.67

Test-Retest (Winter/Spring)

Ethnicity: Black

K

205

0.76

0.70, 0.82

Test-Retest (Fall/Winter)

Ethnicity: Hispanic

K

770

0.74

0.70, 0.77

Test-Retest (Fall/Spring)

Ethnicity: Hispanic

K

758

0.69

0.65, 0.72

Test-Retest (Winter/Spring)

Ethnicity: Hispanic

K

826

0.79

0.76, 0.81

Test-Retest (Fall/Winter)

Ethnicity: White

K

1,363

0.75

0.73, 0.77

Test-Retest (Fall/Spring)

Ethnicity: White

K

1,533

0.69

0.67, 0.72

Test-Retest (Winter/Spring)

Ethnicity: White

K

1,691

0.78

0.76, 0.79

Test-Retest (Fall/Winter)

Gender: Female

K

1,401

0.77

0.75, 0.79

Test-Retest (Fall/Spring)

Gender: Female

K

1,441

0.73

0.71, 0.76

Test-Retest (Winter/Spring)

Gender: Female

K

1,616

0.82

0.80, 0.83

Test-Retest (Fall/Winter)

Gender: Male

K

1,485

0.77

0.75, 0.79

Test-Retest (Fall/Spring)

Gender: Male

K

1,557

0.72

0.69, 0.74

Test-Retest (Winter/Spring)

Gender: Male

K

1,690

0.79

0.78, 0.81

Test-Retest (Fall/Winter)

Ethnicity: Asian or Pacific Islander

1

523

0.88

0.86, 0.90

Test-Retest (Fall/Spring)

Ethnicity: Asian or Pacific Islander

1

538

0.84

0.81, 0.86

Test-Retest (Winter/Spring)

Ethnicity: Asian or Pacific Islander

1

543

0.88

0.85, 0.89

Test-Retest (Fall/Winter)

Ethnicity: Black

1

263

0.85

0.81, 0.88

Test-Retest (Fall/Spring)

Ethnicity: Black

1

280

0.75

0.69, 0.80

Test-Retest (Winter/Spring)

Ethnicity: Black

1

278

0.83

0.79, 0.87

Test-Retest (Fall/Winter)

Ethnicity: Hispanic

1

995

0.84

0.82, 0.86

Test-Retest (Fall/Spring)

Ethnicity: Hispanic

1

1,025

0.79

0.76, 0.81

Test-Retest (Winter/Spring)

Ethnicity: Hispanic

1

1,028

0.85

0.83, 0.87

Test-Retest (Fall/Winter)

Ethnicity: Multi-Ethnic

1

161

0.82

0.76, 0.86

Test-Retest (Fall/Spring)

Ethnicity: Multi-Ethnic

1

172

0.82

0.77, 0.87

Test-Retest (Winter/Spring)

Ethnicity: Multi-Ethnic

1

166

0.87

0.83, 0.90

Test-Retest (Fall/Winter)

Ethnicity: White

1

2,318

0.84

0.83, 0.85

Test-Retest (Fall/Spring)

Ethnicity: White

1

2,503

0.78

0.77, 0.80

Test-Retest (Winter/Spring)

Ethnicity: White

1

2,454

0.85

0.84, 0.86

Test-Retest (Fall/Winter)

Gender: Female

1

2,170

0.87

0.86, 0.88

Test-Retest (Fall/Spring)

Gender: Female

1

2,287

0.83

0.82, 0.85

Test-Retest (Winter/Spring)

Gender: Female

1

2,275

0.88

0.87, 0.89

Test-Retest (Fall/Winter)

Gender: Male

1

2,251

0.85

0.83, 0.86

Test-Retest (Fall/Spring)

Gender: Male

1

2,397

0.79

0.77, 0.80

Test-Retest (Winter/Spring)

Gender: Male

1

2,355

0.86

0.85, 0.87



[1] Samejima, F. (1977). A use of the information function in tailored testing. Applied Psychological Measurement, 1(3), 233–247.

[2] Samejima, F. (1994). Estimation of reliability coefficients using the test information function and its modifications. Applied Psychological Measurement, 18(3), 229–244.

[3] Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. L., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21(4), 347–360.

[4] Wright, B. D. (1999). “Rasch Measurement Models.” In G.N. Masters and J.P. Keeves (Eds.), Advances in Measurement in Educational Research and Assessment (pp. 85-97). Oxford, UK: Elsevier Science Ltd.

 

Validity

GradeK1
RatingFull bubbledFull bubbled
  1. Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool:

In general terms, the better a test measures what it purports to measure and can support its intended uses and decision making, the stronger its validity is said to be. Within this broad statement resides a wide range of information that can be used as validity evidence. This information ranges, for example, from the adequacy and coverage of a test’s content, to its ability to yield scores that are predictive of a status in some area, to its ability to draw accurate inferences about a test taker’s status with respect to a construct, to its ability to allow generalizations from test performance within a domain to like performance in the same domain.

Much of the validity evidence for MAP Growth K–2 comes from the relationships of MAP Growth K–2 test scores with state content-aligned accountability test scores from Grade 3. These relationships are predictive relationships between students’ performance on MAP Growth K–2 tests with their performance, in a later Spring testing term, on state accountability tests.

Several important points should be noted regarding concurrent performance on MAP Growth K–2 tests with that on state accountability tests. First, these two forms of tests (i.e., interim vs. summative) are designed to serve two related but different purposes. MAP Growth K–2 tests are designed to provide estimates of achievement status with low measurement error. They are also designed to provide reasonable estimates of students’ strengths and weaknesses within the identified goal structure.

State accountability tests are commonly designed to determine student proficiency within the state performance standard structure, with the most important decision being the classification of the student as proficient or not proficient. This primary purpose of most state tests in conjunction with adopted content and curriculum standards and structures can influence the relationship of student performance between the two tests.

For example, one of the most common factors influencing these relationships is the use of constructed response items in state tests. In general, the greater the number of constructed response items, the weaker the relationship will appear. Another difference is in test design. Since most state accountability tests are fixed form, it is reasonable for the test to be constructed so that maximum test information is established around the proficiency cut point. This is where a state wants to be the most confident about the classification decision that the test will inform. To the extent that this strategy is reflected in the state’s operational test, the relationship in performance between MAP tests and state tests will be attenuated due to a more truncated range of scores on the state test.

The requirement that state test content be connected to single grade level content standards is different than MAP test content structure that spans grade levels. This difference is another factor that weakens the observed score relationships between tests. Finally, when focus is placed on the relationship between performance on MAP tests and the assigned proficiency category from the state test, information from the state test will have been collapsed into three to five categories. The correlations between RIT scores and these category assignments will always be substantially lower than if the correlations were based on RIT scores and scale scores.

Predictive validity evidence is expressed as the degree of relationship to performance on another test measuring achievement in the same domain (e.g., mathematics, reading) at some later point in time. This form of validity can also be expressed in the form of a Pearson correlation coefficient between the total domain area RIT score and the total scale score of another established test. It answers the question, “How well do the scores from this test that reference this (RIT) scale in this subject area (e.g., reading) predict the scores obtained from an established test that references some other scale in the same subject area at a later point in time?” Both tests are administered to the same students several weeks apart, typically 12 to 36 weeks in evidence reported here. Strong predictive validity is indicated when the correlations are in the low 0.80s. Correlations with non-NWEA tests that include more performance test items that require subjective scoring tend to have lower correlations than when non-NWEA tests consist of exclusively multiple-choice items.

The criterion measure used for this series of analyses was the scaled score on the PARCC ELA assessment, taken by students in the sample during the Spring 2016 school term.

In addition to concurrent and predictive validity, validity evidence for MAP Growth K–2 also comes from the degree and stability of the relationship of RIT scores across multiple and extended periods of time, such as across school years. This type of evidence supports the construct validity of MAP Growth K–2 and the ability underlying the RIT scale. For Grade K, this construct validity evidence was based on MAP Growth K–2 scores from Grades K and 1. For Grade 1, this construct validity evidence was based on MAP Growth K–2 scores from Grade 1 and MAP Growth scores from Grade 2.

 

 

  1. Description of the sample(s), including size and characteristics, for each validity analysis conducted:

Representation

New England, Middle Atlantic, East North Central, South Atlantic, Mountain.. The sample for the study contained student records from a total of five states (Colorado, Illinois, New Jersey, New Mexico, and Rhode Island) and one federal district (District of Columbia), and thus had representation from all four U.S. Census regions.

Date

 

Size

Table 5 summarizes the total number of students, as functions of grade, state, region, and division.

Male

50.80%

Female

49.18%

Unknown

0.02%

Other SES Indicators

Not Provided

Free or reduced-price lunch

Not Provided

White, Non-Hispanic

40.86%

Black, Non-Hispanic

6.21%

Hispanic

22.92%

American Indian/Alaska Native

2.33%

Asian/Pacific Islander

9.05%

Multi-Ethnic

3.39%

Not Specified or Other

15.24%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

Table 5: Number of Students

State, Region, or Division

N

Number of Students Per State

CO

2,967

DC

167

IL

12,140

NJ

643

NM

208

RI

209

Total

16,334

Number of Students Per Region

Midwest

12,140

Northeast

852

South

167

West

3,175

Total

16,334

Number of Students Per Division

East North Central

12,140

Middle Atlantic

643

Mountain

3,175

New England

209

South Atlantic

167

Total

16,334

 

 

  1. Description of the analysis procedures for each reported type of validity:

Predictive validity was estimated as the Pearson correlation coefficient between student RIT scores from a given term and the same students’ total scale score on the PARCC test administered in Spring 2016. The 95% confidence interval for concurrent and predictive validity coefficients was based on the standard 95% confidence interval for a Pearson correlation, using the Fisher z-transformation.

 

  1. Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Grade

Test or Criterion

n

Coefficient

Confidence

Interval

Construct (Fall)

K

MAP Growth K-2: Spring 2013

2,998

0.73

0.71, 0.74

Construct (Fall)

K

MAP Growth K-2: Fall 2013

2,849

0.73

0.71, 0.75

Construct (Fall)

K

MAP Growth K-2: Winter 2014

2,749

0.69

0.67, 0.71

Construct (Fall)

K

MAP Growth K-2: Spring 2014

2,958

0.66

0.64, 0.68

Predictive (Fall)

K

PARCC ELA

3,244

0.57

0.55, 0.60

Construct (Winter)

K

MAP Growth K-2: Spring 2013

3,306

0.81

0.79, 0.82

Construct (Winter)

K

MAP Growth K-2: Fall 2013

3,090

0.80

0.78, 0.81

Construct (Winter)

K

MAP Growth K-2: Winter 2014

3,053

0.76

0.74, 0.77

Construct (Winter)

K

MAP Growth K-2: Spring 2014

3,167

0.72

0.71, 0.74

Predictive (Winter)

K

PARCC ELA

3,475

0.61

0.59, 0.63

Construct (Spring)

K

MAP Growth K-2: Fall 2013

3,426

0.83

0.82, 0.84

Construct (Spring)

K

MAP Growth K-2: Winter 2014

3,270

0.79

0.78, 0.80

Construct (Spring)

K

MAP Growth K-2: Spring 2014

3,586

0.76

0.75, 0.77

Predictive (Spring)

K

PARCC ELA

3,828

0.63

0.61, 0.65

Construct (Fall)

1

MAP Growth K-2: Spring 2014

4,684

0.81

0.80, 0.82

Construct (Fall)

1

MAP Growth: Fall 2014

4,205

0.75

0.73, 0.76

Construct (Fall)

1

MAP Growth: Winter 2015

3,202

0.74

0.73, 0.76

Construct (Fall)

1

MAP Growth: Spring 2015

4,361

0.72

0.71, 0.74

Predictive (Fall)

1

PARCC ELA

4,765

0.69

0.67, 0.70

Construct (Winter)

1

MAP Growth K-2: Spring 2014

4,631

0.87

0.86, 0.87

Construct (Winter)

1

MAP Growth: Fall 2014

4,181

0.78

0.77, 0.79

Construct (Winter)

1

MAP Growth: Winter 2015

3,237

0.79

0.77, 0.80

Construct (Winter)

1

MAP Growth: Spring 2015

4,261

0.76

0.75, 0.77

Predictive (Winter)

1

PARCC ELA

4,682

0.71

0.69, 0.72

Construct (Spring)

1

MAP Growth: Fall 2014

4,462

0.80

0.79, 0.81

Construct (Spring)

1

MAP Growth: Winter 2015

3,359

0.80

0.79, 0.81

Construct (Spring)

1

MAP Growth: Spring 2015

4,658

0.78

0.77, 0.79

Predictive (Spring)

1

PARCC ELA

5,124

0.73

0.71, 0.74

 

  1. Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

Not Provided

 

  1. Describe the degree to which the provided data support the validity of the tool:

Predictive validity coefficients, for each grade and each time of year, were consistently large, demonstrating a strong relationship between the MAP Growth K–2 and PARCC assessments, across the grades and times of year reported.

 

 

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

 

Type of Validity

Subgroup

Grade

Test or

Criterion

N

Coefficient

Confidence Interval

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2013

348

0.80

0.76, 0.84

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Fall 2013

351

0.81

0.77, 0.84

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Winter 2014

345

0.77

0.72, 0.81

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2014

350

0.73

0.68, 0.78

Predictive (Fall)

Ethnicity: Asian or Pacific Islander

K

PARCC ELA

368

0.58

0.50, 0.64

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2013

377

0.85

0.82, 0.88

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Fall 2013

378

0.85

0.81, 0.87

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Winter 2014

372

0.80

0.76, 0.83

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2014

377

0.77

0.72, 0.81

Predictive (Winter)

Ethnicity: Asian or Pacific Islander

K

PARCC ELA

394

0.60

0.53, 0.66

Construct (Spring)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Fall 2013

389

0.86

0.83, 0.88

Construct (Spring)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Winter 2014

381

0.82

0.78, 0.85

Construct (Spring)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2014

389

0.80

0.76, 0.83

Predictive (Spring)

Ethnicity: Asian or Pacific Islander

K

PARCC ELA

400

0.66

0.60, 0.71

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Spring 2013

189

0.58

0.47, 0.67

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Fall 2013

185

0.66

0.57, 0.74

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Winter 2014

181

0.60

0.50, 0.69

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Spring 2014

185

0.55

0.44, 0.64

Predictive (Fall)

Ethnicity: Black

K

PARCC ELA

256

0.53

0.43, 0.61

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Spring 2013

205

0.76

0.70, 0.82

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Fall 2013

196

0.74

0.67, 0.80

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Winter 2014

197

0.68

0.59, 0.75

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Spring 2014

198

0.66

0.57, 0.73

Predictive (Winter)

Ethnicity: Black

K

PARCC ELA

251

0.53

0.44, 0.62

Construct (Spring)

Ethnicity: Black

K

MAP Growth K-2: Fall 2013

207

0.73

0.66, 0.79

Construct (Spring)

Ethnicity: Black

K

MAP Growth K-2: Winter 2014

205

0.70

0.62, 0.76

Construct (Spring)

Ethnicity: Black

K

MAP Growth K-2: Spring 2014

210

0.68

0.59, 0.74

Predictive (Spring)

Ethnicity: Black

K

PARCC ELA

225

0.53

0.42, 0.61

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2013

758

0.69

0.65, 0.72

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Fall 2013

741

0.68

0.63, 0.71

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Winter 2014

726

0.63

0.58, 0.67

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2014

753

0.60

0.55, 0.65

Predictive (Fall)

Ethnicity: Hispanic

K

PARCC ELA

822

0.52

0.47, 0.57

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2013

826

0.79

0.76, 0.81

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Fall 2013

778

0.77

0.74, 0.80

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Winter 2014

758

0.74

0.71, 0.77

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2014

782

0.70

0.66, 0.73

Predictive (Winter)

Ethnicity: Hispanic

K

PARCC ELA

863

0.61

0.56, 0.65

Construct (Spring)

Ethnicity: Hispanic

K

MAP Growth K-2: Fall 2013

818

0.80

0.78, 0.83

Construct (Spring)

Ethnicity: Hispanic

K

MAP Growth K-2: Winter 2014

800

0.78

0.75, 0.80

Construct (Spring)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2014

831

0.75

0.71, 0.77

Predictive (Spring)

Ethnicity: Hispanic

K

PARCC ELA

901

0.59

0.55, 0.63

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Spring 2013

1,533

0.69

0.67, 0.72

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Fall 2013

1,414

0.70

0.67, 0.72

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Winter 2014

1,351

0.66

0.63, 0.69

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Spring 2014

1,513

0.64

0.61, 0.67

Predictive (Fall)

Ethnicity: White

K

PARCC ELA

1,605

0.54

0.50, 0.57

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Spring 2013

1,691

0.78

0.76, 0.79

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Fall 2013

1,554

0.77

0.75, 0.79

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Winter 2014

1,547

0.73

0.70, 0.75

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Spring 2014

1,625

0.69

0.66, 0.71

Predictive (Winter)

Ethnicity: White

K

PARCC ELA

1,744

0.56

0.52, 0.59

Construct (Spring)

Ethnicity: White

K

MAP Growth K-2: Fall 2013

1,813

0.81

0.80, 0.83

Construct (Spring)

Ethnicity: White

K

MAP Growth K-2: Winter 2014

1,698

0.77

0.75, 0.79

Construct (Spring)

Ethnicity: White

K

MAP Growth K-2: Spring 2014

1,947

0.73

0.71, 0.75

Predictive (Spring)

Ethnicity: White

K

PARCC ELA

2,062

0.59

0.56, 0.62

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Spring 2013

1,441

0.73

0.71, 0.76

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Fall 2013

1,368

0.75

0.73, 0.78

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Winter 2014

1,318

0.70

0.68, 0.73

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Spring 2014

1,422

0.67

0.64, 0.70

Predictive (Fall)

Gender: Female

K

PARCC ELA

1,585

0.56

0.53, 0.60

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Spring 2013

1,616

0.82

0.80, 0.83

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Fall 2013

1,501

0.81

0.79, 0.82

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Winter 2014

1,484

0.77

0.75, 0.79

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Spring 2014

1,539

0.74

0.72, 0.76

Predictive (Winter)

Gender: Female

K

PARCC ELA

1,712

0.61

0.58, 0.64

Construct (Spring)

Gender: Female

K

MAP Growth K-2: Fall 2013

1,648

0.85

0.84, 0.86

Construct (Spring)

Gender: Female

K

MAP Growth K-2: Winter 2014

1,576

0.81

0.79, 0.82

Construct (Spring)

Gender: Female

K

MAP Growth K-2: Spring 2014

1,733

0.78

0.76, 0.80

Predictive (Spring)

Gender: Female

K

PARCC ELA

1,864

0.65

0.62, 0.67

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Spring 2013

1,557

0.72

0.69, 0.74

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Fall 2013

1,481

0.71

0.68, 0.73

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Winter 2014

1,431

0.67

0.64, 0.70

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Spring 2014

1,536

0.66

0.63, 0.68

Predictive (Fall)

Gender: Male

K

PARCC ELA

1,659

0.58

0.55, 0.62

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Spring 2013

1,690

0.79

0.78, 0.81

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Fall 2013

1,589

0.79

0.77, 0.80

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Winter 2014

1,569

0.74

0.72, 0.76

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Spring 2014

1,628

0.70

0.68, 0.73

Predictive (Winter)

Gender: Male

K

PARCC ELA

1,763

0.61

0.58, 0.64

Construct (Spring)

Gender: Male

K

MAP Growth K-2: Fall 2013

1,778

0.81

0.79, 0.82

Construct (Spring)

Gender: Male

K

MAP Growth K-2: Winter 2014

1,694

0.77

0.75, 0.79

Construct (Spring)

Gender: Male

K

MAP Growth K-2: Spring 2014

1,853

0.74

0.72, 0.76

Predictive (Spring)

Gender: Male

K

PARCC ELA

1,964

0.61

0.59, 0.64

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth K-2: Spring 2014

538

0.84

0.81, 0.86

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Fall 2014

524

0.81

0.78, 0.84

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Winter 2015

447

0.77

0.73, 0.81

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Spring 2015

532

0.77

0.73, 0.80

Predictive (Fall)

Ethnicity: Asian or Pacific Islander

1

PARCC ELA

543

0.71

0.67, 0.75

Construct (Winter)

Ethnicity: Asian or Pacific Islander

1

MAP Growth K-2: Spring 2014

543

0.88

0.85, 0.89

Construct (Winter)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Fall 2014

528

0.81

0.78, 0.84

Construct (Winter)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Winter 2015

455

0.79

0.76, 0.83

Predictive (Winter)

Ethnicity: Asian or Pacific Islander

1

PARCC ELA

533

0.79

0.76, 0.82

Predictive (Winter)

Ethnicity: Asian or Pacific Islander

1

PARCC ELA

544

0.70

0.66, 0.74

Construct (Spring)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Fall 2014

552

0.83

0.81, 0.86

Construct (Spring)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Winter 2015

475

0.81

0.78, 0.84

Construct (Spring)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Spring 2015

558

0.80

0.77, 0.83

Predictive (Spring)

Ethnicity: Asian or Pacific Islander

1

PARCC ELA

569

0.72

0.68, 0.76

Construct (Fall)

Ethnicity: Black

1

MAP Growth K-2: Spring 2014

280

0.75

0.69, 0.80

Construct (Fall)

Ethnicity: Black

1

MAP Growth: Fall 2014

262

0.71

0.65, 0.77

Construct (Fall)

Ethnicity: Black

1

MAP Growth: Spring 2015

265

0.63

0.55, 0.70

Predictive (Fall)

Ethnicity: Black

1

PARCC ELA

290

0.63

0.55, 0.69

Construct (Winter)

Ethnicity: Black

1

MAP Growth K-2: Spring 2014

278

0.83

0.79, 0.87

Construct (Winter)

Ethnicity: Black

1

MAP Growth: Fall 2014

254

0.76

0.70, 0.81

Predictive (Winter)

Ethnicity: Black

1

PARCC ELA

257

0.73

0.67, 0.78

Predictive (Winter)

Ethnicity: Black

1

PARCC ELA

281

0.68

0.61, 0.74

Construct (Spring)

Ethnicity: Black

1

MAP Growth: Fall 2014

278

0.77

0.71, 0.81

Construct (Spring)

Ethnicity: Black

1

MAP Growth: Spring 2015

279

0.79

0.74, 0.83

Predictive (Spring)

Ethnicity: Black

1

PARCC ELA

309

0.69

0.63, 0.75

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth K-2: Spring 2014

1,025

0.79

0.76, 0.81

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth: Fall 2014

889

0.72

0.68, 0.75

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth: Winter 2015

622

0.71

0.67, 0.75

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth: Spring 2015

925

0.69

0.65, 0.72

Predictive (Fall)

Ethnicity: Hispanic

1

PARCC ELA

1,044

0.65

0.61, 0.68

Construct (Winter)

Ethnicity: Hispanic

1

MAP Growth K-2: Spring 2014

1,028

0.85

0.83, 0.87

Construct (Winter)

Ethnicity: Hispanic

1

MAP Growth: Fall 2014

894

0.77

0.74, 0.80

Construct (Winter)

Ethnicity: Hispanic

1

MAP Growth: Winter 2015

627

0.76

0.72, 0.79

Predictive (Winter)

Ethnicity: Hispanic

1

PARCC ELA

916

0.74

0.70, 0.76

Predictive (Winter)

Ethnicity: Hispanic

1

PARCC ELA

1,039

0.68

0.64, 0.71

Construct (Spring)

Ethnicity: Hispanic

1

MAP Growth: Fall 2014

919

0.78

0.76, 0.81

Construct (Spring)

Ethnicity: Hispanic

1

MAP Growth: Winter 2015

640

0.76

0.73, 0.79

Construct (Spring)

Ethnicity: Hispanic

1

MAP Growth: Spring 2015

959

0.75

0.73, 0.78

Predictive (Spring)

Ethnicity: Hispanic

1

PARCC ELA

1,090

0.70

0.66, 0.73

Construct (Fall)

Ethnicity: Multi-Ethnic

1

MAP Growth K-2: Spring 2014

172

0.82

0.77, 0.87

Construct (Fall)

Ethnicity: Multi-Ethnic

1

MAP Growth: Fall 2014

157

0.75

0.67, 0.81

Construct (Fall)

Ethnicity: Multi-Ethnic

1

MAP Growth: Spring 2015

160

0.69

0.60, 0.76

Predictive (Fall)

Ethnicity: Multi-Ethnic

1

PARCC ELA

178

0.64

0.54, 0.72

Construct (Winter)

Ethnicity: Multi-Ethnic

1

MAP Growth K-2: Spring 2014

166

0.87

0.83, 0.90

Predictive (Winter)

Ethnicity: Multi-Ethnic

1

PARCC ELA

151

0.81

0.74, 0.86

Predictive (Winter)

Ethnicity: Multi-Ethnic

1

PARCC ELA

167

0.68

0.59, 0.76

Construct (Spring)

Ethnicity: Multi-Ethnic

1

MAP Growth: Fall 2014

160

0.87

0.82, 0.90

Construct (Spring)

Ethnicity: Multi-Ethnic

1

MAP Growth: Spring 2015

162

0.83

0.78, 0.87

Predictive (Spring)

Ethnicity: Multi-Ethnic

1

PARCC ELA

184

0.74

0.67, 0.80

Construct (Fall)

Ethnicity: White

1

MAP Growth K-2: Spring 2014

2,503

0.78

0.77, 0.80

Construct (Fall)

Ethnicity: White

1

MAP Growth: Fall 2014

2,286

0.72

0.70, 0.74

Construct (Fall)

Ethnicity: White

1

MAP Growth: Winter 2015

1,833

0.71

0.69, 0.73

Construct (Fall)

Ethnicity: White

1

MAP Growth: Spring 2015

2,394

0.70

0.68, 0.72

Predictive (Fall)

Ethnicity: White

1

PARCC ELA

2,534

0.65

0.62, 0.67

Construct (Winter)

Ethnicity: White

1

MAP Growth K-2: Spring 2014

2,454

0.85

0.84, 0.86

Construct (Winter)

Ethnicity: White

1

MAP Growth: Fall 2014

2,273

0.76

0.74, 0.78

Construct (Winter)

Ethnicity: White

1

MAP Growth: Winter 2015

1,866

0.76

0.74, 0.78

Predictive (Winter)

Ethnicity: White

1

PARCC ELA

2,325

0.73

0.71, 0.75

Predictive (Winter)

Ethnicity: White

1

PARCC ELA

2,484

0.67

0.65, 0.69

Construct (Spring)

Ethnicity: White

1

MAP Growth: Fall 2014

2,459

0.78

0.76, 0.79

Construct (Spring)

Ethnicity: White

1

MAP Growth: Winter 2015

1,930

0.78

0.76, 0.79

Construct (Spring)

Ethnicity: White

1

MAP Growth: Spring 2015

2,603

0.75

0.73, 0.77

Predictive (Spring)

Ethnicity: White

1

PARCC ELA

2,783

0.69

0.67, 0.71

Construct (Fall)

Gender: Female

1

MAP Growth K-2: Spring 2014

2,287

0.83

0.82, 0.85

Construct (Fall)

Gender: Female

1

MAP Growth: Fall 2014

2,071

0.77

0.75, 0.78

Construct (Fall)

Gender: Female

1

MAP Growth: Winter 2015

1,586

0.77

0.74, 0.79

Construct (Fall)

Gender: Female

1

MAP Growth: Spring 2015

2,134

0.73

0.71, 0.75

Predictive (Fall)

Gender: Female

1

PARCC ELA

2,335

0.71

0.69, 0.73

Construct (Winter)

Gender: Female

1

MAP Growth K-2: Spring 2014

2,275

0.88

0.87, 0.89

Construct (Winter)

Gender: Female

1

MAP Growth: Fall 2014

2,062

0.79

0.77, 0.81

Construct (Winter)

Gender: Female

1

MAP Growth: Winter 2015

1,600

0.80

0.78, 0.82

Construct (Winter)

Gender: Female

1

MAP Growth: Spring 2015

2,096

0.77

0.76, 0.79

Predictive (Winter)

Gender: Female

1

PARCC ELA

2,302

0.72

0.70, 0.74

Construct (Spring)

Gender: Female

1

MAP Growth: Fall 2014

2,198

0.81

0.80, 0.83

Construct (Spring)

Gender: Female

1

MAP Growth: Winter 2015

1,658

0.82

0.80, 0.83

Construct (Spring)

Gender: Female

1

MAP Growth: Spring 2015

2,286

0.80

0.78, 0.81

Predictive (Spring)

Gender: Female

1

PARCC ELA

2,513

0.74

0.72, 0.76

Construct (Fall)

Gender: Male

1

MAP Growth K-2: Spring 2014

2,397

0.79

0.77, 0.80

Construct (Fall)

Gender: Male

1

MAP Growth: Fall 2014

2,134

0.73

0.71, 0.75

Construct (Fall)

Gender: Male

1

MAP Growth: Winter 2015

1,616

0.72

0.70, 0.74

Construct (Fall)

Gender: Male

1

MAP Growth: Spring 2015

2,227

0.71

0.69, 0.73

Predictive (Fall)

Gender: Male

1

PARCC ELA

2,430

0.66

0.64, 0.68

Construct (Winter)

Gender: Male

1

MAP Growth K-2: Spring 2014

2,355

0.86

0.85, 0.87

Construct (Winter)

Gender: Male

1

MAP Growth: Fall 2014

2,118

0.77

0.75, 0.79

Construct (Winter)

Gender: Male

1

MAP Growth: Winter 2015

1,637

0.77

0.75, 0.79

Construct (Winter)

Gender: Male

1

MAP Growth: Spring 2015

2,164

0.75

0.73, 0.77

Predictive (Winter)

Gender: Male

1

PARCC ELA

2,379

0.69

0.67, 0.71

Construct (Spring)

Gender: Male

1

MAP Growth: Fall 2014

2,263

0.79

0.77, 0.80

Construct (Spring)

Gender: Male

1

MAP Growth: Winter 2015

1,701

0.78

0.76, 0.80

Construct (Spring)

Gender: Male

1

MAP Growth: Spring 2015

2,371

0.77

0.75, 0.78

Predictive (Spring)

Gender: Male

1

PARCC ELA

2,610

0.71

0.69, 0.73

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format:

Not Provided

 

Sample Representativeness

GradeK1
RatingFull bubbleFull bubble

Primary Classification Accuracy Sample

 

Representation

New England, Middle Atlantic, East North Central, South Atlantic, Mountain. The sample for the study contained student records from a total of five states (Colorado, Illinois, New Jersey, New Mexico, and Rhode Island) and one federal district (District of Columbia), and thus had representation from all four U.S. Census regions.

Date

Partnership for Assessment of Readiness for College and Careers (PARCC) data were based on students in Grade 3 who took the PARCC assessment during Spring 2016. The Spring 2016 PARCC administration spanned from March 2016 through June 2016.MAP Growth K–2 data was obtained for this sample of Grade 3 students taking PARCC in Spring 2016. Specifically, for these students who were in Grade 3 in Spring 2016, their MAP Growth K–2 scores from previous years were obtained. Their MAP Growth K–2 data from Fall 2012, Winter 2013, and Spring 2013 served as their MAP Growth K–2 scores for Grade K; their MAP Growth K–2 data from Fall 2013, Winter 2014, and Spring 2014 served as their MAP Growth K–2 scores for Grade 1; and their MAP Growth K–2 data from Fall 2014, Winter 2015, and Spring 2015 served as their MAP Growth K–2 scores for Grade 2.Thus, the data used were based on a sample of third-grade students taking PARCC assessment in Spring 2016 and their MAP Growth K–2 from previous years, when they were in Grades K, 1, and 2.

Size

Table 2 summarizes the total number of students, as functions of grade, state, region, and division.

Male

50.80%

Female

49.18%

Unknown

0.02%

Other SES Indicators

Not Provided

Free or reduced-price lunch

Not Provided

White, Non-Hispanic

40.86%

Black, Non-Hispanic

6.21%

Hispanic

22.92%

American Indian/Alaska Native

2.33%

Asian/Pacific Islander

9.05%

Multi-Ethnic

3.39%

Not Specified or Other

15.24%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

Table 2: Number of Students

State, Region, or Division

N

Number of Students Per State

CO

2,967

DC

167

IL

12,140

NJ

643

NM

208

RI

209

Total

16,334

Number of Students Per Region

Midwest

12,140

Northeast

852

South

167

West

3,175

Total

16,334

Number of Students Per Division

East North Central

12,140

Middle Atlantic

643

Mountain

3,175

New England

209

South Atlantic

167

Total

16,334

 

Bias Analysis Conducted

GradeK1
RatingYesYes
  1. Description of the method used to determine the presence or absence of bias:

Once tests have been administered and results collected, analysis to detect differential item function (DIF) may be conducted. The method used to detect DIF for NWEA is based on the work of Linacre and Wright[1], implemented by Linacre[2]. When executed as part of a Winsteps analysis, this method entails:

  1. Carrying out a joint Rasch analysis of all person-group classifications that anchors all student abilities and item difficulties to a common (theta) scale.
  2. Carrying out a calibration analysis for the Reference group keeping the student ability estimates and scale structure anchored to produce Reference group item difficulty estimates.
  3. Carrying out a calibration analysis for the Focal group keeping the student ability estimates and scale structure anchored to produce Focal group item difficulty estimates.
  4. Computing pair-wise item difficulty differences (Focal group difficulty minus Reference group difficulty). The calibration analyses in steps b and c are computed for each item, as though all items, except the item currently targeted, are anchored at the joint calibration run (step a).

Ideally, analyzing items for DIF would be incorporated within the item calibration process. This can prove to be a useful initial screen to identify items that should be subjected to heightened surveillance for DIF. However, the number of responses to an item by members of demographic groups of interest may well be insufficient to yield stable calibration estimates at the group level. This can introduce statistical artifacts as well as Type I errors into DIF analyses. To avoid this, data for analyses are taken from responses to operational tests.

 

 

  1. Description of the subgroups for which bias analyses were conducted:

Each test record included the student’s recorded ethnic group membership (Native American, Asian, African American, Hispanic, and European/Anglo American).  

 

  1. Description of the results of the bias analyses conducted, including data and interpretative statements:

The DIF analysis for MAP Growth K–2 includes reading test events that were administered during the Spring and Fall terms of 2010 in six states and were retrieved from the NWEA Growth Research Database (GRD). The six states included Colorado, Illinois, Michigan, New Mexico, South Carolina, and Washington. Each assessment record included the student’s ethnic group membership (Native American, Asian/Pacific Islander, African American, Hispanic, European/Anglo American, and Multi-Racial), the student’s gender, and responses to sixty items administered across two tests per content area. The number of students and the number of test items are provided in Table 6.

Table 6: Numbers of Students and Test Items Included in the DIF Analysis.

Content area

Items

Students

Ethnic Group

% of Students

Reading

1,550

250,734

Native American

2.5

Asian/Pacific Islander

4.1

African American

22

Hispanic

15.3

European/Anglo American

53.6

Multi-Racial

2.4

 

Winsteps (version 3.72.0) was invoked to carry out the analysis. The numbers of items exhibiting DIF for each ethnic focal group are reported in Table 7. Similarly, the numbers of items exhibiting gender-specific DIF are reported in Table 8for reading. The numbers of items reported in these tables are based on a minimum of 500 student responses for each group where p < .05 ensuring that each group in the comparison had adequate power to detect DIF and was statistically significant. The Educational Testing Services (ETS) delta method of categorizing DIF is included in the tables (ETSClass). The delta method provides differentiation between items exhibiting graduated levels of DIF (negligible DIF: difference < .43 logits; moderate DIF: difference >= .43 logits and < 0.64 logits; severe DIF: difference >=.64 logits). For categories B and C, there is a further breakdown using “+” (indicating DIF is against the reference group) and “-“ (indicating DIF is against the focal group).


 

Table 7: Differential Item Functioning for MAP Reading Items (N=1550)

Reference is European/Anglo Americans

Reference is Base Calibration (All Students)

Focal Group

ETSClass*

N Items**

% of Items

Focal Group

ETSClass*

N Items

% of Items

Native American

A

21

61.8%

Native American

A

37

100.0%

B-

7

20.6%

B

0

0.0%

B+

3

8.8%

B+

0

0.0%

C-

3

8.8%

C-

0

0.0%

C+

0

0.0%

C+

0

0.0%

Asian/Pacific Islander

A

77

75.5%

Asian/Pacific Islander

A

109

100.0%

B-

7

6.9%

B-

0

0.0%

B+

10

9.8%

B+

0

0.0%

C-

4

3.9%

C-

0

0.0%

C+

4

3.9%

C+

0

0.0%

African American

A

491

83.4%

African American

A

613

100.0%

B-

48

8.1%

B-

0

0.0%

B+

35

5.9%

B+

0

0.0%

C-

8

1.4%

C-

0

0.0%

C+

7

1.2%

C+

0

0.0%

Hispanic

A

319

86.4%

Hispanic

A

412

100.0%

B-

39

10.6%

B-

0

0.0%

B+

5

1.4%

B+

0

0.0%

C-

6

1.6%

C-

0

0.0%

C+

0

0.0%

C+

0

0.0%

Multi-Racial

A

7

100.0%

Multi-Racial

A

9

100.0%

B-

0

0.0%

B-

0

0.0%

B+

0

0.0%

B+

0

0.0%

C-

0

0.0%

C-

0

0.0%

C+

0

0.0%

C+

0

0.0%

 

Anglo American

A

608

100.0%

B-

0

0.0%

B+

0

0.0%

C-

0

0.0%

C+

0

0.0%

*A=|DIF|<.43 logits; B=.43 logits ≤|DIF|<.64 logits; C=|DIF|≥.64 logits; B- and C- = DIF is against the Focal group; B+ and C+ = DIF is against the Reference group

**The number of items with 500 or more responses from the group where p < .05

 

 

Table 8: Differential Item Functioning Related to Gender for MAP Growth by Content Area

ETSClass*

N Items**

% of Items

Reading (N = 1550)

A

555

95.9%

B-

8

1.4%

B+

7

1.2%

C-

4

0.7%

C+

5

0.9%

*A=|DIF|<.43 logits; B=.43 logits ≤|DIF|<.64 logits; C=|DIF|≥.64 logits; B- and C- = DIF is against the Focal group (female); B+ and C+ = DIF is against the Reference group (male)

**The number of items with 500 or more responses from the group where p < .05

 

Table 7indicates negligible if any DIF between the base calibration and each of the focal groups for reading. There is noticeable DIF at both B and C ETSClass levels when the reference group is Anglo Americans and the focal group is Native Americans. Given the relatively few items in common between these two groups in conjunction with limited representation in the sample population, concern regarding the aforementioned is diminutive. DIF between the remaining focal groups and the reference group (Anglo Americans) is fairly consistent at the same ETSClass levels against both focal and reference groups. Table 8indicates minimal DIF related to gender with 95.9% of the items at level A (negligible).

Actions taken. All items revealed as exhibiting moderate DIF are subjected to an extra review by NWEA Content Specialists to identify the source(s) for differential functioning. For each item, these specialists make a judgment to: 1) remove the item from the item bank, 2) revise the item and re-submit it for field-testing, or 3) to retain the item as is. Items exhibiting severe DIF are removed from the bank. These procedures are consistent with and act to extended periodic Item Quality Reviews, which remove or flag items for revision and re-field-testing problem items.



[1] Linacre, J. M. & Wright, B. D. (1989). Mantel-Haenszel DIF and PROX are Equivalent. Rasch Measurement Transactions, 1989, 3, (2), 52-53.

[2] Linacre, J. M. (2012). Winsteps-Ministep: Rasch Model Computer Programs, Version 3.75.0, www.winsteps.com.

 

Administration Format

GradeK1
Data
  • Individual
  • Individual
  • Administration & Scoring Time

    GradeK1
    Data
  • 45 minutes
  • 45 minutes
  • Scoring Format

    GradeK1
    Data
  • Automatic
  • Automatic
  • Types of Decision Rules

    GradeK1
    Data
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    GradeK1
    Data
  • No
  • No