MAP Growth K-2

Mathematics

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

MAP Growth K–2 Mathematics annual per-student subscription fees range from $7.00–$9.50. A bundled assessment suite of Mathematics and Reading tests starts at $13.50 per student. Discounts are available based on volume and other factors.

 

Replacement Cost:

Subscription renewal fees subject to change annually.

 

Included in Cost:

Annual subscription fees include the following:

  • Full Assessment Suite: MAP Growth K–2 assessments can be administered up to four times per calendar year. The abbreviated Screening Assessment may be administered once a year for placement purposes. The license also includes access to twenty-eight Skills Checklist tests that provide information about specific skills and concepts (e.g., number sense and computation).
  • Robust Reporting: All results from MAP Growth K–2 assessments (including RIT scale scores, proficiency projections, and status and growth norms) are available in a variety of views and formats through MAP Growth’s comprehensive suite of reports.
  • Learning Continuum: Dynamic reporting of learning statements, specifically aligned to the applicable state standards, provide information about what each student is ready to learn.
  • System of Support: A full system of support is provided to enable the success of MAP Growth partners, including technical support; implementation support through the first test administration; and ongoing, dedicated account management for the duration of the partnership.
  • NWEA Professional Learning Online: Access to this online learning portal offers on-demand tutorials, webinars, courses, and videos to supplement professional learning plans and help educators use MAP Growth to improve teaching and learning.

 

NWEA offers a portfolio of flexible, customizable professional learning and training options to meet the needs of partners. Please contact NWEA via https://www.nwea.org/sales-information/ for specific details on pricing.

 

Technology Requirements:

  • Computer or tablet
  • Internet connection

 

Training Requirements:

  • 1–4 hours of training

 

Qualified Administrators:

Examiners should meet the same qualifications as a teaching paraprofessional; examiners should complete all necessary training related to administering an assessment.

 

Accommodations:

MAP Growth assessments incorporate universal design principles for greater accessibility. This means that all content areas are created considering universal design and accessibility standards from the start. For example, alternative text descriptions (alt-tags) for images are an important feature on a website to provide access to those using screen readers. Alt-tags provide descriptions of pictures, charts, graphs, etc., to those who may not be able to see the information. Laying this foundation promotes accessibility for students using various accommodations.

 

Following national standards, such as the Web Content Accessibility Guidelines (WCAG) 2.0 and Accessible Rich Internet Applications (ARIA), helps to guide the creation MAP Growth assessments.

With support from the WGBH

National Center for Accessible Media (NCAM), NWEA has created detailed and thorough guidelines for describing many variations of images, charts, and graphics targeted specifically to mathematics and reading. The guidelines review concepts such as item integrity, fairness, and the unique challenges image description writers face in the context of assessment. These guidelines result in consistent, user-friendly, and valid image descriptions that support the use of screen readers.

 

MAP Growth K–2 assessments include built-in human audio support and other interactive features for early learners. The purpose of providing human voice audio is to address specific content areas within mathematics. Therefore, the audio is strategically placed within the item to maintain content validity.

 

This assessment does not include many of the accessibility features in MAP Growth for grades two and above because adding new assistive technology at the K–2 level calls into question the validity of what is being tested: the use of new technology or the item.

 

Tools are made available for all students on the assessment. These tools are embedded into the user interface for each item and are at the appropriate test level. Tools are not specific to a certain population but will be available to all users whenever necessary so that students can use these tools during their testing experience.

Where to Obtain:

Website: www.nwea.org

Address: 121 NW Everett Street, Portland, OR 97209

Phone number: (503) 624-1951

Please contact NWEA via https://www.nwea.org/sales-information/ for service and support questions.
Access to Technical Support:

Toll-free telephone support, online support, website knowledge base, and live chat support are available. 

MAP Growth K–2 assessments are used across the country for multiple purposes, including as universal screening tools in response to intervention (RTI) programs.

 

MAP Growth K–2 can serve as universal screeners for identifying students at risk of poor academic outcomes in mathematics. MAP Growth K–2 assessments give educators insight into the instructional needs of all students, whether they are performing at, above, or below grade level.

 

MAP Growth K–2 assessments contain appropriate items for students who are still acquiring the skills needed to read independently.

 

MAP Growth K–2 assessments are computer adaptive tests with a cross-grade vertical scale that assess achievement according to standards-aligned content. Scores from repeated administrations measure growth over time. MAP Growth K–2 tests can be administered four times per calendar year.

 

MAP Growth and MAP Growth K–2 are scaled across grades. The Rasch model, an item response theory (IRT) model commonly employed in K–12 assessment programs, was used to create the scales for MAP Growth and MAP Growth K–2 assessments. These scales have been named RIT scales (for Rasch Unit).

 

 

Assessment Format:

  • Direct: Computerized

 

Administration Time:

  • 45 minutes per student, per subject

 

Scoring Time:

  • Scoring is automatic

 

Scoring Method:

MAP Growth K–2 scores are not based on raw scores because they are adaptive. The difficulty of the item answered is used to derive the student’s scale score. During the assessment, a Bayesian scoring algorithm is used to inform item selection. Bayesian scoring for item selection prevents artificially dramatic fluctuations in student achievement at the beginning of the test, which can occur with other scoring algorithms. Although the Bayesian scoring works well as a procedure for selecting items during test administration, Bayesian scores are not appropriate for the calculation of final student achievement scores. This is because Bayesian scoring uses information other than the student’s responses to questions (such as past performance) to calculate the achievement estimate. Since only the student’s performance today should be used to give the student’s current score, a maximum-likelihood algorithm is used to calculate a student’s actual score at the completion of the test.

 

Scores Generated:

  • Percentile score       
  • IRT-based score       
  • Developmental benchmarks
  • Developmental cut points
  • Composite scores
  • Subscale/subtest scores

 

 

 

Classification Accuracy

GradeK1
Criterion 1 FallHalf-filled bubbledFull bubbled
Criterion 1 WinterHalf-filled bubbledFull bubbled
Criterion 1 SpringFull bubbledFull bubbled
Criterion 2 Falldashdash
Criterion 2 Winterdashdash
Criterion 2 Springdashdash

Primary Sample

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

 

Grade K

Grade 1

Cut points

MAP Growth < 136, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.07

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.14

0.15

False Negative Rate

0.40

0.24

Sensitivity

0.60

0.76

Specificity

0.86

0.85

Positive Predictive Power

0.25

0.29

Negative Predictive Power

0.97

0.98

Overall Classification Rate

0.84

0.85

Area Under the Curve (AUC)

0.85

0.90

AUC 95% Confidence Interval Lower

0.82

0.89

AUC 95% Confidence Interval Upper

0.87

0.91

At 90% Sensitivity, specificity equals

0.50

0.68

At 80% Sensitivity, specificity equals

0.71

0.85

At 70% Sensitivity, specificity equals

0.84

0.93

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

0.08

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.15

0.15

False Negative Rate

0.33

0.18

Sensitivity

0.67

0.82

Specificity

0.85

0.86

Positive Predictive Power

0.28

0.32

Negative Predictive Power

0.97

0.98

Overall Classification Rate

0.84

0.85

Area Under the Curve (AUC)

0.87

0.91

AUC 95% Confidence Interval Lower

0.85

0.90

AUC 95% Confidence Interval Upper

0.89

0.93

At 90% Sensitivity, specificity equals

0.60

0.72

At 80% Sensitivity, specificity equals

0.75

0.89

At 70% Sensitivity, specificity equals

0.89

0.94

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.07

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.15

0.15

False Negative Rate

0.28

0.16

Sensitivity

0.73

0.84

Specificity

0.85

0.85

Positive Predictive Power

0.27

0.30

Negative Predictive Power

0.98

0.99

Overall Classification Rate

0.84

0.85

Area Under the Curve (AUC)

0.88

0.92

AUC 95% Confidence Interval Lower

0.86

0.91

AUC 95% Confidence Interval Upper

0.89

0.93

At 90% Sensitivity, specificity equals

0.62

0.73

At 80% Sensitivity, specificity equals

0.79

0.89

At 70% Sensitivity, specificity equals

0.90

0.94

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Disaggregated Data

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Ethnicity: Black

 

Grade K

Grade 1

Cut points

MAP Growth < 136, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.17

0.15

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.22

0.18

False Negative Rate

0.44

0.29

Sensitivity

0.56

0.71

Specificity

0.78

0.82

Positive Predictive Power

0.34

0.41

Negative Predictive Power

0.90

0.94

Overall Classification Rate

0.74

0.81

Area Under the Curve (AUC)

0.77

0.86

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.34

0.52

At 80% Sensitivity, specificity equals

0.53

0.74

At 70% Sensitivity, specificity equals

0.63

0.83

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Ethnicity: Black

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.18

0.16

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.18

0.20

False Negative Rate

0.33

0.25

Sensitivity

0.67

0.75

Specificity

0.82

0.80

Positive Predictive Power

0.44

0.41

Negative Predictive Power

0.92

0.95

Overall Classification Rate

0.80

0.79

Area Under the Curve (AUC)

0.82

0.84

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.46

0.52

At 80% Sensitivity, specificity equals

0.68

0.75

At 70% Sensitivity, specificity equals

0.78

0.83

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Ethnicity: Black

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.17

0.14

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.25

0.24

False Negative Rate

0.25

0.23

Sensitivity

0.75

0.77

Specificity

0.75

0.76

Positive Predictive Power

0.38

0.35

Negative Predictive Power

0.94

0.95

Overall Classification Rate

0.75

0.76

Area Under the Curve (AUC)

0.82

0.82

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.48

0.52

At 80% Sensitivity, specificity equals

0.71

0.64

At 70% Sensitivity, specificity equals

0.80

0.80

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Ethnicity: Hispanic

 

Grade K

Grade 1

Cut points

MAP Growth < 136, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.12

0.15

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.23

0.27

False Negative Rate

0.28

0.19

Sensitivity

0.72

0.82

Specificity

0.77

0.73

Positive Predictive Power

0.31

0.34

Negative Predictive Power

0.95

0.96

Overall Classification Rate

0.77

0.74

Area Under the Curve (AUC)

0.83

0.87

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.48

0.57

At 80% Sensitivity, specificity equals

0.68

0.78

At 70% Sensitivity, specificity equals

0.80

0.87

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Ethnicity: Hispanic

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.14

0.15

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.26

0.28

False Negative Rate

0.25

0.10

Sensitivity

0.75

0.90

Specificity

0.74

0.72

Positive Predictive Power

0.31

0.36

Negative Predictive Power

0.95

0.98

Overall Classification Rate

0.74

0.75

Area Under the Curve (AUC)

0.85

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.58

0.66

At 80% Sensitivity, specificity equals

0.72

0.84

At 70% Sensitivity, specificity equals

0.78

0.93

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Ethnicity: Hispanic

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.13

0.15

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.26

0.29

False Negative Rate

0.24

0.08

Sensitivity

0.76

0.92

Specificity

0.74

0.71

Positive Predictive Power

0.30

0.35

Negative Predictive Power

0.95

0.98

Overall Classification Rate

0.74

0.74

Area Under the Curve (AUC)

0.83

0.89

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.47

0.62

At 80% Sensitivity, specificity equals

0.70

0.82

At 70% Sensitivity, specificity equals

0.80

0.92

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Ethnicity: White

 

Grade K

Grade 1

Cut points

MAP Growth < 136, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.04

0.04

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.09

0.10

False Negative Rate

0.61

0.32

Sensitivity

0.39

0.68

Specificity

0.91

0.90

Positive Predictive Power

0.15

0.22

Negative Predictive Power

0.97

0.99

Overall Classification Rate

0.89

0.89

Area Under the Curve (AUC)

0.83

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.42

0.68

At 80% Sensitivity, specificity equals

0.69

0.86

At 70% Sensitivity, specificity equals

0.83

0.92

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Ethnicity: White

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.05

0.04

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.10

0.09

False Negative Rate

0.49

0.29

Sensitivity

0.51

0.71

Specificity

0.90

0.91

Positive Predictive Power

0.21

0.26

Negative Predictive Power

0.97

0.99

Overall Classification Rate

0.88

0.90

Area Under the Curve (AUC)

0.87

0.91

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.52

0.74

At 80% Sensitivity, specificity equals

0.81

0.87

At 70% Sensitivity, specificity equals

0.89

0.94

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Ethnicity: White

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.04

0.04

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.10

0.09

False Negative Rate

0.39

0.26

Sensitivity

0.62

0.75

Specificity

0.90

0.91

Positive Predictive Power

0.20

0.25

Negative Predictive Power

0.98

0.99

Overall Classification Rate

0.89

0.91

Area Under the Curve (AUC)

0.87

0.93

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.61

0.78

At 80% Sensitivity, specificity equals

0.84

0.92

At 70% Sensitivity, specificity equals

0.89

0.97

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Ethnicity: Multi-Ethnic

 

Grade K

Grade 1

Cut points

Not Provided

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

Not Provided

0.04

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not Provided

0.20

False Positive Rate

Not Provided

0.12

False Negative Rate

Not Provided

0.14

Sensitivity

Not Provided

0.86

Specificity

Not Provided

0.88

Positive Predictive Power

Not Provided

0.00

Negative Predictive Power

Not Provided

0.23

Overall Classification Rate

Not Provided

0.99

Area Under the Curve (AUC)

Not Provided

0.88

AUC 95% Confidence Interval Lower

Not Provided

0.90

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

Not Provided

0.71

At 80% Sensitivity, specificity equals

Not Provided

0.86

At 70% Sensitivity, specificity equals

Not Provided

0.89

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Ethnicity: Multi-Ethnic

 

Grade K

Grade 1

Cut points

Not Provided

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

Not Provided

0.04

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not Provided

0.20

False Positive Rate

Not Provided

0.12

False Negative Rate

Not Provided

0.14

Sensitivity

Not Provided

0.86

Specificity

Not Provided

0.88

Positive Predictive Power

Not Provided

0.23

Negative Predictive Power

Not Provided

0.99

Overall Classification Rate

Not Provided

0.88

Area Under the Curve (AUC)

Not Provided

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

Not Provided

0.71

At 80% Sensitivity, specificity equals

Not Provided

0.86

At 70% Sensitivity, specificity equals

Not Provided

0.89

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Ethnicity: Multi-Ethnic

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.05

0.04

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.14

0.13

False Negative Rate

0.11

0.14

Sensitivity

0.89

0.86

Specificity

0.86

0.87

Positive Predictive Power

0.24

0.21

Negative Predictive Power

0.99

0.99

Overall Classification Rate

0.86

0.87

Area Under the Curve (AUC)

0.94

0.92

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.89

0.86

At 80% Sensitivity, specificity equals

0.89

0.86

At 70% Sensitivity, specificity equals

0.97

0.86

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Ethnicity: Asian or Pacific Islander

 

Grade K

Grade 1

Cut points

Not Provided

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

Not Provided

0.03

Base rate in the sample for children considered at-risk, including those with the most intensive needs

Not Provided

0.20

False Positive Rate

Not Provided

0.11

False Negative Rate

Not Provided

0.21

Sensitivity

Not Provided

0.79

Specificity

Not Provided

0.89

Positive Predictive Power

Not Provided

0.15

Negative Predictive Power

Not Provided

0.99

Overall Classification Rate

Not Provided

0.89

Area Under the Curve (AUC)

Not Provided

0.94

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

Not Provided

0.76

At 80% Sensitivity, specificity equals

Not Provided

1.00

At 70% Sensitivity, specificity equals

Not Provided

1.00

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Ethnicity: Asian or Pacific Islander

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.03

0.03

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.13

0.11

False Negative Rate

0.09

0.20

Sensitivity

0.91

0.80

Specificity

0.87

0.89

Positive Predictive Power

0.18

0.17

Negative Predictive Power

1.00

0.99

Overall Classification Rate

0.87

0.89

Area Under the Curve (AUC)

0.94

0.94

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.90

0.80

At 80% Sensitivity, specificity equals

0.91

0.88

At 70% Sensitivity, specificity equals

0.93

1.00

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Ethnicity: Asian or Pacific Islander

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.03

0.03

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.10

0.10

False Negative Rate

0.17

0.20

Sensitivity

0.83

0.80

Specificity

0.90

0.90

Positive Predictive Power

0.20

0.17

Negative Predictive Power

1.00

0.99

Overall Classification Rate

0.90

0.90

Area Under the Curve (AUC)

0.93

0.93

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.83

0.80

At 80% Sensitivity, specificity equals

1.00

0.87

At 70% Sensitivity, specificity equals

1.00

0.93

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Gender: Female

 

Grade K

Grade 1

Cut points

MAP Growth < 136, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.07

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.13

0.14

False Negative Rate

0.35

0.18

Sensitivity

0.65

0.82

Specificity

0.87

0.86

Positive Predictive Power

0.27

0.31

Negative Predictive Power

0.97

0.98

Overall Classification Rate

0.86

0.86

Area Under the Curve (AUC)

0.87

0.92

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.57

0.75

At 80% Sensitivity, specificity equals

0.78

0.89

At 70% Sensitivity, specificity equals

0.87

0.96

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Gender: Female

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.14

0.14

False Negative Rate

0.24

0.13

Sensitivity

0.76

0.87

Specificity

0.86

0.86

Positive Predictive Power

0.31

0.32

Negative Predictive Power

0.98

0.99

Overall Classification Rate

0.85

0.86

Area Under the Curve (AUC)

0.90

0.93

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.67

0.80

At 80% Sensitivity, specificity equals

0.86

0.92

At 70% Sensitivity, specificity equals

0.94

0.96

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Gender: Female

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.07

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.14

0.16

False Negative Rate

0.26

0.13

Sensitivity

0.75

0.87

Specificity

0.86

0.84

Positive Predictive Power

0.28

0.27

Negative Predictive Power

0.98

0.99

Overall Classification Rate

0.85

0.84

Area Under the Curve (AUC)

0.90

0.93

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.66

0.77

At 80% Sensitivity, specificity equals

0.85

0.91

At 70% Sensitivity, specificity equals

0.94

0.96

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Fall

Gender: Male

 

Grade K

Grade 1

Cut points

MAP Growth < 136, PARCC < 700

MAP Growth < 154, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.15

0.16

False Negative Rate

0.43

0.28

Sensitivity

0.57

0.72

Specificity

0.85

0.84

Positive Predictive Power

0.24

0.27

Negative Predictive Power

0.96

0.97

Overall Classification Rate

0.83

0.84

Area Under the Curve (AUC)

0.83

0.88

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.43

0.60

At 80% Sensitivity, specificity equals

0.64

0.81

At 70% Sensitivity, specificity equals

0.82

0.90

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Winter

Gender: Male

 

Grade K

Grade 1

Cut points

MAP Growth < 144, PARCC < 700

MAP Growth < 166, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

0.08

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.16

0.15

False Negative Rate

0.41

0.23

Sensitivity

0.59

0.77

Specificity

0.84

0.85

Positive Predictive Power

0.25

0.31

Negative Predictive Power

0.96

0.98

Overall Classification Rate

0.82

0.85

Area Under the Curve (AUC)

0.85

0.90

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.53

0.66

At 80% Sensitivity, specificity equals

0.66

0.87

At 70% Sensitivity, specificity equals

0.84

0.92

 

Criterion 1: Partnership for Assessment of Readiness for College and Careers (PARCC) Math

Time of Year: Spring

Gender: Male

 

Grade K

Grade 1

Cut points

MAP Growth < 153, PARCC < 700

MAP Growth < 176, PARCC < 700

Base rate in the sample for children requiring intensive intervention

0.08

0.07

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

0.20

False Positive Rate

0.16

0.13

False Negative Rate

0.29

0.19

Sensitivity

0.71

0.81

Specificity

0.84

0.87

Positive Predictive Power

0.27

0.34

Negative Predictive Power

0.97

0.98

Overall Classification Rate

0.83

0.87

Area Under the Curve (AUC)

0.86

0.91

AUC 95% Confidence Interval Lower

Not Provided

Not Provided

AUC 95% Confidence Interval Upper

Not Provided

Not Provided

At 90% Sensitivity, specificity equals

0.56

0.72

At 80% Sensitivity, specificity equals

0.74

0.88

At 70% Sensitivity, specificity equals

0.86

0.93

 

Reliability

GradeK1
RatingFull bubbledFull bubbled
  1. Justification for each type of reliability reported, given the type and purpose of the tool:

Using MAP Growth K–2 as an academic screener, the internal consistency reliability of student test scores (i.e., student RIT scores on MAP Growth) is key. However, estimating the internal consistency of an adaptive test, such as MAP Growth K–2, is challenging because traditional methods depend on all test takers to take a common test consisting of the same items. Application of these methods to adaptive tests is statistically cumbersome and inaccurate. Fortunately, an equally valid alternative is available in the marginal reliability coefficient[1] [2] that incorporates measurement error as a function of the test score. In effect, it is the result of combining measurement error estimated at different points on the achievement scale into a single index. Note that this method of calculating reliability yields results that are nearly identical to coefficient alpha, when both methods are applied to the same fixed-form test.

MAP Growth K–2 affords the means to screen students on multiple occasions (e.g., Fall, Winter, Spring) during the school year. Thus, test-retest reliability is also key, and we estimate test-retest reliability via the Pearson correlation between MAP Growth K–2 RIT scores of students taking MAP Growth K–2 in two terms within the school year (Fall and Winter, Fall and Spring, and Winter and Spring). Given that MAP Growth K–2 is an adaptive test, without any fixed-forms, this approach to test-retest reliability may be more accurately described as a mix between test-retest reliability and a type of parallel forms reliability. That is, MAP Growth K–2 RIT scores are obtained for students taking MAP Growth K–2 twice, spread across several months. The second test (or retest) is not the same test. Rather, the second test is comparable to the first, by its content and structure, differing only in the difficulty level of its items. Thus, both temporally related and parallel forms of reliability are defined as the consistency of covalent measures taken across time. Green, Bock, Humphreys, Linn, and Reckase[3] suggested the term “stratified, randomly parallel form reliability” to characterize this form of reliability.

 

  1. Description of the sample(s), including size and characteristics, for each reliability analysis conducted:

Representation

New England, Middle Atlantic, East North Central, South Atlantic, Mountain. The sample for the study contained student records from a total of five states (Colorado, Illinois, New Jersey, New Mexico, and Rhode Island) and one federal district (District of Columbia), and thus had representation from all four U.S. Census regions.

Date

PARCC data was based on students in Grade 3 who took the PARCC assessment during Spring 2016. The Spring 2016 PARCC administration spanned from March 2016 through June 2016.MAP Growth K–2 data was obtained for this sample of Grade 3 students taking PARCC in Spring 2016. Specifically, for these students who were in Grade 3 in Spring 2016, their MAP Growth K–2 scores from previous years were obtained. Their MAP Growth K–2 data from Fall 2012, Winter 2013, and Spring 2013 served as their MAP Growth K–2 scores for Grade K and their MAP Growth K–2 data from Fall 2013, Winter 2014, and Spring 2014 served as their MAP Growth K–2 scores for Grade 1.Thus, the data used was based on a sample of 3rd grade students taking PARCC assessment in Spring 2016 and their MAP Growth K–2 from previous years, when they were in Grades K and 1.

Size

Table 4 summarizes the total number of students, as functions of grade, state, region, and division.

Male

50.94%

Female

49.04%

Unknown

0.02%

Other SES Indicators

Not Provided

Free or reduced-price lunch

Not Provided

White, Non-Hispanic

40.88%

Black, Non-Hispanic

6.14%

Hispanic

23.40%

American Indian/Alaska Native

2.27%

Asian/Pacific Islander

9.02%

Multi-Ethnic

3.33%

Not Specified or Other

14.96%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

Table 4: Number of Students

State, Region, or Division

N

Number of Students Per State

CO

3,228

DC

171

IL

12,165

NJ

644

NM

208

RI

209

Total

16,625

Number of Students Per Region

Midwest

12,165

Northeast

853

South

171

West

3,436

Total

16,625

Number of Students Per Division

East North Central

12,165

Middle Atlantic

644

Mountain

3,436

New England

209

South Atlantic

171

Total

16,625

 

 

  1. Description of the analysis procedures for each reported type of reliability:

Marginal Reliability. The approach taken for estimating marginal reliability on MAP Growth K–2 was suggested by Wright in 1999[4]. For a sample of N students, marginal reliability () is estimated by

where  is the IRT achievement level (on a standardized or scaled score metric),  is an estimate of ,  is the observed variance of  across the sample of N students,  is the squared conditional (on ) standard error of measurement (CSEM), and  is the average squared CSEM across the sample of N students.

A bootstrapping approach is used to calculate a 95% confidence interval for marginal reliability. For an initial dataset of the achievement levels and CSEMs for N students, a bootstrap 95% confidence interval for marginal reliability is obtained as follows:

  1. Draw a random sample of size N with replacement from the initial dataset.
  2. Calculate marginal reliability based on the random sample drawn in Step 1.
  3. Repeat steps 1 and 2, 1,000 times.
  4. Determine the 2.5 and 97.5 percentile points from the resulting 1,000 estimates of marginal reliability. The value of these two percentiles are the bootstrap 95% confidence interval.

Test-Retest Reliability. Test-retest reliability of MAP Growth K–2 was estimated as the Pearson correlation of student RIT scores on MAP Growth K–2 for students in the study dataset, who took MAP Growth K–2 twice within a school year. Fundamentally, the test-retest reliability coefficient is a Pearson correlation. As such, the confidence interval (CI) for the test-retest reliability coefficient was obtained using the standard CI for a Pearson correlation (i.e., via the Fisher’s z-transformation).

 

 

  1. Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

 

Type of Reliability

Grade

N

Coefficient

Confidence Interval

Marginal (Fall)

K

2,963

0.93

0.93, 0.93

Marginal (Winter)

K

3,312

0.94

0.94, 0.94

Marginal (Spring)

K

4,351

0.95

0.95, 0.95

Marginal (Fall)

1

5,050

0.96

0.96, 0.96

Marginal (Winter)

1

4,956

0.96

0.95, 0.96

Marginal (Spring)

1

5,455

0.95

0.95, 0.96

Test-Retest (Fall/Winter)

K

2,632

0.81

0.79, 0.82

Test-Retest (Fall/Spring)

K

2,720

0.74

0.73, 0.76

Test-Retest (Winter/Spring)

K

3,153

0.82

0.81, 0.83

Test-Retest (Fall/Winter)

1

4,631

0.87

0.87, 0.88

Test-Retest (Fall/Spring)

1

4,927

0.82

0.81, 0.83

Test-Retest (Winter/Spring)

1

4,912

0.88

0.87, 0.88

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of

Reliability

Subgroup

Grade

N

Coefficient

Confidence

Interval

Marginal (Fall)

Ethnicity: Asian or Pacific Islander

K

357

0.94

0.94, 0.95

Marginal (Winter)

Ethnicity: Asian or Pacific Islander

K

382

0.94

0.94, 0.95

Marginal (Spring)

Ethnicity: Asian or Pacific Islander

K

424

0.95

0.94, 0.96

Marginal (Fall)

Ethnicity: Black

K

244

0.90

0.87, 0.92

Marginal (Winter)

Ethnicity: Black

K

240

0.91

0.89, 0.92

Marginal (Spring)

Ethnicity: Black

K

344

0.94

0.93, 0.95

Marginal (Fall)

Ethnicity: Hispanic

K

792

0.90

0.89, 0.91

Marginal (Winter)

Ethnicity: Hispanic

K

861

0.92

0.92, 0.93

Marginal (Spring)

Ethnicity: Hispanic

K

1,063

0.94

0.94, 0.95

Marginal (Spring)

Ethnicity: Multi-Ethnic

K

187

0.95

0.94, 0.96

Marginal (Fall)

Ethnicity: White

K

1,390

0.93

0.92, 0.93

Marginal (Winter)

Ethnicity: White

K

1,614

0.94

0.93, 0.94

Marginal (Spring)

Ethnicity: White

K

2,232

0.94

0.94, 0.95

Marginal (Fall)

Gender: Female

K

1,437

0.92

0.92, 0.93

Marginal (Winter)

Gender: Female

K

1,623

0.93

0.93, 0.94

Marginal (Spring)

Gender: Female

K

2,101

0.94

0.94, 0.95

Marginal (Fall)

Gender: Male

K

1,526

0.94

0.93, 0.94

Marginal (Winter)

Gender: Male

K

1,689

0.94

0.94, 0.95

Marginal (Spring)

Gender: Male

K

2,250

0.96

0.95, 0.96

Marginal (Fall)

Ethnicity: Asian or Pacific Islander

1

559

0.97

0.96, 0.97

Marginal (Winter)

Ethnicity: Asian or Pacific Islander

1

547

0.96

0.95, 0.96

Marginal (Spring)

Ethnicity: Asian or Pacific Islander

1

572

0.95

0.95, 0.96

Marginal (Fall)

Ethnicity: Black

1

289

0.94

0.93, 0.95

Marginal (Winter)

Ethnicity: Black

1

281

0.94

0.93, 0.95

Marginal (Spring)

Ethnicity: Black

1

309

0.95

0.94, 0.96

Marginal (Fall)

Ethnicity: Hispanic

1

1,194

0.95

0.95, 0.96

Marginal (Winter)

Ethnicity: Hispanic

1

1,196

0.95

0.95, 0.96

Marginal (Spring)

Ethnicity: Hispanic

1

1,251

0.95

0.95, 0.95

Marginal (Fall)

Ethnicity: Multi-Ethnic

1

179

0.95

0.94, 0.96

Marginal (Winter)

Ethnicity: Multi-Ethnic

1

173

0.95

0.94, 0.96

Marginal (Spring)

Ethnicity: Multi-Ethnic

1

189

0.95

0.94, 0.96

Marginal (Fall)

Ethnicity: White

1

2,651

0.95

0.95, 0.96

Marginal (Winter)

Ethnicity: White

1

2,592

0.95

0.94, 0.95

Marginal (Spring)

Ethnicity: White

1

2,939

0.94

0.94, 0.95

Marginal (Fall)

Gender: Female

1

2,455

0.96

0.95, 0.96

Marginal (Winter)

Gender: Female

1

2,436

0.95

0.95, 0.95

Marginal (Spring)

Gender: Female

1

2,666

0.95

0.95, 0.95

Marginal (Fall)

Gender: Male

1

2,595

0.96

0.96, 0.97

Marginal (Winter)

Gender: Male

1

2,519

0.96

0.96, 0.96

Marginal (Spring)

Gender: Male

1

2,788

0.96

0.96, 0.96

Test-Retest (Fall/Winter)

Ethnicity: Asian or Pacific Islander

K

341

0.83

0.79, 0.86

Test-Retest (Fall/Spring)

Ethnicity: Asian or Pacific Islander

K

338

0.81

0.77, 0.84

Test-Retest (Winter/Spring)

Ethnicity: Asian or Pacific Islander

K

367

0.84

0.81, 0.87

Test-Retest (Fall/Winter)

Ethnicity: Black

K

216

0.64

0.56, 0.71

Test-Retest (Fall/Spring)

Ethnicity: Black

K

183

0.55

0.44, 0.65

Test-Retest (Winter/Spring)

Ethnicity: Black

K

198

0.72

0.65, 0.78

Test-Retest (Fall/Winter)

Ethnicity: Hispanic

K

743

0.78

0.75, 0.81

Test-Retest (Fall/Spring)

Ethnicity: Hispanic

K

729

0.68

0.64, 0.72

Test-Retest (Winter/Spring)

Ethnicity: Hispanic

K

821

0.79

0.76, 0.81

Test-Retest (Fall/Winter)

Ethnicity: White

K

1,175

0.80

0.78, 0.82

Test-Retest (Fall/Spring)

Ethnicity: White

K

1,314

0.73

0.70, 0.75

Test-Retest (Winter/Spring)

Ethnicity: White

K

1,566

0.82

0.80, 0.83

Test-Retest (Fall/Winter)

Gender: Female

K

1,261

0.81

0.79, 0.83

Test-Retest (Fall/Spring)

Gender: Female

K

1,293

0.75

0.72, 0.77

Test-Retest (Winter/Spring)

Gender: Female

K

1,532

0.83

0.82, 0.85

Test-Retest (Fall/Winter)

Gender: Male

K

1,371

0.81

0.79, 0.83

Test-Retest (Fall/Spring)

Gender: Male

K

1,427

0.74

0.72, 0.76

Test-Retest (Winter/Spring)

Gender: Male

K

1,621

0.82

0.80, 0.83

Test-Retest (Fall/Winter)

Ethnicity: Asian or Pacific Islander

1

525

0.91

0.89, 0.92

Test-Retest (Fall/Spring)

Ethnicity: Asian or Pacific Islander

1

540

0.85

0.82, 0.87

Test-Retest (Winter/Spring)

Ethnicity: Asian or Pacific Islander

1

546

0.89

0.87, 0.91

Test-Retest (Fall/Winter)

Ethnicity: Black

1

261

0.84

0.80, 0.87

Test-Retest (Fall/Spring)

Ethnicity: Black

1

278

0.76

0.71, 0.81

Test-Retest (Winter/Spring)

Ethnicity: Black

1

277

0.82

0.78, 0.86

Test-Retest (Fall/Winter)

Ethnicity: Hispanic

1

1,138

0.84

0.82, 0.86

Test-Retest (Fall/Spring)

Ethnicity: Hispanic

1

1,172

0.79

0.77, 0.81

Test-Retest (Winter/Spring)

Ethnicity: Hispanic

1

1,180

0.85

0.84, 0.87

Test-Retest (Fall/Winter)

Ethnicity: Multi-Ethnic

1

165

0.87

0.83, 0.90

Test-Retest (Fall/Spring)

Ethnicity: Multi-Ethnic

1

173

0.79

0.73, 0.84

Test-Retest (Winter/Spring)

Ethnicity: Multi-Ethnic

1

172

0.87

0.83, 0.90

Test-Retest (Fall/Winter)

Ethnicity: White

1

2,385

0.85

0.84, 0.86

Test-Retest (Fall/Spring)

Ethnicity: White

1

2,596

0.79

0.78, 0.80

Test-Retest (Winter/Spring)

Ethnicity: White

1

2,573

0.85

0.84, 0.86

Test-Retest (Fall/Winter)

Gender: Female

1

2,268

0.87

0.86, 0.88

Test-Retest (Fall/Spring)

Gender: Female

1

2,394

0.83

0.82, 0.85

Test-Retest (Winter/Spring)

Gender: Female

1

2,414

0.88

0.87, 0.89

Test-Retest (Fall/Winter)

Gender: Male

1

2,363

0.88

0.87, 0.88

Test-Retest (Fall/Spring)

Gender: Male

1

2,533

0.81

0.80, 0.83

Test-Retest (Winter/Spring)

Gender: Male

1

2,497

0.87

0.86, 0.88



[1] Samejima, F. (1977). A use of the information function in tailored testing. Applied Psychological Measurement, 1(3), 233–247.

[2] Samejima, F. (1994). Estimation of reliability coefficients using the test information function and its modifications. Applied Psychological Measurement, 18(3), 229–244.

[3] Green, B. F., Bock, R. D., Humphreys, L. G., Linn, R. L., & Reckase, M. D. (1984). Technical guidelines for assessing computerized adaptive tests. Journal of Educational Measurement, 21(4), 347–360.

[4] Wright, B. D. (1999). “Rasch Measurement Models.” In G.N. Masters and J.P. Keeves (Eds.), Advances in Measurement in Educational Research and Assessment (pp. 85-97). Oxford, UK: Elsevier Science Ltd.

 

Validity

GradeK1
RatingFull bubbledFull bubbled

1.Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool:

In general terms, the better a test measures what it purports to measure and can support its intended uses and decision making, the stronger its validity is said to be. Within this broad statement resides a wide range of information that can be used as validity evidence. This information ranges, for example, from the adequacy and coverage of a test’s content, to its ability to yield scores that are predictive of a status in some area, to its ability to draw accurate inferences about a test taker’s status with respect to a construct, to its ability to allow generalizations from test performance within a domain to like performance in the same domain.

Much of the validity evidence for MAP Growth K–2 comes from the relationships of MAP Growth K–2 test scores with state content-aligned accountability test scores from Grade 3. These relationships are predictive relationships between students’ performance on MAP Growth K–2 tests with their performance, in a later Spring testing term, on state accountability tests.

Several important points should be noted regarding concurrent performance on MAP Growth K–2 tests with that on state accountability tests. First, these two forms of tests (i.e., interim vs. summative) are designed to serve two related but different purposes. MAP Growth K–2 tests are designed to provide estimates of achievement status with low measurement error. They are also designed to provide reasonable estimates of students’ strengths and weaknesses within the identified goal structure.

State accountability tests are commonly designed to determine student proficiency within the state performance standard structure, with the most important decision being the classification of the student as proficient or not proficient. This primary purpose of most state tests in conjunction with adopted content and curriculum standards and structures can influence the relationship of student performance between the two tests.

For example, one of the most common factors influencing these relationships is the use of constructed response items in state tests. In general, the greater the number of constructed response items, the weaker the relationship will appear. Another difference is in test design. Since most state accountability tests are fixed form, it is reasonable for the test to be constructed so that maximum test information is established around the proficiency cut point. This is where a state wants to be the most confident about the classification decision that the test will inform. To the extent that this strategy is reflected in the state’s operational test, the relationship in performance between MAP Growth K–2 tests and state tests will be attenuated due to a more truncated range of scores on the state test.

The requirement that state test content be connected to single grade level content standards is different than MAP Growth K–2 test content structure that spans grade levels. This difference is another factor that weakens the observed score relationships between tests. Finally, when focus is placed on the relationship between performance on MAP Growth K–2 tests and the assigned proficiency category from the state test, information from the state test will have been collapsed into three to five categories. The correlations between RIT scores and these category assignments will always be substantially lower than if the correlations were based on RIT scores and scale scores.

Predictive validity evidence is expressed as the degree of relationship to performance on another test measuring achievement in the same domain (e.g., mathematics, reading) at some later point in time. This form of validity can also be expressed in the form of a Pearson correlation coefficient between the total domain area RIT score and the total scale score of another established test. It answers the question, “How well do the scores from this test that reference this (RIT) scale in this subject area (e.g., reading) predict the scores obtained from an established test that references some other scale in the same subject area at a later point in time?” Both tests are administered to the same students several weeks apart, typically 12 to 36 weeks in evidence reported here. Strong predictive validity is indicated when the correlations are in the low 0.80s. Correlations with non-NWEA tests that include more performance test items that require subjective scoring tend to have lower correlations than when non-NWEA tests consist of exclusively multiple-choice items.

The criterion measure used for this series of analyses was the scaled score on the PARCC mathematics assessment, taken by students in the sample during the Spring 2016 school term.

In addition to concurrent and predictive validity, validity evidence for MAP Growth K–2 also comes from the degree and stability of the relationship of RIT scores across multiple and extended periods of time, such as across school years. This type of evidence supports the construct validity of MAP Growth K–2 and the ability underlying the RIT scale. For Grade K, this construct validity evidence was based on MAP Growth K–2 scores from Grades K and 1. For Grade 1, this construct validity evidence was based on MAP Growth K–2 scores from Grade 1 and MAP Growth scores from Grade 2.

 

 

2.Description of the sample(s), including size and characteristics, for each validity analysis conducted:

Representation

New England, Middle Atlantic, East North Central, South Atlantic, Mountain. The sample for the study contained student records from a total of five states (Colorado, Illinois, New Jersey, New Mexico, and Rhode Island) and one federal district (District of Columbia), and thus had representation from all four U.S. Census regions.

Date

Not Provided

Size

Table 5 summarizes the total number of students, as functions of grade, state, region, and division.

Male

50.94%

Female

49.04%

Unknown

0.02%

Other SES Indicators

Not Provided

Free or reduced-price lunch

Not Provided

White, Non-Hispanic

40.88%

Black, Non-Hispanic

6.14%

Hispanic

23.40%

American Indian/Alaska Native

2.27%

Asian/Pacific Islander

9.02%

Multi-Ethnic

3.33%

Not Specified or Other

14.96%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

 

3.Description of the analysis procedures for each reported type of validity:

Predictive validity was estimated as the Pearson correlation coefficient between student RIT scores from a given term and the same students’ total scale score on the PARCC test administered in Spring 2016. The 95% confidence interval for concurrent and predictive validity coefficients was based on the standard 95% confidence interval for a Pearson correlation, using the Fisher z-transformation.

 

4.Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Grade

Test or Criterion

n

Coefficient

Confidence

Interval

Construct (Fall)

K

MAP Growth K-2: Spring 2013

2,720

0.74

0.73, 0.76

Construct (Fall)

K

MAP Growth K-2: Fall 2013

2,585

0.77

0.75, 0.78

Construct (Fall)

K

MAP Growth K-2: Winter 2014

2,439

0.76

0.74, 0.78

Construct (Fall)

K

MAP Growth K-2: Spring 2014

2,689

0.74

0.72, 0.75

Predictive (Fall)

K

PARCC Math

2,963

0.67

0.65, 0.69

Construct (Winter)

K

MAP Growth K-2: Spring 2013

3,153

0.82

0.81, 0.83

Construct (Winter)

K

MAP Growth K-2: Fall 2013

2,957

0.82

0.81, 0.83

Construct (Winter)

K

MAP Growth K-2: Winter 2014

2,885

0.80

0.79, 0.81

Construct (Winter)

K

MAP Growth K-2: Spring 2014

3,074

0.76

0.75, 0.78

Predictive (Winter)

K

PARCC Math

3,312

0.69

0.68, 0.71

Construct (Spring)

K

MAP Growth K-2: Fall 2013

3,321

0.84

0.83, 0.85

Construct (Spring)

K

MAP Growth K-2: Winter 2014

3,196

0.82

0.81, 0.83

Construct (Spring)

K

MAP Growth K-2: Spring 2014

3,559

0.79

0.77, 0.80

Predictive (Spring)

K

PARCC Math

4,351

0.72

0.70, 0.73

Construct (Fall)

1

MAP Growth K-2: Spring 2014

4,927

0.82

0.81, 0.83

Construct (Fall)

1

MAP Growth: Fall 2014

4,411

0.78

0.77, 0.79

Construct (Fall)

1

MAP Growth: Winter 2015

3,342

0.78

0.76, 0.79

Construct (Fall)

1

MAP Growth: Spring 2015

4,619

0.76

0.75, 0.77

Predictive (Fall)

1

PARCC Math

5,050

0.75

0.73, 0.76

Construct (Winter)

1

MAP Growth K-2: Spring 2014

4,912

0.88

0.87, 0.88

Construct (Winter)

1

MAP Growth: Fall 2014

4,349

0.80

0.79, 0.81

Construct (Winter)

1

MAP Growth: Winter 2015

3,400

0.81

0.80, 0.82

Construct (Winter)

1

MAP Growth: Spring 2015

4,511

0.79

0.78, 0.80

Predictive (Winter)

1

PARCC Math

4,956

0.77

0.76, 0.78

Construct (Spring)

1

MAP Growth: Fall 2014

4,652

0.82

0.81, 0.83

Construct (Spring)

1

MAP Growth: Winter 2015

3,526

0.82

0.81, 0.83

Construct (Spring)

1

MAP Growth: Spring 2015

4,962

0.81

0.80, 0.82

Predictive (Spring)

1

PARCC Math

5,455

0.78

0.77, 0.79

 

5.Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

Not Provided  

 

6.Describe the degree to which the provided data support the validity of the tool:

Predictive validity coefficients, for each grade and each time of year, were consistently large, demonstrating a strong relationship between the MAP Growth K–2 and PARCC assessments, across the grades and times of year reported.

 

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Grade

Test or

Criterion

N

Coefficient

Confidence

Interval

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2013

338

0.81

0.77, 0.84

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Fall 2013

339

0.83

0.79, 0.86

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Winter 2014

332

0.81

0.76, 0.84

Construct (Fall)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2014

339

0.77

0.72, 0.81

Predictive (Fall)

Ethnicity: Asian or Pacific Islander

K

PARCC Math

357

0.68

0.62, 0.73

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2013

367

0.84

0.81, 0.87

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Fall 2013

364

0.84

0.81, 0.87

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Winter 2014

357

0.80

0.76, 0.84

Construct (Winter)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2014

364

0.77

0.73, 0.81

Predictive (Winter)

Ethnicity: Asian or Pacific Islander

K

PARCC Math

382

0.68

0.62, 0.73

Construct (Spring)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Fall 2013

378

0.87

0.84, 0.89

Construct (Spring)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Winter 2014

371

0.84

0.81, 0.87

Construct (Spring)

Ethnicity: Asian or Pacific Islander

K

MAP Growth K-2: Spring 2014

380

0.80

0.76, 0.83

Predictive (Spring)

Ethnicity: Asian or Pacific Islander

K

PARCC Math

424

0.73

0.68, 0.77

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Spring 2013

183

0.55

0.44, 0.65

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Fall 2013

177

0.69

0.60, 0.76

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Winter 2014

174

0.64

0.55, 0.72

Construct (Fall)

Ethnicity: Black

K

MAP Growth K-2: Spring 2014

177

0.65

0.55, 0.72

Predictive (Fall)

Ethnicity: Black

K

PARCC Math

244

0.56

0.46, 0.64

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Spring 2013

198

0.72

0.65, 0.78

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Fall 2013

188

0.75

0.67, 0.80

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Winter 2014

190

0.70

0.62, 0.77

Construct (Winter)

Ethnicity: Black

K

MAP Growth K-2: Spring 2014

191

0.67

0.58, 0.74

Predictive (Winter)

Ethnicity: Black

K

PARCC Math

240

0.57

0.48, 0.65

Construct (Spring)

Ethnicity: Black

K

MAP Growth K-2: Fall 2013

198

0.74

0.67, 0.80

Construct (Spring)

Ethnicity: Black

K

MAP Growth K-2: Winter 2014

201

0.74

0.67, 0.80

Construct (Spring)

Ethnicity: Black

K

MAP Growth K-2: Spring 2014

206

0.74

0.67, 0.80

Predictive (Spring)

Ethnicity: Black

K

PARCC Math

344

0.64

0.57, 0.70

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2013

729

0.68

0.64, 0.72

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Fall 2013

716

0.72

0.68, 0.76

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Winter 2014

698

0.72

0.69, 0.76

Construct (Fall)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2014

726

0.68

0.64, 0.72

Predictive (Fall)

Ethnicity: Hispanic

K

PARCC Math

792

0.62

0.58, 0.66

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2013

821

0.79

0.76, 0.81

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Fall 2013

799

0.78

0.75, 0.81

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Winter 2014

784

0.77

0.74, 0.80

Construct (Winter)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2014

807

0.74

0.70, 0.77

Predictive (Winter)

Ethnicity: Hispanic

K

PARCC Math

861

0.67

0.63, 0.70

Construct (Spring)

Ethnicity: Hispanic

K

MAP Growth K-2: Fall 2013

907

0.80

0.77, 0.82

Construct (Spring)

Ethnicity: Hispanic

K

MAP Growth K-2: Winter 2014

894

0.78

0.75, 0.80

Construct (Spring)

Ethnicity: Hispanic

K

MAP Growth K-2: Spring 2014

924

0.75

0.72, 0.78

Predictive (Spring)

Ethnicity: Hispanic

K

PARCC Math

1,063

0.65

0.61, 0.68

Predictive (Spring)

Ethnicity: Multi-Ethnic

K

PARCC Math

187

0.70

0.62, 0.77

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Spring 2013

1,314

0.73

0.70, 0.75

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Fall 2013

1,207

0.74

0.71, 0.76

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Winter 2014

1,103

0.73

0.71, 0.76

Construct (Fall)

Ethnicity: White

K

MAP Growth K-2: Spring 2014

1,302

0.71

0.68, 0.73

Predictive (Fall)

Ethnicity: White

K

PARCC Math

1,390

0.63

0.60, 0.66

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Spring 2013

1,566

0.82

0.80, 0.83

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Fall 2013

1,430

0.80

0.78, 0.82

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Winter 2014

1,383

0.79

0.77, 0.81

Construct (Winter)

Ethnicity: White

K

MAP Growth K-2: Spring 2014

1,529

0.73

0.71, 0.75

Predictive (Winter)

Ethnicity: White

K

PARCC Math

1,614

0.67

0.64, 0.70

Construct (Spring)

Ethnicity: White

K

MAP Growth K-2: Fall 2013

1,648

0.84

0.82, 0.85

Construct (Spring)

Ethnicity: White

K

MAP Growth K-2: Winter 2014

1,549

0.82

0.80, 0.83

Construct (Spring)

Ethnicity: White

K

MAP Growth K-2: Spring 2014

1,843

0.76

0.74, 0.78

Predictive (Spring)

Ethnicity: White

K

PARCC Math

2,232

0.68

0.66, 0.71

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Spring 2013

1,293

0.75

0.72, 0.77

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Fall 2013

1,226

0.78

0.76, 0.80

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Winter 2014

1,165

0.77

0.74, 0.79

Construct (Fall)

Gender: Female

K

MAP Growth K-2: Spring 2014

1,279

0.74

0.72, 0.77

Predictive (Fall)

Gender: Female

K

PARCC Math

1,437

0.68

0.65, 0.70

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Spring 2013

1,532

0.83

0.82, 0.85

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Fall 2013

1,424

0.84

0.83, 0.86

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Winter 2014

1,401

0.82

0.80, 0.84

Construct (Winter)

Gender: Female

K

MAP Growth K-2: Spring 2014

1,485

0.78

0.76, 0.80

Predictive (Winter)

Gender: Female

K

PARCC Math

1,623

0.71

0.69, 0.73

Construct (Spring)

Gender: Female

K

MAP Growth K-2: Fall 2013

1,581

0.86

0.85, 0.87

Construct (Spring)

Gender: Female

K

MAP Growth K-2: Winter 2014

1,537

0.83

0.81, 0.84

Construct (Spring)

Gender: Female

K

MAP Growth K-2: Spring 2014

1,708

0.80

0.78, 0.81

Predictive (Spring)

Gender: Female

K

PARCC Math

2,101

0.72

0.70, 0.74

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Spring 2013

1,427

0.74

0.72, 0.76

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Fall 2013

1,359

0.76

0.74, 0.78

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Winter 2014

1,274

0.76

0.73, 0.78

Construct (Fall)

Gender: Male

K

MAP Growth K-2: Spring 2014

1,410

0.73

0.71, 0.76

Predictive (Fall)

Gender: Male

K

PARCC Math

1,526

0.66

0.63, 0.69

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Spring 2013

1,621

0.82

0.80, 0.83

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Fall 2013

1,533

0.80

0.78, 0.82

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Winter 2014

1,484

0.79

0.77, 0.81

Construct (Winter)

Gender: Male

K

MAP Growth K-2: Spring 2014

1,589

0.75

0.72, 0.77

Predictive (Winter)

Gender: Male

K

PARCC Math

1,689

0.68

0.65, 0.71

Construct (Spring)

Gender: Male

K

MAP Growth K-2: Fall 2013

1,740

0.83

0.81, 0.84

Construct (Spring)

Gender: Male

K

MAP Growth K-2: Winter 2014

1,659

0.82

0.81, 0.84

Construct (Spring)

Gender: Male

K

MAP Growth K-2: Spring 2014

1,851

0.78

0.76, 0.80

Predictive (Spring)

Gender: Male

K

PARCC Math

2,250

0.71

0.69, 0.73

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth K-2: Spring 2014

540

0.85

0.82, 0.87

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Fall 2014

540

0.83

0.80, 0.85

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Winter 2015

446

0.81

0.78, 0.84

Construct (Fall)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Spring 2015

546

0.81

0.78, 0.83

Predictive (Fall)

Ethnicity: Asian or Pacific Islander

1

PARCC Math

559

0.76

0.73, 0.80

Construct (Winter)

Ethnicity: Asian or Pacific Islander

1

MAP Growth K-2: Spring 2014

546

0.89

0.87, 0.91

Construct (Winter)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Fall 2014

534

0.83

0.80, 0.85

Construct (Winter)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Winter 2015

455

0.84

0.81, 0.87

Predictive (Winter)

Ethnicity: Asian or Pacific Islander

1

PARCC Math

536

0.81

0.78, 0.83

Predictive (Winter)

Ethnicity: Asian or Pacific Islander

1

PARCC Math

547

0.76

0.72, 0.79

Construct (Spring)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Fall 2014

557

0.85

0.82, 0.87

Construct (Spring)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Winter 2015

474

0.85

0.82, 0.87

Construct (Spring)

Ethnicity: Asian or Pacific Islander

1

MAP Growth: Spring 2015

560

0.82

0.79, 0.85

Predictive (Spring)

Ethnicity: Asian or Pacific Islander

1

PARCC Math

572

0.77

0.73, 0.80

Construct (Fall)

Ethnicity: Black

1

MAP Growth K-2: Spring 2014

278

0.76

0.71, 0.81

Construct (Fall)

Ethnicity: Black

1

MAP Growth: Fall 2014

261

0.75

0.69, 0.80

Construct (Fall)

Ethnicity: Black

1

MAP Growth: Spring 2015

259

0.68

0.61, 0.74

Predictive (Fall)

Ethnicity: Black

1

PARCC Math

289

0.70

0.63, 0.75

Construct (Winter)

Ethnicity: Black

1

MAP Growth K-2: Spring 2014

277

0.82

0.78, 0.86

Construct (Winter)

Ethnicity: Black

1

MAP Growth: Fall 2014

253

0.80

0.74, 0.84

Predictive (Winter)

Ethnicity: Black

1

PARCC Math

254

0.76

0.70, 0.81

Predictive (Winter)

Ethnicity: Black

1

PARCC Math

281

0.69

0.63, 0.75

Construct (Spring)

Ethnicity: Black

1

MAP Growth: Fall 2014

278

0.80

0.75, 0.84

Construct (Spring)

Ethnicity: Black

1

MAP Growth: Spring 2015

276

0.78

0.73, 0.83

Predictive (Spring)

Ethnicity: Black

1

PARCC Math

309

0.67

0.61, 0.73

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth K-2: Spring 2014

1,172

0.79

0.77, 0.81

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth: Fall 2014

1,015

0.74

0.71, 0.76

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth: Winter 2015

694

0.70

0.66, 0.73

Construct (Fall)

Ethnicity: Hispanic

1

MAP Growth: Spring 2015

1,073

0.68

0.64, 0.71

Predictive (Fall)

Ethnicity: Hispanic

1

PARCC Math

1,194

0.70

0.67, 0.73

Construct (Winter)

Ethnicity: Hispanic

1

MAP Growth K-2: Spring 2014

1,180

0.85

0.84, 0.87

Construct (Winter)

Ethnicity: Hispanic

1

MAP Growth: Fall 2014

1,027

0.77

0.74, 0.79

Construct (Winter)

Ethnicity: Hispanic

1

MAP Growth: Winter 2015

704

0.72

0.68, 0.75

Predictive (Winter)

Ethnicity: Hispanic

1

PARCC Math

1,069

0.71

0.67, 0.73

Predictive (Winter)

Ethnicity: Hispanic

1

PARCC Math

1,196

0.74

0.72, 0.77

Construct (Spring)

Ethnicity: Hispanic

1

MAP Growth: Fall 2014

1,052

0.80

0.78, 0.82

Construct (Spring)

Ethnicity: Hispanic

1

MAP Growth: Winter 2015

712

0.77

0.74, 0.80

Construct (Spring)

Ethnicity: Hispanic

1

MAP Growth: Spring 2015

1,117

0.75

0.73, 0.78

Predictive (Spring)

Ethnicity: Hispanic

1

PARCC Math

1,251

0.76

0.74, 0.78

Construct (Fall)

Ethnicity: Multi-Ethnic

1

MAP Growth K-2: Spring 2014

173

0.79

0.73, 0.84

Construct (Fall)

Ethnicity: Multi-Ethnic

1

MAP Growth: Fall 2014

152

0.76

0.68, 0.82

Construct (Fall)

Ethnicity: Multi-Ethnic

1

MAP Growth: Spring 2015

161

0.75

0.68, 0.81

Predictive (Fall)

Ethnicity: Multi-Ethnic

1

PARCC Math

179

0.68

0.59, 0.75

Construct (Winter)

Ethnicity: Multi-Ethnic

1

MAP Growth K-2: Spring 2014

172

0.87

0.83, 0.90

Predictive (Winter)

Ethnicity: Multi-Ethnic

1

PARCC Math

154

0.80

0.73, 0.85

Predictive (Winter)

Ethnicity: Multi-Ethnic

1

PARCC Math

173

0.76

0.68, 0.81

Construct (Spring)

Ethnicity: Multi-Ethnic

1

MAP Growth: Fall 2014

156

0.83

0.78, 0.88

Construct (Spring)

Ethnicity: Multi-Ethnic

1

MAP Growth: Spring 2015

167

0.79

0.73, 0.84

Predictive (Spring)

Ethnicity: Multi-Ethnic

1

PARCC Math

189

0.75

0.68, 0.81

Construct (Fall)

Ethnicity: White

1

MAP Growth K-2: Spring 2014

2,596

0.79

0.78, 0.80

Construct (Fall)

Ethnicity: White

1

MAP Growth: Fall 2014

2,357

0.75

0.73, 0.76

Construct (Fall)

Ethnicity: White

1

MAP Growth: Winter 2015

1,905

0.74

0.72, 0.76

Construct (Fall)

Ethnicity: White

1

MAP Growth: Spring 2015

2,497

0.73

0.71, 0.75

Predictive (Fall)

Ethnicity: White

1

PARCC Math

2,651

0.71

0.69, 0.73

Construct (Winter)

Ethnicity: White

1

MAP Growth K-2: Spring 2014

2,573

0.85

0.84, 0.86

Construct (Winter)

Ethnicity: White

1

MAP Growth: Fall 2014

2,311

0.78

0.76, 0.79

Construct (Winter)

Ethnicity: White

1

MAP Growth: Winter 2015

1,953

0.79

0.77, 0.80

Predictive (Winter)

Ethnicity: White

1

PARCC Math

2,422

0.78

0.76, 0.79

Predictive (Winter)

Ethnicity: White

1

PARCC Math

2,592

0.74

0.72, 0.76

Construct (Spring)

Ethnicity: White

1

MAP Growth: Fall 2014

2,515

0.79

0.77, 0.80

Construct (Spring)

Ethnicity: White

1

MAP Growth: Winter 2015

2,023

0.80

0.78, 0.81

Construct (Spring)

Ethnicity: White

1

MAP Growth: Spring 2015

2,744

0.80

0.78, 0.81

Predictive (Spring)

Ethnicity: White

1

PARCC Math

2,939

0.76

0.74, 0.77

Construct (Fall)

Gender: Female

1

MAP Growth K-2: Spring 2014

2,394

0.83

0.82, 0.85

Construct (Fall)

Gender: Female

1

MAP Growth: Fall 2014

2,134

0.79

0.78, 0.81

Construct (Fall)

Gender: Female

1

MAP Growth: Winter 2015

1,641

0.79

0.77, 0.80

Construct (Fall)

Gender: Female

1

MAP Growth: Spring 2015

2,238

0.76

0.74, 0.78

Predictive (Fall)

Gender: Female

1

PARCC Math

2,455

0.76

0.74, 0.78

Construct (Winter)

Gender: Female

1

MAP Growth K-2: Spring 2014

2,414

0.88

0.87, 0.89

Construct (Winter)

Gender: Female

1

MAP Growth: Fall 2014

2,128

0.80

0.78, 0.81

Construct (Winter)

Gender: Female

1

MAP Growth: Winter 2015

1,673

0.80

0.78, 0.82

Construct (Winter)

Gender: Female

1

MAP Growth: Spring 2015

2,206

0.78

0.76, 0.80

Predictive (Winter)

Gender: Female

1

PARCC Math

2,436

0.77

0.76, 0.79

Construct (Spring)

Gender: Female

1

MAP Growth: Fall 2014

2,265

0.83

0.81, 0.84

Construct (Spring)

Gender: Female

1

MAP Growth: Winter 2015

1,731

0.82

0.81, 0.84

Construct (Spring)

Gender: Female

1

MAP Growth: Spring 2015

2,416

0.81

0.79, 0.82

Predictive (Spring)

Gender: Female

1

PARCC Math

2,666

0.79

0.78, 0.81

Construct (Fall)

Gender: Male

1

MAP Growth K-2: Spring 2014

2,533

0.81

0.80, 0.83

Construct (Fall)

Gender: Male

1

MAP Growth: Fall 2014

2,277

0.77

0.76, 0.79

Construct (Fall)

Gender: Male

1

MAP Growth: Winter 2015

1,701

0.77

0.75, 0.79

Construct (Fall)

Gender: Male

1

MAP Growth: Spring 2015

2,381

0.76

0.74, 0.77

Predictive (Fall)

Gender: Male

1

PARCC Math

2,595

0.74

0.72, 0.75

Construct (Winter)

Gender: Male

1

MAP Growth K-2: Spring 2014

2,497

0.87

0.86, 0.88

Construct (Winter)

Gender: Male

1

MAP Growth: Fall 2014

2,220

0.81

0.79, 0.82

Construct (Winter)

Gender: Male

1

MAP Growth: Winter 2015

1,727

0.81

0.80, 0.83

Construct (Winter)

Gender: Male

1

MAP Growth: Spring 2015

2,304

0.80

0.78, 0.81

Predictive (Winter)

Gender: Male

1

PARCC Math

2,519

0.77

0.76, 0.79

Construct (Spring)

Gender: Male

1

MAP Growth: Fall 2014

2,386

0.82

0.80, 0.83

Construct (Spring)

Gender: Male

1

MAP Growth: Winter 2015

1,795

0.82

0.80, 0.84

Construct (Spring)

Gender: Male

1

MAP Growth: Spring 2015

2,545

0.82

0.80, 0.83

Predictive (Spring)

Gender: Male

1

PARCC Math

2,788

0.78

0.76, 0.79

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format:

Not Provided

Sample Representativeness

GradeK1
RatingHalf-filled bubbleHalf-filled bubble

Primary Classification Accuracy Sample

Representation

Middle Atlantic, East North Central, South Atlantic, Mountain

Date

Partnership for Assessment of Readiness for College and Careers (PARCC) data was based on students in Grade 3 who took the PARCC assessment during Spring 2016. The Spring 2016 PARCC administration spanned from March 2016 through June 2016. MAP Growth K–2 data was obtained for this sample of Grade 3 students taking PARCC in Spring 2016. Specifically, for these students who were in Grade 3 in Spring 2016, their MAP Growth K–2 scores from previous years were obtained. Their MAP Growth K–2 data from Fall 2012, Winter 2013, and Spring 2013 served as their MAP Growth K–2 scores for Grade K; and their MAP Growth K–2 data from Fall 2013, Winter 2014, and Spring 2014 served as their MAP Growth K–2 scores for Grade 1.Thus, the data used was based on a sample of 3rd grade students taking PARCC assessment in Spring 2016 and their MAP Growth K–2 from previous years, when they were in Grades K, 1, and 2.

Size

The following tables summarizes the total number of students, as functions of grade, state, region, and division.

Male

50.94%

Female

49.04%

Unknown

0.02%

Other SES Indicators

Not Provided

Free or reduced-price lunch

Not Provided

White, Non-Hispanic

40.88%

Black, Non-Hispanic

6.14%

Hispanic

23.40%

American Indian/Alaska Native

2.27%

Asian/Pacific Islander

9.02%

Multi-Ethnic

3.33%

Not Specified or Other

14.96%

Disability classification

Not Provided

First language

Not Provided

Language proficiency status

Not Provided

 

Table 2: Number of Students

State, Region, or Division

N

Number of Students Per State

CO

3,228

DC

171

IL

12,165

NJ

644

NM

208

RI

209

Total

16,625

Number of Students Per Region

Midwest

12,165

Northeast

853

South

171

West

3,436

Total

16,625

Number of Students Per Division

East North Central

12,165

Middle Atlantic

644

Mountain

3,436

New England

209

South Atlantic

171

Total

16,625

 

Bias Analysis Conducted

GradeK1
RatingYesYes
  1. Description of the method used to determine the presence or absence of bias:

Once tests have been administered and results collected, analysis to detect differential item function (DIF) may be conducted. The method used to detect DIF for NWEA is based on the work of Linacre and Wright[1], implemented by Linacre[2]. When executed as part of a Winsteps[3] analysis, this method entails:

  1. Carrying out a joint Rasch analysis of all person-group classifications that anchors all student abilities and item difficulties to a common (theta) scale.
  2. Carrying out a calibration analysis for the Reference group keeping the student ability estimates and scale structure anchored to produce Reference group item difficulty estimates.
  3. Carrying out a calibration analysis for the Focal group keeping the student ability estimates and scale structure anchored to produce Focal group item difficulty estimates.
  4. Computing pair-wise item difficulty differences (Focal group difficulty minus Reference group difficulty). The calibration analyses in steps b and c are computed for each item, as though all items, except the item currently targeted, are anchored at the joint calibration run (step a).

Ideally, analyzing items for DIF would be incorporated within the item calibration process. This can prove to be a useful initial screen to identify items that should be subjected to heightened surveillance for DIF. However, the number of responses to an item by members of demographic groups of interest may well be insufficient to yield stable calibration estimates at the group level. This can introduce statistical artifacts as well as Type I errors into DIF analyses. To avoid this, data for analyses are taken from responses to operational tests.

 

 

  1. Description of the subgroups for which bias analyses were conducted:

Each test record included the student’s recorded ethnic group membership (Native American, Asian, African American, Hispanic, and European/Anglo American).

 

  1. Description of the results of the bias analyses conducted, including data and interpretative statements:

The DIF analysis for MAP Growth K–2 includes mathematics test events that were administered during the Spring and Fall terms of 2010 in six states and were retrieved from the NWEA Growth Research Database (GRD). The six states included Colorado, Illinois, Michigan, New Mexico, South Carolina, and Washington. Each assessment record included the student’s ethnic group membership (Native American, Asian/Pacific Islander, African American, Hispanic, European/Anglo American, and Multi-Racial), the student’s gender, and responses to sixty items administered across two tests per content area. The number of students and the number of test items for each content area are provided in Table 6.

Table 6: Numbers of Students and Test Items Included in the DIF Analysis.

Content area

Items

Students

Ethnic Group

% of Students

Mathematics

1,361

242,214

Native American

2.6

Asian/Pacific Islander

4.1

African American

21.2

Hispanic

16.1

European/Anglo American

53.6

Multi-Racial

2.4

 

Winsteps (version 3.72.0) was invoked to carry out the analysis. The numbers of items exhibiting DIF for each ethnic focal group are reported in Table 7. Similarly, the numbers of items exhibiting gender-specific DIF are reported in Table 8for mathematics. The numbers of items reported in these tables are based on a minimum of 500 student responses for each group where p < .05 ensuring that each group in the comparison had adequate power to detect DIF and was statistically significant. The Educational Testing Services (ETS) delta method of categorizing DIF is included in the tables (ETSClass). The delta method provides differentiation between items exhibiting graduated levels of DIF (negligible DIF: difference < .43 logits; moderate DIF: difference >= .43 logits and < .64 logits; severe DIF: difference >=.64 logits). For categories B and C, there is a further breakdown using “+” (indicating DIF is against the reference group) and “-“ (indicating DIF is against the focal group).

Table 7: Differential Item Functioning for MAP Mathematics Items (N=1361)

Reference is European/Anglo Americans

Reference is Base Calibration (All Students)

Focal Group

ETSClass*

N Items**

% of Items

Focal Group

ETSClass*

N Items

% of Items

Native American

A

47

75.8%

Native American

A

66

100.0%

B-

7

11.3%

B

0

0.0%

B+

1

1.6%

B+

0

0.0%

C-

5

8.1%

C-

0

0.0%

C+

2

3.2%

C+

0

0.0%

Asian/Pacific Islander

A

78

69.6%

Asian/Pacific Islander

A

123

100.0%

B-

14

12.5%

B-

0

0.0%

B+

13

11.6%

B+

0

0.0%

C-

3

2.7%

C-

0

0.0%

C+

4

3.6%

C+

0

0.0%

African American

A

420

70.4%

African American

A

628

100.0%

B-

61

10.2%

B-

0

0.0%

B+

73

12.2%

B+

0

0.0%

C-

33

5.5%

C-

0

0.0%

C+

10

1.7%

C+

0

0.0%

Hispanic

A

357

73.6%

Hispanic

A

506

100.0%

B-

55

11.3%

B-

0

0.0%

B+

49

10.1%

B+

0

0.0%

C-

20

4.1%

C-

0

0.0%

C+

4

0.8%

C+

0

0.0%

Multi-Racial

A

5

100.0%

Multi-Racial

A

13

100.0%

B-

0

0.0%

B-

0

0.0%

B+

0

0.0%

B+

0

0.0%

C-

0

0.0%

C-

0

0.0%

C+

0

0.0%

C+

0

0.0%

 

Anglo American

A

608

100.0%

B-

0

0.0%

B+

0

0.0%

C-

0

0.0%

C+

0

0.0%

*A=|DIF|<.43 logits; B=.43 logits ≤|DIF|<.64 logits; C=|DIF|≥.64 logits; B- and C- = DIF is against the Focal group; B+ and C+ = DIF is against the Reference group

**The number of items with 500 or more responses from the group where p < .05

 

 

 

Table 8: Differential Item Functioning Related to Gender for MAP Growth by Content Area

ETSClass*

N Items**

% of Items

Mathematics (N = 1361)

A

643

95.5%

B-

14

2.1%

B+

11

1.6%

C-

4

0.6%

C+

1

0.1%

*A=|DIF|<.43 logits; B=.43 logits ≤|DIF|<.64 logits; C=|DIF|≥.64 logits; B- and C- = DIF is against the Focal group (female); B+ and C+ = DIF is against the Reference group (male)

**The number of items with 500 or more responses from the group where p < .05

 

Table 7indicates negligible if any DIF between the base calibration and each of the focal groups. There is noticeable DIF at both B and C ETSClass levels when the reference group is Anglo Americans and the focal group is Native Americans. Given the relatively few items in common between these two groups in conjunction with limited representation in the sample population, concern regarding the aforementioned is diminutive. DIF between the remaining focal groups and the reference group (Anglo Americans) is fairly consistent at the same ETSClass levels against both focal and reference groups. Table 8indicates minimal DIF related to gender with 95.5% of the items at level A (negligible).

Actions taken. All items revealed as exhibiting moderate DIF are subjected to an extra review by NWEA Content Specialists to identify the source(s) for differential functioning. For each item, these specialists make a judgment to: 1) remove the item from the item bank, 2) revise the item and re-submit it for field-testing, or 3) to retain the item as is. Items exhibiting severe DIF are removed from the bank. These procedures are consistent with and act to extended periodic Item Quality Reviews, which remove or flag items for revision and re-field-testing problem items.



[1] Linacre, J. M. & Wright, B. D. (1989). Mantel-Haenszel DIF and PROX are Equivalent. Rasch Measurement Transactions, 1989, 3, (2), 52-53.

[2] Linacre, J. M. (2012). Winsteps-Ministep: Rasch Model Computer Programs, Version 3.75.0, www.winsteps.com.

[3] Ibid.

 

Administration Format

GradeK1
Data
  • Individual
  • Individual
  • Administration & Scoring Time

    GradeK1
    Data
  • 45 minutes
  • 45 minutes
  • Scoring Format

    GradeK1
    Data
  • Automatic
  • Automatic
  • Types of Decision Rules

    GradeK1
    Data
  • None
  • None
  • Evidence Available for Multiple Decision Rules

    GradeK1
    Data
  • No
  • No