MAP® Reading Fluency™
Reading

Summary

MAP Reading Fluency is designed specifically for early learners and focuses on foundational reading skills, oral reading fluency, and literal comprehension. The assessment is an easy-to-administer universal screener for students in grades PK–5 that is typically used three times per school year for benchmarking. It can also be used for students in grades 6–8 who require additional reading support. In addition to benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. MAP Reading Fluency delivers an interactive, developmentally appropriate assessment experience that identifies students at risk for reading difficulties. Easy-to-use reports provide actionable data at the individual, class, and district level. Unlike traditional measures of oral reading fluency, MAP Reading Fluency includes advanced speech scoring technology that automatically records and consistently scores students’ oral reading, saving hours of instructional time. Students complete the computer adaptive assessment independently in about twenty minutes using a headset with a mounted microphone. MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. Phonological Awareness and Phonics/Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that words correct per minute (WCPM) scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format.

Where to Obtain:
NWEA®
proposals@nwea.org
121 NW Everett Street, Portland, OR 97209
(503) 624-1951
www.nwea.org
Initial Cost:
Contact vendor for pricing details.
Replacement Cost:
Contact vendor for pricing details.
Included in Cost:
MAP Reading Fluency assessments require an annual per-student subscription fee. Please visit https://www.nwea.org/contact-us/sales-information/ or call (866) 654-3246 to request a quote. Annual subscription fees include a suite of assessments, scoring and reporting, all assessment software including maintenance and upgrades, support services, and unlimited staff access to NWEA Professional Learning Online. MAP Reading Fluency assessments can be administered three times per school year for benchmarking in grades PK–5. It can also be used for students in grades 6–8 who require reading support. In addition to the benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. The benchmark assessment is also available in Spanish for grades PK–3. Reports provide students’ reading level, their performance compared to grade-level expectations, and suggestions for instructional next steps tailored to each student. A full system of support is provided to enable the success of our partners, including technical support; implementation support through the first test administration; and ongoing, dedicated account management for the duration of the partnership. Unlimited staff access to NWEA Professional Learning Online learning portal offering on-demand tutorials, webinars, courses, and videos to supplement professional learning plans and help educators use MAP Reading Fluency to improve teaching and learning. NWEA offers a portfolio of flexible, customizable professional learning and training options for an additional cost to meet the needs of our partners. MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. Individual districts can purchase headsets directly through a third party if needed.
Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. The Voluntary Product Accessibility Template (VPAT) for MAP Reading Fluency is available from NWEA upon request.
Training Requirements:
1-4 hours of training
Qualified Administrators:
MAP Reading Fluency is group-administered and machine-scored. Proctors should complete online or in-person training offered by NWEA to ensure they are familiar with the test experience and interface and know how to initiate and oversee the testing process.
Access to Technical Support:
Users can obtain support through our Partner Support team via toll-free telephone number, email, and chat; our online Help Center; and a dedicated Account Manager.
Assessment Format:
  • Direct: Computerized
Scoring Time:
  • Scoring is automatic
Scores Generated:
  • Raw score
  • Percentile score
  • IRT-based score
  • Equated
  • Lexile score
  • Composite scores
  • Subscale/subtest scores
Administration Time:
  • 20 minutes per student
Scoring Method:
  • Automatically (computer-scored)
Technology Requirements:
  • Computer or tablet
  • Internet connection
  • Other technology : MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. System requirements are regularly updated at https://teach.mapnwea.org/impl/QRM2_System_Requirements_QuickRef.pdf.
Accommodations:
Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. The Voluntary Product Accessibility Template (VPAT) for MAP Reading Fluency is available from NWEA upon request.

Descriptive Information

Please provide a description of your tool:
MAP Reading Fluency is designed specifically for early learners and focuses on foundational reading skills, oral reading fluency, and literal comprehension. The assessment is an easy-to-administer universal screener for students in grades PK–5 that is typically used three times per school year for benchmarking. It can also be used for students in grades 6–8 who require additional reading support. In addition to benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. MAP Reading Fluency delivers an interactive, developmentally appropriate assessment experience that identifies students at risk for reading difficulties. Easy-to-use reports provide actionable data at the individual, class, and district level. Unlike traditional measures of oral reading fluency, MAP Reading Fluency includes advanced speech scoring technology that automatically records and consistently scores students’ oral reading, saving hours of instructional time. Students complete the computer adaptive assessment independently in about twenty minutes using a headset with a mounted microphone. MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. Phonological Awareness and Phonics/Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that words correct per minute (WCPM) scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format.
The tool is intended for use with the following grade(s).
selected Preschool / Pre - kindergarten
selected Kindergarten
selected First grade
selected Second grade
selected Third grade
selected Fourth grade
selected Fifth grade
not selected Sixth grade
not selected Seventh grade
not selected Eighth grade
not selected Ninth grade
not selected Tenth grade
not selected Eleventh grade
not selected Twelfth grade

The tool is intended for use with the following age(s).
selected 0-4 years old
selected 5 years old
selected 6 years old
selected 7 years old
selected 8 years old
selected 9 years old
selected 10 years old
not selected 11 years old
not selected 12 years old
not selected 13 years old
not selected 14 years old
not selected 15 years old
not selected 16 years old
not selected 17 years old
not selected 18 years old

The tool is intended for use with the following student populations.
selected Students in general education
selected Students with disabilities
selected English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading
Phonological processing:
selected RAN
not selected Memory
selected Awareness
selected Letter sound correspondence
selected Phonics
not selected Structural analysis

Word ID
selected Accuracy
selected Speed

Nonword
not selected Accuracy
selected Speed

Spelling
selected Accuracy
selected Speed

Passage
selected Accuracy
selected Speed

Reading comprehension:
selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
selected Sentence verification
not selected Other (please describe):


Listening comprehension:
selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
selected Sentence verification
selected Vocabulary
not selected Expressive
selected Receptive

Mathematics
Global Indicator of Math Competence
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Early Numeracy
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Concepts
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Computation
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematic Application
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Fractions/Decimals
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Algebra
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Geometry
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

not selected Other (please describe):

Please describe specific domain, skills or subtests:
BEHAVIOR ONLY: Which category of behaviors does your tool target?


BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:
Email Address
proposals@nwea.org
Address
121 NW Everett Street, Portland, OR 97209
Phone Number
(503) 624-1951
Website
www.nwea.org
Initial cost for implementing program:
Cost
Unit of cost
Replacement cost per unit for subsequent use:
Cost
Unit of cost
Duration of license
Additional cost information:
Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.
MAP Reading Fluency assessments require an annual per-student subscription fee. Please visit https://www.nwea.org/contact-us/sales-information/ or call (866) 654-3246 to request a quote. Annual subscription fees include a suite of assessments, scoring and reporting, all assessment software including maintenance and upgrades, support services, and unlimited staff access to NWEA Professional Learning Online. MAP Reading Fluency assessments can be administered three times per school year for benchmarking in grades PK–5. It can also be used for students in grades 6–8 who require reading support. In addition to the benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. The benchmark assessment is also available in Spanish for grades PK–3. Reports provide students’ reading level, their performance compared to grade-level expectations, and suggestions for instructional next steps tailored to each student. A full system of support is provided to enable the success of our partners, including technical support; implementation support through the first test administration; and ongoing, dedicated account management for the duration of the partnership. Unlimited staff access to NWEA Professional Learning Online learning portal offering on-demand tutorials, webinars, courses, and videos to supplement professional learning plans and help educators use MAP Reading Fluency to improve teaching and learning. NWEA offers a portfolio of flexible, customizable professional learning and training options for an additional cost to meet the needs of our partners. MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. Individual districts can purchase headsets directly through a third party if needed.
Provide information about special accommodations for students with disabilities.
Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. The Voluntary Product Accessibility Template (VPAT) for MAP Reading Fluency is available from NWEA upon request.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?
not selected General education teacher
not selected Special education teacher
not selected Parent
not selected Child
not selected External observer
not selected Other
If other, please specify:

What is the administration setting?
not selected Direct observation
not selected Rating scale
not selected Checklist
not selected Performance measure
not selected Questionnaire
selected Direct: Computerized
not selected One-to-one
not selected Other
If other, please specify:

Does the tool require technology?
Yes

If yes, what technology is required to implement your tool? (Select all that apply)
selected Computer or tablet
selected Internet connection
selected Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:
MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. System requirements are regularly updated at https://teach.mapnwea.org/impl/QRM2_System_Requirements_QuickRef.pdf.

What is the administration context?
selected Individual
selected Small group   If small group, n=30
not selected Large group   If large group, n=
selected Computer-administered
selected Other
If other, please specify:
Educators can test a whole class, a group of students, or one student at a time. Tests can be administered in-person or remotely.

What is the administration time?
Time in minutes
20
per (student/group/other unit)
student

Additional scoring time:
Time in minutes
0
per (student/group/other unit)
student

ACADEMIC ONLY: What are the discontinue rules?
selected No discontinue rules provided
not selected Basals
not selected Ceilings
not selected Other
If other, please specify:


Are norms available?
Yes
Are benchmarks available?
Yes
If yes, how many benchmarks per year?
3
If yes, for which months are benchmarks available?
Fall, Winter, Spring
BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?
Yes
Describe the time required for administrator training, if applicable:
1-4 hours of training
Please describe the minimum qualifications an administrator must possess.
MAP Reading Fluency is group-administered and machine-scored. Proctors should complete online or in-person training offered by NWEA to ensure they are familiar with the test experience and interface and know how to initiate and oversee the testing process.
not selected No minimum qualifications
Are training manuals and materials available?
Yes
Are training manuals/materials field-tested?
Yes
Are training manuals/materials included in cost of tools?
Yes
If No, please describe training costs:
Can users obtain ongoing professional and technical support?
Yes
If Yes, please describe how users can obtain support:
Users can obtain support through our Partner Support team via toll-free telephone number, email, and chat; our online Help Center; and a dedicated Account Manager.

Scoring

How are scores calculated?
not selected Manually (by hand)
selected Automatically (computer-scored)
not selected Other
If other, please specify:

Do you provide basis for calculating performance level scores?
Yes
What is the basis for calculating performance level and percentile scores?
not selected Age norms
selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

What types of performance level scores are available?
selected Raw score
not selected Standard score
selected Percentile score
not selected Grade equivalents
selected IRT-based score
not selected Age equivalents
not selected Stanines
not selected Normal curve equivalents
not selected Developmental benchmarks
not selected Developmental cut points
selected Equated
not selected Probability
selected Lexile score
not selected Error analysis
selected Composite scores
selected Subscale/subtest scores
not selected Other
If other, please specify:

Does your tool include decision rules?
Yes
If yes, please describe.
Foundational Skills Benchmarks: Students taking the Foundational Skills test place into a performance level based on the developmental level of the tasks they can complete. Full details are provided in the description of our scoring structure below. Risk Benchmarks: Risk benchmarks were set according to the results of the provided classification accuracy study. Risk was set at the 20th percentile of spring MAP Growth Reading RIT scores. A multivariate logistic model was used to obtain estimated probabilities of students scoring in the At Risk category on the spring MAP Growth Reading tests. Students not meeting certain thresholds on these estimated probabilities were deemed At Risk. Full details of the study appear below.
Can you provide evidence in support of multiple decision rules?
Yes
If yes, please describe.
See the description of our scoring structure below for a description of the rationale for the Foundational Skills performance levels.
Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
MAP Reading Fluency assesses a wide range of specific early reading skills, selecting those in and around each student’s zone of proximal development (ZPD) — concepts the student is ready to develop — using a stage-adaptive methodology. Performance-level reporting classifies student performance as exceeding (blue), meeting (green), approaching (yellow), or below (red) grade-level expectation for a given grade and season (Fall, Winter, Spring). As the year progresses, expectation levels rise and students must demonstrate growth to keep pace with threshold performance for their grade. Raw Score Conversion to Performance Levels: Foundational Skills Measures -- Foundational skill measures in MAP Reading Fluency are presented within the Foundational Skills test form or upon failure to advance to oral reading based on sentence reading criteria. The Foundational Skills test includes measures in the Phonological Awareness, Phonics/Word Recognition, and Language Comprehension domains. The Print Concepts domain is also included for students newer to books and text. Phonological Awareness and Phonics/Word Recognition are assessed with a series of discrete, timed measures focusing on a single skill. These measures are presented adaptively based on student responses (i.e., number correct and percent correct). Each student moves through each of the two progressions based on their demonstrated ability. Performance levels are assigned at the level of the entire progression by comparing the observed zone of proximal development to grade-level expectations. ZPD levels are achievable from a series of related measures administered from each skill progression. The ZPD level is highlighted in an onscreen representation of the progression, shown in the Student Report, and is stated in a narrative in the top summary section of the report. Raw Score Conversion to Performance Levels: Oral Reading Measures -- Students who advance to oral reading are assigned a performance level based on scaled words correct per minute (SWCPM) for each grade and administration, which are drawn from published national norms (Hasbrouck & Tindal, 2017). Students meet expectations if they read the minimum SWCPM for a given grade and seasonal administration. If students struggle to understand a grade-level passage, they will get an easier (lower Lexile measure) passage. If they understand the grade-level passage well, students are presented with a more difficult (higher Lexile measure) passage. Passage equating and scaling allows fluency performance to be compared across a range of passage difficulty. A student’s best attempt determines his or her assigned performance level. Item Pool -- All MAP Reading Fluency items are designed for maximum developmental appropriateness, using engaging character-based audio and colorful graphics. A variety of selected-response formats are used, plus automatic speech scoring. Oral reading passages were developed for the purpose of oral reading fluency assessment, including basic understanding of what was read. Passages range in difficulty from 180L to 1000L in the Lexile metric to support adaptivity above and below grade level through grade 5. The student reads directly into a headset microphone for picture books and oral reading fluency passages. For selected-response tasks, students see and hear demonstrations by the narrating character, including audio and animation or video, before engaging with the scored items. Selected-response item types include multiple choice, including choose-two and hot spot formats; click-and-pop simple object moving formats; and simple constructed response (e.g., building a word from letters). MAP Reading Fluency includes more than 2,000 items across the following areas: Picture Books for oral reading, speech-scored: 11; Oral Reading Passage sets, with one speech-scored passage and six selected-response comprehension questions apiece: over 170 (over 1,190 total); Phonological Awareness items, across eight selected-response measures: over 320; Phonics and Word Recognition items, across 10 selected-response measures: over 350; Language Comprehension, across two selected-response measures: approximately 71; Print Concepts storybooks, with six selected-response questions apiece: 6 (36 total).
Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
Research on early literacy development drove the design of MAP Reading Fluency. Framed by Gough and Tunmer’s Simple View of Reading (1986), the assessment parses decoding and language comprehension factors as separately assessable components until a point at which these are coming together in students’ reading of connected text. Once students are beginning to develop reading fluency, MAP Reading Fluency assesses oral reading directly using speech scoring technology and direct checks for understanding. For students not yet reading passages, decoding factors assessed in MAP Reading Fluency include phonological awareness, phonics and word recognition, and, for those at beginning levels of book exposure, print concepts. Phonemic awareness is among the strongest predictors of decoding fluency in English, and phonological skills that precede phoneme-level skills in a developmental continuum are valuable in earlier screening (Anthony & Francis, 2005). For both phonological awareness and phonics and word recognition, MAP Reading Fluency locates the student on a progression of skills and points to tightly aligned instructional steps for that stage, including research-to-practice student instructional materials from the Florida Center for Reading Research. While academic standards typically frame only decoding skills as foundational, research is increasingly clear that a student’s foundation in language comprehension strongly contributes to future reading comprehension, with growing predictive power as decoding fluency consolidates (Foorman, et al, 2015). In assessing foundational skills, MAP Reading Fluency includes both vocabulary and sentence listening comprehension. For students able to read connected text, MAP Reading Fluency assesses oral reading in a group-administered assessment that capitalizes on automatic speech scoring. This returns hours of instructional time that a teacher might otherwise be spending on the task of one-on-one assessment. While a simple direct measure of words correct per minute (WCPM) is a strong indicator of reading development (Fuchs, et al, 2001), research clearly supports a more robust understanding of reading fluency as including assessment of accuracy, rate, and understanding in the context of variable levels of text difficulty (Valencia, et al, 2010). It is particularly important that students are asked to show understanding of what they read aloud, both to convey to students the purpose of reading and to activate factors that aide in prediction of reading comprehension. As Valencia & Buly (2004) note, students struggling with reading align to more than one profile relevant to instructional next steps. When this is disregarded and all struggling readers are routed to the same generic interventions, screening time and resources are squandered, instructional effectiveness is compromised, and students are left to struggle. Instead, MAP Reading Fluency is designed for individualization. Oral reading fluency reporting generates individual Reader Profiles with Next Steps, tailoring these research-based messages to the individual’s particular performance across accuracy, rate, comprehension, and text level. Test Formats: MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. Phonological Awareness and Phonics/Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that WCPM scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format. Item Bias and Sensitivity: We are committed to developing engaging, authentic, rigorous, and culturally diverse assessments that effectively measure the full range of the standards. Therefore, it is vital that we address a wide variety of texts in a balanced, respectful way that does not upset, distract, or exclude any student populations. Item and passage writers employ careful consideration and sound judgment while crafting items, considering each item from a variety of angles regarding bias and sensitivity, in accordance with the NWEA Sensitivity, Fairness, and Accessibility Guidelines. To meet our high expectation of fairness to all students, every item and passage is thoroughly examined at multiple points in the development process, undergoing specific bias and sensitivity reviews. Sensitivity in this context means an awareness of the different things that can distract a student during assessment. Fairness in this context relates to giving each student equal opportunity to answer the item correctly based solely on their knowledge of the item content. Any sensitivity and fairness issues found in items or passages are eliminated in revision or rejection of the item during development. Each item or passage is evaluated against a set of criteria and is flagged if it requires prior knowledge other than the skill/concept being assessed; requires construct-irrelevant or specialized knowledge; has cultural bias; has linguistic bias; has socioeconomic bias; has religious bias; has geographic bias; has color-blind bias; has gender bias; favors students who have no visual impairments; favors students who have no disabilities; inappropriately employs idiomatic English; offensively stereotypes a group of people; mentions body/weight issues; contains inappropriate or sensitive topics; distracts, upsets, or confuses in any way; or has other bias issues. Our Psychometric Solutions team performs differential item functioning (DIF) analyses to examine the percentages of items that exhibit substantial DIF in the item pools, i.e., C-class DIF (Zwick, 2012). All items revealed as exhibiting C-class DIF are subjected to an extra review by NWEA Content Specialists to identify the source(s) for differential functioning. For each item, these specialists make a judgment to remove the item from the item bank; revise the item and resubmit it for field-testing; retain the item as is. These procedures are consistent with periodic item quality reviews that remove or flag items for revision, which are then field tested again.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Classification Accuracy Fall Partially convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Classification Accuracy Winter Partially convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Classification Accuracy Spring Convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available

MAP® Growth™ Reading

Classification Accuracy

Select time of year
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
The criterion measure was MAP Growth Reading. Although NWEA develops and maintains both MAP Growth and MAP Reading Fluency, they are different assessments built from separate blueprints with no task types or specific subtests in common. MAP Growth Reading assessments measure what students know and inform educators and parents about what they are ready to learn next. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction.
Do the classification accuracy analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
For our classification analyses, we used screener data from Fall 2018, Winter 2019, and Spring 2019, and criterion data from Spring 2019. The predictive classification analyses involved Fall-to-Spring and Winter-to-Spring predictions. We treated Spring-to-Spring analyses as concurrent evidence.
Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
The cut scores for at-risk on the criterion measure were set at the 20th percentile for each grade and term. A multivariate model using MAP Reading Fluency’s item response theory (IRT)-based Foundational Skills domain scores and raw scores on the Silent Sentence Reading measure was used to estimate the probability of students being at risk. The Foundational Skills domain scores comprise Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. The grouping was between high- and low-risk students. Cut points were set not on the individual domain scores and Silent Sentence Reading scores, but rather on the estimated at-risk probabilities. (This is a common practice in medical research.) Cut points were selected to have roughly equal sensitivities and specificities equal to or greater than 0.70. GRADES 2-3: By second grade, students tracking to Foundational Skills track tend to be struggling readers. Therefore, the base rate in the tested sample can become much higher than what the at-risk designation in the general population would suggest. To make the tested sample’s risk incidence more similar to the national norms, grade 2 and grade 3 students were selected using stratified random sampling procedures so that 20% of the selected students were at-risk on the criterion measure, and 80% were not. This method is similar to using poststratification weights to make sample demographics similar to national demographics. Such sampling has only minor effects on model sensitivity and specificity given these quantities are independent from the base rate (Krzanowski & Hand, 2009).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
No
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Our sample comprised students from our regular test-taking population who had taken MAP Reading Fluency Foundational Skills measures and a MAP Growth Reading test during one or more terms in the 2018-2019 school year. Some of these students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students.

Cross-Validation

Has a cross-validation study been conducted?
No
If yes,
Select time of year.
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
Do the cross-validation analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Classification Accuracy - Fall

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 20 20 20 20
Cut Points - Performance score on criterion measure 143 159 173 183
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 413 1388 969 193
Classification Data - False Positive (b) 989 2219 1282 287
Classification Data - False Negative (c) 151 372 312 64
Classification Data - True Negative (d) 2697 8280 3842 741
Area Under the Curve (AUC) 0.82 0.87 0.83 0.82
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.80 0.86 0.82 0.79
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.83 0.87 0.84 0.84
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.13 0.14 0.20 0.20
Overall Classification Rate 0.73 0.79 0.75 0.73
Sensitivity 0.73 0.79 0.76 0.75
Specificity 0.73 0.79 0.75 0.72
False Positive Rate 0.27 0.21 0.25 0.28
False Negative Rate 0.27 0.21 0.24 0.25
Positive Predictive Power 0.29 0.38 0.43 0.40
Negative Predictive Power 0.95 0.96 0.92 0.92
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2018-2019 2018-2019 2018-2019 2018-2019
Sample Size 4250 12259 6405 1285
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV)
New England (MA, ME)
Pacific (AK, CA, WA)
South Atlantic (FL, GA, NC, SC)
West North Central (IA, KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV)
New England (CT, MA, ME, VT)
Pacific (AK, CA, WA)
South Atlantic (DE, FL, GA, MD, NC, SC)
West North Central (IA, KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV)
New England (CT, MA, ME, VT)
Pacific (AK, CA, WA)
South Atlantic (DE, FL, GA, NC, SC, VA)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, MT, NV)
New England (MA, ME, VT)
Pacific (AK, CA, WA)
South Atlantic (FL, GA, NC, SC)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
Male 51.3% 49.5% 50.6% 53.3%
Female 48.5% 50.4% 49.4% 46.7%
Other        
Gender Unknown 0.1% 0.0% 0.0%  
White, Non-Hispanic 51.1% 52.0% 47.3% 38.7%
Black, Non-Hispanic 16.2% 15.9% 19.0% 21.7%
Hispanic 17.4% 12.9% 14.0% 17.0%
Asian/Pacific Islander     1.4% 0.7%
American Indian/Alaska Native 1.7% 4.6% 6.0% 11.8%
Other     2.7% 2.7%
Race / Ethnicity Unknown 11.1% 12.6% 9.7% 7.3%
Low SES 15.3% 12.2% 11.8% 9.0%
IEP or diagnosed disability        
English Language Learner        

Classification Accuracy - Winter

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 20 20 20 20
Cut Points - Performance score on criterion measure 143 159 173 183
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 25 1825 646 115
Classification Data - False Positive (b) 104 2154 870 167
Classification Data - False Negative (c) 10 491 217 43
Classification Data - True Negative (d) 940 7920 2583 465
Area Under the Curve (AUC) 0.87 0.87 0.82 0.79
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.79 0.86 0.81 0.76
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.94 0.88 0.84 0.83
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.03 0.19 0.20 0.20
Overall Classification Rate 0.89 0.79 0.75 0.73
Sensitivity 0.71 0.79 0.75 0.73
Specificity 0.90 0.79 0.75 0.74
False Positive Rate 0.10 0.21 0.25 0.26
False Negative Rate 0.29 0.21 0.25 0.27
Positive Predictive Power 0.19 0.46 0.43 0.41
Negative Predictive Power 0.99 0.94 0.92 0.92
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2018-2019 2018-2019 2018-2019 2018-2019
Sample Size 1079 12390 4316 790
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (MS)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, MT, NV)
New England (MA, ME)
Pacific (CA, WA)
South Atlantic (FL, GA, MD, NC, SC)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV, WY)
New England (CT, MA, ME, VT)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, DE, FL, GA, MD, NC, SC)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV, WY)
New England (CT, MA, ME, VT)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, DE, FL, GA, MD, NC, SC, VA)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS)
Middle Atlantic (NY, PA)
Mountain (AZ, CO, MT, NV)
New England (CT, MA, ME, VT)
Pacific (AK, CA, OR)
South Atlantic (DC, FL, GA, MD, SC)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
Male 49.5% 50.7% 52.6% 56.6%
Female 50.5% 49.3% 47.3% 43.3%
Other        
Gender Unknown   0.0% 0.1% 0.1%
White, Non-Hispanic 57.2% 49.5% 45.3% 39.7%
Black, Non-Hispanic 14.5% 15.9% 21.4% 20.8%
Hispanic 11.6% 15.5% 15.0% 19.7%
Asian/Pacific Islander     1.7% 1.1%
American Indian/Alaska Native 2.2% 4.1% 5.1% 7.8%
Other     3.5% 3.4%
Race / Ethnicity Unknown 13.2% 12.9% 8.0% 7.3%
Low SES 17.9% 16.9% 13.4% 15.1%
IEP or diagnosed disability        
English Language Learner        

Classification Accuracy - Spring

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 20 20 20 20
Cut Points - Performance score on criterion measure 143 159 173 183
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 1677 2654 568 122
Classification Data - False Positive (b) 2408 2178 806 162
Classification Data - False Negative (c) 393 722 209 42
Classification Data - True Negative (d) 10308 8032 2302 493
Area Under the Curve (AUC) 0.89 0.87 0.82 0.81
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.88 0.86 0.80 0.78
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.90 0.87 0.83 0.85
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.14 0.25 0.20 0.20
Overall Classification Rate 0.81 0.79 0.74 0.75
Sensitivity 0.81 0.79 0.73 0.74
Specificity 0.81 0.79 0.74 0.75
False Positive Rate 0.19 0.21 0.26 0.25
False Negative Rate 0.19 0.21 0.27 0.26
Positive Predictive Power 0.41 0.55 0.41 0.43
Negative Predictive Power 0.96 0.92 0.92 0.92
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2018-2019 2018-2019 2018-2019 2018-2019
Sample Size 14786 13586 3885 819
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV, WY)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, FL, GA, MD, NC, SC)
West North Central (IA, KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV, WY)
New England (CT, MA, ME, VT)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, DE, FL, GA, MD, NC, SC)
West North Central (IA, KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV, WY)
New England (CT, MA, ME, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, DE, FL, GA, MD, NC, SC, VA)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN, MI, OH, WI)
East South Central (KY, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (AZ, CO, MT, NV)
New England (CT, MA, ME, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, GA, MD, SC)
West North Central (KS, MN, MO, ND, NE, SD)
West South Central (AR, OK, TX)
Male 50.4% 50.8% 51.3% 53.7%
Female 49.6% 49.1% 48.6% 46.3%
Other        
Gender Unknown 0.1% 0.1% 0.1%  
White, Non-Hispanic 47.7% 45.2% 43.7% 37.9%
Black, Non-Hispanic 15.9% 18.6% 22.2% 21.2%
Hispanic 17.2% 15.9% 15.7% 20.5%
Asian/Pacific Islander     1.7% 0.9%
American Indian/Alaska Native 3.0% 4.5% 4.5% 6.7%
Other     3.5% 3.7%
Race / Ethnicity Unknown 14.0% 13.8% 8.6% 9.2%
Low SES 17.7% 16.5% 15.3% 14.2%
IEP or diagnosed disability        
English Language Learner        

Reliability

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Rating Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Offer a justification for each type of reliability reported, given the type and purpose of the tool.
We submitted marginal reliabilities for each of the Foundational Skills domain scores: Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. The Foundational Skills domain scores’ measures come from a multi-stage testing environment and contain mostly speeded measures. These conditions make popular internal consistency methods such as coefficient alpha less appropriate. Given extreme within-grade range restriction in scores, marginal reliabilities for the Silent Sentence Reading measure were spuriously low. We provided test-retest reliabilities for Silent Sentence Reading internal consistency for the required second type of reliability. Silent Sentence Reading is a speeded measure, so the resulting coefficients are likely affected.
*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
Approximately 112,000 of our MAP Reading Fluency test takers in grades K–3 provided Foundational Skills data for the marginal reliability analyses. Data were collected in the Fall, Winter, and Spring terms of the 2018-2019 school year. The sample was approximately 39% White, 20% Hispanic, 18% African American, 4% Asian, and 2% Native American. Forty-seven U.S. states and territories were represented. Males were slightly overrepresented (52% vs. 48%). All students who take a MAP Reading Fluency benchmark test take the Silent Sentence Reading measure. Therefore, our sample for the internal consistency analyses was larger than that for marginal reliability. Data for these analyses included scores from over 120,000 students. The demographic composition of the sample was similar to that of the marginal reliability sample except that proportions of male and female students were nearly equal. The test-retest students completed a second MAP Reading Fluency test session. Their Local Education Agencies received a financial incentive to participate. Slightly over 2,000 students participated during the Fall 2018 term, and nearly 3,000 students participated in the Winter 2019 term. Fewer than 100 students’ data were available for the Spring 2019 term.
*Describe the analysis procedures for each reported type of reliability.
Rasch item difficulties were estimated in Spring 2019 on sets of linear forms in order to avoid distortions in item difficulty estimation that can arise in computer adaptive testing and multi-stage testing data. Operational data from Fall 2018, Winter 2019, and Spring 2019 were scored with these item difficulty estimates. Marginal reliabilities (MR) were estimated separately for each domain, grade, and semester combination, by MR=(σ^2 (θ_T)-μ(σ^2 (θ_e )))/(σ^2 (θ_T )), where σ^2 (θ_T) denotes the total variance of a set of domain scores and μ(σ^2 (θ_e )) denotes the mean of the error variance of these domain scores. The 95 percent confidence interval for each coefficient was obtained by repeating the estimation over 1,000 bootstrap samples. The marginal reliability estimates for the Foundational Skills domain scores also reflect calibration sampling error, i.e., the reduction in reliability due to using item parameter estimates gathered from a different sample. Confidence intervals for the test-retest correlations were obtained via the Fisher z-transformation (Fisher, 1921). A maximum test-retest interval of 14 days was allowed.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Validity

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Rating Unconvincing evidence Unconvincing evidence Unconvincing evidence Unconvincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
The criterion measure was MAP Growth Reading. For the predictive evidence, criterion scores were from Spring 2019. Although NWEA develops and maintains both MAP Growth and MAP Reading Fluency, they are different assessments built from separate blueprints with no task types or specific subtests in common. MAP Growth Reading assessments measure what students know and inform educators and parents about what they are ready to learn next. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction.
*Describe the sample(s), including size and characteristics, for each validity analysis conducted.
A total of 320,352 test records from the 2018-2019 school year were available for the study. The greatest number of records were from Spring 2019. Data were from 45 states covering all nine census geographic regions. Region coverage ranged from 3,254 students in the East South Central division to 46,620 students in the South Atlantic division. The gender and racial/ethnic composition of the sample was similar to that in the marginal reliability study, which was approximately 39% White, 20% Hispanic, 18% African American, 4% Asian, and 2% Native American. Males were slightly overrepresented (52% vs. 48%).
*Describe the analysis procedures for each reported type of validity.
Both concurrent and predictive evidence are supplied for the Silent Sentence Reading measure and the Foundational Skills domain scores: Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. MAP Growth Reading RIT scores were the criterion measures. Within-semester Pearson correlations were run by grade for each term. Confidence intervals were constructed using the Fisher z-transformation (Fisher, 1921).

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Describe the degree to which the provided data support the validity of the tool.
Each of the Foundational Skills domains and Silent Sentence Reading cover different aspects of reading, whereas MAP Growth Reading is a general reading achievement test. One would not expect the correlations of the Foundational Skills and Silent Sentence Reading scores with MAP Growth Reading to be as large as those between two general reading achievement tests or between two specific tests of a single Foundational Skills domain, e.g., two Phonological Awareness tests. That said, of the lower limits for the 48 concurrent correlations, nine were greater than or equal to 0.60, 20 were greater or equal to 0.50 but less than 0.60, 11 were greater than or equal to 0.40 but less than 0.50, and three were greater than or equal to 0.30 but less than 0.40. Correlations less than 0.30 involved either Silent Sentence Reading in kindergarten or Language Comprehension in third grade. Of the lower limits for the respective 32 predictive correlations, three were greater than or equal to 0.60, 15 were greater than or equal to 0.50 but less than 0.60, eight were greater than or equal to 0.40 but less than 0.50, and three were greater than or equal to 0.30 but less than 0.40. The other three correlations involved Language Comprehension scores in grade 1. It’s likely that semester-to-semester growth resulted in lowered correlations for predictive as compared with the concurrent evidence. Correlations for kindergarten Silent Sentence Reading are included for the sake of transparency. Performance by kindergartners tends to be volatile on this measure, and kindergarteners are not expected to be able to read sentences at that age.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Bias Analysis

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Rating Provided Provided Provided Provided
Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
Yes
If yes,
a. Describe the method used to determine the presence or absence of bias:
Differential item functioning (DIF) refers to the extent that students of equal ability do not have equal probability of answering particular items correctly. NWEA used Spring 2019 Foundational Skills data for a round of item response theory-based DIF analyses. Winsteps version 4.0 was used for these analyses (Linacre, nd). Winsteps fixes student ability estimates and then allows item difficulty estimates to be freely estimated for each group in question. The results are categorized based on the Educational Testing Service (ETS) method of classifying DIF (Zwick, 2012). This method allows items exhibiting negligible DIF (Category A) to be differentiated from those exhibiting moderate DIF (Category B) and severe DIF (Category C). Categories B and C have a further breakdown as “+” (DIF is in favor of the focal group) or “-” (DIF is in favor of the reference group). Typically, only items that fall in the “DIF C-class” require further investigation. Statistical tests for DIF are not corrected for multiple comparisons. Additionally, DIF statistics require an overall ability estimate that is largely unaffected by DIF in order to provide evidence of the presence of DIF. Analyses were conducted across each grade level.
b. Describe the subgroups for which bias analyses were conducted:
DIF analyses were conducted by ethnic group (White, Native American, Asian, African American, Hispanic) and gender (male, female). White serves as reference group in the DIF analysis based on ethnic group, and male serves as reference group in the DIF analysis based on gender.
c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Fewer than two percent of the DIF comparisons resulted in C-Class DIF, which is less than would be expected by chance. Five comparisons involved five items from a now-retired measure. Content experts review all other items showing C-Class DIF.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.