MAP® Reading Fluency™
Reading

Summary

MAP Reading Fluency is designed specifically for early learners and focuses on foundational reading skills, oral reading fluency, and literal comprehension. The assessment is an easy-to-administer universal screener for students in grades Pre-K–5 that is typically used three times per school year for benchmarking. It can also be used for students in grades 6–8 who require additional reading support. In addition to the benchmark assessment, MAP Reading Fluency includes progress monitoring for Foundational Skills and Oral Reading. MAP Reading Fluency delivers an interactive, developmentally appropriate assessment experience that identifies students at risk for reading difficulties. Easy-to-use reports provide actionable data at the individual, class, and district level. Unlike traditional measures of oral reading fluency, MAP Reading Fluency includes advanced speech scoring technology that automatically records and consistently scores students’ oral reading, saving hours of instructional time. Students complete the computer adaptive assessment independently in about 20 minutes using a headset with a mounted microphone. MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. Phonological Awareness and Phonics & Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that words correct per minute (WCPM) scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format.

Where to Obtain:
NWEA®
proposals@nwea.org
121 NW Everett Street, Portland, OR 97209
(503) 624-1951
www.nwea.org
Initial Cost:
Contact vendor for pricing details.
Replacement Cost:
Contact vendor for pricing details.
Included in Cost:
MAP Reading Fluency assessments require an annual per-student subscription fee. Please visit https://www.nwea.org/contact-us/sales-information or call (866) 654-3246 to request a quote. Annual subscription fees include a suite of assessments, scoring and reporting, and all assessment software including maintenance and upgrades. We provide a full system of support to enable the success of our partners, including technical support; implementation support; and dedicated account management for the length of the partnership. MAP Reading Fluency assessments can be administered three times per school year for benchmarking in grades Pre-K–5. With speech-recognition technology, group administration, and automatic scoring, MAP Reading Fluency provides a clear view of early literacy skills and learning needs for an entire class in about 20 minutes. Districts can monitor oral reading fluency, literal comprehension, and foundational reading skills from season to season and year to year. The assessment quickly screens students at risk of reading difficulty, including characteristics of dyslexia and provides precise, reliable insights to support early readers while maximizing valuable instructional time. The benchmark assessment is also available in Spanish for grades Pre-K–3. We require new partners to purchase two professional learning sessions: MAP Reading Fluency Basics and Essential Reports. The workshops cover topics educators need to get started with MAP Reading Fluency and information about accessing, interpreting, and applying MAP Reading Fluency data to inform instruction. MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. Districts can purchase headsets directly through a third party if needed.
Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. We strive to maximize the validity of our assessments for the greatest number of students. At the heart of our efforts is our dedication to providing assessments that are adaptable to a combination of unique learning needs, easily perceived, and clear to each student. We actively conduct research and have taken critical steps in contributing to the field of accessibility and universal design. MAP Reading Fluency includes and supports universal features, designated features, and accommodations, including noise buffer, amplification, color contrast, magnification device, separate setting, and scribe.
Training Requirements:
1-4 hours of training
Qualified Administrators:
MAP Reading Fluency is group-administered and machine-scored. Proctors should complete online or in-person training offered by NWEA to ensure they are familiar with the test experience and interface and know how to initiate and oversee the testing process.
Access to Technical Support:
Users can obtain support through our Product Support team via toll-free telephone number, email, and chat; our online Help Center; and a dedicated Account Manager.
Assessment Format:
  • Direct: Computerized
Scoring Time:
  • Scoring is automatic OR
  • 0 minutes per student
Scores Generated:
  • Raw score
  • Percentile score
  • IRT-based score
  • Equated
  • Lexile score
  • Composite scores
  • Subscale/subtest scores
Administration Time:
  • 20 minutes per student
Scoring Method:
  • Automatically (computer-scored)
  • Other : Educators have access to every student recording, can review the student reading sample at any point, and can choose to hand-score the recorded readings.
Technology Requirements:
  • Computer or tablet
  • Internet connection
  • Other technology : MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. System requirements are regularly updated at https://teach.mapnwea.org/impl/QRM2_System_Requirements_QuickRef.pdf.
Accommodations:
Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. We strive to maximize the validity of our assessments for the greatest number of students. At the heart of our efforts is our dedication to providing assessments that are adaptable to a combination of unique learning needs, easily perceived, and clear to each student. We actively conduct research and have taken critical steps in contributing to the field of accessibility and universal design. MAP Reading Fluency includes and supports universal features, designated features, and accommodations, including noise buffer, amplification, color contrast, magnification device, separate setting, and scribe.

Descriptive Information

Please provide a description of your tool:
MAP Reading Fluency is designed specifically for early learners and focuses on foundational reading skills, oral reading fluency, and literal comprehension. The assessment is an easy-to-administer universal screener for students in grades Pre-K–5 that is typically used three times per school year for benchmarking. It can also be used for students in grades 6–8 who require additional reading support. In addition to the benchmark assessment, MAP Reading Fluency includes progress monitoring for Foundational Skills and Oral Reading. MAP Reading Fluency delivers an interactive, developmentally appropriate assessment experience that identifies students at risk for reading difficulties. Easy-to-use reports provide actionable data at the individual, class, and district level. Unlike traditional measures of oral reading fluency, MAP Reading Fluency includes advanced speech scoring technology that automatically records and consistently scores students’ oral reading, saving hours of instructional time. Students complete the computer adaptive assessment independently in about 20 minutes using a headset with a mounted microphone. MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. Phonological Awareness and Phonics & Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that words correct per minute (WCPM) scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format.
The tool is intended for use with the following grade(s).
selected Preschool / Pre - kindergarten
selected Kindergarten
selected First grade
selected Second grade
selected Third grade
selected Fourth grade
selected Fifth grade
selected Sixth grade
selected Seventh grade
selected Eighth grade
not selected Ninth grade
not selected Tenth grade
not selected Eleventh grade
not selected Twelfth grade

The tool is intended for use with the following age(s).
selected 0-4 years old
selected 5 years old
selected 6 years old
selected 7 years old
selected 8 years old
selected 9 years old
selected 10 years old
selected 11 years old
selected 12 years old
selected 13 years old
not selected 14 years old
not selected 15 years old
not selected 16 years old
not selected 17 years old
not selected 18 years old

The tool is intended for use with the following student populations.
selected Students in general education
selected Students with disabilities
selected English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading
Phonological processing:
selected RAN
not selected Memory
selected Awareness
selected Letter sound correspondence
selected Phonics
not selected Structural analysis

Word ID
selected Accuracy
selected Speed

Nonword
not selected Accuracy
selected Speed

Spelling
selected Accuracy
selected Speed

Passage
selected Accuracy
selected Speed

Reading comprehension:
selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
selected Sentence verification
not selected Other (please describe):


Listening comprehension:
selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
selected Sentence verification
selected Vocabulary
not selected Expressive
selected Receptive

Mathematics
Global Indicator of Math Competence
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Early Numeracy
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Concepts
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Computation
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematic Application
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Fractions/Decimals
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Algebra
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Geometry
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

not selected Other (please describe):

Please describe specific domain, skills or subtests:
BEHAVIOR ONLY: Which category of behaviors does your tool target?


BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:
Email Address
proposals@nwea.org
Address
121 NW Everett Street, Portland, OR 97209
Phone Number
(503) 624-1951
Website
www.nwea.org
Initial cost for implementing program:
Cost
Unit of cost
Student
Replacement cost per unit for subsequent use:
Cost
Unit of cost
Student
Duration of license
Year
Additional cost information:
Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.
MAP Reading Fluency assessments require an annual per-student subscription fee. Please visit https://www.nwea.org/contact-us/sales-information or call (866) 654-3246 to request a quote. Annual subscription fees include a suite of assessments, scoring and reporting, and all assessment software including maintenance and upgrades. We provide a full system of support to enable the success of our partners, including technical support; implementation support; and dedicated account management for the length of the partnership. MAP Reading Fluency assessments can be administered three times per school year for benchmarking in grades Pre-K–5. With speech-recognition technology, group administration, and automatic scoring, MAP Reading Fluency provides a clear view of early literacy skills and learning needs for an entire class in about 20 minutes. Districts can monitor oral reading fluency, literal comprehension, and foundational reading skills from season to season and year to year. The assessment quickly screens students at risk of reading difficulty, including characteristics of dyslexia and provides precise, reliable insights to support early readers while maximizing valuable instructional time. The benchmark assessment is also available in Spanish for grades Pre-K–3. We require new partners to purchase two professional learning sessions: MAP Reading Fluency Basics and Essential Reports. The workshops cover topics educators need to get started with MAP Reading Fluency and information about accessing, interpreting, and applying MAP Reading Fluency data to inform instruction. MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. Districts can purchase headsets directly through a third party if needed.
Provide information about special accommodations for students with disabilities.
Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. We strive to maximize the validity of our assessments for the greatest number of students. At the heart of our efforts is our dedication to providing assessments that are adaptable to a combination of unique learning needs, easily perceived, and clear to each student. We actively conduct research and have taken critical steps in contributing to the field of accessibility and universal design. MAP Reading Fluency includes and supports universal features, designated features, and accommodations, including noise buffer, amplification, color contrast, magnification device, separate setting, and scribe.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?
not selected General education teacher
not selected Special education teacher
not selected Parent
not selected Child
not selected External observer
not selected Other
If other, please specify:

What is the administration setting?
not selected Direct observation
not selected Rating scale
not selected Checklist
not selected Performance measure
not selected Questionnaire
selected Direct: Computerized
not selected One-to-one
not selected Other
If other, please specify:

Does the tool require technology?
Yes

If yes, what technology is required to implement your tool? (Select all that apply)
selected Computer or tablet
selected Internet connection
selected Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:
MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. System requirements are regularly updated at https://teach.mapnwea.org/impl/QRM2_System_Requirements_QuickRef.pdf.

What is the administration context?
selected Individual
selected Small group   If small group, n=30
not selected Large group   If large group, n=
selected Computer-administered
selected Other
If other, please specify:
Educators can test a whole class, a group of students, or one student at a time. Tests can be administered in-person or remotely.

What is the administration time?
Time in minutes
20
per (student/group/other unit)
student

Additional scoring time:
Time in minutes
0
per (student/group/other unit)
student

ACADEMIC ONLY: What are the discontinue rules?
selected No discontinue rules provided
not selected Basals
not selected Ceilings
not selected Other
If other, please specify:


Are norms available?
Yes
Are benchmarks available?
Yes
If yes, how many benchmarks per year?
3
If yes, for which months are benchmarks available?
Fall, Winter, Spring
BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?
Yes
Describe the time required for administrator training, if applicable:
1-4 hours of training
Please describe the minimum qualifications an administrator must possess.
MAP Reading Fluency is group-administered and machine-scored. Proctors should complete online or in-person training offered by NWEA to ensure they are familiar with the test experience and interface and know how to initiate and oversee the testing process.
not selected No minimum qualifications
Are training manuals and materials available?
Yes
Are training manuals/materials field-tested?
Yes
Are training manuals/materials included in cost of tools?
Yes
If No, please describe training costs:
Can users obtain ongoing professional and technical support?
Yes
If Yes, please describe how users can obtain support:
Users can obtain support through our Product Support team via toll-free telephone number, email, and chat; our online Help Center; and a dedicated Account Manager.

Scoring

How are scores calculated?
not selected Manually (by hand)
selected Automatically (computer-scored)
selected Other
If other, please specify:
Educators have access to every student recording, can review the student reading sample at any point, and can choose to hand-score the recorded readings.

Do you provide basis for calculating performance level scores?
Yes
What is the basis for calculating performance level and percentile scores?
not selected Age norms
selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

What types of performance level scores are available?
selected Raw score
not selected Standard score
selected Percentile score
not selected Grade equivalents
selected IRT-based score
not selected Age equivalents
not selected Stanines
not selected Normal curve equivalents
not selected Developmental benchmarks
not selected Developmental cut points
selected Equated
not selected Probability
selected Lexile score
not selected Error analysis
selected Composite scores
selected Subscale/subtest scores
not selected Other
If other, please specify:

Does your tool include decision rules?
Yes
If yes, please describe.
MAP Reading Fluency benchmark tests provide a Flagged status when student performance shows risk factors for possible reading difficulties. The At-Risk threshold is based on predetermined cut scores. For Foundational Skills, we developed a multivariate predictive model using each of the Foundational Skills domain scores: Phonological Awareness, Phonics & Word Recognition (including Sentence Reading Fluency), and Language Comprehension. NWEA set the At-Risk threshold at the 10th percentile of MAP Growth™ Reading by grade for the spring term. Model cut points were set to achieve both sensitivity and specificity greater than 0.70. Cut points were set on predicted probabilities instead of on scores on individual measures. For Oral Reading, students’ scaled words correct per minute (SWCPM) is clearly characterized relative to seasonal norms (Hasbrouck and Tindal, 2017), with student performance below the 25th percentile being Flagged on a universal screener outcome.
Can you provide evidence in support of multiple decision rules?
Yes
If yes, please describe.
Please refer to the description of our scoring structure below for a description of the rationale for the Foundational Skills reporting.
Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
MAP Reading Fluency presents educators with different types of data that are important for informing multiple decisions within their school. Screener Outcomes: The presence of screener outcome flags indicate that possible future reading difficulty is currently predicted, in the absence of intervention. The flags signal which students may need additional resource allocation and intervention. A predictive model is used to determine which students are flagged. Flagged students tend to perform below the 10th percentile on a general reading outcome at the end of the year, without increased intervention. The predictive model draws from the student’s performance in each of the following: Phonological Awareness; Phonics & Word Recognition, including Sentence Reading Fluency; and Language Comprehension (comprised of Listening Comprehension and Picture Vocabulary). Each model weights and combines scores from these multiple measures to form a best prediction of risk. The predictive weighting of the measures, or domains, varies by grade and season. Domain Scores and User Norms: Each Foundational Skills domain score quantifies the student’s level of overall performance in one full domain. There are three foundational skills domains: Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. Domain scores enable educators to compare one student with another student. Percentiles are an effective way to measure how a student is performing amongst their peers. Percentiles are based on user norms. Domain scores enable educators to compare a student’s performance against their own past performance. When a student’s overall phonological awareness improves, for example, their Phonological Awareness domain score increases. Change in the three-digit domain score shows student growth over time. Individual progress monitoring tracks shorter-term growth on a domain score. The domain score — and the percentile associated with it, from user norms — is for norm-referenced decisions about student performance and growth in a full domain. The domain score is determined by item difficulty parameters for every item the student was presented in the whole domain. Zone of Proximal Development and Performance Levels: MAP Reading Fluency assesses a range of specific early reading skills, selecting those in and around each student’s ZPD using a stage-adaptive methodology. Performance-level reporting classifies students as Exceeding, Meeting, Approaching, or Below grade-level expectation for a given season (Fall, Winter, Spring). As the year progresses, expectation levels rise, and students must demonstrate growth to keep pace with threshold performance for their grade. Performance levels are shown throughout MAP Reading Fluency using a four-color coding scheme. Raw Score Conversion to Performance Levels: Foundational Skills Measures: Foundational skill measures in MAP Reading Fluency are presented within the Foundational Skills test form or upon failure to advance to Oral Reading based on sentence reading criteria. The Foundational Skills test includes measures in the Phonological Awareness, Phonics & Word Recognition, and Language Comprehension domains. The Print Concepts domain is also included for students newer to books and text. Phonological Awareness and Phonics & Word Recognition are assessed with a series of discrete, timed measures focusing on a single skill. These measures are presented adaptively based on student responses (i.e., number correct and percent correct). Each student moves through each of the two progressions based on their demonstrated ability. Performance levels are assigned at the level of the entire progression by comparing the observed ZPD to grade-level expectations. ZPD levels are achievable from a series of related measures administered from each skill progression. The ZPD level is highlighted in an onscreen representation of the progression, which is presented in the Student Report, and is stated in a narrative in the top summary section of the report. Raw Score Conversion to Performance Levels: Oral Reading Measures: Students who advance to oral reading are assigned a performance level based on scaled words correct per minute (SWCPM) for each grade and administration, which are drawn from published national norms (Hasbrouck and Tindal, 2017). Blue, green, yellow, and red color-coding is applied in score reports based on which quartile a student’s score falls. Students meet expectations if they read the minimum SWCPM for a given grade and seasonal administration. If students struggle to understand a grade-level passage, they will get an easier (lower Lexile measure) passage. If they understand the grade-level passage well, students are presented with a more difficult (higher Lexile measure) passage. Passage equating and scaling allows fluency performance to be compared across a range of passage difficulty. A student’s best attempt determines his or her assigned performance level. Educators have access to every student recording. They can review the student reading sample at any point and may also choose to hand-score the recorded readings. Item Pool: All MAP Reading Fluency items are designed for maximum developmental appropriateness, using engaging character-based audio and colorful graphics. A variety of selected-response formats are used, plus automatic speech scoring. Oral reading passages were developed for the purpose of oral reading fluency assessment, including basic understanding of what was read. Passages range in difficulty from 180L to 1000L in the Lexile metric to support adaptivity above and below grade level through grade 5. The student reads directly into a headset microphone for picture books and oral reading fluency passages. For selected-response tasks, students see and hear demonstrations by the narrating character, including audio and animation or video, before engaging with the scored items. Selected-response item types include multiple choice, including choose-two and hot spot formats; click-and-pop simple object moving formats; and simple constructed response (e.g., building a word from letters). MAP Reading Fluency includes more than 2,385 items across the following areas: Picture Books for oral reading, speech-scored: 10; Oral Reading Passage sets, with one speech-scored passage and six selected-response comprehension questions apiece: over 170 (over 1,190 total); Phonological Awareness items, across eight selected-response measures: over 413; Phonics & Word Recognition items, across 9 selected-response measures: over 605; Language Comprehension, across two selected-response measures: approximately 131; Print Concepts storybooks, with six selected-response questions apiece: 6 (36 total).
Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
Research on early literacy development drove the design of MAP Reading Fluency. Framed by Gough and Tunmer’s Simple View of Reading (1986), the assessment parses decoding and language comprehension factors as separately assessable components until a point at which these come together in each student’s reading of connected text. Once students are beginning to develop reading fluency, MAP Reading Fluency assesses oral reading directly using speech scoring technology and direct checks for understanding. For students not yet reading passages, decoding factors assessed in MAP Reading Fluency include phonological awareness, phonics and word recognition, and, for those at beginning levels of book exposure, print concepts. Phonemic awareness is among the strongest predictors of decoding fluency in English, and phonological skills that precede phoneme-level skills in a developmental continuum are valuable in earlier screening (Anthony & Francis, 2005). For both Phonological Awareness and Phonics & Word Recognition, MAP Reading Fluency locates the student on a progression of skills and points to tightly aligned instructional steps for that stage, including research-to-practice student instructional materials. While academic standards typically frame only decoding skills as foundational, research is increasingly clear that a student’s foundation in language comprehension strongly contributes to future reading comprehension, with growing predictive power as decoding fluency consolidates (Foorman, et al, 2015). In assessing foundational skills, MAP Reading Fluency includes both vocabulary and sentence listening comprehension. For students able to read connected text, MAP Reading Fluency assesses oral reading in a group-administered assessment that capitalizes on automatic speech scoring. This returns hours of instructional time that a teacher might otherwise be spending on the task of one-on-one assessment. While a simple direct measure of words correct per minute (WCPM) is a strong indicator of reading development (Fuchs, et al, 2001), research clearly supports a more robust understanding of reading fluency as including assessment of accuracy, rate, and understanding in the context of variable levels of text difficulty (Valencia, et al, 2010). It is particularly important that students are asked to show understanding of what they read aloud, both to convey to students the purpose of reading and to activate factors that aide in prediction of reading comprehension. As Valencia & Buly (2004) note, students struggling with reading align to more than one profile relevant to instructional next steps. When this is disregarded and all struggling readers are routed to the same generic interventions, screening time and resources are squandered, instructional effectiveness is compromised, and students are left to struggle. Instead, MAP Reading Fluency is designed for individualization. Oral reading fluency reporting generates individual Reader Profiles with Next Steps, tailoring these research-based messages to the individual’s particular performance across accuracy, rate, comprehension, and text level. Test Formats: MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. Phonological Awareness and Phonics & Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured using passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that WCPM scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format. Item Bias and Sensitivity: We are committed to developing engaging, authentic, rigorous, and culturally diverse assessments that effectively measure the full range of the standards. Therefore, it is vital that we address a wide variety of texts in a balanced, respectful way that does not upset, distract, or exclude any student populations. Item and passage writers employ careful consideration and sound judgment while crafting items, considering each item from a variety of angles regarding bias and sensitivity, in accordance with the NWEA Sensitivity, Fairness, and Accessibility Guidelines. To meet our high expectation of fairness to all students, every item and passage is thoroughly examined at multiple points in the development process, undergoing specific bias and sensitivity reviews. Sensitivity in this context means an awareness of the different things that can distract a student during assessment. Fairness in this context relates to giving each student equal opportunity to answer the item correctly based solely on their knowledge of the item content. Any sensitivity and fairness issues found in items or passages are eliminated in revision or rejection of the item during development. Each item or passage is evaluated against a set of criteria and is flagged if it requires prior knowledge other than the skill/concept being assessed; requires construct-irrelevant or specialized knowledge; has cultural bias; has linguistic bias; has socioeconomic bias; has religious bias; has geographic bias; has color-blind bias; has gender bias; favors students who have no visual impairments; favors students who have no disabilities; inappropriately employs idiomatic English; offensively stereotypes a group of people; mentions body/weight issues; contains inappropriate or sensitive topics; distracts, upsets, or confuses in any way; or has other bias issues. Our Psychometric Solutions team performs differential item functioning (DIF) analyses to examine the percentages of items that exhibit substantial DIF in the item pools, i.e., C-class DIF (Zwick, 2012).

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Classification Accuracy Fall Partially convincing evidence Convincing evidence Partially convincing evidence Convincing evidence
Classification Accuracy Winter Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Classification Accuracy Spring Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available

MAP® Growth™ Reading

Classification Accuracy

Select time of year
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
The criterion measure was MAP Growth Reading. Although NWEA develops and maintains both MAP Reading Fluency and MAP Growth, they are different assessments built from separate blueprints and item pools, with no task types or specific subtests in common. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The screener outcome and categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction. MAP Growth Reading assessments measure student performance and growth in a subject. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction.
Do the classification accuracy analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
The cut scores for at-risk on the criterion measure were set at the 10th percentile for each grade and term. A multivariate model using MAP Reading Fluency’s item response theory (IRT)-based Foundational Skills domain scores and raw scores on the Sentence Reading Fluency measure was used to estimate the probability of students being at-risk. The Foundational Skills domain scores comprise Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. The grouping was between high- and low-risk students. Cut points were set not on the individual domain scores and Sentence Reading Fluency scores, but rather on the estimated at-risk probabilities. Cut points were selected to have roughly equal sensitivities and specificities equal to or greater than 0.70. GRADES 2–3: By second grade, students tracking to Foundational Skills track tend to be struggling readers. Therefore, the base rate in the tested sample can become much higher than what the at-risk designation in the general population would suggest. To ensure that the analysis sample better represents the general population and composition of students in Grades 2–3, a resampling procedure was implemented to acquire more rigorous data. Specifically, post-stratification sampling was applied to align the base rate of the analysis sample with that of the national population. To ensure the tested sample’s risk incidence more accurately reflect the national norms, Grades 2–3 students were sampled using post-stratification weights so that 10% of the selected students were at-risk on the criterion measure, while the remaining 90% were not. This sampling procedure is commonly used to improve the precision of estimates by reducing known discrepancies between the analysis sample and the larger population (Lohr, 2021; Little, 1993). Notably, this approach had only negligible effects on model sensitivity and specificity, as these metrics are independent of the base rate (Krzanowski & Hand, 2009).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
No
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Our sample comprised students from our regular test-taking population who were administered MAP Reading Fluency Foundational Skills measures and a MAP Growth Reading test during one or more terms in the 2020–2021 and 2021–2022 school years. Some of these students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students.

Cross-Validation

Has a cross-validation study been conducted?
Yes
If yes,
Select time of year.
Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
The criterion measure was MAP Growth Reading. Although NWEA develops and maintains both MAP Reading Fluency and MAP Growth, they are different assessments built from separate blueprints and item pools, with no task types or specific subtests in common. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The screener outcome and categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction. MAP Growth Reading assessments measure student performance and growth in a subject. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction.
Do the cross-validation analyses examine concurrent and/or predictive classification?

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
The cut scores for at-risk on the criterion measure were set at the 10th percentile for each grade and term. A multivariate model using MAP Reading Fluency’s item response theory (IRT)-based Foundational Skills domain scores and raw scores on the Sentence Reading Fluency measure was used to estimate the probability of students being at-risk. The Foundational Skills domain scores comprise Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. The grouping was between high- and low-risk students. Cut points were set not on the individual domain scores and Sentence Reading Fluency scores, but rather on the estimated at-risk probabilities. Cut points were selected to have roughly equal sensitivities and specificities equal to or greater than 0.70. Within each stratification, we split the data into two equal folds using randomization, which ensures that the proportions of the at-risk and non-risk classes on the criterion measure were approximately preserved in the resulting halves. Similar geographical representation and demographic distribution were also observed across both folds. In order to validate the results, one data fold was used for the main classification study and the other for the cross-validation study. In the classification sample, students’ Foundational Skills domain scores and raw scores on the Sentence Reading Fluency measure were used to predict the at-risk probabilities of their Spring MAP Growth Reading outcome. The estimated coefficients from the multivariate model and the selected probability cut points were applied to the validation sample to predict the students Spring at-risk status on the criterion measure. The classification rules and cut points applied were identical in the classification study and the cross-validation study. GRADES 2–3: By second grade, students tracking to Foundational Skills track tend to be struggling readers. Therefore, the base rate in the tested sample can become much higher than what the at-risk designation in the general population would suggest. To ensure that the analysis sample better represents the general population and composition of students in Grades 2–3, a resampling procedure was implemented to acquire more rigorous data. Specifically, post-stratification sampling was applied to align the base rate of the analysis sample with that of the national population. To ensure the tested sample’s risk incidence more accurately reflect the national norms, Grades 2–3 students were sampled using post-stratification weights so that 10% of the selected students were at-risk on the criterion measure, while the remaining 90% were not. This sampling procedure is commonly used to improve the precision of estimates by reducing known discrepancies between the analysis sample and the larger population (Lohr, 2021; Little, 1993). Notably, this approach had only negligible effects on model sensitivity and specificity, as these metrics are independent of the base rate (Krzanowski & Hand, 2009).
Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
No
If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Our sample comprised students from our regular test-taking population who were administered MAP Reading Fluency Foundational Skills measures and a MAP Growth Reading test during one or more terms in the 2021–2022 school year. Some of these students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students.

Classification Accuracy - Fall

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 10 10 10 10
Cut Points - Performance score on criterion measure 121 140 153 165
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 676 1863 332 302
Classification Data - False Positive (b) 2445 2818 797 672
Classification Data - False Negative (c) 229 436 100 71
Classification Data - True Negative (d) 7326 12455 3050 2662
Area Under the Curve (AUC) 0.82 0.89 0.87 0.89
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.80 0.88 0.85 0.87
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.83 0.90 0.88 0.90
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.08 0.13 0.10 0.10
Overall Classification Rate 0.75 0.81 0.79 0.80
Sensitivity 0.75 0.81 0.77 0.81
Specificity 0.75 0.82 0.79 0.80
False Positive Rate 0.25 0.18 0.21 0.20
False Negative Rate 0.25 0.19 0.23 0.19
Positive Predictive Power 0.22 0.40 0.29 0.31
Negative Predictive Power 0.97 0.97 0.97 0.97
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2020-2022 2020-2022 2020-2022 2020-2022
Sample Size 10676 17572 4279 3707
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA)
East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA, OK, TX)
East North Central (IL, MI, OH, WI)
East South Central (AL, MS)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, TX)
East North Central (IL)
East South Central (AL, KY, MS, TN)
Middle Atlantic (PA)
Mountain (CO, MT, NM, NV)
New England (CT, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, GA, MD)
West North Central (IA, MN, SD)
West South Central (AR, OK)
Male 49.9% 50.6% 50.8% 51.8%
Female 50.1% 49.4% 49.2% 48.2%
Other        
Gender Unknown        
White, Non-Hispanic 41.3% 42.9% 32.3% 43.2%
Black, Non-Hispanic 19.2% 15.8% 24.9% 17.9%
Hispanic 28.0% 26.6% 27.0% 24.7%
Asian/Pacific Islander 3.3% 2.7% 2.5% 1.6%
American Indian/Alaska Native 1.2% 1.4% 1.6% 1.9%
Other 3.9% 4.8% 4.6% 3.4%
Race / Ethnicity Unknown 3.2% 5.7% 7.1% 7.2%
Low SES 0.4% 1.3% 1.0% 3.2%
IEP or diagnosed disability 0.6% 1.0% 1.2% 2.6%
English Language Learner 1.9% 2.4% 2.8% 5.3%

Classification Accuracy - Winter

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 10 10 10 10
Cut Points - Performance score on criterion measure 131 149 162 173
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 1018 1801 232 183
Classification Data - False Positive (b) 2410 2148 527 372
Classification Data - False Negative (c) 252 430 55 33
Classification Data - True Negative (d) 10293 10472 2095 1539
Area Under the Curve (AUC) 0.88 0.90 0.90 0.90
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.87 0.89 0.88 0.88
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.89 0.91 0.91 0.92
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.09 0.15 0.10 0.10
Overall Classification Rate 0.81 0.83 0.80 0.81
Sensitivity 0.80 0.81 0.81 0.85
Specificity 0.81 0.83 0.80 0.81
False Positive Rate 0.19 0.17 0.20 0.19
False Negative Rate 0.20 0.19 0.19 0.15
Positive Predictive Power 0.30 0.46 0.31 0.33
Negative Predictive Power 0.98 0.96 0.97 0.98
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2020-2022 2020-2022 2020-2022 2020-2022
Sample Size 13973 14851 2909 2127
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR)
East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ, NY)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, TX)
East North Central (IL, MI, OH, WI)
East South Central (AL)
Middle Atlantic (NJ)
Mountain (MT, NM)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (KS, MN, ND, SD)
West South Central (AR, TX)
East North Central (IL)
East South Central (AL, KY, MS, TN)
Middle Atlantic (PA)
Mountain (CO, MT, NV)
New England (CT, NH)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, GA, MD)
West North Central (IA, MN, SD)
West South Central (AR)
Male 50.6% 50.5% 49.3% 52.6%
Female 49.4% 49.5% 50.7% 47.4%
Other        
Gender Unknown        
White, Non-Hispanic 45.0% 43.7% 34.8% 44.5%
Black, Non-Hispanic 16.6% 16.7% 28.1% 18.5%
Hispanic 25.9% 25.9% 24.3% 24.4%
Asian/Pacific Islander 2.7% 2.4% 1.8% 1.0%
American Indian/Alaska Native 1.0% 1.2% 1.6% 1.3%
Other 3.8% 4.4% 3.8% 3.2%
Race / Ethnicity Unknown 5.0% 5.7% 5.6% 7.0%
Low SES 1.4% 1.9% 1.0% 1.9%
IEP or diagnosed disability 0.8% 1.1% 1.2% 2.9%
English Language Learner 1.9% 2.3% 3.4% 5.8%

Classification Accuracy - Spring

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 10 10 10 10
Cut Points - Performance score on criterion measure 138 153 166 176
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 1270 2232 215 112
Classification Data - False Positive (b) 2358 1890 465 253
Classification Data - False Negative (c) 249 537 35 26
Classification Data - True Negative (d) 10449 9615 1807 1019
Area Under the Curve (AUC) 0.90 0.90 0.91 0.88
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.90 0.90 0.89 0.86
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.91 0.91 0.92 0.91
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.11 0.19 0.10 0.10
Overall Classification Rate 0.82 0.83 0.80 0.80
Sensitivity 0.84 0.81 0.86 0.81
Specificity 0.82 0.84 0.80 0.80
False Positive Rate 0.18 0.16 0.20 0.20
False Negative Rate 0.16 0.19 0.14 0.19
Positive Predictive Power 0.35 0.54 0.32 0.31
Negative Predictive Power 0.98 0.95 0.98 0.98
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2020-2022 2020-2022 2020-2022 2020-2022
Sample Size 14326 14274 2522 1410
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ, PA)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA)
East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA, TX)
East North Central (IL, MI, OH, WI)
East South Central (AL)
Middle Atlantic (NJ)
Mountain (MT, NM)
New England (CT, MA, ME, NH, RI, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, TX)
East North Central (IL, IN)
East South Central (AL, KY, MS, TN)
Middle Atlantic (NY)
Mountain (CO, MT, NV)
New England (NH)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, GA, MD)
West North Central (IA, MN, SD)
West South Central (AR, OK)
Male 50.2% 50.8% 51.9% 50.9%
Female 49.8% 49.2% 48.1% 49.1%
Other        
Gender Unknown   0.0%    
White, Non-Hispanic 44.3% 41.2% 32.4% 40.0%
Black, Non-Hispanic 17.7% 18.3% 28.8% 20.4%
Hispanic 22.9% 25.4% 22.8% 24.9%
Asian/Pacific Islander 3.5% 2.4% 1.8% 1.0%
American Indian/Alaska Native 1.5% 1.9% 1.6% 1.1%
Other 4.8% 4.8% 5.1% 2.7%
Race / Ethnicity Unknown 5.3% 6.1% 7.6% 10.0%
Low SES 2.9% 3.0% 1.0% 3.6%
IEP or diagnosed disability 1.1% 1.6% 1.4% 3.4%
English Language Learner 2.5% 3.0% 2.7% 6.7%

Cross-Validation - Fall

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 10 10 10 10
Cut Points - Performance score on criterion measure 121 140 153 165
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 691 1811 329 295
Classification Data - False Positive (b) 2418 2784 762 700
Classification Data - False Negative (c) 265 428 86 67
Classification Data - True Negative (d) 7302 12550 2985 2570
Area Under the Curve (AUC) 0.80 0.89 0.87 0.88
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.79 0.88 0.86 0.86
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.82 0.90 0.89 0.90
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.09 0.13 0.10 0.10
Overall Classification Rate 0.75 0.82 0.80 0.79
Sensitivity 0.72 0.81 0.79 0.81
Specificity 0.75 0.82 0.80 0.79
False Positive Rate 0.25 0.18 0.20 0.21
False Negative Rate 0.28 0.19 0.21 0.19
Positive Predictive Power 0.22 0.39 0.30 0.30
Negative Predictive Power 0.96 0.97 0.97 0.97
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2020-2022 2020-2022 2020-2022 2020-2022
Sample Size 10676 17573 4162 3632
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, NH, VT)
Pacific (AK, CA, HI, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA, OK)
East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA, TX)
East North Central (IL, MI, OH, WI)
East South Central (AL, MS)
Middle Atlantic (NJ)
Mountain (MT, NM)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, OK, TX)
East North Central (IL)
East South Central (AL, KY, MS, TN)
Middle Atlantic (PA)
Mountain (CO, MT, NV)
New England (CT, NH, VT)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, GA, MD)
West North Central (IA, MN, SD)
West South Central (AR, LA, OK)
Male 50.4% 50.8% 50.8% 51.3%
Female 49.6% 49.2% 49.2% 48.7%
Other        
Gender Unknown   0.0%    
White, Non-Hispanic 41.3% 43.1% 33.7% 42.3%
Black, Non-Hispanic 18.6% 15.7% 24.5% 18.3%
Hispanic 28.9% 27.1% 26.0% 25.7%
Asian/Pacific Islander 3.8% 2.9% 2.4% 1.3%
American Indian/Alaska Native 1.0% 1.4% 1.4% 2.2%
Other 3.5% 4.3% 4.8% 3.3%
Race / Ethnicity Unknown 2.8% 5.6% 7.2% 6.8%
Low SES 0.4% 1.5% 1.0% 3.3%
IEP or diagnosed disability 0.5% 1.1% 1.5% 2.4%
English Language Learner 2.4% 2.3% 3.1% 5.5%

Cross-Validation - Winter

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 10 10 10 10
Cut Points - Performance score on criterion measure 131 149 162 173
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 956 1801 227 181
Classification Data - False Positive (b) 2540 2043 537 443
Classification Data - False Negative (c) 284 402 63 40
Classification Data - True Negative (d) 10193 10605 2091 1549
Area Under the Curve (AUC) 0.86 0.91 0.87 0.88
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.85 0.90 0.85 0.86
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.87 0.91 0.89 0.90
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.09 0.15 0.10 0.10
Overall Classification Rate 0.80 0.84 0.79 0.78
Sensitivity 0.77 0.82 0.78 0.82
Specificity 0.80 0.84 0.80 0.78
False Positive Rate 0.20 0.16 0.20 0.22
False Negative Rate 0.23 0.18 0.22 0.18
Positive Predictive Power 0.27 0.47 0.30 0.29
Negative Predictive Power 0.97 0.96 0.97 0.97
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2020-2022 2020-2022 2020-2022 2020-2022
Sample Size 13973 14851 2918 2213
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR)
East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA, TX)
East North Central (IL, MI, OH, WI)
East South Central (AL)
Middle Atlantic (NJ, PA)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, OK, TX)
East North Central (IL)
East South Central (AL, KY, MS, TN)
Middle Atlantic (NY, PA)
Mountain (CO, MT, NV)
New England (CT, NH)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, GA, MD)
West North Central (MN, SD)
West South Central (AR)
Male 50.1% 50.1% 49.9% 52.2%
Female 49.9% 49.9% 50.1% 47.8%
Other        
Gender Unknown        
White, Non-Hispanic 43.4% 43.0% 34.9% 41.6%
Black, Non-Hispanic 17.0% 17.3% 27.0% 18.7%
Hispanic 27.5% 25.2% 24.6% 25.6%
Asian/Pacific Islander 2.8% 2.2% 1.9% 1.2%
American Indian/Alaska Native 1.1% 1.3% 1.9% 1.0%
Other 3.7% 4.6% 3.3% 3.8%
Race / Ethnicity Unknown 4.6% 6.2% 6.4% 8.1%
Low SES 1.3% 2.1% 1.1% 1.7%
IEP or diagnosed disability 0.7% 1.2% 1.5% 3.1%
English Language Learner 2.1% 2.4% 3.3% 6.1%

Cross-Validation - Spring

Evidence Kindergarten Grade 1 Grade 2 Grade 3
Criterion measure MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure 10 10 10 10
Cut Points - Performance score on criterion measure 138 153 166 176
Cut Points - Corresponding performance score (numeric) on screener measure N/A N/A N/A N/A
Classification Data - True Positive (a) 1192 2278 214 119
Classification Data - False Positive (b) 2378 2011 475 244
Classification Data - False Negative (c) 275 551 47 16
Classification Data - True Negative (d) 10481 9434 1895 973
Area Under the Curve (AUC) 0.89 0.90 0.89 0.91
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.88 0.89 0.87 0.88
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.90 0.90 0.91 0.93
Statistics Kindergarten Grade 1 Grade 2 Grade 3
Base Rate 0.10 0.20 0.10 0.10
Overall Classification Rate 0.81 0.82 0.80 0.81
Sensitivity 0.81 0.81 0.82 0.88
Specificity 0.82 0.82 0.80 0.80
False Positive Rate 0.18 0.18 0.20 0.20
False Negative Rate 0.19 0.19 0.18 0.12
Positive Predictive Power 0.33 0.53 0.31 0.33
Negative Predictive Power 0.97 0.94 0.98 0.98
Sample Kindergarten Grade 1 Grade 2 Grade 3
Date 2020-2022 2020-2022 2020-2022 2020-2022
Sample Size 14326 14274 2631 1352
Geographic Representation East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ, PA)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA)
East North Central (IL, IN, MI, OH, WI)
East South Central (AL, MS, TN)
Middle Atlantic (NJ, NY, PA)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, LA, OK, TX)
East North Central (IL, MI, OH, WI)
East South Central (AL)
Middle Atlantic (NJ)
Mountain (MT, NM, NV)
New England (CT, MA, ME, NH, RI, VT)
Pacific (AK, CA, HI, OR, WA)
South Atlantic (DC, FL, MD, NC)
West North Central (IA, KS, MN, ND, SD)
West South Central (AR, OK, TX)
East North Central (IL, IN)
East South Central (AL, KY, MS, TN)
Middle Atlantic (NY)
Mountain (CO, MT, NM, NV)
New England (NH)
Pacific (AK, CA, OR, WA)
South Atlantic (DC, GA, MD)
West North Central (IA, MN, SD)
West South Central (AR, LA, OK)
Male 50.1% 50.3% 50.1% 52.8%
Female 49.9% 49.7% 49.9% 47.2%
Other        
Gender Unknown 0.0%      
White, Non-Hispanic 43.9% 40.3% 32.3% 40.0%
Black, Non-Hispanic 17.7% 19.1% 29.8% 19.6%
Hispanic 23.6% 25.4% 23.0% 23.8%
Asian/Pacific Islander 3.0% 2.4% 2.4% 1.3%
American Indian/Alaska Native 1.6% 2.0% 2.1% 1.6%
Other 4.8% 4.8% 3.5% 2.2%
Race / Ethnicity Unknown 5.5% 6.0% 7.0% 11.5%
Low SES 2.7% 3.3% 1.5% 3.5%
IEP or diagnosed disability 1.1% 1.6% 1.1% 3.2%
English Language Learner 2.6% 2.8% 2.5% 6.2%

Reliability

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Rating Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Offer a justification for each type of reliability reported, given the type and purpose of the tool.
We submitted marginal reliabilities for each of the Foundational Skills domain scores: Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. The Foundational Skills domain scores’ measures come from a multi-stage testing environment and contain mostly speeded measures. These conditions make popular internal consistency methods such as coefficient alpha less appropriate. Given extreme within-grade range restriction in scores, marginal reliabilities for the Sentence Reading Fluency measure were spuriously low. We provided split-half reliabilities for Sentence Reading Fluency internal consistency for the required second type of reliability. Sentence Reading Fluency is a speeded measure, so the resulting coefficients are likely affected.
*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
Approximately 241,000 of our MAP Reading Fluency test takers in grades K–3 provided Foundational Skills data for the marginal reliability analyses. Data were collected in the Fall, Winter, and Spring terms of the 2021–2022 school year. The sample was approximately 37% White, 28% Hispanic, 21% African American, 2% Asian, and 2% Native American. Forty-seven U.S. states and territories were represented. Males were slightly overrepresented (52% vs. 48%). All students who take a MAP Reading Fluency benchmark test take the Sentence Reading Fluency measure. Therefore, our sample for the alpha and internal consistency analyses was larger than that for marginal reliability. Data for these analyses included scores from over 514,000 students. Data were collected in the Fall, Winter, and Spring terms of the 2021–2022 school year. The sample was approximately 40% White, 26% Hispanic, 18% African American, 4% Asian, and 1% Native American. Forty-seven U.S. states and territories were represented. Males were slightly overrepresented (51% vs. 49%).
*Describe the analysis procedures for each reported type of reliability.
Rasch item difficulties were estimated in Spring 2022 on sets of linear forms in order to avoid distortions in item difficulty estimation that can arise in computer adaptive testing and multi-stage testing data. Operational data from Fall 2021, Winter 2022, and Spring 2022 were scored with these item difficulty estimates. Marginal reliabilities (MR) were estimated separately for each domain, grade, and semester combination, by MR=(σ^2 (θ_T)-μ(σ^2 (θ_e )))/(σ^2 (θ_T)), where σ^2 (θ_T) denotes the total variance of a set of domain scores and μ(σ^2 (θ_e )) denotes the mean of the error variance of these domain scores. The 95% confidence interval for each coefficient was obtained by repeating the estimation over 1,000 bootstrap samples. The marginal reliability estimates for the Foundational Skills domain scores also reflect calibration sampling error, i.e., the reduction in reliability due to using item parameter estimates gathered from a different sample. Confidence intervals for the alpha and split-half correlations were obtained via the Fisher z-transformation (Fisher, 1921). The split-half and alpha reliability estimates were calculated for each term and grade level. Confidence intervals for the alpha and split-half correlations were obtained via the Fisher z-transformation (Fisher, 1921).

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Validity

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Rating Partially convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
The criterion measure was MAP Growth Reading. Although NWEA develops and maintains both MAP Growth and MAP Reading Fluency, they are different assessments built from separate blueprints with no task types or specific subtests in common. MAP Growth Reading assessments measure what students know and inform educators and parents about what they are ready to learn next. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction.
*Describe the sample(s), including size and characteristics, for each validity analysis conducted.
A total of 235,721 test records from 119,488 unique students from the 2020–2021 and 2021–2022 school years were used for the study. These students came from 44 states covering all nine census geographic regions. Region coverage ranged from 2,992 students in the Middle Atlanta division to 28,478 students in the East North Central.
*Describe the analysis procedures for each reported type of validity.
Both concurrent and predictive validity evidence are supplied for the Sentence Reading Fluency measure and the Foundational Skills domain scores: Phonological Awareness, Phonics & Word Recognition, and Language Comprehension. MAP Growth Reading RIT scores were the criterion measures. The concurrent validity evidence includes the Pearson correlation coefficients obtained by correlating scores from the same term between the criterion measure (MAP Growth Reading) and each of the MAP Reading Fluency tests of interest, respectively, for each grade. The predictive validity evidence includes the Pearson correlation coefficients obtained by correlating MAP Growth Reading RIT scores in the Spring with scores in Fall and Winter from each of the MAP Reading Fluency tests of interest, respectively, for each grade. For both concurrent and predictive validity evidence, confidence intervals were constructed using the Fisher z-transformation (Fisher, 1921) of the Pearson correlation coefficients.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Describe the degree to which the provided data support the validity of the tool.
Each of the Foundational Skills domains and Sentence Reading Fluency cover different aspects of reading, whereas MAP Growth Reading is a general reading achievement test. One would not expect the correlations of the Foundational Skills and Sentence Reading Fluency scores with MAP Growth Reading to be as large as those between two general reading achievement tests or between two specific tests of a single Foundational Skills domain, e.g., two Phonological Awareness tests. That said, of the lower limits for the 48 concurrent correlations, 24 were greater than or equal to 0.60, 12 were greater or equal to 0.50 but less than 0.60, three were greater than or equal to 0.40 but less than 0.50, and four were greater than or equal to 0.30 but less than 0.40. Correlations less than 0.30 involved either Sentence Reading Fluency in kindergarten or Fall and Winter of first grade. Of the lower limits for the respective 32 predictive correlations, 13 were greater than or equal to 0.60, nine were greater than or equal to 0.50 but less than 0.60, five were greater than or equal to 0.40 but less than 0.50, and one was greater than or equal to 0.30 but less than 0.40. The other four correlations involved Sentence Reading Fluency in Kindergarten and first grade. It’s likely that semester-to-semester growth resulted in lowered correlations for predictive as compared with the concurrent evidence. Correlations for kindergarten Sentence Reading Fluency are included for the sake of transparency. Performance by kindergartners tends to be volatile on this measure, and kindergarteners are not expected to be able to read sentences at that age.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Bias Analysis

Grade Kindergarten
Grade 1
Grade 2
Grade 3
Rating Yes Yes Yes Yes
Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
Yes
If yes,
a. Describe the method used to determine the presence or absence of bias:
Differential item functioning (DIF) refers to the extent that students of equal ability do not have equal probability of answering particular items correctly. NWEA used Spring 2022 Foundational Skills data for a round of item response theory-based DIF analyses. R (version 4.2.2 R Development Core Team, 2021) was used for these analyses. R fixes student ability estimates and then allows item difficulty estimates to be freely estimated for each group in question. The results are categorized based on the Educational Testing Service (ETS) method of classifying DIF (Zwick, 2012). This method allows items exhibiting negligible DIF (Category A) to be differentiated from those exhibiting moderate DIF (Category B) and severe DIF (Category C). Categories B and C have a further breakdown as “+” (DIF is in favor of the focal group) or “-” (DIF is in favor of the reference group). Typically, only items that fall in the “DIF C-class” require further investigation. Statistical tests for DIF are not corrected for multiple comparisons. Additionally, DIF statistics require an overall ability estimate that is largely unaffected by DIF in order to provide evidence of the presence of DIF.
b. Describe the subgroups for which bias analyses were conducted:
DIF analyses were conducted by ethnic group (White, Native American, Asian, African American, Hispanic) and gender (male, female). White serves as reference group in the DIF analysis based on ethnic group, and male serves as reference group in the DIF analysis based on gender. The analyses were conducted separately by grade.
c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
DIF analyses are performed every test cycle to flag, review, and revise all items that exhibit substantial DIF in the item pools, i.e., C-class DIF (Zwick, 2012). All items revealed as exhibiting C-class DIF are subjected to an extra review by NWEA Content Specialists to identify the source(s) for differential functioning. The review and item revisions continually improve item quality and remove or revise items that are flagged for bias such that in the Spring 2022 analysis, no items showed C-Class DIF for gender or ethnicity or race bias.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.