MAP® Reading Fluency™
Reading

Summary

Descriptive Information

MAP Reading Fluency is designed specifically for early learners and focuses on foundational reading skills, oral reading fluency, and literal comprehension. The assessment is an easy-to-administer universal screener for students in grades PK–5 that is typically used three times per school year for benchmarking. It can also be used for students in grades 6–8 who require additional reading support. In addition to benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. MAP Reading Fluency delivers an interactive, developmentally appropriate assessment experience that identifies students at risk for reading difficulties. Easy-to-use reports provide actionable data at the individual, class, and district level. Unlike traditional measures of oral reading fluency, MAP Reading Fluency includes advanced speech scoring technology that automatically records and consistently scores students’ oral reading, saving hours of instructional time. Students complete the computer adaptive assessment independently in about twenty minutes using a headset with a mounted microphone. MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. Phonological Awareness and Phonics/Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that words correct per minute (WCPM) scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format.

Acquisition & Cost

Where to Obtain:: NWEA®; proposals@nwea.org; 121 NW Everett Street, Portland, OR 97209; (503) 624-1951; www.nwea.org

Initial Cost:: Contact vendor for pricing details.

Replacement Cost:: Contact vendor for pricing details.

Included in Cost:: MAP Reading Fluency assessments require an annual per-student subscription fee. Please visit https://www.nwea.org/contact-us/sales-information/ or call (866) 654-3246 to request a quote. Annual subscription fees include a suite of assessments, scoring and reporting, all assessment software including maintenance and upgrades, support services, and unlimited staff access to NWEA Professional Learning Online. MAP Reading Fluency assessments can be administered three times per school year for benchmarking in grades PK–5. It can also be used for students in grades 6–8 who require reading support. In addition to the benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. The benchmark assessment is also available in Spanish for grades PK–3. Reports provide students’ reading level, their performance compared to grade-level expectations, and suggestions for instructional next steps tailored to each student. A full system of support is provided to enable the success of our partners, including technical support; implementation support through the first test administration; and ongoing, dedicated account management for the duration of the partnership. Unlimited staff access to NWEA Professional Learning Online learning portal offering on-demand tutorials, webinars, courses, and videos to supplement professional learning plans and help educators use MAP Reading Fluency to improve teaching and learning. NWEA offers a portfolio of flexible, customizable professional learning and training options for an additional cost to meet the needs of our partners. MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. Individual districts can purchase headsets directly through a third party if needed.; Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. The Voluntary Product Accessibility Template (VPAT) for MAP Reading Fluency is available from NWEA upon request.

Training & Technical Support

Training Requirements:: 1-4 hours of training

Qualified Administrators:: MAP Reading Fluency is group-administered and machine-scored. Proctors should complete online or in-person training offered by NWEA to ensure they are familiar with the test experience and interface and know how to initiate and oversee the testing process.

Access to Technical Support:: Users can obtain support through our Partner Support team via toll-free telephone number, email, and chat; our online Help Center; and a dedicated Account Manager.

Administration

Assessment Format:

Direct: Computerized

Scoring Time:

Scoring is automatic

Scores Generated:

Raw score
Percentile score
IRT-based score
Equated
Lexile score
Composite scores
Subscale/subtest scores

Administration Time:

20 minutes per student

Scoring Method:

Automatically (computer-scored)

Technology Requirements:

Computer or tablet
Internet connection
Other technology : MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. System requirements are regularly updated at https://teach.mapnwea.org/impl/QRM2_System_Requirements_QuickRef.pdf.

Accommodations:: Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. The Voluntary Product Accessibility Template (VPAT) for MAP Reading Fluency is available from NWEA upon request.

Descriptive Information

Please provide a description of your tool:: MAP Reading Fluency is designed specifically for early learners and focuses on foundational reading skills, oral reading fluency, and literal comprehension. The assessment is an easy-to-administer universal screener for students in grades PK–5 that is typically used three times per school year for benchmarking. It can also be used for students in grades 6–8 who require additional reading support. In addition to benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. MAP Reading Fluency delivers an interactive, developmentally appropriate assessment experience that identifies students at risk for reading difficulties. Easy-to-use reports provide actionable data at the individual, class, and district level. Unlike traditional measures of oral reading fluency, MAP Reading Fluency includes advanced speech scoring technology that automatically records and consistently scores students’ oral reading, saving hours of instructional time. Students complete the computer adaptive assessment independently in about twenty minutes using a headset with a mounted microphone. MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. Phonological Awareness and Phonics/Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that words correct per minute (WCPM) scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format.

The tool is intended for use with the following grade(s).

Preschool / Pre - kindergarten
selected

Kindergarten
selected

First grade
selected

Second grade
selected

Third grade
selected

Fourth grade
selected

Fifth grade
not selected

Sixth grade
not selected

Seventh grade
not selected

Eighth grade
not selected

Ninth grade
not selected

Tenth grade
not selected

Eleventh grade
not selected

Twelfth grade

The tool is intended for use with the following age(s).

0-4 years old
selected

5 years old
selected

6 years old
selected

7 years old
selected

8 years old
selected

9 years old
selected

10 years old
not selected

11 years old
not selected

12 years old
not selected

13 years old
not selected

14 years old
not selected

15 years old
not selected

16 years old
not selected

17 years old
not selected

18 years old

The tool is intended for use with the following student populations.

Students in general education
selected

Students with disabilities
selected

English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading

Phonological processing:

RAN

Memory

Awareness

Letter sound correspondence
selected

Phonics

Structural analysis

Word ID

Accuracy

Speed

Nonword

Accuracy

Speed

Spelling

Accuracy

Speed

Passage

Accuracy

Speed

Reading comprehension:

Multiple choice questions
not selected

Cloze

Constructed Response
not selected

Retell

Maze

Sentence verification
not selected

Other (please describe):

Listening comprehension:

Multiple choice questions
not selected

Cloze

Constructed Response
not selected

Retell

Maze

Sentence verification
selected

Vocabulary
not selected

Expressive
selected

Receptive

Mathematics

Global Indicator of Math Competence

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Early Numeracy

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematics Concepts

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematics Computation

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematic Application

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Fractions/Decimals

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Algebra

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Geometry

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Other (please describe):

Please describe specific domain, skills or subtests:

BEHAVIOR ONLY: Which category of behaviors does your tool target?: Internalizing
Externalizing
Internalizing and Externalizing

BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:

Email Address: proposals@nwea.org
Address: 121 NW Everett Street, Portland, OR 97209
Phone Number: (503) 624-1951
Website: www.nwea.org

Initial cost for implementing program:

Cost
Unit of cost

Replacement cost per unit for subsequent use:

Cost
Unit of cost
Duration of license

Additional cost information:

Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.: MAP Reading Fluency assessments require an annual per-student subscription fee. Please visit https://www.nwea.org/contact-us/sales-information/ or call (866) 654-3246 to request a quote. Annual subscription fees include a suite of assessments, scoring and reporting, all assessment software including maintenance and upgrades, support services, and unlimited staff access to NWEA Professional Learning Online. MAP Reading Fluency assessments can be administered three times per school year for benchmarking in grades PK–5. It can also be used for students in grades 6–8 who require reading support. In addition to the benchmark assessment, MAP Reading Fluency includes calibrated passages for progress monitoring students’ oral reading fluency. The benchmark assessment is also available in Spanish for grades PK–3. Reports provide students’ reading level, their performance compared to grade-level expectations, and suggestions for instructional next steps tailored to each student. A full system of support is provided to enable the success of our partners, including technical support; implementation support through the first test administration; and ongoing, dedicated account management for the duration of the partnership. Unlimited staff access to NWEA Professional Learning Online learning portal offering on-demand tutorials, webinars, courses, and videos to supplement professional learning plans and help educators use MAP Reading Fluency to improve teaching and learning. NWEA offers a portfolio of flexible, customizable professional learning and training options for an additional cost to meet the needs of our partners. MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. Individual districts can purchase headsets directly through a third party if needed.

Provide information about special accommodations for students with disabilities.: Our philosophy underscores elements of universal design and individualization for students with diverse needs, including students with disabilities. The Voluntary Product Accessibility Template (VPAT) for MAP Reading Fluency is available from NWEA upon request.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?

General education teacher
not selected

Special education teacher
not selected

Parent

Child

External observer
not selected

Other

If other, please specify:

What is the administration setting?

Direct observation
not selected

Rating scale
not selected

Checklist

Performance measure
not selected

Questionnaire
selected

Direct: Computerized
not selected

One-to-one
not selected

Other

If other, please specify:

Does the tool require technology?

Yes

If yes, what technology is required to implement your tool? (Select all that apply)

Computer or tablet
selected

Internet connection
selected

Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

MAP Reading Fluency requires each student to use an over-ear headset with a boom-style microphone. System requirements are regularly updated at https://teach.mapnwea.org/impl/QRM2_System_Requirements_QuickRef.pdf.

What is the administration context?

Individual
selected

Small group If small group, n=30
not selected

Large group If large group, n=
selected

Computer-administered
selected

Other

If other, please specify:

Educators can test a whole class, a group of students, or one student at a time. Tests can be administered in-person or remotely.

What is the administration time?

Time in minutes

per (student/group/other unit)

student

Additional scoring time:

Time in minutes

per (student/group/other unit)

student

ACADEMIC ONLY: What are the discontinue rules?

No discontinue rules provided
not selected

Basals

Ceilings

Other

If other, please specify:

Are norms available?: Yes

Are benchmarks available?: Yes
If yes, how many benchmarks per year?: 3
If yes, for which months are benchmarks available?: Fall, Winter, Spring

BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?: Yes

Describe the time required for administrator training, if applicable:: 1-4 hours of training

Please describe the minimum qualifications an administrator must possess.: MAP Reading Fluency is group-administered and machine-scored. Proctors should complete online or in-person training offered by NWEA to ensure they are familiar with the test experience and interface and know how to initiate and oversee the testing process.; No minimum qualifications

Are training manuals and materials available?: Yes

Are training manuals/materials field-tested?: Yes

Are training manuals/materials included in cost of tools?: Yes
If No, please describe training costs:

Can users obtain ongoing professional and technical support?: Yes
If Yes, please describe how users can obtain support:: Users can obtain support through our Partner Support team via toll-free telephone number, email, and chat; our online Help Center; and a dedicated Account Manager.

Scoring

How are scores calculated?

Manually (by hand)
selected

Automatically (computer-scored)
not selected

Other

If other, please specify:

Do you provide basis for calculating performance level scores?: Yes

What is the basis for calculating performance level and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

What types of performance level scores are available?

Raw score

Standard score
selected

Percentile score
not selected

Grade equivalents
selected

IRT-based score
not selected

Age equivalents
not selected

Stanines

Normal curve equivalents
not selected

Developmental benchmarks
not selected

Developmental cut points
selected

Equated

Probability
selected

Lexile score
not selected

Error analysis
selected

Composite scores
selected

Subscale/subtest scores
not selected

Other

If other, please specify:

Does your tool include decision rules?: Yes
If yes, please describe.: Foundational Skills Benchmarks: Students taking the Foundational Skills test place into a performance level based on the developmental level of the tasks they can complete. Full details are provided in the description of our scoring structure below. Risk Benchmarks: Risk benchmarks were set according to the results of the provided classification accuracy study. Risk was set at the 20th percentile of spring MAP Growth Reading RIT scores. A multivariate logistic model was used to obtain estimated probabilities of students scoring in the At Risk category on the spring MAP Growth Reading tests. Students not meeting certain thresholds on these estimated probabilities were deemed At Risk. Full details of the study appear below.

Can you provide evidence in support of multiple decision rules?: Yes
If yes, please describe.: See the description of our scoring structure below for a description of the rationale for the Foundational Skills performance levels.

Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.: MAP Reading Fluency assesses a wide range of specific early reading skills, selecting those in and around each student’s zone of proximal development (ZPD) — concepts the student is ready to develop — using a stage-adaptive methodology. Performance-level reporting classifies student performance as exceeding (blue), meeting (green), approaching (yellow), or below (red) grade-level expectation for a given grade and season (Fall, Winter, Spring). As the year progresses, expectation levels rise and students must demonstrate growth to keep pace with threshold performance for their grade. Raw Score Conversion to Performance Levels: Foundational Skills Measures -- Foundational skill measures in MAP Reading Fluency are presented within the Foundational Skills test form or upon failure to advance to oral reading based on sentence reading criteria. The Foundational Skills test includes measures in the Phonological Awareness, Phonics/Word Recognition, and Language Comprehension domains. The Print Concepts domain is also included for students newer to books and text. Phonological Awareness and Phonics/Word Recognition are assessed with a series of discrete, timed measures focusing on a single skill. These measures are presented adaptively based on student responses (i.e., number correct and percent correct). Each student moves through each of the two progressions based on their demonstrated ability. Performance levels are assigned at the level of the entire progression by comparing the observed zone of proximal development to grade-level expectations. ZPD levels are achievable from a series of related measures administered from each skill progression. The ZPD level is highlighted in an onscreen representation of the progression, shown in the Student Report, and is stated in a narrative in the top summary section of the report. Raw Score Conversion to Performance Levels: Oral Reading Measures -- Students who advance to oral reading are assigned a performance level based on scaled words correct per minute (SWCPM) for each grade and administration, which are drawn from published national norms (Hasbrouck & Tindal, 2017). Students meet expectations if they read the minimum SWCPM for a given grade and seasonal administration. If students struggle to understand a grade-level passage, they will get an easier (lower Lexile measure) passage. If they understand the grade-level passage well, students are presented with a more difficult (higher Lexile measure) passage. Passage equating and scaling allows fluency performance to be compared across a range of passage difficulty. A student’s best attempt determines his or her assigned performance level. Item Pool -- All MAP Reading Fluency items are designed for maximum developmental appropriateness, using engaging character-based audio and colorful graphics. A variety of selected-response formats are used, plus automatic speech scoring. Oral reading passages were developed for the purpose of oral reading fluency assessment, including basic understanding of what was read. Passages range in difficulty from 180L to 1000L in the Lexile metric to support adaptivity above and below grade level through grade 5. The student reads directly into a headset microphone for picture books and oral reading fluency passages. For selected-response tasks, students see and hear demonstrations by the narrating character, including audio and animation or video, before engaging with the scored items. Selected-response item types include multiple choice, including choose-two and hot spot formats; click-and-pop simple object moving formats; and simple constructed response (e.g., building a word from letters). MAP Reading Fluency includes more than 2,000 items across the following areas: Picture Books for oral reading, speech-scored: 11; Oral Reading Passage sets, with one speech-scored passage and six selected-response comprehension questions apiece: over 170 (over 1,190 total); Phonological Awareness items, across eight selected-response measures: over 320; Phonics and Word Recognition items, across 10 selected-response measures: over 350; Language Comprehension, across two selected-response measures: approximately 71; Print Concepts storybooks, with six selected-response questions apiece: 6 (36 total).

Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.: Research on early literacy development drove the design of MAP Reading Fluency. Framed by Gough and Tunmer’s Simple View of Reading (1986), the assessment parses decoding and language comprehension factors as separately assessable components until a point at which these are coming together in students’ reading of connected text. Once students are beginning to develop reading fluency, MAP Reading Fluency assesses oral reading directly using speech scoring technology and direct checks for understanding. For students not yet reading passages, decoding factors assessed in MAP Reading Fluency include phonological awareness, phonics and word recognition, and, for those at beginning levels of book exposure, print concepts. Phonemic awareness is among the strongest predictors of decoding fluency in English, and phonological skills that precede phoneme-level skills in a developmental continuum are valuable in earlier screening (Anthony & Francis, 2005). For both phonological awareness and phonics and word recognition, MAP Reading Fluency locates the student on a progression of skills and points to tightly aligned instructional steps for that stage, including research-to-practice student instructional materials from the Florida Center for Reading Research. While academic standards typically frame only decoding skills as foundational, research is increasingly clear that a student’s foundation in language comprehension strongly contributes to future reading comprehension, with growing predictive power as decoding fluency consolidates (Foorman, et al, 2015). In assessing foundational skills, MAP Reading Fluency includes both vocabulary and sentence listening comprehension. For students able to read connected text, MAP Reading Fluency assesses oral reading in a group-administered assessment that capitalizes on automatic speech scoring. This returns hours of instructional time that a teacher might otherwise be spending on the task of one-on-one assessment. While a simple direct measure of words correct per minute (WCPM) is a strong indicator of reading development (Fuchs, et al, 2001), research clearly supports a more robust understanding of reading fluency as including assessment of accuracy, rate, and understanding in the context of variable levels of text difficulty (Valencia, et al, 2010). It is particularly important that students are asked to show understanding of what they read aloud, both to convey to students the purpose of reading and to activate factors that aide in prediction of reading comprehension. As Valencia & Buly (2004) note, students struggling with reading align to more than one profile relevant to instructional next steps. When this is disregarded and all struggling readers are routed to the same generic interventions, screening time and resources are squandered, instructional effectiveness is compromised, and students are left to struggle. Instead, MAP Reading Fluency is designed for individualization. Oral reading fluency reporting generates individual Reader Profiles with Next Steps, tailoring these research-based messages to the individual’s particular performance across accuracy, rate, comprehension, and text level. Test Formats: MAP Reading Fluency includes two primary test forms, plus additional configuration options. For students at the emergent and early stages of reading development, the Foundational Skills test format measures early literacy skills and produces discrete, scaled domain scores for Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. Phonological Awareness and Phonics/Word Recognition domains consist of skill-specific, timed measures of early literacy skills. These measures are presented adaptively according to a developmental progression. Language Comprehension includes measures of receptive vocabulary and listening comprehension. For students who are able to read connected text, the Oral Reading test format measures oral reading fluency, literal comprehension of text read aloud, and sentence reading fluency. Oral reading fluency is measured via passages, which students are expected to read in their entirety. Recordings are scored automatically using speech recognition technology and are available for playback in the teacher interface. Hand-scoring by educators is permitted. Passages have been equated and scaled so that WCPM scores can be directly compared across a wide pool of passages and a range of passage difficulty. Literal comprehension is assessed with selected-response questions that follow each passage. Sentence reading fluency is also measured with selected-response items, presented in a timed format. Item Bias and Sensitivity: We are committed to developing engaging, authentic, rigorous, and culturally diverse assessments that effectively measure the full range of the standards. Therefore, it is vital that we address a wide variety of texts in a balanced, respectful way that does not upset, distract, or exclude any student populations. Item and passage writers employ careful consideration and sound judgment while crafting items, considering each item from a variety of angles regarding bias and sensitivity, in accordance with the NWEA Sensitivity, Fairness, and Accessibility Guidelines. To meet our high expectation of fairness to all students, every item and passage is thoroughly examined at multiple points in the development process, undergoing specific bias and sensitivity reviews. Sensitivity in this context means an awareness of the different things that can distract a student during assessment. Fairness in this context relates to giving each student equal opportunity to answer the item correctly based solely on their knowledge of the item content. Any sensitivity and fairness issues found in items or passages are eliminated in revision or rejection of the item during development. Each item or passage is evaluated against a set of criteria and is flagged if it requires prior knowledge other than the skill/concept being assessed; requires construct-irrelevant or specialized knowledge; has cultural bias; has linguistic bias; has socioeconomic bias; has religious bias; has geographic bias; has color-blind bias; has gender bias; favors students who have no visual impairments; favors students who have no disabilities; inappropriately employs idiomatic English; offensively stereotypes a group of people; mentions body/weight issues; contains inappropriate or sensitive topics; distracts, upsets, or confuses in any way; or has other bias issues. Our Psychometric Solutions team performs differential item functioning (DIF) analyses to examine the percentages of items that exhibit substantial DIF in the item pools, i.e., C-class DIF (Zwick, 2012). All items revealed as exhibiting C-class DIF are subjected to an extra review by NWEA Content Specialists to identify the source(s) for differential functioning. For each item, these specialists make a judgment to remove the item from the item bank; revise the item and resubmit it for field-testing; retain the item as is. These procedures are consistent with periodic item quality reviews that remove or flag items for revision, which are then field tested again.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade	Kindergarten	Grade 1	Grade 2	Grade 3
Classification Accuracy Fall
Classification Accuracy Winter
Classification Accuracy Spring

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

MAP® Growth™ Reading

Classification Accuracy

Select time of year

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.: The criterion measure was MAP Growth Reading. Although NWEA develops and maintains both MAP Growth and MAP Reading Fluency, they are different assessments built from separate blueprints with no task types or specific subtests in common. MAP Growth Reading assessments measure what students know and inform educators and parents about what they are ready to learn next. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction.

Do the classification accuracy analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.: For our classification analyses, we used screener data from Fall 2018, Winter 2019, and Spring 2019, and criterion data from Spring 2019. The predictive classification analyses involved Fall-to-Spring and Winter-to-Spring predictions. We treated Spring-to-Spring analyses as concurrent evidence.

Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).: The cut scores for at-risk on the criterion measure were set at the 20th percentile for each grade and term. A multivariate model using MAP Reading Fluency’s item response theory (IRT)-based Foundational Skills domain scores and raw scores on the Silent Sentence Reading measure was used to estimate the probability of students being at risk. The Foundational Skills domain scores comprise Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. The grouping was between high- and low-risk students. Cut points were set not on the individual domain scores and Silent Sentence Reading scores, but rather on the estimated at-risk probabilities. (This is a common practice in medical research.) Cut points were selected to have roughly equal sensitivities and specificities equal to or greater than 0.70. GRADES 2-3: By second grade, students tracking to Foundational Skills track tend to be struggling readers. Therefore, the base rate in the tested sample can become much higher than what the at-risk designation in the general population would suggest. To make the tested sample’s risk incidence more similar to the national norms, grade 2 and grade 3 students were selected using stratified random sampling procedures so that 20% of the selected students were at-risk on the criterion measure, and 80% were not. This method is similar to using poststratification weights to make sample demographics similar to national demographics. Such sampling has only minor effects on model sensitivity and specificity given these quantities are independent from the base rate (Krzanowski & Hand, 2009).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?: No
If yes, please describe the intervention, what children received the intervention, and how they were chosen.: Our sample comprised students from our regular test-taking population who had taken MAP Reading Fluency Foundational Skills measures and a MAP Growth Reading test during one or more terms in the 2018-2019 school year. Some of these students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students.

Cross-Validation

Has a cross-validation study been conducted?: No
If yes,

Select time of year.

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.

Do the cross-validation analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Classification Accuracy - Fall

Evidence	Kindergarten	Grade 1	Grade 2	Grade 3
Criterion measure	MAP® Growth™ Reading	MAP® Growth™ Reading	MAP® Growth™ Reading	MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure	20	20	20	20
Cut Points - Performance score on criterion measure	143	159	173	183
Cut Points - Corresponding performance score (numeric) on screener measure	N/A	N/A	N/A	N/A
Classification Data - True Positive (a)	413	1388	969	193
Classification Data - False Positive (b)	989	2219	1282	287
Classification Data - False Negative (c)	151	372	312	64
Classification Data - True Negative (d)	2697	8280	3842	741
Area Under the Curve (AUC)	0.82	0.87	0.83	0.82
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.80	0.86	0.82	0.79
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.83	0.87	0.84	0.84

Statistics	Kindergarten	Grade 1	Grade 2	Grade 3
Base Rate	0.13	0.14	0.20	0.20
Overall Classification Rate	0.73	0.79	0.75	0.73
Sensitivity	0.73	0.79	0.76	0.75
Specificity	0.73	0.79	0.75	0.72
False Positive Rate	0.27	0.21	0.25	0.28
False Negative Rate	0.27	0.21	0.24	0.25
Positive Predictive Power	0.29	0.38	0.43	0.40
Negative Predictive Power	0.95	0.96	0.92	0.92

Sample	Kindergarten	Grade 1	Grade 2	Grade 3
Date	2018-2019	2018-2019	2018-2019	2018-2019
Sample Size	4250	12259	6405	1285
Geographic Representation	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV) New England (MA, ME) Pacific (AK, CA, WA) South Atlantic (FL, GA, NC, SC) West North Central (IA, KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV) New England (CT, MA, ME, VT) Pacific (AK, CA, WA) South Atlantic (DE, FL, GA, MD, NC, SC) West North Central (IA, KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV) New England (CT, MA, ME, VT) Pacific (AK, CA, WA) South Atlantic (DE, FL, GA, NC, SC, VA) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS, TN) Middle Atlantic (NJ, NY, PA) Mountain (AZ, MT, NV) New England (MA, ME, VT) Pacific (AK, CA, WA) South Atlantic (FL, GA, NC, SC) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)
Male	51.3%	49.5%	50.6%	53.3%
Female	48.5%	50.4%	49.4%	46.7%
Other
Gender Unknown	0.1%	0.0%	0.0%
White, Non-Hispanic	51.1%	52.0%	47.3%	38.7%
Black, Non-Hispanic	16.2%	15.9%	19.0%	21.7%
Hispanic	17.4%	12.9%	14.0%	17.0%
Asian/Pacific Islander			1.4%	0.7%
American Indian/Alaska Native	1.7%	4.6%	6.0%	11.8%
Other			2.7%	2.7%
Race / Ethnicity Unknown	11.1%	12.6%	9.7%	7.3%
Low SES	15.3%	12.2%	11.8%	9.0%
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Winter

Evidence	Kindergarten	Grade 1	Grade 2	Grade 3
Criterion measure	MAP® Growth™ Reading	MAP® Growth™ Reading	MAP® Growth™ Reading	MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure	20	20	20	20
Cut Points - Performance score on criterion measure	143	159	173	183
Cut Points - Corresponding performance score (numeric) on screener measure	N/A	N/A	N/A	N/A
Classification Data - True Positive (a)	25	1825	646	115
Classification Data - False Positive (b)	104	2154	870	167
Classification Data - False Negative (c)	10	491	217	43
Classification Data - True Negative (d)	940	7920	2583	465
Area Under the Curve (AUC)	0.87	0.87	0.82	0.79
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.79	0.86	0.81	0.76
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.94	0.88	0.84	0.83

Statistics	Kindergarten	Grade 1	Grade 2	Grade 3
Base Rate	0.03	0.19	0.20	0.20
Overall Classification Rate	0.89	0.79	0.75	0.73
Sensitivity	0.71	0.79	0.75	0.73
Specificity	0.90	0.79	0.75	0.74
False Positive Rate	0.10	0.21	0.25	0.26
False Negative Rate	0.29	0.21	0.25	0.27
Positive Predictive Power	0.19	0.46	0.43	0.41
Negative Predictive Power	0.99	0.94	0.92	0.92

Sample	Kindergarten	Grade 1	Grade 2	Grade 3
Date	2018-2019	2018-2019	2018-2019	2018-2019
Sample Size	1079	12390	4316	790
Geographic Representation	East North Central (IL, IN, MI, OH, WI) East South Central (MS) Middle Atlantic (NJ, NY, PA) Mountain (AZ, MT, NV) New England (MA, ME) Pacific (CA, WA) South Atlantic (FL, GA, MD, NC, SC) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS, TN) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV, WY) New England (CT, MA, ME, VT) Pacific (AK, CA, OR, WA) South Atlantic (DC, DE, FL, GA, MD, NC, SC) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV, WY) New England (CT, MA, ME, VT) Pacific (AK, CA, OR, WA) South Atlantic (DC, DE, FL, GA, MD, NC, SC, VA) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS) Middle Atlantic (NY, PA) Mountain (AZ, CO, MT, NV) New England (CT, MA, ME, VT) Pacific (AK, CA, OR) South Atlantic (DC, FL, GA, MD, SC) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)
Male	49.5%	50.7%	52.6%	56.6%
Female	50.5%	49.3%	47.3%	43.3%
Other
Gender Unknown		0.0%	0.1%	0.1%
White, Non-Hispanic	57.2%	49.5%	45.3%	39.7%
Black, Non-Hispanic	14.5%	15.9%	21.4%	20.8%
Hispanic	11.6%	15.5%	15.0%	19.7%
Asian/Pacific Islander			1.7%	1.1%
American Indian/Alaska Native	2.2%	4.1%	5.1%	7.8%
Other			3.5%	3.4%
Race / Ethnicity Unknown	13.2%	12.9%	8.0%	7.3%
Low SES	17.9%	16.9%	13.4%	15.1%
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Spring

Evidence	Kindergarten	Grade 1	Grade 2	Grade 3
Criterion measure	MAP® Growth™ Reading	MAP® Growth™ Reading	MAP® Growth™ Reading	MAP® Growth™ Reading
Cut Points - Percentile rank on criterion measure	20	20	20	20
Cut Points - Performance score on criterion measure	143	159	173	183
Cut Points - Corresponding performance score (numeric) on screener measure	N/A	N/A	N/A	N/A
Classification Data - True Positive (a)	1677	2654	568	122
Classification Data - False Positive (b)	2408	2178	806	162
Classification Data - False Negative (c)	393	722	209	42
Classification Data - True Negative (d)	10308	8032	2302	493
Area Under the Curve (AUC)	0.89	0.87	0.82	0.81
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.88	0.86	0.80	0.78
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.90	0.87	0.83	0.85

Statistics	Kindergarten	Grade 1	Grade 2	Grade 3
Base Rate	0.14	0.25	0.20	0.20
Overall Classification Rate	0.81	0.79	0.74	0.75
Sensitivity	0.81	0.79	0.73	0.74
Specificity	0.81	0.79	0.74	0.75
False Positive Rate	0.19	0.21	0.26	0.25
False Negative Rate	0.19	0.21	0.27	0.26
Positive Predictive Power	0.41	0.55	0.41	0.43
Negative Predictive Power	0.96	0.92	0.92	0.92

Sample	Kindergarten	Grade 1	Grade 2	Grade 3
Date	2018-2019	2018-2019	2018-2019	2018-2019
Sample Size	14786	13586	3885	819
Geographic Representation	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS, TN) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV, WY) New England (CT, MA, ME, NH, VT) Pacific (AK, CA, OR, WA) South Atlantic (DC, FL, GA, MD, NC, SC) West North Central (IA, KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS, TN) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV, WY) New England (CT, MA, ME, VT) Pacific (AK, CA, OR, WA) South Atlantic (DC, DE, FL, GA, MD, NC, SC) West North Central (IA, KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS, TN) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV, WY) New England (CT, MA, ME, VT) Pacific (AK, CA, HI, OR, WA) South Atlantic (DC, DE, FL, GA, MD, NC, SC, VA) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)	East North Central (IL, IN, MI, OH, WI) East South Central (KY, MS, TN) Middle Atlantic (NJ, NY, PA) Mountain (AZ, CO, MT, NV) New England (CT, MA, ME, VT) Pacific (AK, CA, HI, OR, WA) South Atlantic (DC, FL, GA, MD, SC) West North Central (KS, MN, MO, ND, NE, SD) West South Central (AR, OK, TX)
Male	50.4%	50.8%	51.3%	53.7%
Female	49.6%	49.1%	48.6%	46.3%
Other
Gender Unknown	0.1%	0.1%	0.1%
White, Non-Hispanic	47.7%	45.2%	43.7%	37.9%
Black, Non-Hispanic	15.9%	18.6%	22.2%	21.2%
Hispanic	17.2%	15.9%	15.7%	20.5%
Asian/Pacific Islander			1.7%	0.9%
American Indian/Alaska Native	3.0%	4.5%	4.5%	6.7%
Other			3.5%	3.7%
Race / Ethnicity Unknown	14.0%	13.8%	8.6%	9.2%
Low SES	17.7%	16.5%	15.3%	14.2%
IEP or diagnosed disability
English Language Learner

Reliability

Grade	Kindergarten	Grade 1	Grade 2	Grade 3
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Offer a justification for each type of reliability reported, given the type and purpose of the tool.: We submitted marginal reliabilities for each of the Foundational Skills domain scores: Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. The Foundational Skills domain scores’ measures come from a multi-stage testing environment and contain mostly speeded measures. These conditions make popular internal consistency methods such as coefficient alpha less appropriate. Given extreme within-grade range restriction in scores, marginal reliabilities for the Silent Sentence Reading measure were spuriously low. We provided test-retest reliabilities for Silent Sentence Reading internal consistency for the required second type of reliability. Silent Sentence Reading is a speeded measure, so the resulting coefficients are likely affected.

*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.: Approximately 112,000 of our MAP Reading Fluency test takers in grades K–3 provided Foundational Skills data for the marginal reliability analyses. Data were collected in the Fall, Winter, and Spring terms of the 2018-2019 school year. The sample was approximately 39% White, 20% Hispanic, 18% African American, 4% Asian, and 2% Native American. Forty-seven U.S. states and territories were represented. Males were slightly overrepresented (52% vs. 48%). All students who take a MAP Reading Fluency benchmark test take the Silent Sentence Reading measure. Therefore, our sample for the internal consistency analyses was larger than that for marginal reliability. Data for these analyses included scores from over 120,000 students. The demographic composition of the sample was similar to that of the marginal reliability sample except that proportions of male and female students were nearly equal. The test-retest students completed a second MAP Reading Fluency test session. Their Local Education Agencies received a financial incentive to participate. Slightly over 2,000 students participated during the Fall 2018 term, and nearly 3,000 students participated in the Winter 2019 term. Fewer than 100 students’ data were available for the Spring 2019 term.

*Describe the analysis procedures for each reported type of reliability.: Rasch item difficulties were estimated in Spring 2019 on sets of linear forms in order to avoid distortions in item difficulty estimation that can arise in computer adaptive testing and multi-stage testing data. Operational data from Fall 2018, Winter 2019, and Spring 2019 were scored with these item difficulty estimates. Marginal reliabilities (MR) were estimated separately for each domain, grade, and semester combination, by MR=(σ^2 (θ_T)-μ(σ^2 (θ_e )))/(σ^2 (θ_T )), where σ^2 (θ_T) denotes the total variance of a set of domain scores and μ(σ^2 (θ_e )) denotes the mean of the error variance of these domain scores. The 95 percent confidence interval for each coefficient was obtained by repeating the estimation over 1,000 bootstrap samples. The marginal reliability estimates for the Foundational Skills domain scores also reflect calibration sampling error, i.e., the reduction in reliability due to using item parameter estimates gathered from a different sample. Confidence intervals for the test-retest correlations were obtained via the Fisher z-transformation (Fisher, 1921). A maximum test-retest interval of 14 days was allowed.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Validity

Grade	Kindergarten	Grade 1	Grade 2	Grade 3
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.: The criterion measure was MAP Growth Reading. For the predictive evidence, criterion scores were from Spring 2019. Although NWEA develops and maintains both MAP Growth and MAP Reading Fluency, they are different assessments built from separate blueprints with no task types or specific subtests in common. MAP Growth Reading assessments measure what students know and inform educators and parents about what they are ready to learn next. The computer adaptive tests are nationally normed and aligned to state academic standards. They show educators the strengths and weaknesses of each student. With its emphasis on overall achievement, normative ranking, and longitudinal growth, MAP Growth supports identification of students at risk, selection of appropriate learning targets and goals, and tracking the efficacy of instruction. MAP Reading Fluency tests are intended to identify students in need of supplemental and intensive support in reading. The categorical reporting of performance levels allows for simple analyses, such as the proportion of students on track to meet reading expectations. With its sub-score reporting, MAP Reading Fluency supports instructional planning for early literacy for whole-group, small-group, and individual instruction.

*Describe the sample(s), including size and characteristics, for each validity analysis conducted.: A total of 320,352 test records from the 2018-2019 school year were available for the study. The greatest number of records were from Spring 2019. Data were from 45 states covering all nine census geographic regions. Region coverage ranged from 3,254 students in the East South Central division to 46,620 students in the South Atlantic division. The gender and racial/ethnic composition of the sample was similar to that in the marginal reliability study, which was approximately 39% White, 20% Hispanic, 18% African American, 4% Asian, and 2% Native American. Males were slightly overrepresented (52% vs. 48%).

*Describe the analysis procedures for each reported type of validity.: Both concurrent and predictive evidence are supplied for the Silent Sentence Reading measure and the Foundational Skills domain scores: Phonological Awareness, Phonics/Word Recognition, and Language Comprehension. MAP Growth Reading RIT scores were the criterion measures. Within-semester Pearson correlations were run by grade for each term. Confidence intervals were constructed using the Fisher z-transformation (Fisher, 1921).

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Describe the degree to which the provided data support the validity of the tool.: Each of the Foundational Skills domains and Silent Sentence Reading cover different aspects of reading, whereas MAP Growth Reading is a general reading achievement test. One would not expect the correlations of the Foundational Skills and Silent Sentence Reading scores with MAP Growth Reading to be as large as those between two general reading achievement tests or between two specific tests of a single Foundational Skills domain, e.g., two Phonological Awareness tests. That said, of the lower limits for the 48 concurrent correlations, nine were greater than or equal to 0.60, 20 were greater or equal to 0.50 but less than 0.60, 11 were greater than or equal to 0.40 but less than 0.50, and three were greater than or equal to 0.30 but less than 0.40. Correlations less than 0.30 involved either Silent Sentence Reading in kindergarten or Language Comprehension in third grade. Of the lower limits for the respective 32 predictive correlations, three were greater than or equal to 0.60, 15 were greater than or equal to 0.50 but less than 0.60, eight were greater than or equal to 0.40 but less than 0.50, and three were greater than or equal to 0.30 but less than 0.40. The other three correlations involved Language Comprehension scores in grade 1. It’s likely that semester-to-semester growth resulted in lowered correlations for predictive as compared with the concurrent evidence. Correlations for kindergarten Silent Sentence Reading are included for the sake of transparency. Performance by kindergartners tends to be volatile on this measure, and kindergarteners are not expected to be able to read sentences at that age.

Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Bias Analysis

Grade	Kindergarten	Grade 1	Grade 2	Grade 3
Rating	Provided	Provided	Provided	Provided

Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.: Yes

If yes,
a. Describe the method used to determine the presence or absence of bias:: Differential item functioning (DIF) refers to the extent that students of equal ability do not have equal probability of answering particular items correctly. NWEA used Spring 2019 Foundational Skills data for a round of item response theory-based DIF analyses. Winsteps version 4.0 was used for these analyses (Linacre, nd). Winsteps fixes student ability estimates and then allows item difficulty estimates to be freely estimated for each group in question. The results are categorized based on the Educational Testing Service (ETS) method of classifying DIF (Zwick, 2012). This method allows items exhibiting negligible DIF (Category A) to be differentiated from those exhibiting moderate DIF (Category B) and severe DIF (Category C). Categories B and C have a further breakdown as “+” (DIF is in favor of the focal group) or “-” (DIF is in favor of the reference group). Typically, only items that fall in the “DIF C-class” require further investigation. Statistical tests for DIF are not corrected for multiple comparisons. Additionally, DIF statistics require an overall ability estimate that is largely unaffected by DIF in order to provide evidence of the presence of DIF. Analyses were conducted across each grade level.

b. Describe the subgroups for which bias analyses were conducted:: DIF analyses were conducted by ethnic group (White, Native American, Asian, African American, Hispanic) and gender (male, female). White serves as reference group in the DIF analysis based on ethnic group, and male serves as reference group in the DIF analysis based on gender.

c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.: Fewer than two percent of the DIF comparisons resulted in C-Class DIF, which is less than would be expected by chance. Five comparisons involved five items from a now-retired measure. Content experts review all other items showing C-Class DIF.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.

Summary
Descriptive Information
Administration
Training & Scoring

Technical Standards
Classification Accuracy &
Cross-Validation Summary
Reliability
Validity
Bias Analysis

Data Collection Practices

MAP® Reading Fluency™Reading

Summary

Descriptive Information

Administration

Training & Scoring

Training

Scoring

Technical Standards

Classification Accuracy & Cross-Validation Summary

MAP® Growth™ Reading

Classification Accuracy

Cross-Validation

Classification Accuracy - Fall

Classification Accuracy - Winter

Classification Accuracy - Spring

Reliability

Validity

Bias Analysis

Data Collection Practices

MAP® Reading Fluency™
Reading