Acadience Reading (aka DIBELS Next)
Maze

Summary

Descriptive Information

Acadience Reading Maze (previously published under the DIBELS Next® mark ) is a measure of reading comprehension. Maze assesses a student’s ability to construct meaning from text using word recognition skills, background information and prior knowledge, familiarity with linguistic properties such as syntax and morphology, and cause and effect reasoning skills. Maze is just one measure that is part of a broader reading assessment known as Acadience Reading.

Acquisition & Cost

Where to Obtain:: Developer: Dynamic Measurement Group, Publisher: Voyager Sopris Learning and Amplify Education, Inc; info@dibels.org; 859 Willamette Street, Suite 320, Eugene, OR 97410; 541-431-6931 or toll free 888-943-1240; http://dibels.org

Initial Cost:: Free

Replacement Cost:: Free

Included in Cost:: ADDITIONAL INFO WHERE TOOL CAN BE OBTAINED: Dynamic Measurement Group (Free download version in black and white) Website: http://dibels.org Address: 859 Willamette Street, Suite 320, Eugene, OR 97410 Phone number: 541-431-6931 or toll free 888-943-1240 Email address: info@dibels.org Voyager Sopris Learning (Published print version in color) Website: http://voyagersopris.com Address: 17855 Dallas Parkway, Suite 400, Dallas, TX 75287-6816 Telephone number: (800) 547-6747 Amplify (Published mobile device version) Website: www.amplify.com Address: 55 Washington Street, Suite 800, Brooklyn, NY 11201 Telephone number: (800) 823-1969 ADDITIONAL INFO ON COST INFO: Amplify Initial cost for implementing program: 14.90 Unit of Cost: student Replacement cost for subsequent use: 14.90 Unit of cost: student License duration: Year Voyager Sopris Learning Initial cost for implementing program: $3.64 – $3.72 Unit of Cost: student Replacement cost for subsequent use: $3.62 - $3.74 Unit of cost: student License duration: Year Additional Cost Information: Describe basic pricing plan and/or cost structure of the tool, including, as applicable: cost per student per year, start-up or other one-time costs, reoccurring costs, training cost, and what is included in the published tool. Also, provide information on what is not included but required for implementation (e.g., computer and/or internet access.) See cost information noted above and in the footnotes, as well as costs related to training in Section B, Item 5. Computer and internet access are required for the mobile device version from Amplify.; Acadience Reading is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading is appropriate for assessing and monitoring progress in acquisition of early reading skills. For all Acadience Reading measures (including Maze), students are never penalized for articulation or dialect differences that are part of their typical speech. In addition, Acadience Reading Maze includes a set of approved accommodations that assessors may use when appropriate (see the Acadience Reading Assessment Manual, p. 20-21). There are a few groups of students for whom Acadience Reading is not appropriate: (a) students who are learning to read in a language other than English; (b) students who are deaf; (c) students who have fluency-based speech disabilities such as stuttering (if the stuttering affects the student's response fluency within a one-minute timed assessment) and oral apraxia; and (d) students with severe disabilities for whom learning to read connected text is not an IEP goal. Assessment accommodations are used for those students for whom the standard administration conditions would not produce accurate results. Approved accommodations are those accommodations which are unlikely to change how the assessment functions. When approved accommodations are used, the scores can be reported and interpreted as official Acadience Reading scores (see Table 2.1 in the Acadience Reading Assessment Manual, p. 20, for a list of approved accommodations). Approved accommodations should be used only for students for whom the accommodations are necessary to provide an accurate assessment of student skills. Unapproved accommodations are accommodations that are likely to change how the assessment functions (such as modifying the timing rules). Scores from measures administered with unapproved accommodations should not be treated or reported as official Acadience Reading scores and cannot be compared to other Acadience Reading scores or benchmark goals but can be used to measure individual growth for a student. An unapproved accommodation may be used when (a) a student cannot be tested accurately using the standardized rules or approved accommodations, but the school would still like to measure progress for that student; or (b) a student’s Individualized Education Plan (IEP) requires testing with an unapproved accommodation. For more information about accommodations, see pages 20 to 21 of the Acadience Reading Assessment Manual.

Training & Technical Support

Training Requirements:: 1-4 hrs

Qualified Administrators:: No minimum qualifications specified.

Access to Technical Support:: Dynamic Measurement Group provides customer support for all Acadience Reading assessments, as well as support for the data management and reporting system, Acadience Data Management. Staff are available by phone and email on weekdays from 7am to 5pm Pacific Time, for no additional cost. The majority of customer support requests are resolved in less than one business day.

Administration

Assessment Format:

Direct observation
Performance measure
One-to-one

Scoring Time:

Scoring is automatic OR
1 minutes per student form

Scores Generated:

Raw score
Percentile score
Developmental benchmarks

Administration Time:

5 minutes per group

Scoring Method:

Manually (by hand)
Automatically (computer-scored)

Technology Requirements:

Accommodations:: Acadience Reading is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading is appropriate for assessing and monitoring progress in acquisition of early reading skills. For all Acadience Reading measures (including Maze), students are never penalized for articulation or dialect differences that are part of their typical speech. In addition, Acadience Reading Maze includes a set of approved accommodations that assessors may use when appropriate (see the Acadience Reading Assessment Manual, p. 20-21). There are a few groups of students for whom Acadience Reading is not appropriate: (a) students who are learning to read in a language other than English; (b) students who are deaf; (c) students who have fluency-based speech disabilities such as stuttering (if the stuttering affects the student's response fluency within a one-minute timed assessment) and oral apraxia; and (d) students with severe disabilities for whom learning to read connected text is not an IEP goal. Assessment accommodations are used for those students for whom the standard administration conditions would not produce accurate results. Approved accommodations are those accommodations which are unlikely to change how the assessment functions. When approved accommodations are used, the scores can be reported and interpreted as official Acadience Reading scores (see Table 2.1 in the Acadience Reading Assessment Manual, p. 20, for a list of approved accommodations). Approved accommodations should be used only for students for whom the accommodations are necessary to provide an accurate assessment of student skills. Unapproved accommodations are accommodations that are likely to change how the assessment functions (such as modifying the timing rules). Scores from measures administered with unapproved accommodations should not be treated or reported as official Acadience Reading scores and cannot be compared to other Acadience Reading scores or benchmark goals but can be used to measure individual growth for a student. An unapproved accommodation may be used when (a) a student cannot be tested accurately using the standardized rules or approved accommodations, but the school would still like to measure progress for that student; or (b) a student’s Individualized Education Plan (IEP) requires testing with an unapproved accommodation. For more information about accommodations, see pages 20 to 21 of the Acadience Reading Assessment Manual.

Descriptive Information

Please provide a description of your tool:: Acadience Reading Maze (previously published under the DIBELS Next® mark ) is a measure of reading comprehension. Maze assesses a student’s ability to construct meaning from text using word recognition skills, background information and prior knowledge, familiarity with linguistic properties such as syntax and morphology, and cause and effect reasoning skills. Maze is just one measure that is part of a broader reading assessment known as Acadience Reading.

The tool is intended for use with the following grade(s).

Preschool / Pre - kindergarten
not selected

Kindergarten
not selected

First grade
not selected

Second grade
selected

Third grade
selected

Fourth grade
selected

Fifth grade
selected

Sixth grade
not selected

Seventh grade
not selected

Eighth grade
not selected

Ninth grade
not selected

Tenth grade
not selected

Eleventh grade
not selected

Twelfth grade

The tool is intended for use with the following age(s).

0-4 years old
not selected

5 years old
not selected

6 years old
not selected

7 years old
not selected

8 years old
not selected

9 years old
not selected

10 years old
not selected

11 years old
not selected

12 years old
not selected

13 years old
not selected

14 years old
not selected

15 years old
not selected

16 years old
not selected

17 years old
not selected

18 years old

The tool is intended for use with the following student populations.

Students in general education
not selected

Students with disabilities
not selected

English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading

Phonological processing:

RAN

Memory

Awareness

Letter sound correspondence
not selected

Phonics

Structural analysis

Word ID

Accuracy

Speed

Nonword

Accuracy

Speed

Spelling

Accuracy

Speed

Passage

Accuracy

Speed

Reading comprehension:

Multiple choice questions
not selected

Cloze

Constructed Response
not selected

Retell

Maze

Sentence verification
not selected

Other (please describe):

Listening comprehension:

Multiple choice questions
not selected

Cloze

Constructed Response
not selected

Retell

Maze

Sentence verification
not selected

Vocabulary
not selected

Expressive
not selected

Receptive

Mathematics

Global Indicator of Math Competence

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Early Numeracy

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematics Concepts

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematics Computation

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Mathematic Application

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Fractions/Decimals

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Algebra

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Geometry

Accuracy

Speed

Multiple Choice
not selected

Constructed Response

Other (please describe):

Please describe specific domain, skills or subtests:

BEHAVIOR ONLY: Which category of behaviors does your tool target?: Internalizing
Externalizing
Internalizing and Externalizing

BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:

Email Address: info@dibels.org
Address: 859 Willamette Street, Suite 320, Eugene, OR 97410
Phone Number: 541-431-6931 or toll free 888-943-1240
Website: http://dibels.org

Initial cost for implementing program:

Cost: $0.00
Unit of cost: Any

Replacement cost per unit for subsequent use:

Cost: $0.00
Unit of cost: Any
Duration of license: Unlimited

Additional cost information:

Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.: ADDITIONAL INFO WHERE TOOL CAN BE OBTAINED: Dynamic Measurement Group (Free download version in black and white) Website: http://dibels.org Address: 859 Willamette Street, Suite 320, Eugene, OR 97410 Phone number: 541-431-6931 or toll free 888-943-1240 Email address: info@dibels.org Voyager Sopris Learning (Published print version in color) Website: http://voyagersopris.com Address: 17855 Dallas Parkway, Suite 400, Dallas, TX 75287-6816 Telephone number: (800) 547-6747 Amplify (Published mobile device version) Website: www.amplify.com Address: 55 Washington Street, Suite 800, Brooklyn, NY 11201 Telephone number: (800) 823-1969 ADDITIONAL INFO ON COST INFO: Amplify Initial cost for implementing program: 14.90 Unit of Cost: student Replacement cost for subsequent use: 14.90 Unit of cost: student License duration: Year Voyager Sopris Learning Initial cost for implementing program: $3.64 – $3.72 Unit of Cost: student Replacement cost for subsequent use: $3.62 - $3.74 Unit of cost: student License duration: Year Additional Cost Information: Describe basic pricing plan and/or cost structure of the tool, including, as applicable: cost per student per year, start-up or other one-time costs, reoccurring costs, training cost, and what is included in the published tool. Also, provide information on what is not included but required for implementation (e.g., computer and/or internet access.) See cost information noted above and in the footnotes, as well as costs related to training in Section B, Item 5. Computer and internet access are required for the mobile device version from Amplify.

Provide information about special accommodations for students with disabilities.: Acadience Reading is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading is appropriate for assessing and monitoring progress in acquisition of early reading skills. For all Acadience Reading measures (including Maze), students are never penalized for articulation or dialect differences that are part of their typical speech. In addition, Acadience Reading Maze includes a set of approved accommodations that assessors may use when appropriate (see the Acadience Reading Assessment Manual, p. 20-21). There are a few groups of students for whom Acadience Reading is not appropriate: (a) students who are learning to read in a language other than English; (b) students who are deaf; (c) students who have fluency-based speech disabilities such as stuttering (if the stuttering affects the student's response fluency within a one-minute timed assessment) and oral apraxia; and (d) students with severe disabilities for whom learning to read connected text is not an IEP goal. Assessment accommodations are used for those students for whom the standard administration conditions would not produce accurate results. Approved accommodations are those accommodations which are unlikely to change how the assessment functions. When approved accommodations are used, the scores can be reported and interpreted as official Acadience Reading scores (see Table 2.1 in the Acadience Reading Assessment Manual, p. 20, for a list of approved accommodations). Approved accommodations should be used only for students for whom the accommodations are necessary to provide an accurate assessment of student skills. Unapproved accommodations are accommodations that are likely to change how the assessment functions (such as modifying the timing rules). Scores from measures administered with unapproved accommodations should not be treated or reported as official Acadience Reading scores and cannot be compared to other Acadience Reading scores or benchmark goals but can be used to measure individual growth for a student. An unapproved accommodation may be used when (a) a student cannot be tested accurately using the standardized rules or approved accommodations, but the school would still like to measure progress for that student; or (b) a student’s Individualized Education Plan (IEP) requires testing with an unapproved accommodation. For more information about accommodations, see pages 20 to 21 of the Acadience Reading Assessment Manual.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?

General education teacher
not selected

Special education teacher
not selected

Parent

Child

External observer
not selected

Other

If other, please specify:

What is the administration setting?

Direct observation
not selected

Rating scale
not selected

Checklist

Performance measure
not selected

Questionnaire
not selected

Direct: Computerized
selected

One-to-one
not selected

Other

If other, please specify:

Does the tool require technology?

If yes, what technology is required to implement your tool? (Select all that apply)

Computer or tablet
not selected

Internet connection
not selected

Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?

Individual
not selected

Small group If small group, n=
selected

Large group If large group, n=
not selected

Computer-administered
not selected

Other

If other, please specify:

What is the administration time?

Time in minutes

per (student/group/other unit)

group

Additional scoring time:

Time in minutes

per (student/group/other unit)

student form

ACADEMIC ONLY: What are the discontinue rules?

No discontinue rules provided
not selected

Basals

Ceilings

Other

If other, please specify:

Are norms available?: Yes

Are benchmarks available?: Yes
If yes, how many benchmarks per year?: Three
If yes, for which months are benchmarks available?: Beginning of year (months 1 - 3 of school year), middle of year (months 4 - 6 of the school year), and end of year (months 7 - 9 of the school year).

BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?: Yes

Describe the time required for administrator training, if applicable:: 1-4 hrs

Please describe the minimum qualifications an administrator must possess.: No minimum qualifications

Are training manuals and materials available?: Yes

Are training manuals/materials field-tested?: Yes

Are training manuals/materials included in cost of tools?: No
If No, please describe training costs:: If not, please describe training costs: The Acadience Reading Assessment Manual is available for free download along with the test materials. In addition, Dynamic Measurement Group offers a variety of training options to meet different needs and at different price points. Training options include online training, live online webinars, onsite training (hiring a trainer to come out to the school or district), and our Super Institute, which takes place each summer. Dynamic Measurement Group staff can work with schools, LEAs, regional agencies, and SEAs to develop customized training plans to meet their unique needs. We also have an Acadience Reading Mentor program, where a single attendee or small group of attendees can become Acadience Reading Mentors and receive access to our official training materials, which they can use to train others in their school or district. For an individual teacher subscription to the online Acadience Reading Essential Workshop, the cost is $129. Please note: Other training options may cost more or less depending on the circumstances and the number of attendees.

Can users obtain ongoing professional and technical support?: Yes
If Yes, please describe how users can obtain support:: Dynamic Measurement Group provides customer support for all Acadience Reading assessments, as well as support for the data management and reporting system, Acadience Data Management. Staff are available by phone and email on weekdays from 7am to 5pm Pacific Time, for no additional cost. The majority of customer support requests are resolved in less than one business day.

Scoring

How are scores calculated?

Manually (by hand)
selected

Automatically (computer-scored)
not selected

Other

If other, please specify:

Do you provide basis for calculating performance level scores?: Yes

What is the basis for calculating performance level and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

What types of performance level scores are available?

Raw score

Standard score
selected

Percentile score
not selected

Grade equivalents
not selected

IRT-based score
not selected

Age equivalents
not selected

Stanines

Normal curve equivalents
selected

Developmental benchmarks
not selected

Developmental cut points
not selected

Equated

Probability
not selected

Lexile score
not selected

Error analysis
not selected

Composite scores
not selected

Subscale/subtest scores
not selected

Other

If other, please specify:

Does your tool include decision rules?: Yes
If yes, please describe.: The fundamental logic for developing the Acadience Reading benchmark goals and cut points for risk was to begin with an external outcome goal and work backward in a step-by-step system. Student skills at or above benchmark at the beginning of the year put odds in favor of the student achieving the middle-of- year benchmark goal. In turn, students with skills at or above benchmark in the middle of the year have odds in favor of achieving the end-of-year benchmark goal. We first obtained an external criterion measure (the GRADE Total Test Raw Score) at the end of the year with a level of performance that would represent adequate reading skills. The scores at the 40th and 20th percentiles on the GRADE compared to the GRADE normative sample were used as an approximation for adequate reading skills, and a cut point for risk, respectively. Next, we specified the benchmark goal and cut point for risk on the end-of- year Acadience Reading Composite with respect to the end-of-year external criterion. Then, using the Acadience Reading Composite end-of-year goal as an internal criterion, we established the benchmark goals and cut points for risk on the middle-of-year Acadience Reading Composite. Finally, we established the benchmark goals and cut points for risk on the beginning-of-year Acadience Reading Composite using the middle-of-year Acadience Reading Composite as an internal criterion. The primary design specification for benchmark goals was to establish a level of skill where students scoring at or above benchmark have favorable odds (80%-90%) of achieving subsequent reading outcomes. The primary specification for a cut point for risk is a level of skill where students scoring below that level have low odds (10%-20%) of achieving subsequent reading outcomes. A secondary specification was based on an examination of marginal percents. We aimed to keep the marginal percent of students in each score level consistent from predictor to criterion. Another consideration was based on logistic regression predicting the odds of scoring at or above benchmark on the criterion based on the score on the predictor. We aimed to keep the predicted odds for students obtaining the exact benchmark goal at 60% or higher of achieving subsequent goals, and the predicted odds of achieving subsequent goals at 40% or less for students obtaining the exact score corresponding to the cut point for risk. Additional issues considered include the pattern of student performance in the scatterplot, the ROC curve analysis to evaluate the AUC, sensitivity and specificity, and the overall pattern of benchmark goals and cut points for risk across grades. The same standard setting methodology used for the Acadience Reading Composite was also used for each individual Acadience Reading component measure. For details, please see the following documents : • DIBELS Next Benchmark Goals and Composite Score, p. 1-5. • DIBELS Next Technical Manual, p. 49-54, and p. 63-83 for Benchmark Goal Detail pages.

Can you provide evidence in support of multiple decision rules?: Yes
If yes, please describe.: Research evidence supporting the use of Acadience Reading measures for benchmark assessment three times per year (diagnostic screening) is found here in the following documents: • DIBELS Next Technical Manual, p. 49-54, and p. 63-83 for Benchmark Goal Detail pages. • DIBELS Next Technical Adequacy Brief, Table 5, p. 12-13. • Using Curriculum-Based Measures to Predict Reading Test Scores on the Michigan Educational Assessment Program (Technical Report) • DIBELS Next and the SBAC ELA (Contemporary School Psych 2018) • Using DIBELS Next to Predict Performance on Statewide ELA Assessments: A Tale of Two Tests (NASP 2018)

Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.

Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.: The Acadience Reading measures were developed to provide teachers with information they need to make decisions about instruction. The authors advocate a data-based decision-making model referred to as the Outcomes-Driven Model, because the data are used to make decisions to improve student outcomes by matching the amount and type of instructional support with the needs of the individual students. These steps of the model repeat each trimester as a student progresses through the grades. At the beginning of the trimester, the first step is to identify students who may need additional support. At the end of the trimester, the final step is to review outcomes, which also facilitates identifying students who need additional support for the next trimester. In this manner, educators can ensure that students who are on track to become proficient readers continue to make adequate progress, and that those students who are not on track receive the support they need to become proficient readers. Step 1: Identify need for support early. This process occurs during benchmark assessment, and is also referred to as universal screening. The purpose is to identify those students who may need additional instructional support to achieve benchmark goals. The benchmark assessment also provides information regarding the performance of all students in the school with respect to benchmark goals. All students within a school or grade are tested three times per year on grade-level material. The testing occurs at the beginning, middle, and end of the school year. Step 2: Validate need for support. The purpose of this step is to be reasonably confident that the student needs or does not need additional instructional support. Before making individual student decisions, it is important to consider additional information beyond the initial data obtained during benchmark testing. Teachers can always use additional assessment information and knowledge about a student to validate a score before making decisions about instructional support. If there is a discrepancy in the student’s performance relative to other information available about the student, or if there is a question about the accuracy of a score, the score can be validated by retesting the student using alternate forms of the Acadience Reading measures or additional diagnostic assessments as necessary. Step 3: Plan and implement support. In general, for students who are meeting the benchmark goals, a good, research-based core classroom curriculum should meet their instructional needs, and they will continue to receive benchmark assessment three times per year to ensure they remain on track. Students who are identified as needing support are likely to require additional instruction or intervention in the skill areas where they are having difficulties. Step 4: Evaluate and modify support as needed. Students who are receiving additional support should be progress monitored more frequently to ensure that the instructional support being provided is helping them get back on track. Students should be monitored on the measures that test the skill areas where they are having difficulties and receiving additional instructional support. Monitoring may occur once per month, once every two weeks, or as often as once per week. In general, students who need the most intensive instruction are progress monitored most frequently. Step 5: Review outcomes. By looking at the benchmark assessment data for all students, schools can ensure that their instructional supports—both core curriculum and additional interventions—are working for all students. If a school identifies areas of instructional support that are not working as desired, the school can use the data to help make decisions on how to improve. The use of Acadience Reading measures within the Outcomes-Driven Model is consistent with the most recent reauthorization of the Individuals with Disabilities Education Improvement Act (IDEA), which allows the use of a Response to Intervention (RtI) approach to identify children with learning disabilities. In an RtI approach to identification, early intervention is provided to students who are at risk for the development of learning difficulties. Data are gathered to determine which students are responsive to the intervention provided and which students need more intensive support (Fuchs & Fuchs, 2006). The Outcomes-Driven Model is based on foundational work with a problem-solving model (see Deno, 1989; Shinn, 1995; Tilly, 2008) and the initial application of the problem-solving model to early literacy skills (Kaminski & Good, 1998). The general questions addressed by a problem-solving model include: What is the problem? Why is it happening? What should be done about it? Did it work? (Tilly, 2008). The Outcomes-Driven Model was developed to address these questions, but within a prevention-oriented framework designed to preempt early reading difficulty and ensure step-by-step progress toward outcomes that will result in established, adequate reading achievement.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade	Grade 3	Grade 4	Grade 5	Grade 6
Classification Accuracy Fall
Classification Accuracy Winter
Classification Accuracy Spring

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Group Reading Assessment and Diagnostic Evaluation (GRADE)

Classification Accuracy

Select time of year

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.: The Group Reading Assessment and Diagnostic Evaluation (GRADE) is an untimed, group-administered, norm-referenced reading achievement test appropriate for children in preschool through grade 12. The GRADE is comprised of 16 subtests within five components. Not all 16 subtests are used at each testing level. Various subtest scores are combined to form the Total Test composite score. The GRADE Total Test score is comprised of scores across subtests of the GRADE that vary by grade level. In kindergarten, the GRADE Total Test score is comprised of measures that assess phonics and phonemic and phonological awareness. In first and second grade, GRADE Total Test includes word meaning, passage (or sentence) reading, and comprehension measures. In third grade, GRADE Total Test is comprised of measures assessing word reading, vocabulary, and comprehension. In fourth, fifth, and sixth grade, GRADE Total Test includes scores from measures of vocabulary and comprehension. Each of the criterion measures were administered at the end of the academic year (April through June). This means that the fall and winter administration of Acadience Reading were separated from the criterion measure by at least three months. Furthermore, the GRADE, and CST criterion measures are distinct from Acadience Reading. Each of these criterion measures were developed separately, based upon different samples of students, and are published by organizations separate from Acadience Reading.

Do the classification accuracy analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).: Classification analyses were performed by performing a logistic regression predicting the classification on the criterion from the Acadience Reading Composite Score. The target outcome (or “positive”) that was chosen was whether the student was classified as having intensive need. This logistic regression model was then used to calculate the classification statistics that are reported. The 20th percentile on the Group Reading Assessment and Diagnostic Evaluation (GRADE) was chosen as a means of identifying those students who are most in need of intensive evaluation.

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?: No
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Cross-Validation

Has a cross-validation study been conducted?: No
If yes,

Select time of year.

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.

Do the cross-validation analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

California Standards Test (CST): ELA, Reading Cluster

Classification Accuracy

Select time of year

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.: The California Standards Test (CST) is a statewide achievement test produced for California public schools and was designed to assess the California content standards for English/language arts (ELA), mathematics, history–social science, and science in grades two through eleven. The Reading cluster of the ELA portion of the CST was chosen as the criterion. According to a technical report from ETS (2011), the CST items were developed and designed to conform to principles of item writing defined by ETS (ETS, 2002). In addition, the items selected underwent an extensive item review process designed to provide the best standards-based tests possible. Each of the criterion measures were administered at the end of the academic year (April through June). This means that the fall and winter administration of Acadience Reading were separated from the criterion measure by at least three months. Furthermore, the GRADE, and CST criterion measures are distinct from Acadience Reading. Each of these criterion measures were developed separately, based upon different samples of students, and are published by organizations separate from Acadience Reading.

Do the classification accuracy analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).: Classification analyses were performed by performing a logistic regression predicting the classification on the criterion (e.g., below 350 on the CST) from the Acadience Reading Maze Adjusted Score. The target outcome (or “positive”) that was chosen was whether the student was classified as having intensive need. This logistic regression model was then used to calculate the classification statistics that are reported. The California Standards Test (CST) sets a minimum score of 350 for proficiency in reading, and scores below 350 show a lack of proficiency - however, for grade 3 students, using the standard cut point led to a base rate of children requiring intensive intervention that was too high, so the cut point was lowered to 315.

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?: No
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Cross-Validation

Has a cross-validation study been conducted?: No
If yes,

Select time of year.

Fall

Winter

Spring

Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.

Do the cross-validation analyses examine concurrent and/or predictive classification?

Concurrent
Predictive

Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.

Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).

Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
If yes, please describe the intervention, what children received the intervention, and how they were chosen.

Classification Accuracy - Fall

Evidence	Grade 3	Grade 4	Grade 5	Grade 6
Criterion measure	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)	California Standards Test (CST): ELA, Reading Cluster
Cut Points - Percentile rank on criterion measure	20	20	20	20
Cut Points - Performance score on criterion measure	78	41	39	<= 350
Cut Points - Corresponding performance score (numeric) on screener measure	5	10	12	14
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC)	0.86	0.85	0.84	0.83
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.82	0.77	0.78	0.79
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.90	0.94	0.90	0.86

Statistics	Grade 3	Grade 4	Grade 5	Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power

Sample	Grade 3	Grade 4	Grade 5	Grade 6
Date
Sample Size
Geographic Representation
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
Asian/Pacific Islander
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Winter

Evidence	Grade 3	Grade 4	Grade 5	Grade 6
Criterion measure	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)
Cut Points - Percentile rank on criterion measure	20	20	20	20
Cut Points - Performance score on criterion measure	78	41	39	53
Cut Points - Corresponding performance score (numeric) on screener measure	7	12	13	14
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC)	0.88	0.84	0.84	0.80
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.82	0.76	0.77	0.69
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.93	0.92	0.92	0.92

Statistics	Grade 3	Grade 4	Grade 5	Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power

Sample	Grade 3	Grade 4	Grade 5	Grade 6
Date
Sample Size
Geographic Representation
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
Asian/Pacific Islander
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Spring

Evidence	Grade 3	Grade 4	Grade 5	Grade 6
Criterion measure	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)	Group Reading Assessment and Diagnostic Evaluation (GRADE)
Cut Points - Percentile rank on criterion measure	20	20	20	20
Cut Points - Performance score on criterion measure	78	41	39	53
Cut Points - Corresponding performance score (numeric) on screener measure	14	20	18	15
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC)	0.84	0.87	0.80	0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound	0.76	0.80	0.73	0.75
AUC Estimate’s 95% Confidence Interval: Upper Bound	0.92	0.94	0.88	0.92

Statistics	Grade 3	Grade 4	Grade 5	Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power

Sample	Grade 3	Grade 4	Grade 5	Grade 6
Date	2011	2011	2011	2011
Sample Size
Geographic Representation
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
Asian/Pacific Islander
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Reliability

Grade	Grade 3	Grade 4	Grade 5	Grade 6
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Offer a justification for each type of reliability reported, given the type and purpose of the tool.: Reliability refers to the relative stability with which a test measures the same skills across minor differences in conditions. Three types of reliability are reported in the table below, alternate form reliability, alpha, and inter-rater reliability. Alternate form reliability is the correlation between different measures of the same early literacy skills. The coefficient reported is the average correlation among three forms of the measure. Coefficient alpha is a measure of reliability that is widely used in education research and represents the proportion of true score to total variance. Alpha incorporates information about the average inter-test correlation as well as the number of tests. Inter-rater reliability indicates the extent to which results generalize across assessors. The inter-rater reliability estimates reported here are based on two independent assessors simultaneously scoring student performance during a single test administration (“shadow-scoring”). The two raters’ scores were then correlated.

*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.: The data used for assessing reliability came from third through sixth grade. The total sample size is 674 students from 13 schools within 5 school districts. The sample was drawn from two census regions (Pacific and North Central Midwest).

*Describe the analysis procedures for each reported type of reliability.: Alternate form reliability is reported as the average correlation among three alternate forms of the same test. Coefficient alpha treats each of the three tests as separate indicators and is calculated using the alternate form reliability, where the number of tests is equal to three.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: Yes

Provide citations for additional published studies.: Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). DIBELS Next technical adequacy brief. Eugene: Dynamic Measurement Group.

Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Validity

Grade	Grade 3	Grade 4	Grade 5	Grade 6
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.: The California Standards Test (CST) is a statewide achievement test produced for California public schools and was designed to assess the California content standards for English/language arts (ELA), mathematics, history–social science, and science in grades two through eleven. The Reading cluster of the ELA portion of the CST was chosen as the criterion. According to a technical report from ETS (2011), the CST items were developed and designed to conform to principles of item writing defined by ETS (ETS, 2002). In addition, the items selected underwent an extensive item review process designed to provide the best standards-based tests possible. The AzMERIT is a statewide achievement test produced for Arizona. Arizona partnered with the American Institutes for Research (AIR) to develop this Arizona-specific assessment linked to, and designed to assess, the Arizona College and Career Ready Standards (ACCRS) for English Language Arts (ELA) and Math. According to the test developer, scores from the AzMERIT can be used to evaluate whether students have achieved the ACCRS by the end of the school year, as well as the effectiveness with which Arizona districts and schools teach students the ACCRS (AIR, 2017). A rigorous test development process was used to ensure item alignment with the ACCRS. Test blueprints specified the range and depth with which each standard was assessed. Items underwent extensive review by content experts, an internal review with AIR experts, and reviewed by parents/community members. Field-testing was extensive along with research examining the relation of the test to other measures (e.g., NAEP, SBAC, ACT). The AzMERIT English Language Arts (ELA) score is derived from performance in three areas: Reading for Information, Reading for Literature, and Writing and Language. There are four performance levels for the AzMERIT ELA score: 1-Minimally Proficient, 2-Partially Proficient, 3-Proficient, and 4-Highly Proficient. Scores in the level 3 (Proficient) and 4 (Highly Proficient) range meet grade-level achievement standards. The minimal score to be at level 3 on the ELA portion of the AzMERIT is 2509 for Grade 3 and 2523 for Grade 4.

*Describe the sample(s), including size and characteristics, for each validity analysis conducted.: The data used for assessing validity came from third through sixth grade. The total sample size is 4,249 students from 14 schools. The sample was drawn from the Pacific census region.

*Describe the analysis procedures for each reported type of validity.: Predictive validity is the correlation between the Acadience Reading Maze Adjusted Score at the beginning of the year and the CST and AzMERIT scores at the end of the school year. This coefficient represents the extent to which Acadience Reading Maze can predict later reading outcomes. Concurrent validity is the correlation between the Acadience Reading Maze Adjusted Score and the CST and AzMERIT measures both at the end of the year. This coefficient represents the extent to which Acadience Reading Maze is related to important reading outcomes.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.: Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). DIBELS Next Technical Adequacy Brief. Eugene: Dynamic Measurement Group.

Describe the degree to which the provided data support the validity of the tool.: Both the concurrent and predictive correlation are high. These strong correlations suggest that Acadience Reading Maze assesses skills relevant to broad reading outcomes.

Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of	Subgroup	Informant	Age / Grade	Test or Criterion	n	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Bias Analysis

Grade	Grade 3	Grade 4	Grade 5	Grade 6
Rating	Provided	Provided	Provided	Provided

Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.: Yes

If yes,
a. Describe the method used to determine the presence or absence of bias:: Bias was conceptualized as different classification accuracy between different groups. This was assessed using a Cleary model with the dichotomous outcome of status on the criterion, where the Maze Adjusted Score, subgroup, and the interaction between the two were used as predictors. If a model with the subgroup and interaction term do not add significantly to model fit, there was evidence that Maze is not biased. Model fit was assessed using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and the likelihood ratio test (LRT). The effect size for bias was assessed using the difference in AUC for the ROC curves for the different groups. These models were tested for each grade, at each time of year.

b. Describe the subgroups for which bias analyses were conducted:: Bias was assessed across genders and among white and non-white students.

c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.: Of the 9 models examining bias across ethnicities the AIC and LRT favored a model without bias eight times, while the BIC favored a model without bias all nine times. Of the 21 models examining bias across genders, the AIC favored a model without bias 17 times while the BIC favored a model without bias 20 times. Likewise, the likelihood ratio test favored a model with bias only three times out of 21 models. The results show that the rate of preferring model with bias is near the global Type I error rate of .05, suggesting a lack of bias on the Maze measure.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.

Summary
Descriptive Information
Administration
Training & Scoring

Technical Standards
Classification Accuracy &
Cross-Validation Summary
Reliability
Validity
Bias Analysis

Data Collection Practices

Acadience Reading (aka DIBELS Next)Maze

Summary

Descriptive Information

Administration

Training & Scoring

Training

Scoring

Technical Standards

Classification Accuracy & Cross-Validation Summary

Group Reading Assessment and Diagnostic Evaluation (GRADE)

Classification Accuracy

Cross-Validation

California Standards Test (CST): ELA, Reading Cluster

Classification Accuracy

Cross-Validation

Classification Accuracy - Fall

Classification Accuracy - Winter

Classification Accuracy - Spring

Reliability

Validity

Bias Analysis

Data Collection Practices

Acadience Reading (aka DIBELS Next)
Maze