Acadience Reading (aka DIBELS Next)
Maze

Summary

Acadience Reading Maze (previously published under the DIBELS Next® mark ) is a measure of reading comprehension. Maze assesses a student’s ability to construct meaning from text using word recognition skills, background information and prior knowledge, familiarity with linguistic properties such as syntax and morphology, and cause and effect reasoning skills. Maze is just one measure that is part of a broader reading assessment known as Acadience Reading.

Where to Obtain:
Developer: Dynamic Measurement Group, Publisher: Voyager Sopris Learning and Amplify Education, Inc
info@dibels.org
859 Willamette Street, Suite 320, Eugene, OR 97410
541-431-6931 or toll free 888-943-1240
http://dibels.org
Initial Cost:
Free
Replacement Cost:
Free
Included in Cost:
ADDITIONAL INFO WHERE TOOL CAN BE OBTAINED: Dynamic Measurement Group (Free download version in black and white) Website: http://dibels.org Address: 859 Willamette Street, Suite 320, Eugene, OR 97410 Phone number: 541-431-6931 or toll free 888-943-1240 Email address: info@dibels.org Voyager Sopris Learning (Published print version in color) Website: http://voyagersopris.com Address: 17855 Dallas Parkway, Suite 400, Dallas, TX 75287-6816 Telephone number: (800) 547-6747 Amplify (Published mobile device version) Website: www.amplify.com Address: 55 Washington Street, Suite 800, Brooklyn, NY 11201 Telephone number: (800) 823-1969 ADDITIONAL INFO ON COST INFO: Amplify Initial cost for implementing program: 14.90 Unit of Cost: student Replacement cost for subsequent use: 14.90 Unit of cost: student License duration: Year Voyager Sopris Learning Initial cost for implementing program: $3.64 – $3.72 Unit of Cost: student Replacement cost for subsequent use: $3.62 - $3.74 Unit of cost: student License duration: Year Additional Cost Information: Describe basic pricing plan and/or cost structure of the tool, including, as applicable: cost per student per year, start-up or other one-time costs, reoccurring costs, training cost, and what is included in the published tool. Also, provide information on what is not included but required for implementation (e.g., computer and/or internet access.) See cost information noted above and in the footnotes, as well as costs related to training in Section B, Item 5. Computer and internet access are required for the mobile device version from Amplify.
Acadience Reading is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading is appropriate for assessing and monitoring progress in acquisition of early reading skills. For all Acadience Reading measures (including Maze), students are never penalized for articulation or dialect differences that are part of their typical speech. In addition, Acadience Reading Maze includes a set of approved accommodations that assessors may use when appropriate (see the Acadience Reading Assessment Manual, p. 20-21). There are a few groups of students for whom Acadience Reading is not appropriate: (a) students who are learning to read in a language other than English; (b) students who are deaf; (c) students who have fluency-based speech disabilities such as stuttering (if the stuttering affects the student's response fluency within a one-minute timed assessment) and oral apraxia; and (d) students with severe disabilities for whom learning to read connected text is not an IEP goal. Assessment accommodations are used for those students for whom the standard administration conditions would not produce accurate results. Approved accommodations are those accommodations which are unlikely to change how the assessment functions. When approved accommodations are used, the scores can be reported and interpreted as official Acadience Reading scores (see Table 2.1 in the Acadience Reading Assessment Manual, p. 20, for a list of approved accommodations). Approved accommodations should be used only for students for whom the accommodations are necessary to provide an accurate assessment of student skills. Unapproved accommodations are accommodations that are likely to change how the assessment functions (such as modifying the timing rules). Scores from measures administered with unapproved accommodations should not be treated or reported as official Acadience Reading scores and cannot be compared to other Acadience Reading scores or benchmark goals but can be used to measure individual growth for a student. An unapproved accommodation may be used when (a) a student cannot be tested accurately using the standardized rules or approved accommodations, but the school would still like to measure progress for that student; or (b) a student’s Individualized Education Plan (IEP) requires testing with an unapproved accommodation. For more information about accommodations, see pages 20 to 21 of the Acadience Reading Assessment Manual.
Training Requirements:
1-4 hrs
Qualified Administrators:
No minimum qualifications specified.
Access to Technical Support:
Dynamic Measurement Group provides customer support for all Acadience Reading assessments, as well as support for the data management and reporting system, Acadience Data Management. Staff are available by phone and email on weekdays from 7am to 5pm Pacific Time, for no additional cost. The majority of customer support requests are resolved in less than one business day.
Assessment Format:
  • Direct observation
  • Performance measure
  • One-to-one
Scoring Time:
  • Scoring is automatic OR
  • 1 minutes per student form
Scores Generated:
  • Raw score
  • Percentile score
  • Developmental benchmarks
Administration Time:
  • 5 minutes per group
Scoring Method:
  • Manually (by hand)
  • Automatically (computer-scored)
Technology Requirements:
Accommodations:
Acadience Reading is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading is appropriate for assessing and monitoring progress in acquisition of early reading skills. For all Acadience Reading measures (including Maze), students are never penalized for articulation or dialect differences that are part of their typical speech. In addition, Acadience Reading Maze includes a set of approved accommodations that assessors may use when appropriate (see the Acadience Reading Assessment Manual, p. 20-21). There are a few groups of students for whom Acadience Reading is not appropriate: (a) students who are learning to read in a language other than English; (b) students who are deaf; (c) students who have fluency-based speech disabilities such as stuttering (if the stuttering affects the student's response fluency within a one-minute timed assessment) and oral apraxia; and (d) students with severe disabilities for whom learning to read connected text is not an IEP goal. Assessment accommodations are used for those students for whom the standard administration conditions would not produce accurate results. Approved accommodations are those accommodations which are unlikely to change how the assessment functions. When approved accommodations are used, the scores can be reported and interpreted as official Acadience Reading scores (see Table 2.1 in the Acadience Reading Assessment Manual, p. 20, for a list of approved accommodations). Approved accommodations should be used only for students for whom the accommodations are necessary to provide an accurate assessment of student skills. Unapproved accommodations are accommodations that are likely to change how the assessment functions (such as modifying the timing rules). Scores from measures administered with unapproved accommodations should not be treated or reported as official Acadience Reading scores and cannot be compared to other Acadience Reading scores or benchmark goals but can be used to measure individual growth for a student. An unapproved accommodation may be used when (a) a student cannot be tested accurately using the standardized rules or approved accommodations, but the school would still like to measure progress for that student; or (b) a student’s Individualized Education Plan (IEP) requires testing with an unapproved accommodation. For more information about accommodations, see pages 20 to 21 of the Acadience Reading Assessment Manual.

Descriptive Information

Please provide a description of your tool:
Acadience Reading Maze (previously published under the DIBELS Next® mark ) is a measure of reading comprehension. Maze assesses a student’s ability to construct meaning from text using word recognition skills, background information and prior knowledge, familiarity with linguistic properties such as syntax and morphology, and cause and effect reasoning skills. Maze is just one measure that is part of a broader reading assessment known as Acadience Reading.
The tool is intended for use with the following grade(s).
not selected Preschool / Pre - kindergarten
not selected Kindergarten
not selected First grade
not selected Second grade
selected Third grade
selected Fourth grade
selected Fifth grade
selected Sixth grade
not selected Seventh grade
not selected Eighth grade
not selected Ninth grade
not selected Tenth grade
not selected Eleventh grade
not selected Twelfth grade

The tool is intended for use with the following age(s).
not selected 0-4 years old
not selected 5 years old
not selected 6 years old
not selected 7 years old
not selected 8 years old
not selected 9 years old
not selected 10 years old
not selected 11 years old
not selected 12 years old
not selected 13 years old
not selected 14 years old
not selected 15 years old
not selected 16 years old
not selected 17 years old
not selected 18 years old

The tool is intended for use with the following student populations.
not selected Students in general education
not selected Students with disabilities
not selected English language learners

ACADEMIC ONLY: What skills does the tool screen?

Reading
Phonological processing:
not selected RAN
not selected Memory
not selected Awareness
not selected Letter sound correspondence
not selected Phonics
not selected Structural analysis

Word ID
not selected Accuracy
not selected Speed

Nonword
not selected Accuracy
not selected Speed

Spelling
not selected Accuracy
not selected Speed

Passage
not selected Accuracy
not selected Speed

Reading comprehension:
not selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
selected Maze
not selected Sentence verification
not selected Other (please describe):


Listening comprehension:
not selected Multiple choice questions
not selected Cloze
not selected Constructed Response
not selected Retell
not selected Maze
not selected Sentence verification
not selected Vocabulary
not selected Expressive
not selected Receptive

Mathematics
Global Indicator of Math Competence
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Early Numeracy
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Concepts
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematics Computation
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Mathematic Application
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Fractions/Decimals
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Algebra
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

Geometry
not selected Accuracy
not selected Speed
not selected Multiple Choice
not selected Constructed Response

not selected Other (please describe):

Please describe specific domain, skills or subtests:
BEHAVIOR ONLY: Which category of behaviors does your tool target?


BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

Acquisition and Cost Information

Where to obtain:
Email Address
info@dibels.org
Address
859 Willamette Street, Suite 320, Eugene, OR 97410
Phone Number
541-431-6931 or toll free 888-943-1240
Website
http://dibels.org
Initial cost for implementing program:
Cost
$0.00
Unit of cost
Any
Replacement cost per unit for subsequent use:
Cost
$0.00
Unit of cost
Any
Duration of license
Unlimited
Additional cost information:
Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.
ADDITIONAL INFO WHERE TOOL CAN BE OBTAINED: Dynamic Measurement Group (Free download version in black and white) Website: http://dibels.org Address: 859 Willamette Street, Suite 320, Eugene, OR 97410 Phone number: 541-431-6931 or toll free 888-943-1240 Email address: info@dibels.org Voyager Sopris Learning (Published print version in color) Website: http://voyagersopris.com Address: 17855 Dallas Parkway, Suite 400, Dallas, TX 75287-6816 Telephone number: (800) 547-6747 Amplify (Published mobile device version) Website: www.amplify.com Address: 55 Washington Street, Suite 800, Brooklyn, NY 11201 Telephone number: (800) 823-1969 ADDITIONAL INFO ON COST INFO: Amplify Initial cost for implementing program: 14.90 Unit of Cost: student Replacement cost for subsequent use: 14.90 Unit of cost: student License duration: Year Voyager Sopris Learning Initial cost for implementing program: $3.64 – $3.72 Unit of Cost: student Replacement cost for subsequent use: $3.62 - $3.74 Unit of cost: student License duration: Year Additional Cost Information: Describe basic pricing plan and/or cost structure of the tool, including, as applicable: cost per student per year, start-up or other one-time costs, reoccurring costs, training cost, and what is included in the published tool. Also, provide information on what is not included but required for implementation (e.g., computer and/or internet access.) See cost information noted above and in the footnotes, as well as costs related to training in Section B, Item 5. Computer and internet access are required for the mobile device version from Amplify.
Provide information about special accommodations for students with disabilities.
Acadience Reading is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading is appropriate for assessing and monitoring progress in acquisition of early reading skills. For all Acadience Reading measures (including Maze), students are never penalized for articulation or dialect differences that are part of their typical speech. In addition, Acadience Reading Maze includes a set of approved accommodations that assessors may use when appropriate (see the Acadience Reading Assessment Manual, p. 20-21). There are a few groups of students for whom Acadience Reading is not appropriate: (a) students who are learning to read in a language other than English; (b) students who are deaf; (c) students who have fluency-based speech disabilities such as stuttering (if the stuttering affects the student's response fluency within a one-minute timed assessment) and oral apraxia; and (d) students with severe disabilities for whom learning to read connected text is not an IEP goal. Assessment accommodations are used for those students for whom the standard administration conditions would not produce accurate results. Approved accommodations are those accommodations which are unlikely to change how the assessment functions. When approved accommodations are used, the scores can be reported and interpreted as official Acadience Reading scores (see Table 2.1 in the Acadience Reading Assessment Manual, p. 20, for a list of approved accommodations). Approved accommodations should be used only for students for whom the accommodations are necessary to provide an accurate assessment of student skills. Unapproved accommodations are accommodations that are likely to change how the assessment functions (such as modifying the timing rules). Scores from measures administered with unapproved accommodations should not be treated or reported as official Acadience Reading scores and cannot be compared to other Acadience Reading scores or benchmark goals but can be used to measure individual growth for a student. An unapproved accommodation may be used when (a) a student cannot be tested accurately using the standardized rules or approved accommodations, but the school would still like to measure progress for that student; or (b) a student’s Individualized Education Plan (IEP) requires testing with an unapproved accommodation. For more information about accommodations, see pages 20 to 21 of the Acadience Reading Assessment Manual.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?
not selected General education teacher
not selected Special education teacher
not selected Parent
not selected Child
not selected External observer
not selected Other
If other, please specify:

What is the administration setting?
selected Direct observation
not selected Rating scale
not selected Checklist
selected Performance measure
not selected Questionnaire
not selected Direct: Computerized
selected One-to-one
not selected Other
If other, please specify:

Does the tool require technology?
No

If yes, what technology is required to implement your tool? (Select all that apply)
not selected Computer or tablet
not selected Internet connection
not selected Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?
selected Individual
not selected Small group   If small group, n=
selected Large group   If large group, n=
not selected Computer-administered
not selected Other
If other, please specify:

What is the administration time?
Time in minutes
5
per (student/group/other unit)
group

Additional scoring time:
Time in minutes
1
per (student/group/other unit)
student form

ACADEMIC ONLY: What are the discontinue rules?
selected No discontinue rules provided
not selected Basals
not selected Ceilings
not selected Other
If other, please specify:


Are norms available?
Yes
Are benchmarks available?
Yes
If yes, how many benchmarks per year?
Three
If yes, for which months are benchmarks available?
Beginning of year (months 1 - 3 of school year), middle of year (months 4 - 6 of the school year), and end of year (months 7 - 9 of the school year).
BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?
Yes
Describe the time required for administrator training, if applicable:
1-4 hrs
Please describe the minimum qualifications an administrator must possess.
selected No minimum qualifications
Are training manuals and materials available?
Yes
Are training manuals/materials field-tested?
Yes
Are training manuals/materials included in cost of tools?
No
If No, please describe training costs:
If not, please describe training costs: The Acadience Reading Assessment Manual is available for free download along with the test materials. In addition, Dynamic Measurement Group offers a variety of training options to meet different needs and at different price points. Training options include online training, live online webinars, onsite training (hiring a trainer to come out to the school or district), and our Super Institute, which takes place each summer. Dynamic Measurement Group staff can work with schools, LEAs, regional agencies, and SEAs to develop customized training plans to meet their unique needs. We also have an Acadience Reading Mentor program, where a single attendee or small group of attendees can become Acadience Reading Mentors and receive access to our official training materials, which they can use to train others in their school or district. For an individual teacher subscription to the online Acadience Reading Essential Workshop, the cost is $129. Please note: Other training options may cost more or less depending on the circumstances and the number of attendees.
Can users obtain ongoing professional and technical support?
Yes
If Yes, please describe how users can obtain support:
Dynamic Measurement Group provides customer support for all Acadience Reading assessments, as well as support for the data management and reporting system, Acadience Data Management. Staff are available by phone and email on weekdays from 7am to 5pm Pacific Time, for no additional cost. The majority of customer support requests are resolved in less than one business day.

Scoring

How are scores calculated?
selected Manually (by hand)
selected Automatically (computer-scored)
not selected Other
If other, please specify:

Do you provide basis for calculating performance level scores?
Yes
What is the basis for calculating performance level and percentile scores?
not selected Age norms
selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

What types of performance level scores are available?
selected Raw score
not selected Standard score
selected Percentile score
not selected Grade equivalents
not selected IRT-based score
not selected Age equivalents
not selected Stanines
not selected Normal curve equivalents
selected Developmental benchmarks
not selected Developmental cut points
not selected Equated
not selected Probability
not selected Lexile score
not selected Error analysis
not selected Composite scores
not selected Subscale/subtest scores
not selected Other
If other, please specify:

Does your tool include decision rules?
Yes
If yes, please describe.
The fundamental logic for developing the Acadience Reading benchmark goals and cut points for risk was to begin with an external outcome goal and work backward in a step-by-step system. Student skills at or above benchmark at the beginning of the year put odds in favor of the student achieving the middle-of- year benchmark goal. In turn, students with skills at or above benchmark in the middle of the year have odds in favor of achieving the end-of-year benchmark goal. We first obtained an external criterion measure (the GRADE Total Test Raw Score) at the end of the year with a level of performance that would represent adequate reading skills. The scores at the 40th and 20th percentiles on the GRADE compared to the GRADE normative sample were used as an approximation for adequate reading skills, and a cut point for risk, respectively. Next, we specified the benchmark goal and cut point for risk on the end-of- year Acadience Reading Composite with respect to the end-of-year external criterion. Then, using the Acadience Reading Composite end-of-year goal as an internal criterion, we established the benchmark goals and cut points for risk on the middle-of-year Acadience Reading Composite. Finally, we established the benchmark goals and cut points for risk on the beginning-of-year Acadience Reading Composite using the middle-of-year Acadience Reading Composite as an internal criterion. The primary design specification for benchmark goals was to establish a level of skill where students scoring at or above benchmark have favorable odds (80%-90%) of achieving subsequent reading outcomes. The primary specification for a cut point for risk is a level of skill where students scoring below that level have low odds (10%-20%) of achieving subsequent reading outcomes. A secondary specification was based on an examination of marginal percents. We aimed to keep the marginal percent of students in each score level consistent from predictor to criterion. Another consideration was based on logistic regression predicting the odds of scoring at or above benchmark on the criterion based on the score on the predictor. We aimed to keep the predicted odds for students obtaining the exact benchmark goal at 60% or higher of achieving subsequent goals, and the predicted odds of achieving subsequent goals at 40% or less for students obtaining the exact score corresponding to the cut point for risk. Additional issues considered include the pattern of student performance in the scatterplot, the ROC curve analysis to evaluate the AUC, sensitivity and specificity, and the overall pattern of benchmark goals and cut points for risk across grades. The same standard setting methodology used for the Acadience Reading Composite was also used for each individual Acadience Reading component measure. For details, please see the following documents : • DIBELS Next Benchmark Goals and Composite Score, p. 1-5. • DIBELS Next Technical Manual, p. 49-54, and p. 63-83 for Benchmark Goal Detail pages.
Can you provide evidence in support of multiple decision rules?
Yes
If yes, please describe.
Research evidence supporting the use of Acadience Reading measures for benchmark assessment three times per year (diagnostic screening) is found here in the following documents: • DIBELS Next Technical Manual, p. 49-54, and p. 63-83 for Benchmark Goal Detail pages. • DIBELS Next Technical Adequacy Brief, Table 5, p. 12-13. • Using Curriculum-Based Measures to Predict Reading Test Scores on the Michigan Educational Assessment Program (Technical Report) • DIBELS Next and the SBAC ELA (Contemporary School Psych 2018) • Using DIBELS Next to Predict Performance on Statewide ELA Assessments: A Tale of Two Tests (NASP 2018)
Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
The Acadience Reading measures were developed to provide teachers with information they need to make decisions about instruction. The authors advocate a data-based decision-making model referred to as the Outcomes-Driven Model, because the data are used to make decisions to improve student outcomes by matching the amount and type of instructional support with the needs of the individual students. These steps of the model repeat each trimester as a student progresses through the grades. At the beginning of the trimester, the first step is to identify students who may need additional support. At the end of the trimester, the final step is to review outcomes, which also facilitates identifying students who need additional support for the next trimester. In this manner, educators can ensure that students who are on track to become proficient readers continue to make adequate progress, and that those students who are not on track receive the support they need to become proficient readers. Step 1: Identify need for support early. This process occurs during benchmark assessment, and is also referred to as universal screening. The purpose is to identify those students who may need additional instructional support to achieve benchmark goals. The benchmark assessment also provides information regarding the performance of all students in the school with respect to benchmark goals. All students within a school or grade are tested three times per year on grade-level material. The testing occurs at the beginning, middle, and end of the school year. Step 2: Validate need for support. The purpose of this step is to be reasonably confident that the student needs or does not need additional instructional support. Before making individual student decisions, it is important to consider additional information beyond the initial data obtained during benchmark testing. Teachers can always use additional assessment information and knowledge about a student to validate a score before making decisions about instructional support. If there is a discrepancy in the student’s performance relative to other information available about the student, or if there is a question about the accuracy of a score, the score can be validated by retesting the student using alternate forms of the Acadience Reading measures or additional diagnostic assessments as necessary. Step 3: Plan and implement support. In general, for students who are meeting the benchmark goals, a good, research-based core classroom curriculum should meet their instructional needs, and they will continue to receive benchmark assessment three times per year to ensure they remain on track. Students who are identified as needing support are likely to require additional instruction or intervention in the skill areas where they are having difficulties. Step 4: Evaluate and modify support as needed. Students who are receiving additional support should be progress monitored more frequently to ensure that the instructional support being provided is helping them get back on track. Students should be monitored on the measures that test the skill areas where they are having difficulties and receiving additional instructional support. Monitoring may occur once per month, once every two weeks, or as often as once per week. In general, students who need the most intensive instruction are progress monitored most frequently. Step 5: Review outcomes. By looking at the benchmark assessment data for all students, schools can ensure that their instructional supports—both core curriculum and additional interventions—are working for all students. If a school identifies areas of instructional support that are not working as desired, the school can use the data to help make decisions on how to improve. The use of Acadience Reading measures within the Outcomes-Driven Model is consistent with the most recent reauthorization of the Individuals with Disabilities Education Improvement Act (IDEA), which allows the use of a Response to Intervention (RtI) approach to identify children with learning disabilities. In an RtI approach to identification, early intervention is provided to students who are at risk for the development of learning difficulties. Data are gathered to determine which students are responsive to the intervention provided and which students need more intensive support (Fuchs & Fuchs, 2006). The Outcomes-Driven Model is based on foundational work with a problem-solving model (see Deno, 1989; Shinn, 1995; Tilly, 2008) and the initial application of the problem-solving model to early literacy skills (Kaminski & Good, 1998). The general questions addressed by a problem-solving model include: What is the problem? Why is it happening? What should be done about it? Did it work? (Tilly, 2008). The Outcomes-Driven Model was developed to address these questions, but within a prevention-oriented framework designed to preempt early reading difficulty and ensure step-by-step progress toward outcomes that will result in established, adequate reading achievement.

Technical Standards

Classification Accuracy & Cross-Validation Summary

Grade Grade 3
Grade 4
Grade 5
Grade 6
Classification Accuracy for Criterion 1 Fall Partially convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Classification Accuracy for Criterion 1 Winter Partially convincing evidence Partially convincing evidence Partially convincing evidence Unconvincing evidence
Classification Accuracy for Criterion 1 Spring Partially convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Classification Accuracy for Criterion 2 Fall Partially convincing evidence Convincing evidence Convincing evidence Partially convincing evidence
Classification Accuracy for Criterion 2 Winter Partially convincing evidence Partially convincing evidence Partially convincing evidence Convincing evidence
Classification Accuracy for Criterion 2 Spring Partially convincing evidence Partially convincing evidence Partially convincing evidence Partially convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available

Classification Accuracy - Criterion 1 Fall

Evidence Grade 3 Grade 4 Grade 5 Grade 6
Criterion measure Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE)
Cut Points - Percentile rank on criterion measure 20 20 20 20
Cut Points - Performance score on criterion measure 78 41 39 53
Cut Points - Corresponding performance score (numeric) on screener measure 5 10 12 14
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC) 0.86 0.85 0.84 0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.82 0.77 0.78 0.75
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.90 0.94 0.90 0.94
Statistics Grade 3 Grade 4 Grade 5 Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power
Sample Grade 3 Grade 4 Grade 5 Grade 6
Date
Sample Size
Geographic Representation        
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Criterion 1 Winter

Evidence Grade 3 Grade 4 Grade 5 Grade 6
Criterion measure Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE)
Cut Points - Percentile rank on criterion measure 20 20 20 20
Cut Points - Performance score on criterion measure 78 41 39 53
Cut Points - Corresponding performance score (numeric) on screener measure 7 12 13 14
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC) 0.88 0.84 0.84 0.80
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.82 0.76 0.77 0.69
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.93 0.92 0.92 0.92
Statistics Grade 3 Grade 4 Grade 5 Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power
Sample Grade 3 Grade 4 Grade 5 Grade 6
Date
Sample Size
Geographic Representation        
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Criterion 1 Spring

Evidence Grade 3 Grade 4 Grade 5 Grade 6
Criterion measure Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE) Group Reading Assessment and Diagnostic Evaluation (GRADE)
Cut Points - Percentile rank on criterion measure 20 20 20 20
Cut Points - Performance score on criterion measure 78 41 39 53
Cut Points - Corresponding performance score (numeric) on screener measure 14 20 18 15
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC) 0.84 0.87 0.80 0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.76 0.80 0.73 0.75
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.92 0.94 0.88 0.92
Statistics Grade 3 Grade 4 Grade 5 Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power
Sample Grade 3 Grade 4 Grade 5 Grade 6
Date 2011 2011 2011 2011
Sample Size
Geographic Representation        
Male
Female
Other
Gender Unknown 219 186 195 103
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Other
Race / Ethnicity Unknown 219 186 195 103
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Criterion 2 Fall

Evidence Grade 3 Grade 4 Grade 5 Grade 6
Criterion measure California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster
Cut Points - Percentile rank on criterion measure
Cut Points - Performance score on criterion measure <= 315 <= 350 <= 350 <= 350
Cut Points - Corresponding performance score (numeric) on screener measure 5 10 12 14
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC) 0.84 0.86 0.87 0.83
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.81 0.83 0.85 0.79
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.84 0.88 0.89 0.86
Statistics Grade 3 Grade 4 Grade 5 Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power
Sample Grade 3 Grade 4 Grade 5 Grade 6
Date
Sample Size
Geographic Representation        
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Criterion 2 Winter

Evidence Grade 3 Grade 4 Grade 5 Grade 6
Criterion measure California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster
Cut Points - Percentile rank on criterion measure
Cut Points - Performance score on criterion measure <= 315 <= 350 <= 350 <= 350
Cut Points - Corresponding performance score (numeric) on screener measure 7 12 13 14
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC) 0.82 0.82 0.83 0.86
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.78 0.79 0.80 0.83
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.85 0.84 0.85 0.90
Statistics Grade 3 Grade 4 Grade 5 Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power
Sample Grade 3 Grade 4 Grade 5 Grade 6
Date
Sample Size
Geographic Representation        
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Classification Accuracy - Criterion 2 Spring

Evidence Grade 3 Grade 4 Grade 5 Grade 6
Criterion measure California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster California Standards Test (CST): ELA, Reading Cluster
Cut Points - Percentile rank on criterion measure
Cut Points - Performance score on criterion measure <= 315 <= 350 <= 350 <= 350
Cut Points - Corresponding performance score (numeric) on screener measure 14 20 18 15
Classification Data - True Positive (a)
Classification Data - False Positive (b)
Classification Data - False Negative (c)
Classification Data - True Negative (d)
Area Under the Curve (AUC) 0.85 0.84 0.81 0.84
AUC Estimate’s 95% Confidence Interval: Lower Bound 0.82 0.82 0.79 0.81
AUC Estimate’s 95% Confidence Interval: Upper Bound 0.87 0.87 0.84 0.88
Statistics Grade 3 Grade 4 Grade 5 Grade 6
Base Rate
Overall Classification Rate
Sensitivity
Specificity
False Positive Rate
False Negative Rate
Positive Predictive Power
Negative Predictive Power
Sample Grade 3 Grade 4 Grade 5 Grade 6
Date 2011 2011 2011 2011
Sample Size
Geographic Representation        
Male
Female
Other
Gender Unknown
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Other
Race / Ethnicity Unknown
Low SES
IEP or diagnosed disability
English Language Learner

Reliability

Grade Grade 3
Grade 4
Grade 5
Grade 6
Rating Convincing evidence Convincing evidence Convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Offer a justification for each type of reliability reported, given the type and purpose of the tool.
Reliability refers to the relative stability with which a test measures the same skills across minor differences in conditions. Three types of reliability are reported in the table below, alternate form reliability, alpha, and inter-rater reliability. Alternate form reliability is the correlation between different measures of the same early literacy skills. The coefficient reported is the average correlation among three forms of the measure. Coefficient alpha is a measure of reliability that is widely used in education research and represents the proportion of true score to total variance. Alpha incorporates information about the average inter-test correlation as well as the number of tests. Inter-rater reliability indicates the extent to which results generalize across assessors. The inter-rater reliability estimates reported here are based on two independent assessors simultaneously scoring student performance during a single test administration (“shadow-scoring”). The two raters’ scores were then correlated.
*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
The data used for assessing reliability came from third through sixth grade. The total sample size is 674 students from 13 schools within 5 school districts. The sample was drawn from two census regions (Pacific and North Central Midwest).
*Describe the analysis procedures for each reported type of reliability.
Alternate form reliability is reported as the average correlation among three alternate forms of the same test. Coefficient alpha treats each of the three tests as separate indicators and is calculated using the alternate form reliability, where the number of tests is equal to three.

*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
Yes
Provide citations for additional published studies.
Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). DIBELS Next technical adequacy brief. Eugene: Dynamic Measurement Group.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Validity

Grade Grade 3
Grade 4
Grade 5
Grade 6
Rating Convincing evidence Convincing evidence Partially convincing evidence Convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
The California Standards Test (CST) is a statewide achievement test produced for California public schools and was designed to assess the California content standards for English/language arts (ELA), mathematics, history–social science, and science in grades two through eleven. The Reading cluster of the ELA portion of the CST was chosen as the criterion. According to a technical report from ETS (2011), the CST items were developed and designed to conform to principles of item writing defined by ETS (ETS, 2002). In addition, the items selected underwent an extensive item review process designed to provide the best standards-based tests possible. The AzMERIT is a statewide achievement test produced for Arizona. Arizona partnered with the American Institutes for Research (AIR) to develop this Arizona-specific assessment linked to, and designed to assess, the Arizona College and Career Ready Standards (ACCRS) for English Language Arts (ELA) and Math. According to the test developer, scores from the AzMERIT can be used to evaluate whether students have achieved the ACCRS by the end of the school year, as well as the effectiveness with which Arizona districts and schools teach students the ACCRS (AIR, 2017). A rigorous test development process was used to ensure item alignment with the ACCRS. Test blueprints specified the range and depth with which each standard was assessed. Items underwent extensive review by content experts, an internal review with AIR experts, and reviewed by parents/community members. Field-testing was extensive along with research examining the relation of the test to other measures (e.g., NAEP, SBAC, ACT). The AzMERIT English Language Arts (ELA) score is derived from performance in three areas: Reading for Information, Reading for Literature, and Writing and Language. There are four performance levels for the AzMERIT ELA score: 1-Minimally Proficient, 2-Partially Proficient, 3-Proficient, and 4-Highly Proficient. Scores in the level 3 (Proficient) and 4 (Highly Proficient) range meet grade-level achievement standards. The minimal score to be at level 3 on the ELA portion of the AzMERIT is 2509 for Grade 3 and 2523 for Grade 4.
*Describe the sample(s), including size and characteristics, for each validity analysis conducted.
The data used for assessing validity came from third through sixth grade. The total sample size is 4,249 students from 14 schools. The sample was drawn from the Pacific census region.
*Describe the analysis procedures for each reported type of validity.
Predictive validity is the correlation between the Acadience Reading Maze Adjusted Score at the beginning of the year and the CST and AzMERIT scores at the end of the school year. This coefficient represents the extent to which Acadience Reading Maze can predict later reading outcomes. Concurrent validity is the correlation between the Acadience Reading Maze Adjusted Score and the CST and AzMERIT measures both at the end of the year. This coefficient represents the extent to which Acadience Reading Maze is related to important reading outcomes.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). DIBELS Next Technical Adequacy Brief. Eugene: Dynamic Measurement Group.
Describe the degree to which the provided data support the validity of the tool.
Both the concurrent and predictive correlation are high. These strong correlations suggest that Acadience Reading Maze assesses skills relevant to broad reading outcomes.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated validity data.

Type of Subgroup Informant Age / Grade Test or Criterion n Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Bias Analysis

Grade Grade 3
Grade 4
Grade 5
Grade 6
Rating Yes Yes Yes Yes
Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
Yes
If yes,
a. Describe the method used to determine the presence or absence of bias:
Bias was conceptualized as different classification accuracy between different groups. This was assessed using a Cleary model with the dichotomous outcome of status on the criterion, where the Maze Adjusted Score, subgroup, and the interaction between the two were used as predictors. If a model with the subgroup and interaction term do not add significantly to model fit, there was evidence that Maze is not biased. Model fit was assessed using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and the likelihood ratio test (LRT). The effect size for bias was assessed using the difference in AUC for the ROC curves for the different groups. These models were tested for each grade, at each time of year.
b. Describe the subgroups for which bias analyses were conducted:
Bias was assessed across genders and among white and non-white students.
c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Of the 9 models examining bias across ethnicities the AIC and LRT favored a model without bias eight times, while the BIC favored a model without bias all nine times. Of the 21 models examining bias across genders, the AIC favored a model without bias 17 times while the BIC favored a model without bias 20 times. Likewise, the likelihood ratio test favored a model with bias only three times out of 21 models. The results show that the rate of preferring model with bias is near the global Type I error rate of .05, suggesting a lack of bias on the Maze measure.

Disclaimer

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.