Acadience Reading K-6 (aka DIBELS Next®)
Maze
Summary
Acadience Reading Maze uses standardized maze procedures for measuring reading comprehension. The purpose of a maze procedure is to measure the reasoning processes that constitute comprehension. Specifically, Maze assesses the student's ability to construct meaning from text using word recognition skills, background information and prior knowledge, familiarity with linguistic properties such as syntax and morphology, and reasoning skills. Maze can be given to a whole class at the same time, to a small group of students, or individually. Students are given a passage where approximately every seventh word has been replaced by a box containing the correct word and two distractor words. Using standardized directions, students are asked to read the passage silently and circle their word choices. The student receives credit for selecting the word that best fits the omitted word in the reading passage. The scores that are recorded are the number of correct and incorrect responses. The Maze Adjusted Score, which compensates for guessing, is calculated based on the number of correct and incorrect responses. Half the number of incorrect responses are subtracted from the total correct responses, and the difference is rounded up to the nearest whole number.
- Where to Obtain:
- Acadience Learning Inc. and Voyager Sopris Learning
- info@acadiencelearning.org
- Acadience Learning: 859 Willamette Street, Suite 320, Eugene, OR 97401; Voyager Sopris: 17855 Dallas Parkway, Suite 400, Dallas, TX 75287-6816
- Acadience Learning: (541)4316931, (888) 943-1240; Voyager Sopris: (888) 399-1995
- Acadience Learning: https://acadiencelearning.org/; Voyager Sopris: http://voyagersopris.com
- Initial Cost:
- Free
- Replacement Cost:
- Free
- Included in Cost:
- Acadience Learning: All materials are available for free download at https://acadiencelearning.org/acadiencereading.html, including progress monitoring student materials for each grade, assessor directions and keys for each grade, the Acadience Reading K-6 Assessment Manual, and the Acadience Reading Technical Manual. Large print materials are also available. Voyager Sopris: There are three purchasing options for implementing progress monitoring materials: 1) Progress monitoring via online test administration and scoring; 2) Progress monitoring materials as part of the purchase of classroom sets, which also include benchmark materials; and 3) Individual progress monitoring materials (i.e., Assessor Materials, Student Booklets). Classroom sets contain everything needed for one person to conduct the benchmark assessment for 25 students and the progress monitoring assessment for up to five students.
- Approved accommodations are any accommodations that will not alter the standardization of the assessment. Approved Accommodations: 1. The use of colored overlays, filters, or lighting adjustments for students with visual impairments. 2. The use of student materials that have been enlarged or with larger print for students with visual impairments. 3. The use of assistive technology, such as hearing aids and assistive listening devices (ALDs), for students with hearing impairments. 4. The use of a marker or ruler to focus student attention on the materials for students who are not able to demonstrate their skills adequately without one.
- Training Requirements:
- Approximately 1-2 hours of training to cover foundations of Acadience Reading, as well as administration and scoring of Maze.
- Qualified Administrators:
- Paraprofessional-level training and adequate training on administration and scoring of Maze.
- Access to Technical Support:
- Acadience Learning: Customer support is available from 8:00am to 5:00pm PT, Monday through Friday by phone, email, or through Acadience Learning's website; Voyager Sopris: Customer support is available 8:00am to 6:00pm CT, Monday through Friday by phone, email, or through the Voyager Sopris website.
- Assessment Format:
-
- Individual
- Small group
- Large group
- Computer-administered
- Scoring Time:
-
- Scoring is automatic OR
- 1 minutes per worksheet
- Scores Generated:
-
- Raw score
- Percentile score
- Developmental benchmarks
- Developmental cut points
- Administration Time:
-
- 3 minutes per student or worksheet
- Scoring Method:
-
- Manually (by hand)
- Automatically (computer-scored)
- Technology Requirements:
-
Tool Information
Descriptive Information
- Please provide a description of your tool:
- Acadience Reading Maze uses standardized maze procedures for measuring reading comprehension. The purpose of a maze procedure is to measure the reasoning processes that constitute comprehension. Specifically, Maze assesses the student's ability to construct meaning from text using word recognition skills, background information and prior knowledge, familiarity with linguistic properties such as syntax and morphology, and reasoning skills. Maze can be given to a whole class at the same time, to a small group of students, or individually. Students are given a passage where approximately every seventh word has been replaced by a box containing the correct word and two distractor words. Using standardized directions, students are asked to read the passage silently and circle their word choices. The student receives credit for selecting the word that best fits the omitted word in the reading passage. The scores that are recorded are the number of correct and incorrect responses. The Maze Adjusted Score, which compensates for guessing, is calculated based on the number of correct and incorrect responses. Half the number of incorrect responses are subtracted from the total correct responses, and the difference is rounded up to the nearest whole number.
- Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?
-
ACADEMIC ONLY: What dimensions does the tool assess?
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
Acquisition and Cost Information
Administration
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- Approximately 1-2 hours of training to cover foundations of Acadience Reading, as well as administration and scoring of Maze.
- Please describe the minimum qualifications an administrator must possess.
- Paraprofessional-level training and adequate training on administration and scoring of Maze.
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- Acadience Learning: Customer support is available from 8:00am to 5:00pm PT, Monday through Friday by phone, email, or through Acadience Learning's website; Voyager Sopris: Customer support is available 8:00am to 6:00pm CT, Monday through Friday by phone, email, or through the Voyager Sopris website.
Scoring
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- Maze is a group or individually administered measure. The assessor asks students to read a passage and circle the word that makes the most sense in the story. The assessor scores the Maze worksheet after the student has completed it. The assessor corrects the worksheet and calculates the student's number of correct and incorrect responses. If a student completes the assessment before the allotted time (3 minutes) is up, the assessor does not prorate the score. The student receives 1 point for each correct word, minus half a point for each incorrect word. A response is correct if the student circled or otherwise marked the correct word. The assessor will mark a slash (/) through any incorrect responses. Incorrect responses include errors, boxes with more than one answer marked, and items left blank (if they occur before the last item the student attempted within the 3-minute time limit). Items left blank because the student could not get to them before time ran out do not need to be slashed and do not count as incorrect responses. If there are erasure marks, scratched out words, or any other extraneous markings, and the student’s final response is obvious, the assessor should score the item based on that response. Assessors record both scores (correct and incorrect) on the cover sheet. On the cover sheet, “C” designates correct responses and “I” designates incorrect responses. For progress monitoring, there is no scoring booklet for Maze, but there is a progress monitoring chart to record the scores. The Adjusted Score is a modified score that compensates for student guessing and is calculated using the following formula: Adjusted Score = number of correct responses – (number of incorrect responses ÷ 2). The result of the formula should then be rounded to the nearest whole number. Half-points (0.5) should be rounded up. The minimum Adjusted Score is 0. Negative numbers are not recorded.
- Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?
- Yes
- ACADEMIC ONLY: Do you provide benchmarks for the slopes?
- No
- ACADEMIC ONLY: Do you provide percentile ranks for the slopes?
- No
- Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
- The Acadience Reading K-6 measures were designed to be economical and efficient indicators of a student's progress toward achieving a general outcome such as reading or phonemic awareness, and to be used for both benchmark assessment and progress monitoring. Progress monitoring refers to the more frequent testing of students who may be at risk for future reading difficulty on the skill areas in which they are receiving instruction, to ensure that they are making adequate progress. Progress monitoring can be conducted using grade-level or out-of-grade materials, depending on the student's needs. Decisions about the skill areas and levels to monitor are made at the individual student level. Students who are receiving additional support should be monitored for progress more frequently to ensure that the instructional support being provided is helping them get back on track. Monitoring may occur once per month, once every two weeks, or as often as once per week. In general, students who need the most intensive instruction are monitored for progress most frequently. Progress monitoring materials contain alternate forms of the same measures administered during benchmark assessment. Each alternate form is of equivalent difficulty. Not all students will need progress monitoring. Progress monitoring materials are organized by measure, since students who need progress monitoring will typically be monitored on specific measures related to the instruction they are receiving, rather than on every measure for that grade. Material selected for progress monitoring must be sensitive to growth, yet still represent an ambitious goal. The standardized procedures for administering an Acadienc Reading K-6 measure may apply when using Acadience Reading K-6 for progress monitoring. Progress monitoring data should be graphed and readily available to those who teach the student. An aimline should be drawn from the student's current skill level (which may be the most recent benchmark assessment score) to the goal. Progress monitoring scores can then be plotted over time and examined to determine whether they indicate that the student is making adequate progress (i.e. fall above or below the aimline). The Acadience Reading K-6 assessments were designed to support students of varied backgrounds. Passages were written with names that represent diverse cultural, racial, and ethnic groups. Acadience Reading K-6 is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading K-6 is appropriate for assessing and monitoring progress in acquisition of early reading skills.
Rates of Improvement and End of Year Benchmarks
- Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in your manual or published materials?
- Yes
- If yes, specify the growth standards:
- Using Acadience Reading Pathways of Progress, the growth standards depend on the student's beginning of year performance relative to students with similar levels of initial skills, i.e., student performance is only compared to other students who have the same beginning of year score. Student scores above the 80th percentile are considered Well Above Typical progress. Student scores between the 60th and 79th percentile are considered Above Typical progress.Student scores between the 40th and 59th percentile are considered Typical progress. Student scores between the 20th and 39th percentile are considered Below Typical progress. And student scores below the 20th percentile are considered Well Below Typical progress.
- Are benchmarks for minimum acceptable end-of-year performance specified in your manual or published materials?
- Yes
- If yes, specify the end-of-year performance standards:
- Three primary end-of-year performance standards are specified: Well Below Benchmark, Below Benchmark, and At or Above Benchmark. These standards are used to indicate increasing odds of achieving At or Above benchmark status at the next benchmark administration. End of year benchmarks goals and cut points for risk: Grade 3 benchmark goal: 19, cut point: 14; Grade 4 benchmark goal: 24, cut point: 20; Grade 5 benchmark goal: 24, cut point: 18; Grade 6 benchmark goal: 21, cut point: 15.
- Date
- 2018
- Size
- 2,748,243
- Male
- 52%
- Female
- 48%
- Unknown
- 0%
- Eligible for free or reduced-price lunch
- 60%
- Other SES Indicators
- White, Non-Hispanic
- 45.63%
- Black, Non-Hispanic
- Hispanic
- American Indian/Alaska Native
- Asian/Pacific Islander
- Other
- Unknown
- Disability classification (Please describe)
- First language (Please describe)
- Language proficiency status (Please describe)
Performance Level
Reliability
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- Reliability refers to the relative stability with which a test measures the same skills across minor differences in conditions. Three types of reliability are reported in the table below, alternate form reliability, alpha, and inter-rater reliability. Alternate form reliability is the correlation between different measures of the same early literacy skills. The coefficient reported is the correlation between two forms of the measure. High alternate-form reliability coefficients suggest that these multiple forms are measuring the same construct. Coefficient alpha is a measure of reliability that is widely used in education research and represents the proportion of true score to total variance. Alpha incorporates information about the average inter-test correlation as well as the number of tests. Inter-rater reliability indicates the extent to which results generalize across assessors. The inter-rater reliability estimates reported represent the reliability of the directions and scoring procedures of the measures themselves as interpreted by the assessors administering the measure.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- The data used for assessing reliability came from third through sixth grade. The total sample size is 674 students from 13 schools within 5 school districts. The sample was drawn from two census regions (Pacific and North Central Midwest).
- *Describe the analysis procedures for each reported type of reliability.
- Alternate form reliability is reported as the correlation between two alternate forms of the same test. Coefficient alpha treats the two tests as separate indicators and is calculated using the alternate form reliability, where the number of tests is equal to two. For inter-rater reliability, pairwise correlations were performed on a data set that included scores from the administrator of the measure and a shadow scorer, resulting in two scores of the same student performance on the Maze measure.
*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Dewey, E. N., Latimer, R. J., Kaminski, R. A., & Good, R. H. (2011). DIBELS Next Development: Findings from Beta 2 Validation Study (Tech. Report No. 10). Eugene, OR: Dynamic Measurement Group. Available: https://acadienclearning.org. Powell-Smith, K. A., Good, R. H., Latimer, R. J., Dewey, E. N., & Kaminski, R. A. (2011). DIBELS Next Benchmark Goals Study (Tech. Report No. 11). Eugene, OR: Dynamic Measurement Group. Available: https://acadienclearning.org. Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). Acadience Reading K–6 Technical Adequacy Brief. Eugene, OR: Acadience Learning. Available: https://acadiencelearning.org. Please note that that Dynamic Measurement Group is now Acadience Learning and Acadience Reading K-6 is also published as DIBELS Next. Some historical documents retain the original assessment name and company name.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|
Rating |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- The Group Reading Assessment and Diagnostic Evaluation (GRADE) is an untimed, group-administered, norm-referenced reading achievement test appropriate for children in preschool through grade 12. The GRADE is comprised of 16 subtests within five components. Not all 16 subtests are used at each testing level. Various subtest scores are combined to form the Total Test composite score. The GRADE Total Test score is comprised of scores across subtests of the GRADE that vary by grade level. In kindergarten, the GRADE Total Test score is comprised of measures that assess phonics and phonemic and phonological awareness. In first and second grade, the GRADE Total Test includes word meaning, passage (or sentence) reading, and comprehension measures. In third grade, the GRADE Total Test is comprised of measures assessing word reading, vocabulary, and comprehension. In fourth, fifth, and sixth grade, the GRADE Total Test includes scores from measures of vocabulary and comprehension. The AzMERIT includes a number of different types of questions, including performance tasks that are multi-step assignments that ask students to apply their knowledge and skills to address real-world problems. In English Language Arts (ELA), the subtest examined in our analyses, students apply their research and writing skills. The test also includes traditional multiple choice questions, as well as interactive questions that require students to drag and drop their answers into a box, create equations, and fill in the answer. The California Standards Test (CST) is a statewide achievement test produced for California public schools and was designed to assess the California content standards for English/language arts (ELA), mathematics, history–social science, and science in grades 2-11. According to a technical report from ETS (2011), the CST items were developed and designed to conform to principles of item writing defined by ETS (ETS, 2002). In addition, the items selected underwent an extensive item review process designed to provide the best standards-based tests possible. The Reading cluster of the ELA portion of the CST was examined in our analyses.
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- The GRADE data set included scores for students in third and fifth sixth grade. The total sample size is 382 students from 13 schools within 5 school districts. The sample was drawn from two census regions (Pacific and North Central Midwest). The AzMERIT data set included scores for students in third and fourth grade. The total sample size was 1,253 students from 16 schools in 1 large-city school district in Mountain West US state. 54% of students were Hispanic/Latino, 23% were White, 11% were Black/African American, 8% were American Indian/Native Alaskan, 5% were Multiracial, and 4% were Asian/Native Hawaiian/Pacific Islander. The CST data set included 2,986 students in fourth through sixth grade from 14 schools in 1 large-suburban school district in 1 Pacific West US state. Approximately 46% of students were White and 38% were Hispanic/Latino. Thirty one percent of students in the district qualified for free/reduced lunch and 20% were English Language Learners.
- *Describe the analysis procedures for each reported type of validity.
- Predictive validity is the correlation between the Maze Adjusted Score at the beginning of the year and the GRADE, AzMERIT, or CST (as indicated) score at the end of the school year. This coefficient represents the extent to which Maze can predict later reading outcomes. Concurrent validity is the correlation between the Maze Adjusted Score and the GRADE, AzMERIT, or CST (as indicated) measure both at the end of the year. This coefficient represents the extent to which Maze is related to important reading outcomes.
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). Acadience Reading K–6 Technical Adequacy Brief. Eugene, OR: Acadience Learning. Dewey, E. N., Latimer, R. J., Kaminski, R. A., & Good, R. H. (2011). DIBELS Next Development: Findings from Beta 2 Validation Study (Tech. Report No. 10). Eugene, OR: Dynamic Measurement Group. Available: https://acadiencelearning.org. Powell-Smith, K. A., Good, R. H., Latimer, R. J., Dewey, E. N., & Kaminski, R. A. (2011). DIBELS Next Benchmark Goals Study (Tech. Report No. 11). Eugene, OR: Dynamic Measurement Group. Available: https://acadiencelearning.org. Please note that that Dynamic Measurement Group is now Acadience Learning and Acadience Reading K-6 is also published as DIBELS Next. Some historical documents retain the original assessment name and company name.
- Describe the degree to which the provided data support the validity of the tool.
- Both the concurrent and predictive correlations are high. These strong correlations suggest that Acadience Reading Maze assesses skills relevant to broad reading outcomes.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Grade |
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|
Rating | No | No | No | No |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- Yes
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- Bias was conceptualized as different classification accuracy between different groups. This was assessed using a Cleary model with the dichotomous outcome of status on the criterion, where the Maze Adjusted Score, subgroup, and the interaction between the two were used as predictors. If a model with the subgroup and interaction term do not add significantly to model fit, there was evidence that Maze is not biased. Model fit was assessed using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and the likelihood ratio test (LRT). The effect size for bias was assessed using the difference in AUC for the ROC curves for the different groups. These models were tested for each grade, at each time of year.
- b. Describe the subgroups for which bias analyses were conducted:
- Bias was assessed across genders and among white and non-white students.
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Of the 9 models examining bias across ethnicities the AIC and LRT favored a model without bias eight times, while the BIC favored a model without bias all nine times. Of the 21 models examining bias across genders, the AIC favored a model without bias 17 times while the BIC favored a model without bias 20 times. Likewise, the likelihood ratio test favored a model with bias only three times out of 21 models. The results show that the rate of preferring model with bias is near the global Type I error rate of .05, suggesting a lack of bias on the Maze measure.
Growth Standards
Sensitivity: Reliability of Slope
Grade | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|
Rating |
- Describe the sample, including size and characteristics. Please provide documentation showing that the sample was composed of students in need of intensive intervention. A sample of students with intensive needs should satisfy one of the following criteria: (1) all students scored below the 30th percentile on a local or national norm, or the sample mean on a local or national test fell below the 25th percentile; (2) students had an IEP with goals consistent with the construct measured by the tool; or (3) students were non-responsive to Tier 2 instruction. Evidence based on an unknown sample, or a sample that does not meet these specifications, may not be considered.
- The sample consisted of students who were identified as being "Well Below Benchmark" using the benchmark assessment of Acadience Reading at the beginning of year. Being Well Below Benchmark corresponds to being below the 19th, 18th, 19th, and 10th percentiles for third, fourth, fifth, and sixth grades, respectively. Students were only selected if they had a minimum of 15 observations.
- Describe the frequency of measurement (for each student in the sample, report how often data were collected and over what span of time).
- Progress monitoring data were collected throughout the school year at the discretion of the administering school, but not more frequently than once per week. Any student who had fewer than fifteen progress monitoring assessments was excluded from the analysis.
- Describe the analysis procedures.
- Reliability of slope was calculated as the ratio of true score variance to observed total variance. The true score variance estimate came from a hierarchical linear model based estimate of the variance in progress monitoring slopes (using the R package lme4), the observed score variance was calculated as the variance of the ordinary least squares slopes created for each student that met the aforementioned inclusion criteria. Confidence intervals were calculated using bootstrap estimation.
In the table below, report reliability of the slope (e.g., ratio of true slope variance to total slope variance) by grade level (if relevant).
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability of the slope data that is disaggregated by subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?
- No
If yes, fill in data for each subgroup with disaggregated reliability of the slope data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Sensitivity: Validity of Slope
Grade | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|
Rating |
- Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- For the Acadience Maze progress monitoring assessment, we used the Acadience Oral Reading Fluency Words Correct at the end of the subsequent year as the outcome measure. For instance, to calculate the validity of grade 3 progress monitoring slopes, we used the Oral Reading Fluency Words Correct score at the end of grade 4. While the criterion is internal in the sense that both the progress monitoring assessment and the criterion are Acadience measures, the criterion is external in the sense that it is distinct and separate from the Maze progress monitoring system. Indeed, there is no shared method variance between the two: (a) the Maze assessment requires students to read a passage silently and fill in blanks for approximately every 7th word by selecting from a choice of three words the word that makes the most sense in the passage, (b) The Oral Reading Fluency Words Correct assessment requires a student to read a passage aloud accurately and fluently. In addition, there is no overlap of item samples: The passages used for the Maze assessment are completely different and share no overlap with the passages used for the Oral Reading Fluency Words Correct assessment. These requirements (external measures, no shared method variance, no overlap of item samples) serve to ensure a conceptual distance between the slope of Maze and the criterion. In the reported analysis we increased the length of time between the slope of Maze and the criterion measure by examining outcomes to the end of the subsequent academic year. So, for example, the validity of slope of progress on third-grade Maze assessment was examined with respect to end of fourth grade Oral Reading Fluency Words Correct. In sum, we believe that using both an alternative measure of reading skills (Maze vs. Oral Reading Fluency Words Correct), and the length of time between the end of progress monitoring and the criterion (an entire year between the last progress motioning occasion and the criterion) provides a sufficiently powerful examination of the validity of slope.
- Describe the sample(s), including size and characteristics. Please provide documentation showing that the sample was composed of students in need of intensive intervention. A sample of students with intensive needs should satisfy one of the following criteria: (1) all students scored below the 30th percentile on a local or national norm, or the sample mean on a local or national test fell below the 25th percentile; (2) students had an IEP with goals consistent with the construct measured by the tool; or (3) students were non-responsive to Tier 2 instruction. Evidence based on an unknown sample, or a sample that does not meet these specifications, may not be considered.
- The sample consisted of students who were identified as being "Well Below Benchmark" using the benchmark assessment of Acadience Reading at the beginning of year. Being Well Below Benchmark corresponds to being below the 19th, 18th, 19th, and 10th percentiles for third, fourth, fifth, and sixth grades, respectively. Students were only selected if they had a minimum of 15 observations.
- Describe the frequency of measurement (for each student in the sample, report how often data were collected and over what span of time).
- Progress monitoring data were collected throughout the school year at the discretion of the administering school, but not more frequently than once per week. Any student who had fewer than fifteen progress monitoring assessments was excluded from the analysis.
- Describe the analysis procedures for each reported type of validity.
- Validity of slope was assessed using the partial correlations between the students' ordinary least squares slope and the criterion, while controlling for the students' ordinary least squares intercept.
In the table below, report predictive validity of the slope (correlation between the slope and achievement outcome) by grade level (if relevant).
NOTE: The TRC suggests controlling for initial level when the correlation for slope without such control is not adequate.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published validity studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- The moderate to strong partial correlations that the OLS slopes have with a criterion that is separated by an entire year and a conceptually different measure of reading skills provides strong evidence for validity.
- Do you have validity of the slope data that is disaggregated by subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?
- No
If yes, fill in data for each subgroup with disaggregated validity of the slope data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published validity studies:
- No
- Provide citations for additional published studies.
Alternate Forms
Grade | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|
Rating |
- Describe the sample for these analyses, including size and characteristics:
- What is the number of alternate forms of equal and controlled difficulty?
- If IRT based, provide evidence of item or ability invariance
- If computer administered, how many items are in the item bank for each grade level?
- If your tool is computer administered, please note how the test forms are derived instead of providing alternate forms:
Decision Rules: Setting & Revising Goals
Grade | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|
Rating |
- In your manual or published materials, do you specify validated decision rules for how to set and revise goals?
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
NOTE: The TRC expects evidence for this standard to include an empirical study that compares a treatment group to a control and evaluates whether student outcomes increase when decision rules are in place.
Decision Rules: Changing Instruction
Grade | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|
Rating |
- In your manual or published materials, do you specify validated decision rules for when changes to instruction need to be made?
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
NOTE: The TRC expects evidence for this standard to include an empirical study that compares a treatment group to a control and evaluates whether student outcomes increase when decision rules are in place.
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.