Acadience Reading K-6 (aka DIBELS Next®)
Oral Reading Fluency Words Correct
Summary
Oral Reading Fluency Words Correct (ORF WC) is a measure of advanced phonics and word attack skills and accurate and fluent reading of connected text. The ORF passages and procedures are based on the program of research and development of Curriculum-Based Measurement of reading by Stan Deno and colleagues at the University of Minnesota (Deno, 1989). Students are given an unfamiliar, grade-level passage of text and asked to read for 1 minute. Errors such as substitutions, omissions, and hesitations for more than 3 seconds are marked while listening to the student read aloud. The Words Correct score is calculated by subtracting the number of errors from the total words read.
- Where to Obtain:
- Acadience Learning Inc. and Voyager Sopris Learning
- info@acadiencelearning.org
- Acadience Learning: 859 Willamette Street, Suite 320, Eugene, OR 97401; Voyager Sopris: 17855 Dallas Parkway, Suite 400, Dallas, TX 75287-6816
- Acadience Learning: (541)4316931, (888) 943-1240; Voyager Sopris: (888) 399-1995
- Acadience Learning: https://acadiencelearning.org/; Voyager Sopris: http://voyagersopris.com
- Initial Cost:
- Free
- Replacement Cost:
- Free
- Included in Cost:
- Acadience Learning: All materials are available for free download at https://acadiencelearning.org/acadiencereading.html, including progress monitoring student materials for each grade, assessor scoring booklets, a large print edition of all student materials, the Acadience Reading K-6 Assessment Manual, and the Acadience Reading Technical Manual. Voyager Sopris: There are three purchasing options for implementing progress monitoring materials: 1) Progress monitoring via online test administration and scoring; 2) Progress monitoring materials as part of the purchase of classroom sets, which also include benchmark materials; and 3) Individual progress monitoring materials (i.e., Assessment Book, Scoring Booklets). Classroom sets contain everything needed for one person to conduct the benchmark assessment for 25 students and the progress monitoring assessment for up to five students.
- Approved accommodations are any accommodations that will not alter the standardization of the assessment. Specific approved accommodations include, but are not limited to: 1. The use of colored overlays, filters, or lighting adjustments for students with visual impairments. 2. The use of student materials that have been enlarged or with larger print for students with visual impairments. 3. The use of assistive technology, such as hearing aids and assistive listening devices (ALDs), for students with hearing impairments. 4. The use of a marker or ruler to focus student attention on the materials for students who are not able to demonstrate their skills adequately without one.
- Training Requirements:
- Two to four hours of training to cover foundations of Acadience Reading, as well as administration and scoring of Oral Reading Fluency.
- Qualified Administrators:
- Paraprofessional-level training and adequate training on administration and scoring of Oral Reading Fluency.
- Access to Technical Support:
- Acadience Learning: Customer support is available from 8:00am to 5:00pm PT, Monday through Friday by phone, email, or through Acadience Learning's website; Voyager Sopris: Customer support is available 8:00am to 6:00pm CT, Monday through Friday by phone, email, or through the Voyager Sopris website.
- Assessment Format:
-
- Individual
- Computer-administered
- Scoring Time:
-
- Scoring is automatic OR
- 1 minutes per passage
- Scores Generated:
-
- Raw score
- Percentile score
- Developmental benchmarks
- Developmental cut points
- Administration Time:
-
- 1 minutes per passage
- Scoring Method:
-
- Manually (by hand)
- Automatically (computer-scored)
- Technology Requirements:
-
Tool Information
Descriptive Information
- Please provide a description of your tool:
- Oral Reading Fluency Words Correct (ORF WC) is a measure of advanced phonics and word attack skills and accurate and fluent reading of connected text. The ORF passages and procedures are based on the program of research and development of Curriculum-Based Measurement of reading by Stan Deno and colleagues at the University of Minnesota (Deno, 1989). Students are given an unfamiliar, grade-level passage of text and asked to read for 1 minute. Errors such as substitutions, omissions, and hesitations for more than 3 seconds are marked while listening to the student read aloud. The Words Correct score is calculated by subtracting the number of errors from the total words read.
- Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?
-
ACADEMIC ONLY: What dimensions does the tool assess?
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
Acquisition and Cost Information
Administration
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- Two to four hours of training to cover foundations of Acadience Reading, as well as administration and scoring of Oral Reading Fluency.
- Please describe the minimum qualifications an administrator must possess.
- Paraprofessional-level training and adequate training on administration and scoring of Oral Reading Fluency.
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- Acadience Learning: Customer support is available from 8:00am to 5:00pm PT, Monday through Friday by phone, email, or through Acadience Learning's website; Voyager Sopris: Customer support is available 8:00am to 6:00pm CT, Monday through Friday by phone, email, or through the Voyager Sopris website.
Scoring
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- During progress monitoring, one Oral Reading Fluency passage is administered. The ORF Words Correct score is the number of words read correctly in 1 minute. The student receives 1 point for each word read correctly in 1 minute. As the student reads the passage out loud during the assessment, the assessor follows along in the scoring booklet, leaving blank any words the student reads correctly and ignoring inserted words. The assessor marks a slash (/) through any words read incorrectly including the following: substitutions, skipped words, hesitations for more than 3 seconds, words read out of order, and words that are sounded out but not read as a whole word. The assessor marks an "sc" over any words that the student self-corrects within 3 seconds. At the end of 1 minute, the assessor puts a bracket after the last word read and does not score any student responses after 1 minute. The Words Correct score is calculated by subtracting the number of errors from the total words read. For benchmark assessment/universal screening, three passages are administered. For each passage, the assessor determines the total number of words and subtracts the errors (including skipped words) to obtain the words correct score for the passage. To obtain the final score, the assessor determines the median (middle) words correct score across the three passages.
- Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?
- Yes
- ACADEMIC ONLY: Do you provide benchmarks for the slopes?
- No
- ACADEMIC ONLY: Do you provide percentile ranks for the slopes?
- No
- Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
- The Acadience Reading K-6 measures were designed to be economical and efficient indicators of a student's progress toward achieving a general outcome such as reading or phonemic awareness and to be used for both benchmark assessment and progress monitoring. Progress monitoring refers to the more frequent testing of students who may be at risk for future reading difficulty on the skill areas in which they are receiving instruction, to ensure that they are making adequate progress. Progress monitoring can be conducted using grade-level or out-of-grade materials, depending on the student's needs. Decisions about the skill areas and levels to monitor are made at the individual student level. Students who are receiving additional support should be monitored for progress more frequently to ensure that the instructional support being provided is helping them get back on track. Monitoring may occur once per month, once every two weeks, or as often as once per week. In general, students who need the most intensive instruction are monitored for progress most frequently. Progress monitoring materials contain alternate forms of the same measures administered during benchmark assessment. Each alternate form is of equivalent difficulty. Not all students will need progress monitoring. Progress monitoring materials are organized by measure, since students who need progress monitoring will typically be monitored on specific measures related to the instruction they are receiving, rather than on every measure for that grade. Material selected for progress monitoring must be sensitive to growth, yet still represent an ambitious goal. The standardized procedures for administering an Acadience Reading K-6 measure may apply when using Acadience Reading K-6 for progress monitoring. Progress monitoring data should be graphed and readily available to those who teach the student. An aimline should be drawn from the student's current skill level (which may be the most recent benchmark assessment score) to the goal. Progress monitoring scores can then be plotted over time and examined to determine whether they indicate that the student is making adequate progress (i.e. fall above or below the aimline). The Acadience Reading K-6 assessments were designed to support students of varied backgrounds. Passages were written with names that represent diverse cultural, racial, and ethnic groups. Acadience Reading K-6 is appropriate for most students for whom an instructional goal is to learn to read in English. For English language learners who are learning to read in English, Acadience Reading K-6 is appropriate for assessing and monitoring progress in acquisition of early reading skills.
Rates of Improvement and End of Year Benchmarks
- Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in your manual or published materials?
- Yes
- If yes, specify the growth standards:
- Using Acadience Reading Pathways of Progress, the growth standards depend on the student's beginning-of-year performance relative to students with similar levels of initial skills, i.e., student performance is only compared to other students who have the same beginning-of-year score. Student scores above the 80th percentile are considered Well Above Typical progress. Student scores between the 60th and 79th percentile are considered Above Typical progress. Student scores between the 40th and 59th percentile are considered Typical progress. Student scores between the 20th and 39th percentile are considered Below Typical progress. And student scores below the 20th percentile are considered Well Below Typical progress.
- Are benchmarks for minimum acceptable end-of-year performance specified in your manual or published materials?
- Yes
- If yes, specify the end-of-year performance standards:
- Three primary end-of-year performance standards are specified: Well Below Benchmark, Below Benchmark, and At or Above Benchmark. These standards are used to indicate increasing odds of achieving At or Above Benchmark status at the next benchmark administration. End- of-year benchmarks goals and cut points for risk: Grade 1 Benchmark Goal: 47, Cut point: 32; Grade 2 Benchmark Goal: 87, Cut point: 65; Grade 3 Benchmark Goal: 100, Cut point: 80; Grade 4 Benchmark Goal: 115, Cut point: 95; Grade 5 Benchmark Goal: 130, Cut point: 105; Grade 6 Benchmark Goal: 120, Cut point: 95.
- Date
- 2018
- Size
- 2,748,243
- Male
- 52%
- Female
- 48%
- Unknown
- 0%
- Eligible for free or reduced-price lunch
- 60%
- Other SES Indicators
- White, Non-Hispanic
- 45.63%
- Black, Non-Hispanic
- Hispanic
- American Indian/Alaska Native
- Asian/Pacific Islander
- Other
- Unknown
- Disability classification (Please describe)
- First language (Please describe)
- Language proficiency status (Please describe)
Performance Level
Reliability
Grade |
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- Reliability refers to the relative stability with which a test measures the same skills across minor differences in conditions. Three types of reliability are reported in the table below, alternate-form reliability, alpha, and inter-rater reliability. Alternate-form reliability is the correlation between different measures of the same early literacy skills. The coefficient reported is the average correlation among three forms of the measure. High alternate-form reliability coefficients suggest that these multiple forms are measuring the same construct. Coefficient alpha is a measure of reliability that is widely used in education research and represents the proportion of true score to total variance. Alpha incorporates information about the average inter-test correlation as well as the number of tests. Inter-rater reliability indicates the extent to which results generalize across assessors. The inter-rater reliability estimates reported here are based on two independent assessors simultaneously scoring student performance during a single test administration (“shadow-scoring”). The two raters’ scores were then correlated.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- The data used for assessing alternate-form reliability and calculating alpha was collected for 140 students in grades 1-6 (Oral Reading Fluency Readability Study (Tech. Report No. 7)). Student participants were from one elementary and one middle school located in one state of the Mountain West region of the United States. Students whose teachers volunteered to participate were recruited for participation in the study. Students receiving English language reading instruction in first- through sixth-grade general education classrooms were eligible for participation. Eligible students included those with disabilities as well as English language learners provided they had the response capabilities to participate, and provided their parents did not indicate that they did not want their child to participate. Participants were selected systematically from those students meeting the inclusion criteria. Each school site selected the nth student meeting the inclusion criteria from the participating teachers' class lists at each grade level (e.g., every fifth student at each grade level first to sixth) until 25 students per grade were selected. Demographic data were gathered at the school level from the NCES website (NCES, 2007, http://nces.ed.gov/). At the elementary school, 13% of the student population was reported as American Indian or Alaska Native, 4% as Asian, 1% as Black or African American, <1% as Hispanic, and 81% as White, . Thirty-nine percent of the students were eligible for free/reduced lunch. The middle school reported 6% of the student population as American Indian or Alaska Native, 2% as Asian, <1% as Black or African American, 2% as Hispanic, and 89% as White. Fifty-six percent of the student population was eligible for free/reduced lunch. The data used for assessing inter-rater reliability was collected for 122 students in grades 2-6. This data was collected as part of a larger study (i.e., Benchmark Goals Study (Tech. Report No. 11) with a sample size of 3,816. Participants were from 13 elementary schools in 5 school districts in 5 states across three regions of the United States (East North Central Midwest, West North Central Midwest, and Pacific West). The students were selected from general education classrooms who were receiving English language reading instruction, including students with disabilities or who were English language learners, provided they had the response capabilities to participate. The site coordinator in each participating district (5 total) was asked to conduct shadow-scoring for 5 students in each grade. They were asked to select students at random from grade-level lists, and if possible, to have every examiner participate as both assessor and shadow-scorer. Demographic data were gathered at the school level from the NCES website (NCES, 2008, http://nces.ed.gov/). Across all participating schools, <1% of the student population as reported as American Indian or Alaska Native, 1% as Asian, <1% as Black or African American, 4% as Hispanic, and 94% as White. One of the school districts did not report data on free/reduced lunch eligibility. In the other four districts, 19% of the student population was eligible for free/reduced lunch.
- *Describe the analysis procedures for each reported type of reliability.
- Alternate-form reliability for grades 2-5: Each student in the study read a series of Acadience Reading passages. First-grade students read 29 passages and students in grades 2-5 read 32 passages. Passages were administered in a random order specific to each participating student (i.e., each student had their own specific testing schedule). The passages were organized into triads (sets of three forms or passages) and one dyad (set of two passages). Alternate-form reliability of the triads was determined by correlating each triad with every other triad in that grade (e.g., Triad 1 x Triad 2, Triad 1 x Triad 3, etc.). The alternate-form reliability reported below is the median alternate-form reliability between triads of Acadience Reading passages and other grade-level triads. Alternate form reliability for grade 6: Students were administered their Acadience Reading benchmark assessment in the fall. Two weeks later, students were administered three alternate forms of ORF passages. The alternate-form reliability reported below is the ___________ alternate-form reliability between the _________________. Coefficient alpha: Coefficient alpha treats each of the three passages as separate indicators and is calculated using the alternate-form reliability, where the number of tests is equal to three. Inter-rater reliability: The inter-rater reliability estimates reported here are based on two independent assessors simultaneously scoring student performance during a single test administration (“shadow-scoring”). The inter-rater reliability coefficient is the correlation between these two independent assessors. ORF passages are administered in triads, thus the inter-rater reliability that is reported is based on three forms (i.e., triad).
*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Dewey, E. N., Latimer, R. J., Kaminski, R. A., & Good, R. H. (2011). DIBELS Next Development: Findings from Beta 2 Validation Study (Tech. Report No. 10). Eugene, OR: Dynamic Measurement Group. Available: https://acadiencelearning.org. Powell-Smith, K. A., Good, R. H., Latimer, R. J., Dewey, E. N., & Kaminski, R. A. (2011). DIBELS Next Benchmark Goals Study (Tech. Report No. 11). Eugene, OR: Dynamic Measurement Group. Available: https://acadiencelearning.org. Powell-Smith, K. A., Good, R. H., & Atkins, T. (2010). DIBELS Next Oral Reading Fluency Readability Study (Tech. Report No. 7). Eugene, OR: Dynamic Measurement Group. Available: https://acadiencelearning.org. Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). Acadience Reading K–6 Technical Adequacy Brief. Eugene, OR: Acadience Learning. Available: https://acadiencelearning.org. Please note that Dynamic Measurement Group is now Acadience Learning and Acadience Reading K-6 is also published as DIBELS Next. Some historical documents use the original company and assessment name.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Grade |
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|
Rating |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- The Group Reading Assessment and Diagnostic Evaluation (GRADE) is an untimed, group-administered, norm-referenced reading achievement test appropriate for children in preschool through grade 12. The GRADE is comprised of 16 subtests within five components. Not all 16 subtests are used at each testing level. Various subtest scores are combined to form the Total Test composite score. The GRADE Total Test raw score was compared to all Acadience measures given during the year, providing both predictive criterion-related validity correlations for beginning- and middle-of-year Acadience measures and concurrent criterion-related validity data for end-of-year Acadience measures. The GRADE Total Test score is comprised of scores across subtests of the GRADE that vary by grade level. In kindergarten, the GRADE Total Test score is comprised of measures that assess phonics and phonemic and phonological awareness. In third grade, GRADE Total Test is comprised of measures assessing word reading, vocabulary, and comprehension. In fourth, fifth, and sixth grade, GRADE Total Test includes scores from measures of vocabulary and comprehension. The SBAC (Smarter Balanced Assessment Consortium) is a summative Common-Core aligned assessment administered to students in grades 3-8 at the end of the school year (i.e., spring). The ELA (English Language Arts) subtest, which is what we examined, is comprised of computer adaptive (CAT) and performance tasks (PT) and measures four claims: Reading, Writing, Speaking/Listening, and Research. The CAT consists of machine scored and short-text items assessing all four claims. The PT includes two or three research items requiring both short-text responses and a full written response and assesses only the Writing and Research claims. Scores are reported for overall ELA performance as well as for each claim. Arizona’s Measurement of Educational Readiness to Inform Teaching (AzMERIT) is the CCSS-aligned assessment used and commissioned by the state of Arizona administered to students in grades 3-8 at the end of the school year (i.e., spring). The AzMERIT includes a number of different types of questions, including performance tasks that are multi-step assignments that ask students to apply their knowledge and skills to address real-world problems. In English Language Arts (ELA), the subtest examined in our analyses, students apply their research and writing skills. The test also includes traditional multiple choice questions, as well as interactive questions that require students to drag and drop their answers into a box, create equations, and fill in the answer. The California Standards Test (CST) is a statewide achievement test produced for California public schools and was designed to assess the California content standards for English/language arts (ELA), mathematics, history–social science, and science in grades 2-11. According to a technical report from ETS (2011), the CST items were developed and designed to conform to principles of item writing defined by ETS (ETS, 2002). In addition, the items selected underwent an extensive item review process designed to provide the best standards-based tests possible. The Reading cluster of the ELA portion of the CST was examined in our analyses.
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- The GRADE data set included students in third through sixth grade. The total sample size is 677 students from 13 schools within 5 school districts. The sample was drawn from two census regions (Pacific and North Central Midwest). The SBAC data set included 1,973 students in third through fifth grade from 18 schools, including 5 schools in rural areas, in 1 school district in 1 Pacific West US state. Approximately 83% of students were White, 12% were Hispanic/Latino, 2% were Multiracial, 1% Asian, and less than 1% American Indian/Alaska Native, Native Hawaiian/Pacific Islander, and Black/African American, respectively. The AzMERIT data set included 1,252 students in third and fourth from 16 schools in 1 large-city school district in 1 Mountain West US state. Approximately 54% of students were Hispanic/Latino, 23% were White, 11% were Black/African American, 8% were American Indian/Native Alaskan, 5% were Multiracial, and 4% were Asian/Native Hawaiian/Pacific Islander. The CST data set included 5, 391 students in second through sixth grade from 14 schools in 1 large-suburban school district in 1 Pacific West US state. Approximately 46% of students were White and 38% were Hispanic/Latino. Thirty one percent of students in the district qualified for free/reduced lunch and 20% were English Language Learners.
- *Describe the analysis procedures for each reported type of validity.
- Predictive validity is the correlation between Oral Reading Fluency Words Correct at the beginning of the year and the GRADE, SBAC, AzMERIT, or CST (as indicated) score at the end of the school year. This coefficient represents the extent to which Oral Reading Fluency Words Correct can predict later reading outcomes. Concurrent validity is the correlation between the Oral Reading Fluency Words Correct score and the GRADE, SBAC, AzMERIT, or CST (as indicated) measure both at the end of the year. This coefficient represents the extent to which the Words Correct score is related to important reading outcomes.
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- ORF is a measure of advanced phonics and word attack skills, accurate and fluent reading of connected text, and reading comprehension. The ORF passages and procedures are based on the program of research and development of Curriculum-Based Measurement of reading by Stan Deno and colleagues at the University of Minnesota (Deno, 1989). The ORF passages were designed to represent the different types of text that students will encounter, including a mix of narrative and expository, with different types of passages and content within those categories. A range of topics and themes was selected so that each student would encounter familiar topics and unfamiliar topics. The passages were designed to be authentic text, so they include irregular words and are not written entirely in decodable text. Passages were written and revised by professional authors according to the design specifications noted in the Technical Manual. All passages were required to meet readability criteria for the grade level as measured by the Acadience Learning Passage Revision Utility, which is software that identifies the target word length, rare words, and sentence length for a passage and provides guidance when a passage is outside any of the target ranges specified by the Acadience Learning Passage Difficulty Index. The initial passage set included 40 passages for each grade that met the criteria. A readability study was conducted to examine actual student performance on all of the passages and further control differences in passage difficulty within each grade level.
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Dewey, E. N., Powell-Smith, K. A., Good, R. H., & Kaminski, R. A. (2015). Acadience Reading K–6 Technical Adequacy Brief. Eugene, OR: Acadience Learning. Available: https://acadiencelearning.org. Powell-Smith, K. A., Good, R. H., Latimer, R. J., Dewey, E. N., & Kaminski, R. A. (2011).DIBELS Next Benchmark Goals Study (Tech. Report No. 11). Eugene, OR: Dynamic Measurement Group. Available: https://acadiencelearning.org. Good, R. H., Powell-Smith, K. A., Abbott, M., VanLoo, D., Warnock, A. N., & Latimer, R. J. (2018). Using DIBELS Next to Predict Performance on Statewide ELA Assessments. Paper presented at the National Association of School Psychologists' Annual Convention, Chicago, IL. Available: https://acadiencelearning.org. Good, R. H., Powell-Smith, K. A., Abbott, M., Dewey, E. N., Warnock, A. N. , & VanLoo, D. (2017). Examining the Association Between DIBELS Next and the SBAC ELA Achievement Standard. Poster presented at the Pacific Coast Research Conference, San Diego, CA. Available: https://acadiencelearning.org. Powell-Smith, K. A., Good, R. H., Plahy, C., & Hunter, M. P. (2013). Decision Utility of DIBELS Next for the California Standards Test. Presented at the National Association of School Psychologists' Annual Convention. Available: https://acadiencelearning.org. Please note that Dynamic Measurement Group is now Acadience Learning and Acadience Reading K-6 is also published as DIBELS Next. Some historical documents use the original company and assessment
- Describe the degree to which the provided data support the validity of the tool.
- Both the concurrent and predictive correlations are generally high for ORF Words Correct. These strong correlations suggest that Oral Reading Fluency Words Correct is measuring skills relevant to reading outcomes. Given the wide range of skills assessed on the GRADE, these data support the conclusion that the Oral Reading Fluency measure is an excellent indicator of reading proficiency.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Grade |
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
---|---|---|---|---|---|---|
Rating | No | No | No | No | No | No |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- Yes
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- Bias was conceptualized as different classification accuracy between different groups. This was assessed using a Cleary model with the dichotomous outcome of status on the criterion, where the Oral Reading Fluency score, subgroup , and the interaction between the two were used as predictors. If a model with the subgroup and interaction term do not add significantly to model fit, there is evidence that Oral Reading Fluency is not biased. Model fit was assessed using the Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC), and the likelihood ratio test (LRT). The effect size for bias was assessed using the difference in AUC for the ROC curves for the different groups. These models were tested for each grade, at each time of year.
- b. Describe the subgroups for which bias analyses were conducted:
- Bias was assessed across genders and among white and non-white students.
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Of the 9 models examining bias across ethnicities the AIC and BIC favored a model without bias favored a model without bias all nine times, and the likelihood ratio test showed that adding ethnic group to the logistic regression did not significantly improve model fit. Of the 21 models examining bias across genders, the AIC favored a model without bias 17 times while the BIC favored a model without bias 20 times. Likewise, the likelihood ratio test favored a model with bias only once out of 21 models. The results show that the rate of preferring model with bias is near the global type I error rate of .05, suggesting a lack of bias on the Oral Reading Fluency measure.
Growth Standards
Sensitivity: Reliability of Slope
Grade | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|
Rating |
- Describe the sample, including size and characteristics. Please provide documentation showing that the sample was composed of students in need of intensive intervention. A sample of students with intensive needs should satisfy one of the following criteria: (1) all students scored below the 30th percentile on a local or national norm, or the sample mean on a local or national test fell below the 25th percentile; (2) students had an IEP with goals consistent with the construct measured by the tool; or (3) students were non-responsive to Tier 2 instruction. Evidence based on an unknown sample, or a sample that does not meet these specifications, may not be considered.
- The sample consisted of students who were identified as being "Well Below Benchmark" using the benchmark assessment of Acadience Reading at beginning of year. Being Well Below Benchmark corresponds to being below the 14th, 26th, 22nd, 26th, and 24th percentiles for second, third, fourth, fifth, and sixth grades, respectively. Students were only selected if they had a minimum of 15 observations.
- Describe the frequency of measurement (for each student in the sample, report how often data were collected and over what span of time).
- Progress monitoring data were collected throughout the school year at the discretion of the administering school, but not more frequently than once per week. Any student who had fewer than fifteen progress monitoring assessments was excluded from the analysis.
- Describe the analysis procedures.
- Reliability of slope was calculated as the ratio of true score variance to observed total variance. The true score variance estimate came from a hierarchical linear model based estimate of the variance in progress monitoring slopes (using the R package lme4), the observed score variance was calculated as the variance of the ordinary least squares slopes created for each student that met the aforementioned inclusion criteria. Confidence intervals were calculated using bootstrap estimation.
In the table below, report reliability of the slope (e.g., ratio of true slope variance to total slope variance) by grade level (if relevant).
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability of the slope data that is disaggregated by subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?
- No
If yes, fill in data for each subgroup with disaggregated reliability of the slope data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Sensitivity: Validity of Slope
Grade | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|
Rating |
- Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- For the Acadience Oral Reading Fluency Words Correct progress monitoring assessment in first grade, we used the Acadience Retell at the end of the subsequent year as the criterion measure. For the Acadience Oral Reading Fluency Words Correct progress monitoring assessment in second through fifth grades, we used the Acadience Maze at the end of the subsequent year as the criterion measure. While the criterion is internal in the sense that both the progress monitoring assessment and the criterion are Acadience measures, the criterion is external in the sense that it is distinct and separate from the Oral Reading Fluency Words Correct progress monitoring system. Indeed, there is no shared method variance between the two: (a) the Oral Reading Fluency Words Correct assessment requires a student to read a passage aloud accurately and fluently, (b) The Retell assessment requires the student to read a passage orally and tell about what they have read, and (c) The Maze assessment requires students to read a passage silently and fill in blanks for approximately every 7th word by selecting from a choice of three words the word that makes the most sense in the passage. In addition, there is no overlap of item samples: The passages used for the Oral Reading Fluency Words Correct assessment are completely different and share no overlap with either (a) the passages used for the Retell assessment at the end of the subsequent year or (b) the passages used for the Maze assessment. These requirements (external measures, no shared method variance, no overlap of item samples) serve to ensure a conceptual distance between the slope of Oral Reading Fluency Words Correct and the criterion so there is not artificial overlap. In the reported analysis we increased the length of time between the slope of Oral Reading Fluency Words Correct and the criterion measure by examining the criterion at the end of the subsequent academic year - over 12 months later. So, for example, the validity of slope of progress on third-grade Oral Reading Fluency Words Correct assessment was examined with respect to end of fourth grade Maze assessment. In sum, we believe that using both an alternative measure of reading skills (Oral Reading Fluency vs. Retell or Maze) and the length of time between the end of progress monitoring and the criterion (an entire year between the progress motioning occasion and the criterion) provides a sufficiently powerful examination of the validity of slope.
- Describe the sample(s), including size and characteristics. Please provide documentation showing that the sample was composed of students in need of intensive intervention. A sample of students with intensive needs should satisfy one of the following criteria: (1) all students scored below the 30th percentile on a local or national norm, or the sample mean on a local or national test fell below the 25th percentile; (2) students had an IEP with goals consistent with the construct measured by the tool; or (3) students were non-responsive to Tier 2 instruction. Evidence based on an unknown sample, or a sample that does not meet these specifications, may not be considered.
- The sample consisted of students who were identified as being "Well Below Benchmark" using the benchmark assessment of Acadience Reading at beginning of year. Being Well Below Benchmark corresponds to being below the 14th, 26th, 22nd, 26th, and 24th percentiles for second, third, fourth, fifth, and sixth grades, respectively. Students were only selected if they had a minimum of 15 observations.
- Describe the frequency of measurement (for each student in the sample, report how often data were collected and over what span of time).
- Progress monitoring data were collected throughout the school year at the discretion of the administering school, but not more frequently than once per week. Any student who had fewer than fifteen progress monitoring assessments was excluded from the analysis.
- Describe the analysis procedures for each reported type of validity.
- Validity of slope was assessed using the partial correlations between the students' ordinary least squares slope and the criterion, while controlling for the students' ordinary least squares intercept.
In the table below, report predictive validity of the slope (correlation between the slope and achievement outcome) by grade level (if relevant).
NOTE: The TRC suggests controlling for initial level when the correlation for slope without such control is not adequate.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published validity studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- The moderate partial correlations that the OLS slopes have with a criterion that is separated by an entire year and a conceptually different measure of reading skills provides good evidence for validity.
- Do you have validity of the slope data that is disaggregated by subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?
- No
If yes, fill in data for each subgroup with disaggregated validity of the slope data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published validity studies:
- No
- Provide citations for additional published studies.
Alternate Forms
Grade | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|
Rating |
- Describe the sample for these analyses, including size and characteristics:
- What is the number of alternate forms of equal and controlled difficulty?
- If IRT based, provide evidence of item or ability invariance
- If computer administered, how many items are in the item bank for each grade level?
- If your tool is computer administered, please note how the test forms are derived instead of providing alternate forms:
Decision Rules: Setting & Revising Goals
Grade | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|
Rating |
- In your manual or published materials, do you specify validated decision rules for how to set and revise goals?
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
NOTE: The TRC expects evidence for this standard to include an empirical study that compares a treatment group to a control and evaluates whether student outcomes increase when decision rules are in place.
Decision Rules: Changing Instruction
Grade | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 |
---|---|---|---|---|---|---|
Rating |
- In your manual or published materials, do you specify validated decision rules for when changes to instruction need to be made?
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
NOTE: The TRC expects evidence for this standard to include an empirical study that compares a treatment group to a control and evaluates whether student outcomes increase when decision rules are in place.
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.