mCLASS: Reading 3D
Text Reading and Comprehension (TRC)

Summary

Descriptive Information

mCLASS:3D - Text Reading and Comprehension (TRC) is a set of screening and progress monitoring measures for grades K-6. TRC is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. While the student reads from the set of leveled readers, the teacher follows along on a handheld device, recording the student’s performance as the child reads. The handheld software offers a pre-loaded class list indicating required assessment tasks, provides the teacher with directions and prompts to ensure standardized, accurate administration, and automates the precise timing requirements. Upon completion of each task, the handheld automatically calculates the student’s score and provides a risk evaluation. Student performance data are securely and immediately transferred to the Web-based mCLASS reporting system. The mCLASS:3D Web site offers a range of reports at the district, school, class, and individual student level for further analysis. The set of measures in the screening are designed to be administered at the beginning, middle, and end of year, with alternate forms of all measures available for progress monitoring in between screening windows.

Acquisition & Cost

Where to Obtain:: Amplify Education, Inc.; support@amplify.com; 55 Washington Street Suite 800 Brooklyn, NY 11201-1071; (800) 823-1969; https://www.amplify.com/

Initial Cost:: $20.90 per student

Replacement Cost:: $20.90 per student per year

Included in Cost:: The basic pricing plan is an annual per student license of $20.90. For users already using an mCLASS assessment product, the cost per student to add mCLASS:3D is $6 per student.; mCLASS allows for administration of the TRC assessment using mobile devices and allows teachers to easily record student responses with just a tap of a button as well as other observations noticed during an assessment for a deeper interpretation of students’ skills. It has embedded script for prompts and directions for ensuring standardized administration so all students receive the same opportunity to perform. The mCLASS Platform provides a comprehensive service for managing the staff organizational structure and student enrollment data, providing online reporting and analysis tools for users of different roles from administrators to classroom teachers, and supporting our mobile assessment delivery system. It supports the Now What Tools, which makes assessment results actionable for teachers by translating data into practical instructional support with tools for small-group instruction, item-level analysis, and parent letters. Educators and administrators can immediately access student data using reports that are designed for influencing instruction and informing administrative decisions. mCLASS is an assessment instrument well-suited for use with capturing the developing reading skills of students with disabilities, with a few exceptions: a) students who are deaf; b) students who have fluency-based speech disabilities, e.g., stuttering, oral apraxia; c) students who are learning to read in a language other than English or Spanish; d) students with severe disabilities. Use of mCLASS is appropriate for all other students, including those with disabilities and receiving special education supports for whom reading connected text is an IEP goal. For students receiving special education, it may be necessary to adjust goals and timelines; and provide accommodations as part of the administration. The purpose of accommodation is to facilitate assessment for children for whom a standard administration may not provide an accurate estimate of their skills in the core early literacy skill areas. Valid and acceptable accommodations are ones that are unlikely to change substantially the meaning or interpretation of a student’s scores. The list of valid and acceptable accommodations for TRC administration are available upon request.

Training & Technical Support

Training Requirements:: 4 - 8 hours of training.

Qualified Administrators:: Examiners must receive training in the assessment administration and scoring.

Access to Technical Support:: Amplify’s Customer Care Center offers complete user-level support from 7:00 a.m. to 7:00 p.m. EST, Monday through Friday. Customers may contact a customer support representative via telephone, e-mail, or electronically through the mCLASS website. Calls to the Customer Care Center’s toll-free number are answered immediately by an automated attendant and routed to customer support agents according to regional expertise. Additionally, customers have self-service access to instructions, documents, and frequently asked questions on our Website. The research staff and product teams are available to answer questions about the content within the assessments. Larger implementations have a designated account manager to support ongoing successful implementation.

Administration

Assessment Format:

Individual
Small group
Large group
Computer-administered

Scoring Time:

Scoring is automatic OR
0 minutes per student

Scores Generated:

Raw score
Developmental benchmarks

Administration Time:

7 minutes per student

Scoring Method:

Manually (by hand)
Automatically (computer-scored)

Technology Requirements:

Computer or tablet
Internet connection

Tool Information

Descriptive Information

Please provide a description of your tool:: mCLASS:3D - Text Reading and Comprehension (TRC) is a set of screening and progress monitoring measures for grades K-6. TRC is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a book and complete a number of follow-up tasks, which may include responding to oral comprehension questions, completing a retell, and/or writing responses to comprehension questions. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, reading fluency and comprehension of the text. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. While the student reads from the set of leveled readers, the teacher follows along on a handheld device, recording the student’s performance as the child reads. The handheld software offers a pre-loaded class list indicating required assessment tasks, provides the teacher with directions and prompts to ensure standardized, accurate administration, and automates the precise timing requirements. Upon completion of each task, the handheld automatically calculates the student’s score and provides a risk evaluation. Student performance data are securely and immediately transferred to the Web-based mCLASS reporting system. The mCLASS:3D Web site offers a range of reports at the district, school, class, and individual student level for further analysis. The set of measures in the screening are designed to be administered at the beginning, middle, and end of year, with alternate forms of all measures available for progress monitoring in between screening windows.

Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?: End-year goal
Short-term skill

The tool is intended for use with the following grade(s).

Preschool / Pre - kindergarten
selected

Kindergarten
selected

First grade
selected

Second grade
selected

Third grade
selected

Fourth grade
selected

Fifth grade
selected

Sixth grade
not selected

Seventh grade
not selected

Eighth grade
not selected

Ninth grade
not selected

Tenth grade
not selected

Eleventh grade
not selected

Twelfth grade

The tool is intended for use with the following age(s).

0-4 years old
selected

5 years old
selected

6 years old
selected

7 years old
selected

8 years old
selected

9 years old
selected

10 years old
selected

11 years old
selected

12 years old
not selected

13 years old
not selected

14 years old
not selected

15 years old
not selected

16 years old
not selected

17 years old
not selected

18 years old

The tool is intended for use with the following student populations.

Students in general education
selected

Students with disabilities
selected

English language learners

ACADEMIC ONLY: What dimensions does the tool assess?

Reading

Global Indicator of Reading Competence
not selected

Listening Comprehension
not selected

Vocabulary
not selected

Phonemic Awareness
selected

Decoding

Passage Reading
not selected

Word Identification
selected

Comprehension

Spelling & Written Expression

Global Indicator of Spelling Competence
not selected

Global Indicator of Writting Expression Competence

Mathematics

Global Indicator of Mathematics Comprehension
not selected

Early Numeracy
not selected

Mathematics Concepts
not selected

Mathematics Computation
not selected

Mathematics Application
not selected

Fractions

Algebra

Other

Please describe specific domain, skills or subtests:

BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.

BEHAVIOR ONLY: Which category of behaviors does your tool target?

Acquisition and Cost Information

Where to obtain:

Email Address: support@amplify.com
Address: 55 Washington Street Suite 800 Brooklyn, NY 11201-1071
Phone Number: (800) 823-1969
Website: https://www.amplify.com/

Initial cost for implementing program:

Cost: $20.90
Unit of cost: student

Replacement cost per unit for subsequent use:

Cost: $20.90
Unit of cost: student
Duration of license: year

Additional cost information:

Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.: The basic pricing plan is an annual per student license of $20.90. For users already using an mCLASS assessment product, the cost per student to add mCLASS:3D is $6 per student.

Provide information about special accommodations for students with disabilities.: mCLASS allows for administration of the TRC assessment using mobile devices and allows teachers to easily record student responses with just a tap of a button as well as other observations noticed during an assessment for a deeper interpretation of students’ skills. It has embedded script for prompts and directions for ensuring standardized administration so all students receive the same opportunity to perform. The mCLASS Platform provides a comprehensive service for managing the staff organizational structure and student enrollment data, providing online reporting and analysis tools for users of different roles from administrators to classroom teachers, and supporting our mobile assessment delivery system. It supports the Now What Tools, which makes assessment results actionable for teachers by translating data into practical instructional support with tools for small-group instruction, item-level analysis, and parent letters. Educators and administrators can immediately access student data using reports that are designed for influencing instruction and informing administrative decisions. mCLASS is an assessment instrument well-suited for use with capturing the developing reading skills of students with disabilities, with a few exceptions: a) students who are deaf; b) students who have fluency-based speech disabilities, e.g., stuttering, oral apraxia; c) students who are learning to read in a language other than English or Spanish; d) students with severe disabilities. Use of mCLASS is appropriate for all other students, including those with disabilities and receiving special education supports for whom reading connected text is an IEP goal. For students receiving special education, it may be necessary to adjust goals and timelines; and provide accommodations as part of the administration. The purpose of accommodation is to facilitate assessment for children for whom a standard administration may not provide an accurate estimate of their skills in the core early literacy skill areas. Valid and acceptable accommodations are ones that are unlikely to change substantially the meaning or interpretation of a student’s scores. The list of valid and acceptable accommodations for TRC administration are available upon request.

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?

General education teacher
not selected

Special education teacher
not selected

Parent

Child

External observer
not selected

Other

If other, please specify:

BEHAVIOR ONLY: What is the administration format?

Direct observation
not selected

Rating scale
not selected

Checklist

Performance measure
not selected

Other

If other, please specify:

BEHAVIOR ONLY: What is the administration setting?

General education classroom
not selected

Special education classroom
not selected

School office
not selected

Recess

Lunchroom

Home

Other

If other, please specify:

Does the program require technology?

Yes

If yes, what technology is required to implement your program? (Select all that apply)

Computer or tablet
selected

Internet connection
not selected

Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?

Individual

Small group If small group, n=

Large group If large group, n=

Computer-administered
not selected

Other

If other, please specify:

What is the administration time?

Time in minutes

per (student/group/other unit)

student

Additional scoring time:

Time in minutes

per (student/group/other unit)

student

How many alternate forms are available, if applicable?

Number of alternate forms

20+

per (grade/level/unit)

ACADEMIC ONLY: What are the discontinue rules?

No discontinue rules provided
not selected

Basals

Ceilings

Other

If other, please specify:

BEHAVIOR ONLY: Can multiple students be rated concurrently by one administrator?

If yes, how many students can be rated concurrently?

Training & Scoring

Training

Is training for the administrator required?: Yes

Describe the time required for administrator training, if applicable:: 4 - 8 hours of training.

Please describe the minimum qualifications an administrator must possess.: Examiners must receive training in the assessment administration and scoring.; No minimum qualifications

Are training manuals and materials available?: Yes

Are training manuals/materials field-tested?: Yes

Are training manuals/materials included in cost of tools?: No
If No, please describe training costs:: For first-time mCLASS users, a 2-day in-person training is available for $575 in-office or $4,800 onsite for 25 maximum participants.

Can users obtain ongoing professional and technical support?: Yes
If Yes, please describe how users can obtain support:: Amplify’s Customer Care Center offers complete user-level support from 7:00 a.m. to 7:00 p.m. EST, Monday through Friday. Customers may contact a customer support representative via telephone, e-mail, or electronically through the mCLASS website. Calls to the Customer Care Center’s toll-free number are answered immediately by an automated attendant and routed to customer support agents according to regional expertise. Additionally, customers have self-service access to instructions, documents, and frequently asked questions on our Website. The research staff and product teams are available to answer questions about the content within the assessments. Larger implementations have a designated account manager to support ongoing successful implementation.

Scoring

BEHAVIOR ONLY: What types of scores result from the administration of the assessment?

Score
Observation	Behavior Rating
Frequency Duration Interval Latency	Raw score

Conversion
Observation	Behavior Rating
Rate Percent	Standard score Subscale/ Subtest Composite Stanine Percentile ranks Normal curve equivalents IRT based scores

Interpretation
Observation	Behavior Rating
Error analysis Peer comparison Rate of change	Dev. benchmarks Age-Grade equivalent

How are scores calculated?

Manually (by hand)
selected

Automatically (computer-scored)
not selected

Other

If other, please specify:

Teachers follow and score along, marking reading errors and responses to orally presented questions, as students read and respond to questions in real time. Final reading accuracy and comprehension scores and instructional reading level based on students’ performance as recorded by the teacher are calculated automatically by the software.

Do you provide basis for calculating performance level scores?

Yes

What is the basis for calculating performance level and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

What types of performance level scores are available?

Raw score

Standard score
not selected

Percentile score
not selected

Grade equivalents
not selected

IRT-based score
not selected

Age equivalents
not selected

Stanines

Normal curve equivalents
selected

Developmental benchmarks
not selected

Developmental cut points
not selected

Equated

Probability
not selected

Lexile score
not selected

Error analysis
not selected

Composite scores
not selected

Subscale/subtest scores
not selected

Other

If other, please specify:

Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.: Raw scores are provided as the reading level of the student, categorized as a reading level A through Z. A student’s reading level is a composite of his reading accuracy and comprehension of text. Cut Points for determining reading level are provided. A student must reach the accuracy and comprehension cut points in order for a book level to be determined as the student’s reading level. Developmental benchmarks for each measure, grade, and time of year (beginning, middle, end) classify each student’s score as Above Proficient, Proficient, Below Proficient, Far Below Proficient.

Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?: Yes

ACADEMIC ONLY: Do you provide benchmarks for the slopes?: No

ACADEMIC ONLY: Do you provide percentile ranks for the slopes?: No

What is the basis for calculating slope and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.: TRC is a set of screening and progress monitoring measures for grades K-6. Text Reading and Comprehension (TRC) is an individually administered assessment using leveled readers from a book set to determine a student’s instructional reading level. During this measure, students are asked to read a benchmark book and complete a number of follow-up tasks, which may include Oral Comprehension, Retelling, and/or Written Comprehension. Assessors observe and record the student’s oral reading behaviors through the administration of TRC to determine reading accuracy, fluency and various comprehension components. The comprehension components help assessors determine whether the student understands the meaning of the text and the student’s instructional reading level. The instructional reading level, a composite score of reading accuracy and comprehension is used to classify students in one of four proficiency levels. The skills assessed in TRC include those skills that must be mastered for any student learning to read in English. The materials were subject to multiple rounds of review by content development experts and school-based professionals to ensure they were culturally relevant and free from bias. Field testing data (including qualitative feedback from educators) also indicate lack of bias in the results. The observational administration allows for flexibility with respect to issues of linguistic diversity through professional judgment based on the student’s responses and prior knowledge of his/her speech patterns. A list of approved accommodations is available upon request.

Rates of Improvement and End of Year Benchmarks

Is minimum acceptable growth (slope of improvement or average weekly increase in score by grade level) specified in your manual or published materials?: Yes; If yes, specify the growth standards:; Growth norms are available from the Center upon request. Student progress percentiles that account for students’ initial skills were used to determine how much growth, as documented by instructional reading level, to expect over the course of a school year. Amplify (2015). TRC Atlas National Growth Norms 2014 - 2015. Unpublished technical report.

Are benchmarks for minimum acceptable end-of-year performance specified in your manual or published materials?: Yes; If yes, specify the end-of-year performance standards:; Performance standards are available for three time points in the year: BOY (Beginning of Year), MOY (Middle of Year), and EOY (End of Year). End of year expectations are underlined in the table below. Grade TOY Far Below Prof. Below Prof. Prof. Above Prof. K BOY < PC PC RB ≥ A K MOY ≤ RB A B ≥ C K EOY ≤ A B C-D ≥ E 1 BOY ≤ A B C-D ≥ E 1 MOY ≤ C D-E F-G ≥ H 1 EOY ≤ E F-H I ≥ J 2 BOY ≤ E F-H I ≥ J 2 MOY ≤ H I J-K ≥ L 2 EOY ≤ J K L-M ≥ N 3 BOY ≤ J K L-M ≥ N 3 MOY ≤ K L-M N ≥ O 3 EOY ≤ L M-N O-P ≥ Q 4 BOY ≤ L M-N O-P ≥ Q 4 MOY ≤ N O-P Q ≥ R 4 EOY ≤ P Q R-S ≥ T 5 BOY ≤ P Q R-S ≥ T 5 MOY ≤ Q R-S T ≥ U 5 EOY ≤ S T U-V ≥ W 6 BOY ≤ S T U-V ≥ W 6 MOY ≤ U V W-X ≥ Y 6 EOY ≤ V W-X Y-Z *

What is the basis for specifying minimum acceptable growth and end of year benchmarks?

Norm-referenced
selected

Criterion-referenced
not selected

Other

If other, please specify:

False

If norm-referenced, describe the normative profile.

National representation (check all that apply):

Northeast:

New England

Middle Atlantic

Midwest:

East North Central

West North Central

South:

South Atlantic

East South Central

West South Central

West:

Mountain

Pacific

Local representation (please describe, including number of states)

End of year benchmarks are criterion-referenced: The TRC performance standards indicate proficiency with respect to student performance against the grade-level expectations of the Common Core State Standards in English Language Arts. Cut points defining the performance standards were determined according to standard setting procedures during a workshop held in 2015. The following report provides more information on the development of benchmarks: Amplify (2015). mCLASS:Reading 3D - Text Reading and Comprehension: Setting Rigorous Early Reading Performance Standards to Prepare Students for Career and College. Unpublished technical report. A copy of this report is available from the Center upon request. Minimum acceptable growth is norm-referenced. Growth norms were calculated using data from the 2014 - 2015 school year. They account for students’ initial skills and document what is average, below, and above average progress. The following report provides more information on the development of growth norms: Amplify (2015). TRC Atlas National Growth Norms 2014 - 2015. Unpublished technical report. A copy of this report is available from the Center upon request.

Date: 2014 - 2015 School Year
Size: 164633

Gender (Percent)

Male
Female
Unknown

SES indicators (Percent)

Eligible for free or reduced-price lunch
Other SES Indicators

Race/Ethnicity (Percent)

White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Asian/Pacific Islander
Other
Unknown

Disability classification (Please describe)
First language (Please describe)
Language proficiency status (Please describe)

Do you provide, in your user’s manual, norms which are disaggregated by race or ethnicity? If so, for which race/ethnicity?

White, Non-Hispanic
not selected

Black, Non-Hispanic
not selected

Hispanic

American Indian/Alaska Native
not selected

Asian/Pacific Islander
not selected

Other

Unknown

If criterion-referenced, describe procedure for specifying criterion for adequate growth and benchmarks for end-of-year performance levels.

The TRC performance standards indicate proficiency with respect to student performance against the grade-level expectations of the Common Core State Standards in English Language Arts. Cut points defining the performance standards were determined according to standard setting procedures during a workshop held in 2015. The following report provides more information on the development of benchmarks: Amplify (2015). mCLASS:Reading 3D - Text Reading and Comprehension: Setting Rigorous Early Reading Performance Standards to Prepare Students for Career and College. Unpublished technical report. A copy of this report is available from the Center upon request.

Describe any other procedures for specifying adequate growth and minimum acceptable end of year performance.

Performance Level

Reliability

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Offer a justification for each type of reliability reported, given the type and purpose of the tool.: The current submission includes Cronbach’s alpha as an indicator of internal consistency. Cronbach’s alpha was calculated as an additional indicator of internal consistency using data from students who read book levels that were appropriate for their current grade level. Cronbach’s alpha quantifies the degree to which the items on an assessment all measure the same underlying construct. To avoid missing responses, the students in each grades who are reading at-grade proficient text level books are used to compute the Cronbach’s alpha. The current submission includes evidence for inter-rater reliability. The assessment involves administration by a single rater with a single student at a time. Thus, it is important for raters to be consistent.

*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.: Cronbach’s alpha was calculated using data from the mCLASS data system for the 2016-2017 school year. In total, 2,513 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central. The sample was composed of participants from the following demographic categories: 47 percent male, 45 percent female, and 8 percent unspecified gender; 25 percent black, 23 percent white, 20 percent Hispanic, 4 percent American Indian/Alaska Native, 4 percent multiracial, and 21 percent other race or unspecified race. Thirty-two percent of the sample is eligible for free and reduced lunch, while eligibility for free and reduced lunch for 52% of the sample is unknown. Study 1: Three raters assessed 33 students from two schools in two Southern states during the 2013–2014 end-of-year benchmark administration period. Among the students, representation was as follows: 8 from kindergarten, 10 from Grade 1, four from Grade 2, four from Grade 3, two from Grade 4, and five from Grade 5. The sample was 39 percent female and 61 percent male; 9 percent white, 21 percent Hispanic, 67 percent black, and 3 percent represented other races. The raters were two Amplify consultants. Study 2: The sample was composed of students in Grade 4 (n = 15), Grade 5 (n= 15), and Grade 6 (n = 10); 39 percent of the students were female and 61 percent male; 67 percent of students were black, 21 percent were Hispanic, 9 percent were white, and 3 percent were of other ethnicity. The raters were four Amplify consultants.

*Describe the analysis procedures for each reported type of reliability.: Cronbach’s alpha is used as the indicator for internal consistency, which quantifies the degree to which the items on an assessment all measure the same underlying construct. To avoid missing responses, the students in each grades who are reading at-grade proficient text level books are used to compute Cronbach’s alpha. The 95% confidence interval of the Cronbach’s alpha is computed using the Bootstrap method, where 1000 samples with replacement are drawn from the data, calculating for each alpha and computing the 2.5% and 97.5% quantiles. In presenting inter-rater reliability (IRR) evidence, raters’ scores are typically compared using intraclass correlations (ICC). ICC is one of the most commonly used statistics for assessing IRR on ordinal, interval, or ratio variables and is suitable for studies with two or more coders (Hallgren, 2012). Cicchetti (1994) provides cutoffs for ICC values, with IRR being poor for values less than 0.40, fair for values between 0.40 and 0.59, good for values between 0.60 and 0.74, and excellent for values between 0.75 and 1.00. IRR estimates reported here are based on two or more independent assessors simultaneously scoring student performance during a single test administration (“shadow-scoring”). Reliability coefficients presented therefore represent the degree to which the administration and scoring procedures for the TRC components lead to consistent results, generalizing across administrators. Study 1: Inter-rater reliability evidence is provided for each subcomponent of the assessment: reading record accuracy, retell/recall, and oral comprehension. Grade-specific IRR is slightly lower than overall IRR due to relatively small sample sizes in each of the grades. All values are above 0.40, with the exceptions of reading record accuracy in grades 2 and 4 and oral comprehension in Grade 5. According to Cicchetti’s criteria, IRR overall is classified as excellent for reading record accuracy, oral comprehension, and retell/recall. Study 2: The IRR results for Study 2, including overall book performance, reading record accuracy, and oral comprehension performance, for the entire sample and for each grade are provided. All values are above 0.40. The lower bound CI of book performance for Grade 4 and 6 and the lower bound CI of oral comprehension for Grade 6 are all below 0.4, possibly due to the small sample size. According to Cicchetti’s criteria, IRR overall is classified as fair for overall book performance, excellent for reading record accuracy, and good for oral comprehension.

*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: Yes

Provide citations for additional published studies.: Amplify (2015). mCLASS Reading3D - Amplify Atlas Book Set Technical Manual, 2nd Edition. Brooklyn, NY: Author.

Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Validity

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating	^d	^d	^d	^d	^d	^d

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.: DIBELS Next was chosen as the outcome measure. DIBELS Next measures are brief, powerful indicators of foundational early literacy skills that: are quick to administer and score; serve as universal screening (or benchmark assessment) and progress monitoring; identify students in need of intervention support; evaluate the effectiveness of interventions; and support the RtI/Multi-tiered model. DIBELS Next includes six measures: First Sound Fluency (FSF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), DIBELS Oral Reading Fluency (DORF), and Daze. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good et al., 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade appropriate reading skills, and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments.

*Describe the sample(s), including size and characteristics, for each validity analysis conducted.: The current submission includes evidence for concurrent and predictive validity. Concurrent validity was calculated using data from the mCLASS data system for the 2016-2017 school year. In total, 182,259 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central. The sample was composed of participants from the following demographic categories: 47 percent male, 45 percent female, and 8 percent unspecified gender; 30 percent Hispanic, 22 percent black, 18 percent white, 3 percent Asian, Native Hawaiian or Pacific Islander, 3 percent multiracial, 0.3 percent American Indian/Alaska Native, and 24 percent other race or unspecified race. Twenty-nine percent of the sample is eligible for free and reduced lunch, while eligibility for free and reduced lunch for 52% of the sample is unknown. Predictive validity was calculated using data from the mCLASS data system for the 2016-2017 school year. In total, 173,224 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central. The sample was composed of participants from the following demographic categories: 47 percent male, 45 percent female, and 8 percent unspecified gender; 32 percent Hispanic, 22 percent black, 19 percent white, 3 percent Asian, Native Hawaiian or Pacific Islander, 3 percent multiracial, 0.4 percent American Indian/Alaska Native, and 21 percent other race or unspecified race. Thirty-two percent of the sample is eligible for free and reduced lunch, while eligibility for free and reduced lunch for 53% of the sample is unknown.

*Describe the analysis procedures for each reported type of validity.: Evidence of concurrent validity is often presented as a correlation between the assessment and an external criterion measure. Instructional reading levels determined from the administration of the Atlas edition of TRC should correlate highly with other accepted procedures and measures that determine overall reading achievement, including accuracy and comprehension. The degree of correlation between two conceptually related, concurrently administered tests suggests the tests measure the same underlying psychological constructs or processes. The correlation of final instructional reading level on TRC with the Composite score on DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017). Predictive validity provides an estimate of the extent to which student performance on TRC predicts scores on the criterion measure administered at a later point in time, defined as more than three months in this study. The correlation of final instructional reading level on TRC at the middle of year with the Composite score resulting from subsequent administration of DIBELS Next at the end of year is computed to provide predictive validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017).

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: Yes

Provide citations for additional published studies.: Amplify (2015). mCLASS Reading3D - Amplify Atlas Book Set Technical Manual, 2nd Edition. Brooklyn, NY: Author.

Describe the degree to which the provided data support the validity of the tool.: The table above summarizes the concurrent and predictive validity evidence for each grade. Across Grades K to 6, concurrent validity coefficients range from 0.71 to 0.82, demonstrating strong correlations between final instructional reading level on TRC and DIBELS Next composite score at the end of year. The lower bounds of 95% confidence intervals are all above 0.70. Across Grades K to 6, predictive validity coefficients are in the range of 0.59 to 0.79. The lower bounds of 95% confident intervals are above 0.70 for Grades 1 to 6. The correlation with the DIBELS Next composite score is slightly lower in Kindergarten than the other grades, possibly because text levels at the lower grades are much less variable due to floor effect at Kindergarten. It is also possible that predictive validity is not as strong in Kindergarten due to the nature of reading growth for children in Kindergarten. Students’ literacy skills at this time are widely variable and change rapidly in response to instruction.

Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: Yes

If yes, fill in data for each subgroup with disaggregated validity data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: Yes

Provide citations for additional published studies.: Amplify (2015). mCLASS Reading3D - Amplify Atlas Book Set Technical Manual, 2nd Edition. Brooklyn, NY: Author.

Bias Analysis

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating	Not Provided	Not Provided	Not Provided	Not Provided	Not Provided	Not Provided

Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.: No

If yes,
a. Describe the method used to determine the presence or absence of bias:

b. Describe the subgroups for which bias analyses were conducted:

c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.

Growth Standards

Sensitivity: Reliability of Slope

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Describe the sample, including size and characteristics. Please provide documentation showing that the sample was composed of students in need of intensive intervention. A sample of students with intensive needs should satisfy one of the following criteria: (1) all students scored below the 30th percentile on a local or national norm, or the sample mean on a local or national test fell below the 25th percentile; (2) students had an IEP with goals consistent with the construct measured by the tool; or (3) students were non-responsive to Tier 2 instruction. Evidence based on an unknown sample, or a sample that does not meet these specifications, may not be considered.: The sample of students included in analyses of slope of improvement were those students identified as in need of intensive intervention based on their DIBELS Next Composite scores. All students included scored below the Well Below Benchmark cutpoint which is a research-based criterion that is indicative of need for intensive instructional supports and ongoing progress monitoring. The odds of students who obtain scores in the Well Below Benchmark zone meeting grade level expectations at the next screening or benchmarking period without intensive intervention supports are 10 to 20 percent (Good, R. H., Kaminski, R., Dewey, E., Walin, J., Powell-Smith, K., & Latimer, R., 2013. DIBELS Next Technical Manual, Eugene, OR: Dynamic Measurement Group, Inc. ). In total, 8,415 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central. The sample was composed of participants from the following demographic categories: 58 percent male, 43 percent female, and 8 percent unspecified gender; 37 percent black, 23 percent Asian/Pacific Islander,18 percent Hispanic, 8 percent white, 1 percent American Indian/Alaska Native, 4 percent multiracial, and 19 percent other race or unspecified race. Thirty-six percent of the sample is eligible for free and reduced lunch, while eligibility for free and reduced lunch for 54% of the sample is unknown.

Describe the frequency of measurement (for each student in the sample, report how often data were collected and over what span of time).: Reliability of slope was calculated for students who had multiple TRC data points over the course of a school year which amounts to an average of 1 data point every 3 weeks.

Describe the analysis procedures.: HLM was used to compute the reliability and validity of the slope. Reliability of the slope is the ratio of the true score variance to the total variance. The true score variance is the random slope variance in the multilevel regression using a random intercept and random slope model; the total variance is the estimation of total variance of each student’s individual slope of improvement. The correlation of slope of improvement with the Composite score resulting from administration of DIBELS Next at the end of year is computed to provide validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017).

In the table below, report reliability of the slope (e.g., ratio of true slope variance to total slope variance) by grade level (if relevant).

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Do you have reliability of the slope data that is disaggregated by subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?: No

If yes, fill in data for each subgroup with disaggregated reliability of the slope data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Sensitivity: Validity of Slope

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.: DIBELS Next was chosen as the outcome measure. DIBELS Next measures are brief, powerful indicators of foundational early literacy skills that: are quick to administer and score; serve as universal screening (or benchmark assessment) and progress monitoring; identify students in need of intervention support; evaluate the effectiveness of interventions; and support the RtI/Multi-tiered model. DIBELS Next includes six measures: First Sound Fluency (FSF), Letter Naming Fluency (LNF), Phoneme Segmentation Fluency (PSF), Nonsense Word Fluency (NWF), DIBELS Oral Reading Fluency (DORF), and Daze. An overall composite score is calculated based on a student’s scores on grade-specific measures to provide an overall indication of literacy skills. DIBELS Next is considered an appropriate criterion measure given the strong reliability and validity evidence demonstrated by various studies (please refer to the DIBELS Next technical manual for details: Good et al., 2013). DIBELS Next was selected as the criterion measure as the Composite Score is a powerful indicator of overall reading skill. DIBELS Next serves a similar purpose to TRC: to provide an indicator of risk or proficiency with grade appropriate reading skills, and to measure growth in reading skills over time. While both DIBELS Next and TRC are available within the mCLASS platform, they are completely separate assessments.

Describe the sample(s), including size and characteristics. Please provide documentation showing that the sample was composed of students in need of intensive intervention. A sample of students with intensive needs should satisfy one of the following criteria: (1) all students scored below the 30th percentile on a local or national norm, or the sample mean on a local or national test fell below the 25th percentile; (2) students had an IEP with goals consistent with the construct measured by the tool; or (3) students were non-responsive to Tier 2 instruction. Evidence based on an unknown sample, or a sample that does not meet these specifications, may not be considered.: The sample of students included in analyses of slope of improvement were those students identified as in need of intensive intervention based on their DIBELS Next Composite scores. All students included scored below the Well Below Benchmark cutpoint which is a research-based criterion that is indicative of need for intensive instructional supports and ongoing progress monitoring. The odds of students who obtain scores in the Well Below Benchmark zone meeting grade level expectations at the next screening or benchmarking period without intensive intervention supports are 10 to 20 percent (Good, R. H., Kaminski, R., Dewey, E., Walin, J., Powell-Smith, K., & Latimer, R., 2013. DIBELS Next Technical Manual, Eugene, OR: Dynamic Measurement Group, Inc.). In total, 7,741 students were included from 18 states across the following geographic divisions: East North Central, East South Central, Middle Atlantic, Mountain, New England, Pacific, South Atlantic, West North Central, and West South Central. The sample was composed of participants from the following demographic categories: 58 percent male, 43 percent female, and 8 percent unspecified gender; 37 percent black, 23 percent Asian/Pacific Islander,18 percent Hispanic, 8 percent white, 1 percent American Indian/Alaska Native, 4 percent multiracial, and 19 percent other race or unspecified race. Thirty-six percent of the sample is eligible for free and reduced lunch, while eligibility for free and reduced lunch for 54% of the sample is unknown.

Describe the frequency of measurement (for each student in the sample, report how often data were collected and over what span of time).: Validity of slope was calculated for students who had multiple TRC data points over the course of a school year which amounts to an average of 1 data point every 3 weeks.

Describe the analysis procedures for each reported type of validity.: HLM was used to compute the reliability and validity of the slope. Reliability of the slope is the ratio of the true score variance to the total variance. The true score variance is the random slope variance in the multilevel regression using a random intercept and random slope model; the total variance is the estimation of total variance of each student’s individual slope of improvement. The correlation of slope of improvement with the Composite score resulting from administration of DIBELS Next at the end of year is computed to provide validity evidence. 95% confidence interval of the correlation is computed using the “stats” package in in R software package (R Development Core Team, 2017).

In the table below, report predictive validity of the slope (correlation between the slope and achievement outcome) by grade level (if relevant).
NOTE: The TRC suggests controlling for initial level when the correlation for slope without such control is not adequate.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published validity studies:: No

Provide citations for additional published studies.

Describe the degree to which the provided data support the validity of the tool.: The table above summarizes the evidence for validity of slope for each grade. Across grades K through 6, coefficients range from 0.435 to 0.770 demonstrating relationships with the external measure and sensitivity to student learning.

Do you have validity of the slope data that is disaggregated by subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?: No

If yes, fill in data for each subgroup with disaggregated validity of the slope data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published validity studies:: No

Provide citations for additional published studies.

Alternate Forms

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Describe the sample for these analyses, including size and characteristics:: Methods: In order to demonstrate empirical evidence for the comparability of books (i.e., alternate form reliability) for students who are at risk overall on TRC, student performance data collected during the 2016-2017 end of school year was submitted to ANOVAs by each text complexity level (i.e., A,Z) , with individual books serving as the between subjects factor, and student final performance on each book administered as the dependent variable. Results: The range of Eta-squared value (ᵰ 2) resulting from analysis of at risk student performance on the books at each text complexity level are presented in the table below along with corresponding sample sizes and number of books. Interpreting these ᵰ 2 values as effect sizes, it is shown for each grade that a very small, nonsignificant, amount of variability is accounted for by differences in books administered, therefore, alternate form reliability of TRC for at risk students is demonstrated. Reliability Type Grade n (Books) n (Students) ᵰ 2 (range) Alt. Form K 9 7410 0.00-0.04 Alt. Form 1 28 8362 0.01-0.04 Alt. Form 2 36 8736 0.00-0.05 Alt. Form 3 36 4594 0.01-0.05 Alt. Form 4 45 4116 0.00-0.07 Alt. Form 5 54 2836 0.00-0.05 Alt. Form 6 29 381 0.00-0.08

What is the number of alternate forms of equal and controlled difficulty?: There are more than 300 books available at each text complexity level in TRC as the mCLASS home website allows teachers to add additional progress monitoring books to system. Development of these books according to a text leveling gradient as specified previously ensures comparability of content among books at each specific level.

If IRT based, provide evidence of item or ability invariance

If computer administered, how many items are in the item bank for each grade level?

If your tool is computer administered, please note how the test forms are derived instead of providing alternate forms:

Decision Rules: Setting & Revising Goals

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

In your manual or published materials, do you specify validated decision rules for how to set and revise goals?: Yes
If yes, specify the decision rules:: We recommend using a goal-oriented rule for evaluating a student’s response to intervention that is straightforward for teachers to understand and use. Decisions about a student’s progress are based on comparisons of TRC scores that are plotted on a graph and the aimline, or expected rate of progress as determined by either research-based benchmark expectations or expectations for growth. Goals should only be increased and not decreased. If a student is making progress that is exceeding the expected goal, the goal should be increased and/or the intervention or instruction modified to be less intensive as the student is showing greater than expected progress.

What is the evidentiary basis for these decision rules? NOTE: The TRC expects evidence for this standard to include an empirical study that compares a treatment group to a control and evaluates whether student outcomes increase when decision rules are in place.: This recommended decision rule is based on early work with CBM (Fuchs, 1988, 1989) and precision teaching (White & Haring, 1980) and allows for a minimum of three data points to be gathered before any decision is made. Fuchs, L. S. (1988). Effects of computer-managed instruction on teachers' implementation of systematic monitoring programs and student achievement. Journal of Education Research, 81, 294-304. Fuchs, L. S. (1989). Evaluation solutions: Monitoring progress and vising intervention plans. In M. Shinn (Ed.), Curriculum-based management: Assessing special children. New York: Guilford Press. White, O. R., & Haring, N. G. (1980). Exceptional teaching (2nd Ed.) Columbus, OH: Merrill.

Decision Rules: Changing Instruction

Grade	Kindergarten	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

In your manual or published materials, do you specify validated decision rules for when changes to instruction need to be made?: Yes
If yes, specify the decision rules:: In general, it is recommended that support be continued until a student achieves at least three points at or above the goal. If a decision is made to discontinue support, it is recommended that progress monitoring be continued weekly for at least 1 month to ensure that the student is able to maintain growth without the supplemental support. The frequency of progress monitoring will be faded gradually as the child’s progress continues to be sufficient. We suggest that educational professionals consider instructional modifications when student performance falls below the aimline for three consecutive points.

What is the evidentiary basis for these decision rules? NOTE: The TRC expects evidence for this standard to include an empirical study that compares a treatment group to a control and evaluates whether student outcomes increase when decision rules are in place.: This recommended decision rule is based on early work with CBM (Fuchs, 1988, 1989) and precision teaching (White & Haring, 1980) and allows for a minimum of three data points to be gathered before any decision is made. Fuchs, L. S. (1988). Effects of computer-managed instruction on teachers' implementation of systematic monitoring programs and student achievement. Journal of Education Research, 81, 294-304. Fuchs, L. S. (1989). Evaluation solutions: Monitoring progress and vising intervention plans. In M. Shinn (Ed.), Curriculum-based management: Assessing special children. New York: Guilford Press. White, O. R., & Haring, N. G. (1980). Exceptional teaching (2nd Ed.) Columbus, OH: Merrill.

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.

Summary

Tool Information
Descriptive Information
Administration
Training & Scoring
Benchmarks

Performance Level
Reliability
Validity
Bias Analysis

Growth Standards
Sensitivity
Alternate Forms
Decision Rules

Data Collection Practices

mCLASS: Reading 3DText Reading and Comprehension (TRC)

Summary

Tool Information

Descriptive Information

Acquisition and Cost Information

Administration

Training & Scoring

Training

Scoring

Rates of Improvement and End of Year Benchmarks

Performance Level

Reliability

Validity

Bias Analysis

Growth Standards

Sensitivity: Reliability of Slope

Sensitivity: Validity of Slope

Alternate Forms

Decision Rules: Setting & Revising Goals

Decision Rules: Changing Instruction

Data Collection Practices

mCLASS: Reading 3D
Text Reading and Comprehension (TRC)