Lexia RAPID Assessment
Reading
Summary
Lexia RAPID™ Assessment™ for Grades K-12 helps teachers and educational leaders make decisions that promote reading success. This research-based, computer-adaptive reading and language screening assessment allows educators to gather predictive, norm-referenced data up to three times a year, with immediate scoring and reports. RAPID for Grades K-2 measures students’ foundational skills in the key reading and language domains of Word Recognition, Academic Language, and Reading Comprehension. RAPID for Grades 3-12 measures complex knowledge, understanding, and application of skills within these domains. Combining the resources and knowledge of two organizations committed to literacy research and innovation, RAPID was developed through an ongoing partnership between Lexia Learning and the Florida Center for Reading Research (FCRR). Educators can use RAPID to screen K-12 students and identify students who are in need of more support to reach a rigorous end-of-year goal as well as to identify students’ strengths and weaknesses in word recognition, academic language, and reading comprehension. In addition, RAPID provides direct resources to address skill gaps with a robust bank of high-quality instructional materials that address common skill gaps.
- Where to Obtain:
- Developer: Lexia Learning and Florida Center for Reading Research Publisher: Lexia Learning, A Rosetta Stone Company
- support@lexialearning.com
- 300 Baker Avenue, Concord, MA 01742
- 800-435-3942
- www.lexialearning.com
- Initial Cost:
- $7.20 per student
- Replacement Cost:
- Contact vendor for pricing details.
- Included in Cost:
- Replacement cost: annual licensing rates subject to change RAPID Assessment is accessed through an annual subscription based service. Licenses can be purchases at the individual student level or through a school site license model. Pricing rates are subject to volume and duration. RAPID can be administered through a browser or ipad, so a device (computer or iPad) and stable Internet connection are required to administer the assessment. Headphones are also strongly recommended. RAPID administration training is very brief (less than 10 minutes) and can be completed independently by teachers or paraprofessional educators. Training on all aspects of RAPID is available for free to all customers through a series of asynchronous training-on-demand modules, as well as several guides and training resource manuals. For additional training around RAPID tasks, scores, and reports, schools can purchase an Implementation Support Package for 3,500.00. This package provides access to ongoing in person training and remote support throughout the year. Individual trainings can also be purchased in a remote or in person format. Bundle pricing for licenses plus services is available.
- Designed for ease of use, RAPID promotes students’ access to assessment with computer-adaptive tasks and flexible administration options for timing and setting. These options can be utilized by any student. While RAPID for grades 3-12 is typically administered in a group setting and RAPID for grades K-2 is administered one-on-one, all students can take the untimed RAPID assessment at any time in any setting in school, wherever a stable internet connection is available. Allowable accommodations for RAPID enable students to participate in the assessment by adjusting assessment presentation and student responding. These accommodations will not impact the reliability and validity of the way RAPID measures student reading abilities. A student’s Individualized Education Plan (IEP), Section 504 Plan, or other formal documentation will state the student’s eligibility and need for specific accommodations. At the discretion of educators and administrators, allowable accommodations may also be utilized for students with special circumstances. To ensure that students with accommodations have equal access to assessment, it is important that accommodations for RAPID are consistent with those utilized for classroom assessment. Examples of Allowable Accommodations are visual magnification, use of touch-screen overlays, providing student placeholders and/or scrap paper, assisting with practice questions, student dictation of responses, etc. Additional allowable accommodations can be found in the RAPID Administration and Accommodations Resource – available to all RAPID customers.
- Training Requirements:
- Less than 1 hr of training
- Qualified Administrators:
- No minimum qualifications specified.
- Access to Technical Support:
- All customers have access to Lexia’s Customer Support team, which is available by phone (800-507-2772) or email (support@lexialearning.com) Monday – Friday, 8:00am to 6:00pm EST. For an additional cost, schools can purchase a year long Implementation Support Package (ISP), which includes assessment planning support as well as in person and remote training. In person or remote a la carte trainings can also be purchased separately.
- Assessment Format:
-
- Direct: Computerized
- Scoring Time:
-
- Scoring is automatic
- Scores Generated:
-
- Percentile score
- IRT-based score
- Probability
- Administration Time:
-
- 40 minutes per group
- Scoring Method:
-
- Automatically (computer-scored)
- Technology Requirements:
-
- Computer or tablet
- Internet connection
- Accommodations:
- Designed for ease of use, RAPID promotes students’ access to assessment with computer-adaptive tasks and flexible administration options for timing and setting. These options can be utilized by any student. While RAPID for grades 3-12 is typically administered in a group setting and RAPID for grades K-2 is administered one-on-one, all students can take the untimed RAPID assessment at any time in any setting in school, wherever a stable internet connection is available. Allowable accommodations for RAPID enable students to participate in the assessment by adjusting assessment presentation and student responding. These accommodations will not impact the reliability and validity of the way RAPID measures student reading abilities. A student’s Individualized Education Plan (IEP), Section 504 Plan, or other formal documentation will state the student’s eligibility and need for specific accommodations. At the discretion of educators and administrators, allowable accommodations may also be utilized for students with special circumstances. To ensure that students with accommodations have equal access to assessment, it is important that accommodations for RAPID are consistent with those utilized for classroom assessment. Examples of Allowable Accommodations are visual magnification, use of touch-screen overlays, providing student placeholders and/or scrap paper, assisting with practice questions, student dictation of responses, etc. Additional allowable accommodations can be found in the RAPID Administration and Accommodations Resource – available to all RAPID customers.
Descriptive Information
- Please provide a description of your tool:
- Lexia RAPID™ Assessment™ for Grades K-12 helps teachers and educational leaders make decisions that promote reading success. This research-based, computer-adaptive reading and language screening assessment allows educators to gather predictive, norm-referenced data up to three times a year, with immediate scoring and reports. RAPID for Grades K-2 measures students’ foundational skills in the key reading and language domains of Word Recognition, Academic Language, and Reading Comprehension. RAPID for Grades 3-12 measures complex knowledge, understanding, and application of skills within these domains. Combining the resources and knowledge of two organizations committed to literacy research and innovation, RAPID was developed through an ongoing partnership between Lexia Learning and the Florida Center for Reading Research (FCRR). Educators can use RAPID to screen K-12 students and identify students who are in need of more support to reach a rigorous end-of-year goal as well as to identify students’ strengths and weaknesses in word recognition, academic language, and reading comprehension. In addition, RAPID provides direct resources to address skill gaps with a robust bank of high-quality instructional materials that address common skill gaps.
ACADEMIC ONLY: What skills does the tool screen?
- Please describe specific domain, skills or subtests:
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
-
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
Acquisition and Cost Information
Administration
- Are norms available?
- Yes
- Are benchmarks available?
- No
- If yes, how many benchmarks per year?
- If yes, for which months are benchmarks available?
- BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
- If yes, how many students can be rated concurrently?
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- Less than 1 hr of training
- Please describe the minimum qualifications an administrator must possess.
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- All customers have access to Lexia’s Customer Support team, which is available by phone (800-507-2772) or email (support@lexialearning.com) Monday – Friday, 8:00am to 6:00pm EST. For an additional cost, schools can purchase a year long Implementation Support Package (ISP), which includes assessment planning support as well as in person and remote training. In person or remote a la carte trainings can also be purchased separately.
Scoring
- Do you provide basis for calculating performance level scores?
-
Yes
- Does your tool include decision rules?
-
Yes
- If yes, please describe.
- Schools can use the assessment to evaluate students at risk for difficulty in individual tasks or across all tasks. Percentile rank scores allow educators to look at students individual task performance and identify students at risk for significant difficulty in a particular area. Phonological awareness in Kindergarten, Word Reading in first and second grades, and Reading Comprehension in grades three and beyond are key tasks that can be used to identify students who demonstrate reading and comprehension needs. The RAPID reports place students into instructional groups based on their percentile rank on the assessment tasks. For grades K-2, scores below the 30th percentile place the individual into specific, task-focused skill groups based on areas of learning opportunity. For grades 3-12, RAPID uses the 25th percentile for placement into instructional groups to ensure students receive additional literacy support in their respective areas of need.
- Can you provide evidence in support of multiple decision rules?
-
No
- If yes, please describe.
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- Note that time for administration varies substantially by grade-level; the number given in the implementation information section is an average across grades. Individual and Group Administration Timing Estimates for Early and Later Grades Grades K-2: 15 - 30 minutes per individual Grades 3-12: 30-45 minutes per group
- Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
- As a computer-adaptive screening tool, RAPID efficiently gathers information on student needs in a short period of time. The computerized nature allows each task to be standardized in a way that minimizes discrepancies across test administration and maximizes interrater reliability, which enhances equity for all students. All 3-12 task and half of the K-2 screening tasks are computer scored. For the portion of K-2 tasks that are teacher administered, the scoring process can be taught in a three-minute video and allows teachers to score student responses in a way that is unknown to the student, to avoid any impact of feedback on student performance. The adaptive nature of the assessment allows for short administration times, with some tasks reliably estimating performance for some students in as few as 5 items. This process reduces student frustration and boredom, which maximizes standardization and reliability. As each task is on it own scale, individual areas can be measured over the course of the year. Multiple steps were taken in the content and psychometric development to ensure that RAPID is appropriate for use of with a wide range of students, including those from culturally and linguistically diverse backgrounds and with a range of disabilities. As a part of the item review process, a varied group of educators read and reviewed the item passages and content (at least three individuals per item/passage). Some of the content of their review was related to issues associated with bias and sensitivity. After norming, DIF analyses were conducted to evaluate bias for subgroups. In addition, continuous review of items around bias and sensitivity is conducted on an ongoing basis. As noted in Question C3., there are a number of standard accommodations allowable for all students, as well as a number of additional accommodations that can be provided for students with documented needs.
Technical Standards
Classification Accuracy & Cross-Validation Summary
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
Grade 9
|
Grade 10
|
---|---|---|---|---|---|---|---|---|---|---|---|
Classification Accuracy Fall | |||||||||||
Classification Accuracy Winter | |||||||||||
Classification Accuracy Spring |
Stanford early School Achievement Test (SESAT)
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- The Stanford Early School Achievement Test (SESAT) was used for students in Kindergarten.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- The classification accuracy of scores from RAPID as it pertains to risk status on the SESAT/Stanford 10/FCAT/OST ELA was evaluated by dichotomizing screening task scores as ‘1’ for not at-risk for reading difficulties and ‘0’ for at-risk for reading difficulties, students could be classified based on their dichotomized performances on both RAPID and SESAT/Stanford 10. As such, students could be identified as not at-risk on the screening tasks and demonstrating grade level performance on the outcome measures (i.e., specificity or true-negatives), at-risk on the screening task scores and below grade level performance on the outcome measures (i.e., sensitivity or true-positives), not at-risk based on the screening task scores and not at grade level on the outcome measures (i.e., false negative error), or at-risk on the screening task scores and at grade level on the outcome measures (i.e., false positive error). Decisions concerning appropriate cut-points for screening measures are based on the level of correct classification that is desired from the screening assessments. A variety of statistics were examined through the application of a confusion matrix (e.g., sensitivity, specificity, positive and negative predictive power). The area under the curve (AUC) was examined through logistic regressions, where students’ performance on the SESAT/Stanford 10/FCAT/OST ELA was coded as ‘1’ for performance at or above the 20th percentile, and ‘0’ for scores below this target. This dichotomous variable was then regressed on RAPIDs screening tasks at each grade level. The 20% percentile of the SESAT/Stanford-10/FCAT/OST ELA was chosen as the cutpoint. As a nationally normed assessment of reading comprehension, low performance on this assessment indicates that students are performing significantly below expectations of their grade level and will likely need supplemental support to achieve grade level success. 40% percentile is often denoted as generally at-risk, so performance at or below the 20th percentile indicates a significant level of risk aligning to intensive need (Foorman, Petscher, Lefsky & Toste, 2010).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
No
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Stanford Achievement Test 10 (SAT-10)
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- The Stanford Achievement Test 10 (Stanford-10) was used for 1st, 2nd, 3rd, 5th, 6th and 8th grades. The Stanford Achievement Test Series was chosen as it reflects a global reading outcome that focuses on reading and comprehension. The criterion measure was external to the screening tool system and represents assessment of general reading ability.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- The classification accuracy of scores from RAPID as it pertains to risk status on the SESAT/Stanford 10/FCAT/OST ELA was evaluated by dichotomizing screening task scores as ‘1’ for not at-risk for reading difficulties and ‘0’ for at-risk for reading difficulties, students could be classified based on their dichotomized performances on both RAPID and SESAT/Stanford 10. As such, students could be identified as not at-risk on the screening tasks and demonstrating grade level performance on the outcome measures (i.e., specificity or true-negatives), at-risk on the screening task scores and below grade level performance on the outcome measures (i.e., sensitivity or true-positives), not at-risk based on the screening task scores and not at grade level on the outcome measures (i.e., false negative error), or at-risk on the screening task scores and at grade level on the outcome measures (i.e., false positive error). Decisions concerning appropriate cut-points for screening measures are based on the level of correct classification that is desired from the screening assessments. A variety of statistics were examined through the application of a confusion matrix (e.g., sensitivity, specificity, positive and negative predictive power). The area under the curve (AUC) was examined through logistic regressions, where students’ performance on the SESAT/Stanford 10/FCAT/OST ELA was coded as ‘1’ for performance at or above the 20th percentile, and ‘0’ for scores below this target. This dichotomous variable was then regressed on RAPIDs screening tasks at each grade level. The 20% percentile of the SESAT/Stanford-10/FCAT/OST ELA was chosen as the cutpoint. As a nationally normed assessment of reading comprehension, low performance on this assessment indicates that students are performing significantly below expectations of their grade level and will likely need supplemental support to achieve grade level success. 40% percentile is often denoted as generally at-risk, so performance at or below the 20th percentile indicates a significant level of risk aligning to intensive need (Foorman, Petscher, Lefsky & Toste, 2010).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
No
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Florida Comprehensive Assessment Test (FCAT)
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- The Florida Comprehensive Assessment Test (FCAT) was used for students in 4th, 7th, 9th and 10th grades.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- The classification accuracy of scores from RAPID as it pertains to risk status on the SESAT/Stanford 10/FCAT/OST ELA was evaluated by dichotomizing screening task scores as ‘1’ for not at-risk for reading difficulties and ‘0’ for at-risk for reading difficulties, students could be classified based on their dichotomized performances on both RAPID and SESAT/Stanford 10. As such, students could be identified as not at-risk on the screening tasks and demonstrating grade level performance on the outcome measures (i.e., specificity or true-negatives), at-risk on the screening task scores and below grade level performance on the outcome measures (i.e., sensitivity or true-positives), not at-risk based on the screening task scores and not at grade level on the outcome measures (i.e., false negative error), or at-risk on the screening task scores and at grade level on the outcome measures (i.e., false positive error). Decisions concerning appropriate cut-points for screening measures are based on the level of correct classification that is desired from the screening assessments. A variety of statistics were examined through the application of a confusion matrix (e.g., sensitivity, specificity, positive and negative predictive power). The area under the curve (AUC) was examined through logistic regressions, where students’ performance on the SESAT/Stanford 10/FCAT/OST ELA was coded as ‘1’ for performance at or above the 20th percentile, and ‘0’ for scores below this target. This dichotomous variable was then regressed on RAPIDs screening tasks at each grade level. The 20% percentile of the SESAT/Stanford-10/FCAT/OST ELA was chosen as the cutpoint. As a nationally normed assessment of reading comprehension, low performance on this assessment indicates that students are performing significantly below expectations of their grade level and will likely need supplemental support to achieve grade level success. 40% percentile is often denoted as generally at-risk, so performance at or below the 20th percentile indicates a significant level of risk aligning to intensive need (Foorman, Petscher, Lefsky & Toste, 2010).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
No
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Ohio State Test for English Language Arts (OST ELA)
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- The Ohio State Test for English Language Arts (OST ELA) was also used for 6th-8th grades.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- The classification accuracy of scores from RAPID as it pertains to risk status on the SESAT/Stanford 10/FCAT/OST ELA was evaluated by dichotomizing screening task scores as ‘1’ for not at-risk for reading difficulties and ‘0’ for at-risk for reading difficulties, students could be classified based on their dichotomized performances on both RAPID and SESAT/Stanford 10. As such, students could be identified as not at-risk on the screening tasks and demonstrating grade level performance on the outcome measures (i.e., specificity or true-negatives), at-risk on the screening task scores and below grade level performance on the outcome measures (i.e., sensitivity or true-positives), not at-risk based on the screening task scores and not at grade level on the outcome measures (i.e., false negative error), or at-risk on the screening task scores and at grade level on the outcome measures (i.e., false positive error). Decisions concerning appropriate cut-points for screening measures are based on the level of correct classification that is desired from the screening assessments. A variety of statistics were examined through the application of a confusion matrix (e.g., sensitivity, specificity, positive and negative predictive power). The area under the curve (AUC) was examined through logistic regressions, where students’ performance on the SESAT/Stanford 10/FCAT/OST ELA was coded as ‘1’ for performance at or above the 20th percentile, and ‘0’ for scores below this target. This dichotomous variable was then regressed on RAPIDs screening tasks at each grade level. The 20% percentile of the SESAT/Stanford-10/FCAT/OST ELA was chosen as the cutpoint. As a nationally normed assessment of reading comprehension, low performance on this assessment indicates that students are performing significantly below expectations of their grade level and will likely need supplemental support to achieve grade level success. 40% percentile is often denoted as generally at-risk, so performance at or below the 20th percentile indicates a significant level of risk aligning to intensive need (Foorman, Petscher, Lefsky & Toste, 2010).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
No
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Cross-Validation
- Has a cross-validation study been conducted?
-
No
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
Classification Accuracy - Winter
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 | Grade 9 | Grade 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Criterion measure | Stanford early School Achievement Test (SESAT) | Stanford Achievement Test 10 (SAT-10) | Stanford Achievement Test 10 (SAT-10) | Stanford Achievement Test 10 (SAT-10) | Florida Comprehensive Assessment Test (FCAT) | Stanford Achievement Test 10 (SAT-10) | Stanford Achievement Test 10 (SAT-10) | Florida Comprehensive Assessment Test (FCAT) | Stanford Achievement Test 10 (SAT-10) | Florida Comprehensive Assessment Test (FCAT) | Florida Comprehensive Assessment Test (FCAT) |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | |||||||||||
Cut Points - Corresponding performance score (numeric) on screener measure | |||||||||||
Classification Data - True Positive (a) | 24 | 19 | 21 | 62 | 67 | 66 | 116 | 124 | 81 | 67 | 60 |
Classification Data - False Positive (b) | 17 | 9 | 23 | 55 | 41 | 84 | 71 | 75 | 109 | 67 | 83 |
Classification Data - False Negative (c) | 15 | 12 | 20 | 39 | 27 | 28 | 71 | 83 | 46 | 47 | 63 |
Classification Data - True Negative (d) | 117 | 131 | 164 | 394 | 365 | 442 | 642 | 518 | 464 | 305 | 384 |
Area Under the Curve (AUC) | 0.84 | 0.91 | 0.83 | 0.86 | 0.91 | 0.88 | 0.88 | 0.87 | 0.83 | 0.82 | 0.80 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.77 | 0.85 | 0.75 | 0.83 | 0.88 | 0.85 | 0.85 | 0.85 | 0.80 | 0.79 | 0.75 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.91 | 0.96 | 0.92 | 0.89 | 0.94 | 0.91 | 0.90 | 0.90 | 0.87 | 0.86 | 0.83 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 | Grade 9 | Grade 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Base Rate | 0.23 | 0.18 | 0.18 | 0.18 | 0.19 | 0.15 | 0.21 | 0.26 | 0.18 | 0.23 | 0.21 |
Overall Classification Rate | 0.82 | 0.88 | 0.81 | 0.83 | 0.86 | 0.82 | 0.84 | 0.80 | 0.78 | 0.77 | 0.75 |
Sensitivity | 0.62 | 0.61 | 0.51 | 0.61 | 0.71 | 0.70 | 0.62 | 0.60 | 0.64 | 0.59 | 0.49 |
Specificity | 0.87 | 0.94 | 0.88 | 0.88 | 0.90 | 0.84 | 0.90 | 0.87 | 0.81 | 0.82 | 0.82 |
False Positive Rate | 0.13 | 0.06 | 0.12 | 0.12 | 0.10 | 0.16 | 0.10 | 0.13 | 0.19 | 0.18 | 0.18 |
False Negative Rate | 0.38 | 0.39 | 0.49 | 0.39 | 0.29 | 0.30 | 0.38 | 0.40 | 0.36 | 0.41 | 0.51 |
Positive Predictive Power | 0.59 | 0.68 | 0.48 | 0.53 | 0.62 | 0.44 | 0.62 | 0.62 | 0.43 | 0.50 | 0.42 |
Negative Predictive Power | 0.89 | 0.92 | 0.89 | 0.91 | 0.93 | 0.94 | 0.90 | 0.86 | 0.91 | 0.87 | 0.86 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 | Grade 9 | Grade 10 |
---|---|---|---|---|---|---|---|---|---|---|---|
Date | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 | Jan-Feb 2013 |
Sample Size | 173 | 171 | 228 | 550 | 500 | 620 | 900 | 800 | 700 | 486 | 590 |
Geographic Representation | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) | South Atlantic (FL) |
Male | 55.1% | 51.4% | 51.0% | 47.0% | 45.8% | 50.0% | 52.1% | 55.1% | |||
Female | 44.9% | 47.6% | 49.0% | 53.0% | 53.6% | 50.0% | 47.9% | 44.9% | |||
Other | |||||||||||
Gender Unknown | |||||||||||
White, Non-Hispanic | 33.1% | 35.6% | 36.9% | 36.0% | 33.8% | 33.0% | 35.0% | 36.9% | |||
Black, Non-Hispanic | 16.2% | 19.3% | 24.6% | 19.1% | 17.8% | 18.1% | 16.0% | 16.9% | 17.0% | 25.1% | 23.1% |
Hispanic | 17.3% | 34.5% | 36.8% | 42.0% | 39.6% | 40.0% | 43.0% | 43.8% | 43.0% | 34.0% | 35.1% |
Asian/Pacific Islander | |||||||||||
American Indian/Alaska Native | 3.1% | 2.0% | 3.1% | 3.0% | 3.0% | 3.0% | 2.1% | 2.0% | |||
Other | 3.1% | 3.0% | 5.0% | 3.0% | 3.0% | 3.0% | 3.1% | 3.1% | |||
Race / Ethnicity Unknown | |||||||||||
Low SES | 81.5% | 43.9% | 47.4% | 66.0% | 60.4% | 61.0% | 61.0% | 60.6% | 61.0% | 60.1% | 53.1% |
IEP or diagnosed disability | |||||||||||
English Language Learner | 11.0% | 16.4% | 18.0% | 11.1% | 10.0% | 13.1% | 14.0% | 12.9% | 15.0% | 7.0% | 10.0% |
Classification Accuracy - Spring
Evidence | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|
Criterion measure | Ohio State Test for English Language Arts (OST ELA) | Ohio State Test for English Language Arts (OST ELA) | Ohio State Test for English Language Arts (OST ELA) |
Cut Points - Percentile rank on criterion measure | 20 | 20 | 20 |
Cut Points - Performance score on criterion measure | |||
Cut Points - Corresponding performance score (numeric) on screener measure | |||
Classification Data - True Positive (a) | 41 | 25 | 33 |
Classification Data - False Positive (b) | 20 | 17 | 27 |
Classification Data - False Negative (c) | 15 | 10 | 15 |
Classification Data - True Negative (d) | 78 | 94 | 90 |
Area Under the Curve (AUC) | 0.86 | 0.87 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.79 | 0.80 | 0.85 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.92 | 0.93 | 0.94 |
Statistics | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|
Base Rate | 0.36 | 0.24 | 0.29 |
Overall Classification Rate | 0.77 | 0.82 | 0.75 |
Sensitivity | 0.73 | 0.71 | 0.69 |
Specificity | 0.80 | 0.85 | 0.77 |
False Positive Rate | 0.20 | 0.15 | 0.23 |
False Negative Rate | 0.27 | 0.29 | 0.31 |
Positive Predictive Power | 0.67 | 0.60 | 0.55 |
Negative Predictive Power | 0.84 | 0.90 | 0.86 |
Sample | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|
Date | Apr – May 2018 | Apr – May 2018 | Apr – May 2018 |
Sample Size | 154 | 146 | 165 |
Geographic Representation | East North Central (OH) | East North Central (OH) | East North Central (OH) |
Male | 48.7% | 52.1% | 55.2% |
Female | 51.3% | 47.9% | 44.8% |
Other | |||
Gender Unknown | |||
White, Non-Hispanic | 89.0% | 93.2% | 90.9% |
Black, Non-Hispanic | 1.3% | 0.7% | 3.0% |
Hispanic | 1.3% | 2.1% | 1.2% |
Asian/Pacific Islander | |||
American Indian/Alaska Native | |||
Other | 6.2% | 6.1% | |
Race / Ethnicity Unknown | |||
Low SES | |||
IEP or diagnosed disability | 7.1% | 10.3% | 12.7% |
English Language Learner | 5.8% | 8.2% | 12.1% |
Reliability
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
Grade 9
|
Grade 10
|
---|---|---|---|---|---|---|---|---|---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- Reliability estimates for RAPID are provided in the forms of marginal reliability and standard error of measurement (SEM). Given that RAPID is a computer-adaptive assessment that does not have a fixed form, traditional reliability estimates such as Cronbach’s alpha are not appropriate when evaluating consistency or inconsistency in student performance. Marginal Reliability: The IRT equivalent to traditional reliability is called marginal reliability which is a function of theta scores and the average of the expected error variance. Standard Error of Measurement (SEM): The standard error of measurement is an estimate of the amount of variance that might be observed in an individual’s performance given repeated testing. Only through repeated testing would it be possible to define the individual’s true ability, as scores may fluctuate from one particular day of testing to the next. Because it is unreasonable to test a student repeatedly in order to capture his/her true ability, we can construct an interval to observe the extent to which the score may fluctuate.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- Marginal reliability and the standard error of measurement are based on a sample of students that participated in RAPIDs calibration and validation studies. Approximately 9,000 students in kindergarten through second grade participated in these studies. Average demographic information for the sample in grades kindergarten through two was as follows: 40% White, 31% Hispanic, 23% Black, 6% Other; 65% eligible for free/reduced price lunch; 18% limited English proficient. A total of 44,780 students in 3rd-10th grades participated in RAPIDs calibration and validation studies. Average demographic information for the sample in grades three through ten was as follows: 41% White, 30% Hispanic, 23% Black, 6% Other; 60% eligible for Free/Reduced Price lunch (FRL); 8% English language learners (ELL).
- *Describe the analysis procedures for each reported type of reliability.
- Marginal reliability was computed as follows: (see attached doc)
*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Mean Performance Scores and Standard Error of Measurement (SEM) Grade Task Mean SEM K Phonological Awareness 315.86 31.95 1 Word Reading 503.6 25.33 2 Word Reading 614.64 38.57 Mean Performance Scores and Standard Error of Measurement (SEM) for the Reading Comprehension Task Grade N Mean SEM 3 325 386.03 28.69 4 322 440.07 32.96 5 302 497.25 36.49 6 431 499.96 37.63 7 426 524.45 39.67 8 461 571.71 48.61 9 703 583.06 39.26 10 626 589.72 44.65
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
Grade 9
|
Grade 10
|
---|---|---|---|---|---|---|---|---|---|---|---|
Rating |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- All criterion measures were external to the screening tool system and represent assessments of general reading ability. In order to collect evidence of predictive validity, The Stanford Early School Achievement Test (SESAT) was used for students in Kindergarten, whereas the Stanford Achievement Test 10 was used for 1st-10th grades. The Stanford Achievement Test Series was chosen as it reflects a global reading outcome that focuses on reading and comprehension. For the purpose of collecting evidence of convergent validity, the FCAT was used for 3rd-10th grades. Evidence of concurrent validity was collected in an independent study with the Lexile Reading Scale in cooperation with Meta Metrics. Validity evidence for RAPID also comes from the degree and stability of the relationship of RAPID performance scores across multiple periods of time. This type of evidence supports the construct validity of RAPID and the underlying stability of the score scale.
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- Predictive Validity: A key consideration in aligning RAPID performance scores to the Stanford 10 was to ensure that the sample data reflected normative performance in the United States at large. Percentile ranks from the Stanford 10 were used to evaluate the distribution of scores in the alignment sample to national norms which allowed for adjustments where needed to bring the observed sample in line with national norms. As such, the sample was stratified to reflect normative reading comprehension risk levels whereby the base rate of reading comprehension risk was 50% (for Kindergarten) and Stanford 10 (1st-10th grades). A total of 1651 students participated in the study in Kindergarten-2nd grade, whereas 5170 students participated in 3rd-10th grades. Convergent Validity: Out of 5170 students in the normed sample a total of 5118 students also had taken the FCAT in 3rd-10th grades. Concurrent Validity: A total of 832 students participated in the Lexile study in grades 1 and 2 and 2551 students participated in grades 3-10. Construct Validity: A total of 2572 students participated in Kindergarten and 1st grade. The samples of students in 3rd – 10th grades ranged from 223 (10th Grade) to 2329 (3rd Grade).
- *Describe the analysis procedures for each reported type of validity.
- Predictive Validity: Predictive validity was evaluated using RAPID tasks to predict later reading comprehension performance on the SESAT and Stanford 10 through a series of linear regressions. The linear regressions were run two ways. First, a correlation analysis (Pearson Product Moment) was used to evaluate the strength of relations between each of the screening tasks’ performance scores with scores on the SESAT/Stanford 10. Second, a multiple regression analysis was run to estimate the total amount of variance that the linear combination of the predictors explained in Stanford 10 reading comprehension performance. Convergent Validity: Convergent validity was evaluated by testing relations (Pearson Product Moment Coefficients) between RAPID and FCAT. Concurrent Validity: Evidence is expressed as the degree of relationship to performance on another assessment measuring achievement in the same construct administered close in time. This form of validity was expressed in the form of a Pearson correlation coefficient between the RAPID score and the Lexile score. Construct Validity: Construct validity was estimated as the Pearson correlation coefficient between student RAPID scores from the spring administration in 2017 versus RAPID scores from the spring administration in 2018 for Kindergarten and 1st grades, whereas winter administrations were used for 3rd-10th grades
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- see doc attached above
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- For grades K-2, the validity coefficients provide moderate evidence of predictive and construct validity. For grades 3-10, the validity coefficients provide moderate to strong evidence of predictive, concurrent, convergent and construct validity.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
Grade 9
|
Grade 10
|
---|---|---|---|---|---|---|---|---|---|---|---|
Rating | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- Yes
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- DIF testing was conducted with a multiple indicator multiple cause (MIMIC) analysis. A series of four standardized and expected score effect size measures were generated using VisualDF software to quantify various technical aspects of score differentiation between the gender groups. First, the signed item difference in the sample (SIDS) index was created, which describes the average unstandardized difference in expected scores between the groups. The second effect size calculated was the unsigned item difference in the sample (UIDS). This index can be utilized as supplementary to the SIDS. Lastly, an expected score standardized difference (ESSD) was generated, and was computed similarly to a Cohen’s (1988) d statistic. As such, it is interpreted as a measure of standard deviation difference between the groups for the expected score response with values of .2 regarded as small, .5 as medium, and .8 as large.
- b. Describe the subgroups for which bias analyses were conducted:
- For RAPID, DIF testing was conducted comparing: Black-White students, Hispanic-White students, Black-Hispanic students, students eligible for Free or Reduced Priced Lunch (FRL) with students not receiving FRL, and English language learner (ELL) to non-ELL students. Average demographic information for the sample in grades kindergarten through two was as follows: 40% White, 31% Hispanic, 23% Black, 6% Other; 65% eligible for free/reduced price lunch; 18% limited English proficient. Average demographic information for the sample in grades three through ten was as follows: 41% White, 30% Hispanic, 23% Black, 6% Other; 60% eligible for Free/Reduced Price lunch (FRL); 8% English language learners (ELL).
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
- Fewer than 10% of items demonstrated DIF with most items <.50. Such items were studied further over time; however, the lack of differential bias in prediction of risk tempers item-level DIF analysis.
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.