ADAM: Adaptive Diagnostic Assessment of Mathematics
Mathematics
Summary
The Adaptive Diagnostic Assessment of Mathematics (ADAM) is a web-based diagnostic and screener that is part of the Let’s Go Learn integrated assessment and instruction platform. It provides a single vertically aligned grade-level equivalency score for grades K through 7 by assessing mastery across 44 sub-tests that represent foundational mathematics skills. These sub-tests are organized into scope and sequenced skills and concepts, aligning with the progression of how mathematics is typically taught. Key Features and Design: 1) Precise Scope and Sequence Alignment: ADAM assesses skills and concepts in a manner consistent with instructional best practices, enabling it to effectively pinpoint students’ mastery levels. 2) Adaptive Algorithm: The assessment automatically adjusts to each student’s performance, allowing for targeted evaluation that skips mastered skills while probing deeper into areas of struggle. 3) Domains Covered: ADAM evaluates five critical domains: Numbers and Operations, Measurement, Data Analysis, Geometry, and Algebraic Thinking, ensuring a comprehensive understanding of foundational mathematics. Instructional Applications: 1) Special Education: ADAM identifies present levels of academic achievement and functional performance (PLAAFP) for IEP development, offering precise diagnostic data. 2) Tiered Interventions: The assessment identifies skill gaps to group students effectively and provide targeted instruction. 3) Progress Monitoring: By capturing growth over time, ADAM supports ongoing instructional adjustments and tracking student progress. Administration Details: 1) Duration and Flexibility: The assessment typically takes 30 to 60 minutes, depending on the student’s grade level and skill breadth, with the option to administer in multiple sittings. 2) Frequency: ADAM is commonly administered three times a year, with 12–18 weeks of instruction recommended between administrations. 3) Targeted Use: It is suitable for screening all students or specific groups, including those at risk of academic failure or requiring individualized support. Actionable Reporting: ADAM generates detailed reports immediately upon completion. These reports include: 1) Analysis of skill/concept gaps and strengths. 2) Specific instructional recommendations for addressing deficiencies. 3) Grouping tools to support differentiated instruction in small-group or whole-class settings. 4) Individualized learning paths when paired with the optional Math Edge program, which prescribes targeted online lessons for each student. Sophisticated Adaptivity: ADAM’s design ensures that students are assessed only on relevant skills. For example, if a student demonstrates mastery of single-digit multiplication but not multi-digit multiplication, the assessment will focus on the concept of area with simpler calculations, gradually increasing in complexity based on success. Each skill is assessed using multiple items to confirm mastery, supporting accurate instructional planning. In summary, ADAM serves as a powerful and flexible tool for screening, diagnosing, and monitoring student learning needs in mathematics. Its adaptive design and actionable reports ensure it is practical and effective for educators, whether used as a universal screener or a diagnostic tool for individualized learning plans.
- Where to Obtain:
- Let's Go Learn, Inc.
- help@letsgolearn.com
- 705 Wellesley Ave., Kensington, CA 94708
- 888-618-7323
- www.letsgolearn.com
- Initial Cost:
- $8.00 per student
- Replacement Cost:
- $8.00 per student per year
- Included in Cost:
- ADAM operates on a per-student annual licensing fee, which varies depending on the scale of implementation and any additional features requested, such as progress monitoring. The cost ranges from $8 to $13 per student per year, with discounts available for multi-year subscriptions. The annual license fee includes: Student Access: Online access to assessments and progress monitoring. Educator Tools: Full access to management and reporting tools. User Resources: Comprehensive asynchronous professional development resources. Infrastructure and Maintenance: Account setup, secure hosting, and all program updates, enhancements, and maintenance during the active license term. Support Services: Unlimited access to U.S.-based customer support via toll-free phone and email during business hours. Professional Development: While professional development is required for implementation, it is available at an additional cost ranging from $500 to $2,500, depending on the scope and delivery method.
- ADAM includes a variety of accommodations and accessibility features, ensuring its usability for most students, including those with disabilities. It incorporates Universal Accessibility Features that are available to all students without requiring intervention from educators to enable them. These features include untimed testing, the ability to navigate the assessment using a keyboard, and adjustable audio volume. Additionally, universally accessible audio support is integrated into all items for grades K-7, which not only reduces potential reading bias but also assists students who may require audio support for text comprehension. ADAM also includes processes and tools specifically designed to support students who require accommodations as determined by Individualized Education Programs (IEPs) or 504 plans. Since fall 2015, ADAM has met the Level AA standard under the Web Content Accessibility Guidelines (WCAG 2.0), with documented exceptions. While all students have access to Universal Accessibility Features, accommodations beyond these features are typically determined by IEP teams or other educational professionals. Let’s Go Learn provides guidance and training to educators on how to implement accommodations effectively, but the ultimate decision and application of accommodations rest with the educators working directly with individual students.
- Training Requirements:
- 1 hour
- Qualified Administrators:
- No minimum qualifications specified.
- Access to Technical Support:
- Let’s Go Learn provides a dedicated partnership success specialist plus unlimited access to in-house technical support during business hours to every customer.
- Assessment Format:
-
- Direct: Computerized
- Scoring Time:
-
- Scoring is automatic
- Scores Generated:
-
- Raw score
- Percentile score
- Grade equivalents
- Developmental benchmarks
- Developmental cut points
- Composite scores
- Subscale/subtest scores
- Other: ADAM also provides present levels for IEPs and next skills/concepts gap report for teachers to set goals for students.
- Administration Time:
-
- 45 minutes per student or group
- Scoring Method:
-
- Automatically (computer-scored)
- Technology Requirements:
-
- Computer or tablet
- Internet connection
- Accommodations:
- ADAM includes a variety of accommodations and accessibility features, ensuring its usability for most students, including those with disabilities. It incorporates Universal Accessibility Features that are available to all students without requiring intervention from educators to enable them. These features include untimed testing, the ability to navigate the assessment using a keyboard, and adjustable audio volume. Additionally, universally accessible audio support is integrated into all items for grades K-7, which not only reduces potential reading bias but also assists students who may require audio support for text comprehension. ADAM also includes processes and tools specifically designed to support students who require accommodations as determined by Individualized Education Programs (IEPs) or 504 plans. Since fall 2015, ADAM has met the Level AA standard under the Web Content Accessibility Guidelines (WCAG 2.0), with documented exceptions. While all students have access to Universal Accessibility Features, accommodations beyond these features are typically determined by IEP teams or other educational professionals. Let’s Go Learn provides guidance and training to educators on how to implement accommodations effectively, but the ultimate decision and application of accommodations rest with the educators working directly with individual students.
Descriptive Information
- Please provide a description of your tool:
- The Adaptive Diagnostic Assessment of Mathematics (ADAM) is a web-based diagnostic and screener that is part of the Let’s Go Learn integrated assessment and instruction platform. It provides a single vertically aligned grade-level equivalency score for grades K through 7 by assessing mastery across 44 sub-tests that represent foundational mathematics skills. These sub-tests are organized into scope and sequenced skills and concepts, aligning with the progression of how mathematics is typically taught. Key Features and Design: 1) Precise Scope and Sequence Alignment: ADAM assesses skills and concepts in a manner consistent with instructional best practices, enabling it to effectively pinpoint students’ mastery levels. 2) Adaptive Algorithm: The assessment automatically adjusts to each student’s performance, allowing for targeted evaluation that skips mastered skills while probing deeper into areas of struggle. 3) Domains Covered: ADAM evaluates five critical domains: Numbers and Operations, Measurement, Data Analysis, Geometry, and Algebraic Thinking, ensuring a comprehensive understanding of foundational mathematics. Instructional Applications: 1) Special Education: ADAM identifies present levels of academic achievement and functional performance (PLAAFP) for IEP development, offering precise diagnostic data. 2) Tiered Interventions: The assessment identifies skill gaps to group students effectively and provide targeted instruction. 3) Progress Monitoring: By capturing growth over time, ADAM supports ongoing instructional adjustments and tracking student progress. Administration Details: 1) Duration and Flexibility: The assessment typically takes 30 to 60 minutes, depending on the student’s grade level and skill breadth, with the option to administer in multiple sittings. 2) Frequency: ADAM is commonly administered three times a year, with 12–18 weeks of instruction recommended between administrations. 3) Targeted Use: It is suitable for screening all students or specific groups, including those at risk of academic failure or requiring individualized support. Actionable Reporting: ADAM generates detailed reports immediately upon completion. These reports include: 1) Analysis of skill/concept gaps and strengths. 2) Specific instructional recommendations for addressing deficiencies. 3) Grouping tools to support differentiated instruction in small-group or whole-class settings. 4) Individualized learning paths when paired with the optional Math Edge program, which prescribes targeted online lessons for each student. Sophisticated Adaptivity: ADAM’s design ensures that students are assessed only on relevant skills. For example, if a student demonstrates mastery of single-digit multiplication but not multi-digit multiplication, the assessment will focus on the concept of area with simpler calculations, gradually increasing in complexity based on success. Each skill is assessed using multiple items to confirm mastery, supporting accurate instructional planning. In summary, ADAM serves as a powerful and flexible tool for screening, diagnosing, and monitoring student learning needs in mathematics. Its adaptive design and actionable reports ensure it is practical and effective for educators, whether used as a universal screener or a diagnostic tool for individualized learning plans.
ACADEMIC ONLY: What skills does the tool screen?
- Please describe specific domain, skills or subtests:
- ADAM assesses five domains (Number and Operations, Algebra and Algebraic Thinking, Measurement, Data Analysis, and Geometry). For the domain of Number and Operations, a present level may be identified in the following 14 sub-tests: Numbers, Place Value, Comparing and Ordering, Addition of Whole Numbers, Subtraction of Whole Numbers, Multiplication of Whole Numbers, Division of Whole Numbers, Fractions, Number Theory, Decimal Operations, Percentages, Ratios and Proportions, Positive and Negative Integers, and Exponents. Within the domain of Measurement, the sub-tests covered are: Money, Time, Temperature, Length, Weight, Capacity & Volume, and Rate. Within the domain of Data Analysis, the sub-tests include: Patterns & Sorting, Data Representation, Simple Probability, Outcomes, Displaying Data, Measures of Central Tendency, Ordered Pairs, and Samples. Within the domain of Geometry the sub-tests are: Location & Direction, 2D Shapes, 3D Shapes, Triangles, Quadrilaterals, Area & Perimeter, Lines, Circles, Angles, Volume & Surface Area, and Geometric Relationships. And finally within the domain of Algebraic Thinking the sub-tests include: Relationships, Expressions & Problem Solving, Equations, and Graphing Algebraic Relationships
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
-
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
Acquisition and Cost Information
Administration
- Are norms available?
- Yes
- Are benchmarks available?
- Yes
- If yes, how many benchmarks per year?
- Let’s Go Learn recommends giving the ADAM three times a year. ADAM provides benchmark scores in the form of grade-level equivalency scores at the sub-test, domain and overall assessment levels. ADAM also provides intervention tiers associated with schools using an RtI or MTSS academic framework.
- If yes, for which months are benchmarks available?
- Fall, winter, spring
- BEHAVIOR ONLY: Can students be rated concurrently by one administrator?
- If yes, how many students can be rated concurrently?
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- 1 hour
- Please describe the minimum qualifications an administrator must possess.
-
No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- Let’s Go Learn provides a dedicated partnership success specialist plus unlimited access to in-house technical support during business hours to every customer.
Scoring
- Do you provide basis for calculating performance level scores?
-
Yes
- Does your tool include decision rules?
-
No
- If yes, please describe.
- Can you provide evidence in support of multiple decision rules?
-
No
- If yes, please describe.
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- ADAM employs a single vertical grade-level scoring algorithm spanning grades K through 7, designed to reflect the specific skills and concepts taught across the five domains of mathematics at each grade level. This scoring structure ensures that student performance can be measured with absolute comparability over time, making it particularly effective for tracking growth from kindergarten through grade 7. Unlike norm-referenced assessments, ADAM’s scoring does not obscure growth trends, especially in middle school, where national performance data often depresses normed growth indicators. Each skill or concept is assessed with a minimum of three test items, and the assessment consists of 44 sub-tests, each representing a scope and sequence of related skills or concepts. Students progress by mastering distinct sets of skills and concepts and regress when they fail to demonstrate mastery, with scoring based on achieving a ceiling condition—mastery of a specific skill set followed by failure at the next level of difficulty. This approach produces 44 individual grade-level equivalency scores that are integrated into a weighted scoring algorithm. In total, ADAM encompasses 283 skill or concept sets that span foundational mathematics for grades K through 7. The scoring structure is explicitly designed to test skills and concepts directly rather than predict scores, resulting in absolute rather than relative performance measures. As students demonstrate mastery of new skills, ADAM dynamically adjusts starting points to previous high points, reducing inefficiencies by avoiding reassessment of already-mastered content. The adaptive logic also accounts for potential regression by requiring mastery of a ceiling condition in each administration, ensuring that scores accurately reflect each student’s current level of mastery.
- Describe the tool’s approach to screening, samples (if applicable), and/or test format, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
- ADAM includes intervention screening reports that classify students into three tiers: Tier 1, Tier 2, or Tier 3. Its grade-level equivalency scoring system simplifies interpretation by providing a decimal score that correlates with the student’s grade level and time of year. This allows educators to calculate the student’s gap by subtracting the decimal grade at the time of testing from the ADAM score. Three intervention screening report formats are available: beginning-of-year, standard, and end-of-year, each categorizing students based on their chronological grade level. ADAM also provides national norms expressed as percentiles for beginning-of-year, mid-year, and end-of-year testing, offering an alternative percentile-based classification approach for intervention for those familiar with this method. Let’s Go Learn ensures that ADAM is appropriate for a diverse range of students, including those from culturally and linguistically diverse populations and students with disabilities. All items are designed to be developmentally, linguistically, and culturally appropriate. Audio support is included for all math items requiring reading, minimizing the potential for reading bias. For students who require accommodations, such as large print or additional time, ADAM’s design ensures that most will not need further intervention to complete the assessment. The interface automatically optimizes text, image, and number scaling, and the assessment emphasizes maintaining standard administration and interpretation without compromising its purpose or validity. ADAM was developed using universal design principles, with a commitment to fair, engaging, authentic, rigorous, and culturally inclusive assessments. A diverse review committee establishes standards that item writers adhere to, considering bias and sensitivity in accordance with Let’s Go Learn’s Sensitivity, Fairness, and Accessibility Guidelines. Items are constructed to focus students’ attention on the task without introducing bias or distraction. Every item undergoes rigorous review for sensitivity, fairness, and potential bias, including but not limited to cultural, linguistic, socioeconomic, religious, geographic, color-blind, gender, and disability biases. Items that fail to meet these criteria are revised, rejected, or removed. The Assessment Development team employs differential item functioning (DIF) analyses and p-value outlier screening to identify items with potential bias in performance metrics. Items exhibiting severe DIF are removed from the item bank, while flagged items are revised and re-piloted. These practices align with periodic quality reviews to maintain a high standard of fairness and accessibility in the assessment.
Technical Standards
Classification Accuracy & Cross-Validation Summary
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|---|---|---|
Classification Accuracy Fall |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Classification Accuracy Winter |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Classification Accuracy Spring |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |




Smarter Balanced Assessment (SBA) Mathematics
Classification Accuracy
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- The criterion measure is the Smarter Balanced Assessment (SBA) Mathematics test for grades 3-8. The SBA is an end-of-year state summative assessment administered in the spring in various states. The scaled scores and performance bands defined in the Smarter Balanced 2018-2019 Summative Technical Report were used to classify students. Students who scored below the score corresponding to the 10th percentile on the SBA for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 10th percentile were classified as not-at-risk. For grades K-2, the grade 3 SBA scores were used as the criterion for calculation of predictive classification accuracy, as states do not administer SBA before grade 3. As such, the criterion was administered 1-3 years after the ADAM administration.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- For grades 3-8, the screening measure was administered at three time points in the 2018-19 academic year: Fall (between 08/01/2018 and 10/15/2018), Winter (between 12/1/2018 and 03/01/2019), and Spring (between 04/01/2019 and 06/15/2019). The criterion measure (SBA) was administered in the Spring of 2019. The Spring ADAM scores, taken close in time to the SBA, represent concurrent classification accuracy, while the Fall and Winter scores represent predictive classification accuracy. For grade K to 2, the screening measure was administered at three time points in the earlier school years prior to the SBA being administered in the Spring of 2019.
- Describe how the classification analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- Cut points on the criterion measure (SBA) were determined as the scale score corresponding to the 10th percentile defined in the Smarter Balanced 2018-2019 Summative Technical Report for the given subject and grade. This cut point follows the definition of students in need of intensive intervention provided by NCII’s Technical Review Committee. Students who scored below the score corresponding to the 10th percentile on the SBA for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 10th percentile were classified as not-at-risk. Cut points on the screening measure (ADAM) were empirically identified as grade-level scores that best align with SBA’s 10th percentile scores for each subject, grade and testing window. Using these cut scores, students were classified as at-risk if they scored below the cut score on ADAM for the given testing window, or not-at-risk if they scored at or above the cut. Classification indices between at-risk/not-at-risk on ADAM and at-risk/not-at-risk on the SBA assessment are calculated per the formulas in the classification worksheet. For students in grades 3-8, screening scores in the Fall, Winter and Spring of the 2018-19 academic year were used for at-risk classification on the criterion measure administered in Spring 2019 at the same grade level. For students in grade K, screening scores from the 2015-16 academic year were used for at-risk classification on the criterion measure for the same students in grade 3 in Spring 2019. For students in grade 1, Fall, Winter, and Spring screening scores from the 2016-17 academic year were used for at-risk classification on the criterion measure for the same students in grade 3 in Spring 2019. For students in grade 2, Fall, Winter, and Spring screening scores from the 2017-18 academic year were used for at-risk classification on the criterion measure for the same students in grade 3 in Spring 2019. A post-Covid note: We are submitting data from pre-covid. Given that neither our assessment nor current modern math standards have changed since the 2018-19 year, we believe this data to be the most valid for comparison to SBA and its cut scores for risk or not at-risk.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- Some students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students. In addition, roughly 50% of students who take ADAM may be using our math intervention but for these students we don’t know the fidelity in which they may have received our intervention.
Cross-Validation
- Has a cross-validation study been conducted?
-
Yes
- If yes,
- Describe the criterion (outcome) measure(s) including the degree to which it/they is/are independent from the screening measure.
- The criterion measure is the Smarter Balanced Assessment (SBA) Mathematics test for grades 3-8. For grades K-2, SBA scores for the same students in 2019 were used as the criterion for calculation of predictive classification accuracy. The SBA is an end-of-year state summative assessment administered in the Spring in various states. The percentile scores defined in the Smarter Balanced 2018–19 Summative Technical Report are used to classify students. Students who scored below the score corresponding to the 10th percentile on the SBA for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 10th percentile were classified as not-at-risk.
- Describe when screening and criterion measures were administered and provide a justification for why the method(s) you chose (concurrent and/or predictive) is/are appropriate for your tool.
- For grades 3-8, the screening measure was administered at three time points in the 2018-19 academic year: Fall (between 08/01/2018 and 10/15/2018), Winter (between 12/1/2018 and 03/01/2019), and Spring (between 04/01/2019 and 06/15/2019). The criterion measure (SBA) was administered in the Spring of 2019. The Spring ADAM scores, taken close in time to the SBA, represent concurrent classification accuracy, while the Fall and Winter scores represent predictive classification accuracy. For grade K to 2, the screening measure was administered at three time points in the earlier school years prior to the SBA being administered in the Spring of 2019.
- Describe how the cross-validation analyses were performed and cut-points determined. Describe how the cut points align with students at-risk. Please indicate which groups were contrasted in your analyses (e.g., low risk students versus high risk students, low risk students versus moderate risk students).
- For the cross-validation study, we used a K-fold cross-validation method by splitting the sample into K=5 parts, using 4 parts (80% of the sample) for the classification accuracy study and 1 part (20% of the sample) for cross-validation. Therefore, the timing of measure administration was the same as the main Classification Accuracy study. In order to validate our results, we used the same cut points as the main Classification Accuracy study for both the criterion measure (SBA) and screening measure (i-Ready Diagnostic) when performing the classification analyses on the cross-validation sample. Cut points on the criterion measure (SBA) were determined as the scale score corresponding to the 10th percentile defined in the Smarter Balanced 2021–22 Summative Technical Report for the given subject and grade. This cut point follows the definition of students in need of intensive intervention provided by NCII’s Technical Review Committee. Students who scored below the score corresponding to the 10th percentile on the SBA test for the given grade were classified as at-risk and students who scored at or above the score corresponding to the 10th percentile were classified as no-risk. Cut points on the screening measure (ADAM) were the same scores identified as cut-points in the main Classification Accuracy study. Students were designated as actually “at risk” or “not at risk” by rank-ordering their SBA state scale scores and using the 10th percentile rank point within the study sample as the cut score, disaggregated within the sample by grade and subject area. Students actually “at risk” were so designated when their state scale scores fell below the 10th percentile rank point. Because ADAM uses a grade-level equivalence score, we rank at-risk based on how far below a student may be in their total ADAM grade-level score from their decimal grade corresponding with their Fall, Winter, or Spring ADAM assessment. For the purpose of this cross-validation analysis the gap threshold per grade varied.
- Were the children in the study/studies involved in an intervention in addition to typical classroom instruction between the screening measure and outcome assessment?
-
Yes
- If yes, please describe the intervention, what children received the intervention, and how they were chosen.
- Some students may have been involved in various interventions in their particular schools, but we do not know which interventions or which students. In addition, roughly 50% of students who take ADAM may be using our math intervention but for these student we don’t know the fidelity in which they may have received our intervention.
Classification Accuracy - Fall
Evidence | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|
Criterion measure | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics |
Cut Points - Percentile rank on criterion measure | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Cut Points - Performance score on criterion measure | 2324 | 2324 | 2324 | 2362 | 2373 | 2367 | 2374 | 2375 |
Cut Points - Corresponding performance score (numeric) on screener measure | .5 | 1.4 | 2.2 | 3.1 | 3.7 | 4.1 | 4.6 | 5.1 |
Classification Data - True Positive (a) | 630 | 899 | 821 | 1441 | 1570 | 1580 | 1791 | 1090 |
Classification Data - False Positive (b) | 2451 | 2290 | 1755 | 1880 | 3280 | 3018 | 3374 | 2550 |
Classification Data - False Negative (c) | 179 | 170 | 140 | 250 | 178 | 265 | 360 | 269 |
Classification Data - True Negative (d) | 9841 | 10291 | 10853 | 18267 | 18051 | 18124 | 17562 | 11980 |
Area Under the Curve (AUC) | 0.85 | 0.84 | 0.91 | 0.91 | 0.91 | 0.90 | 0.90 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.83 | 0.85 | 0.91 | 0.89 | 0.90 | 0.89 | 0.88 | 0.88 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.86 | 0.83 | 0.90 | 0.93 | 0.92 | 0.91 | 0.92 | 0.90 |
Statistics | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|
Base Rate | 0.06 | 0.08 | 0.07 | 0.08 | 0.08 | 0.08 | 0.09 | 0.09 |
Overall Classification Rate | 0.80 | 0.82 | 0.86 | 0.90 | 0.85 | 0.86 | 0.84 | 0.82 |
Sensitivity | 0.78 | 0.84 | 0.85 | 0.85 | 0.90 | 0.86 | 0.83 | 0.80 |
Specificity | 0.80 | 0.82 | 0.86 | 0.91 | 0.85 | 0.86 | 0.84 | 0.82 |
False Positive Rate | 0.20 | 0.18 | 0.14 | 0.09 | 0.15 | 0.14 | 0.16 | 0.18 |
False Negative Rate | 0.22 | 0.16 | 0.15 | 0.15 | 0.10 | 0.14 | 0.17 | 0.20 |
Positive Predictive Power | 0.20 | 0.28 | 0.32 | 0.43 | 0.32 | 0.34 | 0.35 | 0.30 |
Negative Predictive Power | 0.98 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.98 |
Sample | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|
Date | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 |
Sample Size | 13101 | 13650 | 13569 | 21838 | 23079 | 22987 | 23087 | 15889 |
Geographic Representation | East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
Male | 48.3% | 50.0% | 50.1% | 52.6% | 50.6% | 50.5% | 50.8% | 50.7% |
Female | 51.7% | 48.6% | 49.9% | 52.0% | 49.2% | 49.5% | 49.2% | 49.3% |
Other | ||||||||
Gender Unknown | ||||||||
White, Non-Hispanic | 25.8% | 25.9% | 23.8% | 25.3% | 23.6% | 22.3% | 22.6% | 23.7% |
Black, Non-Hispanic | 8.4% | 7.2% | 7.6% | 6.9% | 6.4% | 9.4% | 10.2% | 9.1% |
Hispanic | 43.8% | 42.4% | 45.3% | 52.6% | 50.0% | 49.9% | 52.3% | 51.6% |
Asian/Pacific Islander | 13.1% | 13.7% | 12.3% | 12.6% | 10.5% | 10.9% | 9.2% | 8.4% |
American Indian/Alaska Native | 0.2% | 0.2% | 0.2% | 0.2% | 0.2% | 0.2% | 0.2% | 0.2% |
Other | 7.3% | 5.0% | 6.1% | 3.2% | 5.9% | 5.2% | 3.2% | 4.1% |
Race / Ethnicity Unknown | 1.5% | 4.2% | 4.7% | 3.8% | 3.2% | 2.1% | 2.3% | 2.9% |
Low SES | ||||||||
IEP or diagnosed disability | ||||||||
English Language Learner |
Classification Accuracy - Winter
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Criterion measure | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics |
Cut Points - Percentile rank on criterion measure | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Cut Points - Performance score on criterion measure | 2292 | 2292 | 2292 | 2292 | 2334 | 2351 | 2348 | 2362 | 2360 |
Cut Points - Corresponding performance score (numeric) on screener measure | .35 | .9 | 1.9 | 2.6 | 3.5 | 4 | 4.4 | 5 | 5.5 |
Classification Data - True Positive (a) | 181 | 496 | 625 | 594 | 611 | 583 | 575 | 467 | 451 |
Classification Data - False Positive (b) | 775 | 1799 | 1813 | 1618 | 1653 | 1689 | 1693 | 1535 | 1478 |
Classification Data - False Negative (c) | 90 | 138 | 149 | 138 | 152 | 149 | 150 | 125 | 97 |
Classification Data - True Negative (d) | 2220 | 6301 | 6306 | 6029 | 6078 | 6091 | 6138 | 6296 | 6102 |
Area Under the Curve (AUC) | 0.78 | 0.82 | 0.83 | 0.83 | 0.82 | 0.82 | 0.82 | 0.80 | 0.80 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.77 | 0.81 | 0.83 | 0.82 | 0.80 | 0.80 | 0.81 | 0.79 | 0.79 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.79 | 0.83 | 0.83 | 0.84 | 0.84 | 0.84 | 0.83 | 0.81 | 0.82 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Base Rate | 0.08 | 0.07 | 0.09 | 0.09 | 0.09 | 0.09 | 0.08 | 0.07 | 0.07 |
Overall Classification Rate | 0.74 | 0.78 | 0.78 | 0.79 | 0.79 | 0.78 | 0.78 | 0.80 | 0.81 |
Sensitivity | 0.67 | 0.78 | 0.81 | 0.81 | 0.80 | 0.80 | 0.79 | 0.79 | 0.82 |
Specificity | 0.74 | 0.78 | 0.78 | 0.79 | 0.79 | 0.78 | 0.78 | 0.80 | 0.81 |
False Positive Rate | 0.26 | 0.22 | 0.22 | 0.21 | 0.21 | 0.22 | 0.22 | 0.20 | 0.19 |
False Negative Rate | 0.33 | 0.22 | 0.19 | 0.19 | 0.20 | 0.20 | 0.21 | 0.21 | 0.18 |
Positive Predictive Power | 0.19 | 0.22 | 0.26 | 0.27 | 0.27 | 0.26 | 0.25 | 0.23 | 0.23 |
Negative Predictive Power | 0.96 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 | 0.98 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Date | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 |
Sample Size | 3266 | 8734 | 8893 | 8379 | 8494 | 8512 | 8556 | 8423 | 8128 |
Geographic Representation | East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
Male | 51.9% | 49.2% | 50.2% | 50.2% | 50.7% | 50.2% | 51.1% | 49.7% | 50.2% |
Female | 48.1% | 50.6% | 49.8% | 49.8% | 49.3% | 49.8% | 48.9% | 50.3% | 49.8% |
Other | |||||||||
Gender Unknown | |||||||||
White, Non-Hispanic | 16.2% | 24.6% | 28.1% | 23.0% | 25.2% | 24.9% | 23.8% | 22.4% | 23.7% |
Black, Non-Hispanic | 7.9% | 8.6% | 6.9% | 7.5% | 6.8% | 6.1% | 8.1% | 9.3% | 8.9% |
Hispanic | 51.9% | 44.6% | 42.0% | 44.1% | 48.1% | 49.9% | 47.1% | 50.1% | 51.2% |
Asian/Pacific Islander | 17.4% | 14.1% | 14.2% | 14.5% | 12.8% | 11.6% | 13.0% | 12.9% | 9.8% |
American Indian/Alaska Native | 0.3% | 0.5% | 0.4% | 0.3% | 0.1% | 0.2% | 0.5% | 0.2% | 0.3% |
Other | 5.1% | 6.2% | 5.1% | 6.0% | 3.5% | 4.6% | 5.4% | 3.1% | 4.0% |
Race / Ethnicity Unknown | 1.2% | 1.2% | 3.3% | 4.6% | 3.5% | 2.7% | 2.1% | 2.0% | 2.1% |
Low SES | |||||||||
IEP or diagnosed disability | |||||||||
English Language Learner |
Classification Accuracy - Spring
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Criterion measure | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics |
Cut Points - Percentile rank on criterion measure | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Cut Points - Performance score on criterion measure | 2292 | 2334 | 2292 | 2292 | 2334 | 2351 | 2348 | 2362 | 2360 |
Cut Points - Corresponding performance score (numeric) on screener measure | .6 | 1.2 | 2.3 | 3 | 3.7 | 4.2 | 4.6 | 5.2 | 5.8 |
Classification Data - True Positive (a) | 540 | 630 | 899 | 913 | 1524 | 1570 | 1481 | 1787 | 1121 |
Classification Data - False Positive (b) | 2216 | 2401 | 2191 | 1765 | 1780 | 2702 | 2918 | 3370 | 2450 |
Classification Data - False Negative (c) | 235 | 179 | 172 | 146 | 228 | 225 | 262 | 372 | 311 |
Classification Data - True Negative (d) | 6413 | 9791 | 10136 | 10754 | 19066 | 19451 | 18128 | 17554 | 12476 |
Area Under the Curve (AUC) | 0.79 | 0.85 | 0.84 | 0.91 | 0.92 | 0.91 | 0.90 | 0.90 | 0.89 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.78 | 0.83 | 0.85 | 0.92 | 0.91 | 0.90 | 0.89 | 0.88 | 0.88 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.80 | 0.86 | 0.83 | 0.90 | 0.93 | 0.92 | 0.91 | 0.92 | 0.90 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Base Rate | 0.08 | 0.06 | 0.08 | 0.08 | 0.08 | 0.07 | 0.08 | 0.09 | 0.09 |
Overall Classification Rate | 0.74 | 0.80 | 0.82 | 0.86 | 0.91 | 0.88 | 0.86 | 0.84 | 0.83 |
Sensitivity | 0.70 | 0.78 | 0.84 | 0.86 | 0.87 | 0.87 | 0.85 | 0.83 | 0.78 |
Specificity | 0.74 | 0.80 | 0.82 | 0.86 | 0.91 | 0.88 | 0.86 | 0.84 | 0.84 |
False Positive Rate | 0.26 | 0.20 | 0.18 | 0.14 | 0.09 | 0.12 | 0.14 | 0.16 | 0.16 |
False Negative Rate | 0.30 | 0.22 | 0.16 | 0.14 | 0.13 | 0.13 | 0.15 | 0.17 | 0.22 |
Positive Predictive Power | 0.20 | 0.21 | 0.29 | 0.34 | 0.46 | 0.37 | 0.34 | 0.35 | 0.31 |
Negative Predictive Power | 0.96 | 0.98 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.98 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Date | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 |
Sample Size | 9404 | 13001 | 13398 | 13578 | 22598 | 23948 | 22789 | 23083 | 16358 |
Geographic Representation | East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
Male | 49.1% | 49.0% | 50.7% | 51.8% | 50.2% | 50.4% | 49.9% | 51.2% | 50.3% |
Female | 50.9% | 51.0% | 48.5% | 48.2% | 49.8% | 49.6% | 50.1% | 48.8% | 49.7% |
Other | |||||||||
Gender Unknown | |||||||||
White, Non-Hispanic | 18.8% | 24.1% | 25.6% | 29.9% | 24.3% | 24.8% | 23.9% | 21.9% | 24.8% |
Black, Non-Hispanic | 9.1% | 9.3% | 8.2% | 6.4% | 7.0% | 5.9% | 8.3% | 10.2% | 9.2% |
Hispanic | 50.2% | 44.8% | 41.7% | 44.0% | 49.9% | 49.1% | 49.0% | 51.9% | 50.1% |
Asian/Pacific Islander | 15.5% | 12.8% | 15.8% | 10.2% | 11.2% | 11.1% | 11.7% | 10.2% | 7.9% |
American Indian/Alaska Native | 0.4% | 0.3% | 0.2% | 0.3% | 0.2% | 0.2% | 0.3% | 0.3% | 0.4% |
Other | 5.0% | 7.5% | 4.1% | 5.5% | 3.9% | 4.7% | 4.7% | 3.5% | 4.8% |
Race / Ethnicity Unknown | 1.0% | 1.2% | 3.7% | 3.7% | 3.5% | 4.2% | 2.1% | 2.0% | 2.8% |
Low SES | |||||||||
IEP or diagnosed disability | |||||||||
English Language Learner |
Cross-Validation - Fall
Evidence | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|
Criterion measure | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics |
Cut Points - Percentile rank on criterion measure | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Cut Points - Performance score on criterion measure | 2324 | 2324 | 2324 | 2362 | 2373 | 2367 | 2374 | 2375 |
Cut Points - Corresponding performance score (numeric) on screener measure | .5 | 1.4 | 2.2 | 3.1 | 3.7 | 4.1 | 4.6 | 5.1 |
Classification Data - True Positive (a) | 122 | 183 | 160 | 292 | 304 | 310 | 350 | 230 |
Classification Data - False Positive (b) | 494 | 450 | 355 | 380 | 656 | 613 | 683 | 518 |
Classification Data - False Negative (c) | 37 | 36 | 33 | 55 | 35 | 57 | 78 | 55 |
Classification Data - True Negative (d) | 1967 | 2021 | 2165 | 3640 | 3610 | 3617 | 3506 | 2374 |
Area Under the Curve (AUC) | 0.83 | 0.82 | 0.89 | 0.90 | 0.91 | 0.89 | 0.89 | 0.86 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.82 | 0.81 | 0.88 | 0.90 | 0.90 | 0.88 | 0.88 | 0.85 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.84 | 0.83 | 0.89 | 0.91 | 0.92 | 0.90 | 0.90 | 0.87 |
Statistics | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|
Base Rate | 0.06 | 0.08 | 0.07 | 0.08 | 0.07 | 0.08 | 0.09 | 0.09 |
Overall Classification Rate | 0.80 | 0.82 | 0.86 | 0.90 | 0.85 | 0.85 | 0.84 | 0.82 |
Sensitivity | 0.77 | 0.84 | 0.83 | 0.84 | 0.90 | 0.84 | 0.82 | 0.81 |
Specificity | 0.80 | 0.82 | 0.86 | 0.91 | 0.85 | 0.86 | 0.84 | 0.82 |
False Positive Rate | 0.20 | 0.18 | 0.14 | 0.09 | 0.15 | 0.14 | 0.16 | 0.18 |
False Negative Rate | 0.23 | 0.16 | 0.17 | 0.16 | 0.10 | 0.16 | 0.18 | 0.19 |
Positive Predictive Power | 0.20 | 0.29 | 0.31 | 0.43 | 0.32 | 0.34 | 0.34 | 0.31 |
Negative Predictive Power | 0.98 | 0.98 | 0.98 | 0.99 | 0.99 | 0.98 | 0.98 | 0.98 |
Sample | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|
Date | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | ||
Sample Size | 2620 | 2690 | 2713 | 4367 | 4605 | 4597 | 4617 | 3177 |
Geographic Representation | East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
Male | 48.1% | 50.1% | 49.9% | 52.9% | 50.1% | 51.1% | 49.9% | 50.3% |
Female | 51.9% | 49.9% | 50.1% | 51.7% | 49.9% | 48.9% | 50.1% | 49.7% |
Other | ||||||||
Gender Unknown | ||||||||
White, Non-Hispanic | 25.6% | 26.1% | 23.7% | 26.2% | 23.1% | 23.3% | 21.9% | 26.1% |
Black, Non-Hispanic | 8.2% | 7.9% | 7.4% | 7.1% | 6.3% | 9.6% | 10.1% | 8.1% |
Hispanic | 43.6% | 42.4% | 46.0% | 52.4% | 50.3% | 48.9% | 52.1% | 49.7% |
Asian/Pacific Islander | 13.3% | 14.0% | 11.2% | 12.3% | 10.7% | 10.7% | 9.8% | 8.0% |
American Indian/Alaska Native | 0.2% | 0.4% | 0.3% | 0.3% | 0.3% | 0.3% | 0.3% | 0.6% |
Other | 7.4% | 4.9% | 6.8% | 3.1% | 6.1% | 5.3% | 3.3% | 4.6% |
Race / Ethnicity Unknown | 1.7% | 4.3% | 4.6% | 3.1% | 3.2% | 1.9% | 2.5% | 2.9% |
Low SES | ||||||||
IEP or diagnosed disability | ||||||||
English Language Learner |
Cross-Validation - Winter
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Criterion measure | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics |
Cut Points - Percentile rank on criterion measure | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Cut Points - Performance score on criterion measure | 2292 | 2292 | 2292 | 2292 | 2334 | 2351 | 2348 | 2362 | 2360 |
Cut Points - Corresponding performance score (numeric) on screener measure | .35 | .9 | 1.9 | 2.6 | 3.5 | 4 | 4.4 | 5 | 5.5 |
Classification Data - True Positive (a) | 35 | 97 | 130 | 115 | 124 | 120 | 112 | 90 | 87 |
Classification Data - False Positive (b) | 151 | 365 | 353 | 325 | 332 | 340 | 342 | 315 | 298 |
Classification Data - False Negative (c) | 17 | 30 | 32 | 30 | 39 | 27 | 32 | 28 | 22 |
Classification Data - True Negative (d) | 430 | 1250 | 1263 | 1205 | 1203 | 1215 | 1225 | 1251 | 1218 |
Area Under the Curve (AUC) | 0.76 | 0.80 | 0.82 | 0.82 | 0.81 | 0.81 | 0.81 | 0.80 | 0.80 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.75 | 0.79 | 0.82 | 0.81 | 0.80 | 0.80 | 0.80 | 0.78 | 0.79 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.77 | 0.81 | 0.83 | 0.83 | 0.82 | 0.82 | 0.81 | 0.81 | 0.82 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Base Rate | 0.08 | 0.07 | 0.09 | 0.09 | 0.10 | 0.09 | 0.08 | 0.07 | 0.07 |
Overall Classification Rate | 0.73 | 0.77 | 0.78 | 0.79 | 0.78 | 0.78 | 0.78 | 0.80 | 0.80 |
Sensitivity | 0.67 | 0.76 | 0.80 | 0.79 | 0.76 | 0.82 | 0.78 | 0.76 | 0.80 |
Specificity | 0.74 | 0.77 | 0.78 | 0.79 | 0.78 | 0.78 | 0.78 | 0.80 | 0.80 |
False Positive Rate | 0.26 | 0.23 | 0.22 | 0.21 | 0.22 | 0.22 | 0.22 | 0.20 | 0.20 |
False Negative Rate | 0.33 | 0.24 | 0.20 | 0.21 | 0.24 | 0.18 | 0.22 | 0.24 | 0.20 |
Positive Predictive Power | 0.19 | 0.21 | 0.27 | 0.26 | 0.27 | 0.26 | 0.25 | 0.22 | 0.23 |
Negative Predictive Power | 0.96 | 0.98 | 0.98 | 0.98 | 0.97 | 0.98 | 0.97 | 0.98 | 0.98 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Date | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 |
Sample Size | 633 | 1742 | 1778 | 1675 | 1698 | 1702 | 1711 | 1684 | 1625 |
Geographic Representation | East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
|
Male | 53.1% | 49.9% | 50.8% | 49.9% | 50.1% | 50.1% | 51.5% | 49.6% | 50.1% |
Female | 50.1% | 50.1% | 49.2% | 50.1% | 49.9% | 49.9% | 48.5% | 50.4% | 49.9% |
Other | |||||||||
Gender Unknown | |||||||||
White, Non-Hispanic | 16.4% | 23.9% | 28.9% | 23.9% | 24.8% | 25.1% | 22.9% | 22.4% | 23.7% |
Black, Non-Hispanic | 8.1% | 8.4% | 7.1% | 7.2% | 6.7% | 6.3% | 8.0% | 9.3% | 8.9% |
Hispanic | 53.4% | 43.9% | 41.2% | 43.0% | 47.1% | 50.1% | 46.9% | 50.1% | 51.2% |
Asian/Pacific Islander | 18.6% | 15.1% | 14.2% | 20.5% | 12.9% | 11.1% | 12.9% | 12.9% | 9.8% |
American Indian/Alaska Native | 0.5% | 0.6% | 0.5% | 0.6% | 0.3% | 0.3% | 0.7% | 0.2% | 0.3% |
Other | 5.1% | 6.5% | 5.2% | 5.9% | 4.6% | 4.5% | 5.5% | 3.1% | 4.0% |
Race / Ethnicity Unknown | 1.1% | 1.6% | 2.9% | 4.9% | 3.6% | 2.6% | 3.1% | 2.0% | 2.1% |
Low SES | |||||||||
IEP or diagnosed disability | |||||||||
English Language Learner |
Cross-Validation - Spring
Evidence | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Criterion measure | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics | Smarter Balanced Assessment (SBA) Mathematics |
Cut Points - Percentile rank on criterion measure | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Cut Points - Performance score on criterion measure | 2292 | 2292 | 2292 | 2292 | 2334 | 2351 | 2348 | 2362 | 2360 |
Cut Points - Corresponding performance score (numeric) on screener measure | .6 | 1.2 | 2.3 | 3 | 3.7 | 4.2 | 4.6 | 5.2 | 5.8 |
Classification Data - True Positive (a) | 106 | 124 | 177 | 179 | 303 | 316 | 398 | 352 | 227 |
Classification Data - False Positive (b) | 447 | 483 | 441 | 355 | 359 | 542 | 581 | 670 | 498 |
Classification Data - False Negative (c) | 49 | 32 | 36 | 30 | 43 | 44 | 55 | 72 | 60 |
Classification Data - True Negative (d) | 1278 | 1961 | 2005 | 2151 | 3812 | 3887 | 3623 | 3522 | 2486 |
Area Under the Curve (AUC) | 0.78 | 0.83 | 0.83 | 0.89 | 0.89 | 0.90 | 0.89 | 0.88 | 0.88 |
AUC Estimate’s 95% Confidence Interval: Lower Bound | 0.77 | 0.82 | 0.82 | 0.88 | 0.88 | 0.89 | 0.88 | 0.88 | 0.86 |
AUC Estimate’s 95% Confidence Interval: Upper Bound | 0.79 | 0.84 | 0.84 | 0.89 | 0.89 | 0.91 | 0.89 | 0.88 | 0.89 |
Statistics | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Base Rate | 0.08 | 0.06 | 0.08 | 0.08 | 0.08 | 0.08 | 0.10 | 0.09 | 0.09 |
Overall Classification Rate | 0.74 | 0.80 | 0.82 | 0.86 | 0.91 | 0.88 | 0.86 | 0.84 | 0.83 |
Sensitivity | 0.68 | 0.79 | 0.83 | 0.86 | 0.88 | 0.88 | 0.88 | 0.83 | 0.79 |
Specificity | 0.74 | 0.80 | 0.82 | 0.86 | 0.91 | 0.88 | 0.86 | 0.84 | 0.83 |
False Positive Rate | 0.26 | 0.20 | 0.18 | 0.14 | 0.09 | 0.12 | 0.14 | 0.16 | 0.17 |
False Negative Rate | 0.32 | 0.21 | 0.17 | 0.14 | 0.12 | 0.12 | 0.12 | 0.17 | 0.21 |
Positive Predictive Power | 0.19 | 0.20 | 0.29 | 0.34 | 0.46 | 0.37 | 0.41 | 0.34 | 0.31 |
Negative Predictive Power | 0.96 | 0.98 | 0.98 | 0.99 | 0.99 | 0.99 | 0.99 | 0.98 | 0.98 |
Sample | Kindergarten | Grade 1 | Grade 2 | Grade 3 | Grade 4 | Grade 5 | Grade 6 | Grade 7 | Grade 8 |
---|---|---|---|---|---|---|---|---|---|
Date | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 | Spring 2019 |
Sample Size | 1880 | 2600 | 2659 | 2715 | 4517 | 4789 | 4657 | 4616 | 3271 |
Geographic Representation | East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
East North Central (MI) New England (CT) Pacific (CA, WA) |
Male | 52.1% | 49.4% | 49.8% | 50.1% | 50.2% | 50.3% | 50.4% | 49.7% | 49.9% |
Female | 47.9% | 50.6% | 50.2% | 49.9% | 49.8% | 49.7% | 47.5% | 50.3% | 50.1% |
Other | |||||||||
Gender Unknown | |||||||||
White, Non-Hispanic | 16.0% | 23.9% | 28.6% | 22.5% | 24.9% | 25.2% | 23.5% | 23.2% | 23.9% |
Black, Non-Hispanic | 7.9% | 8.7% | 7.1% | 7.6% | 6.8% | 6.2% | 8.0% | 9.0% | 9.1% |
Hispanic | 52.9% | 43.9% | 41.1% | 44.4% | 48.3% | 48.8% | 46.3% | 51.0% | 50.1% |
Asian/Pacific Islander | 17.0% | 15.2% | 14.6% | 14.8% | 12.3% | 11.5% | 12.8% | 12.4% | 9.8% |
American Indian/Alaska Native | 0.4% | 0.7% | 0.5% | 0.5% | 0.3% | 0.3% | 0.4% | 0.3% | 0.4% |
Other | 4.8% | 6.3% | 5.2% | 6.1% | 3.5% | 4.9% | 5.0% | 2.1% | 2.7% |
Race / Ethnicity Unknown | 1.0% | 1.3% | 2.9% | 4.1% | 3.9% | 3.1% | 1.9% | 2.0% | 4.0% |
Low SES | |||||||||
IEP or diagnosed disability | |||||||||
English Language Learner |
Reliability
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|---|---|---|
Rating |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |




- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- The Adaptive Diagnostic Assessment of Mathematics (ADAM) is a criterion-referenced, adaptive assessment designed to measure student mastery of mathematical concepts across multiple sub-tests. Marginal Reliability (Item Response Theory-Based, Applied to the Total Math Score): - Evaluates the precision of student ability estimates using ADAM’s total math score, rather than internal consistency of test items. - Since ADAM’s total math score is a weighted composite of multiple sub-tests (e.g., Fractions, Geometry, Algebraic Thinking), it already accounts for the instructional importance of each math domain at different grade levels. - Instead of calculating marginal reliability separately for each sub-test, it is computed directly on the total math score, ensuring that the final reliability estimate accurately reflects the overall assessment structure. - This approach prevents redundant calculations and ensures that the reliability estimate properly accounts for differences in skill emphasis across grades.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- Marginal Reliability Sample Size: The full ADAM test dataset, typically including thousands of student responses across all sub-tests. Characteristics: Students across multiple grade levels, ensuring broad applicability. Each student’s total math score was analyzed, rather than separate sub-test reliability scores, to reflect how different mathematical domains contribute at different grade levels. The dataset reflects the adaptive nature of ADAM, where students received different sets of items based on their performance, and marginal reliability was calculated to ensure precise ability estimates for overall mathematical proficiency.
- *Describe the analysis procedures for each reported type of reliability.
- Marginal Reliability Analysis (Applied to the Total Math Score): - Ability estimates (𝜃) were calculated for each student’s total weighted math score, rather than for each individual sub-test. - The Standard Error of Measurement (SEM) was computed for each student’s ability estimate, reflecting the precision of measurement. - The variance of ability estimates across all students was calculated to determine how much scores naturally vary in the population. - Marginal reliability was computed using the formula: Rmarginal=1−Mean(SEM2)Variance of Ability Estimates (𝜃)R_{\text{marginal}} = 1 - \frac{\text{Mean(SEM}^2)}{\text{Variance of Ability Estimates (𝜃)}}Rmarginal=1−Variance of Ability Estimates (𝜃)Mean(SEM2) - By applying this calculation to the total math score, the reliability estimate reflects the combined contribution of all mathematical domains, weighted by their instructional importance at each grade level.
*In the table(s) below, report the results of the reliability analyses described above (e.g., internal consistency or inter-rater reliability coefficients).
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Marginal Reliability Across K-8 White: n=1834, r=.91 Hispanic: n=2564, r=.90 Black: n=697, r=.88 Asian: n=678, r=.88
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|---|---|---|
Rating |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |




- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- The validity of a tool like ADAM is supported by how well it measures what it claims to measure and how effectively it informs decision-making. ADAM’s design centers on assessing specific skills and concepts aligned with modern math content standards, ensuring it is both relevant and precise. Its adaptive nature allows students to be assessed across a broad spectrum of abilities, irrespective of their enrolled grade level. This flexibility ensures that ADAM captures each student’s current mastery without being constrained by predefined grade boundaries. ADAM’s structure, with 44 criterion-referenced sub-tests, further enhances its validity. These sub-tests are designed to directly assess mastery of specific skills and concepts, rather than predicting mastery based on proxies. This approach, combined with ADAM’s granular organization, provides precise and actionable results. The vertical native scoring system ensures stability and continuity across assessments, eliminating floor and ceiling effects. As a result, ADAM can assess growth and performance accurately over time. Evidence of the tool’s validity is further demonstrated through external correlation. The total grade-equivalency score of ADAM and the total scaled score of the Smarter Balanced Assessment (SBA) show a Pearson’s correlation coefficient of 0.87, indicating a strong relationship between the two measures. This correlation underscores ADAM’s appropriateness as a diagnostic tool, as it aligns well with other established measures of student performance in mathematics. What is a little surprising is that because ADAM has no floor or ceiling when given to students in grade 1 to 7 (K to 8 item range), one would expect correlations to dip as the SBA assessment really focuses on grade-level items. But nonetheless, correlations remain high indicating that the high correlation of instructional content covered by both assessments outweighs the diagnostic versus accountability of each respective measures.
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- For the concurrent validity analyses, the sample consisted of approximately 141,000 students who completed both the screening measure and the criterion measure during the Spring testing window. This sample included students across all performance levels, ensuring a comprehensive representation. For predictive validity analyses in grades K-2, with the Smarter Balanced Assessment (SBA) used as the criterion measure, the sample included students who completed the screening measure 12 to 36 months prior to the administration of the criterion measure. Sample sizes was approximately 26,000 students, encompassing a diverse range of performance levels. For predictive validity analyses in grades 3-8, the sample included students who completed the screening measure in the Fall testing window and the criterion measure in the subsequent Spring. Sample sizes was approximately 121,000 students, with all performance levels represented.
- *Describe the analysis procedures for each reported type of validity.
- For both concurrent and predictive validity analyses, Pearson correlation coefficients were calculated to examine the relationship between the screening measure and the criterion measure. The correlations ranged from 0.75 to 0.87, and 95% confidence intervals were computed around the Pearson r coefficients to ensure statistical precision. For grades K-2, Spring screening scores were correlated with criterion scores from subsequent academic years. Specifically, screening scores in grades K, 1, and 2 were correlated with criterion scores in grade 3, resulting in Pearson correlation coefficients of 0.75, 0.76, and 0.78, respectively. This analysis demonstrates the predictive strength of the screening measure over time. For grades 3-8, predictive validity was assessed by correlating screening scores from the Fall with criterion scores from the Spring of the same academic year. Concurrent validity was analyzed by correlating Spring screening scores with Spring criterion scores from the same year.
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- Predictive validity coefficients for grades K-2 were positive and significant, ranging from 0.75 to 0.78. These correlations reflect the relationship between screening measures administered 12 to 36 months prior and subsequent criterion measures. While slightly lower than coefficients from measures taken closer in time, they still exceed the minimum threshold of 0.60, indicating a positive relationship between ADAM and high-stakes statewide assessments, even over an extended period. Concurrent and predictive validity coefficients for grades 3-8 generally exceeded 0.80, demonstrating a strong alignment between ADAM and criterion state summative assessments in the same subject and grade.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subgroup | Informant | Age / Grade | Test or Criterion | n | Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Grade |
Kindergarten
|
Grade 1
|
Grade 2
|
Grade 3
|
Grade 4
|
Grade 5
|
Grade 6
|
Grade 7
|
Grade 8
|
---|---|---|---|---|---|---|---|---|---|
Rating | Not Provided | Not Provided | Not Provided | Not Provided | Not Provided | Not Provided | Not Provided | Not Provided | Not Provided |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- Yes
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- Analysis Method: We used a simplified method called Differential Item Functioning (DIF) to check for bias. This involves comparing how different groups (like males vs. females) respond to the same questions in ADAM. We look for any questions that one group finds consistently easier or harder than the other, which might suggest bias.
- b. Describe the subgroups for which bias analyses were conducted:
- Gender (Female vs. Male) Race/Ethnicity (Black or African American and Latino vs. Caucasian) Language ability (English Learners vs. non–English Learners) Educational needs (Students with Disabilities vs. General Education students) Socioeconomic status (Economically Disadvantaged vs. Not Economically Disadvantaged)
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
- Our analysis involved a random sample of students, using data to calculate the difficulty level of items for each subgroup. We used a version of the statistical software specifically tailored for DIF analysis in educational assessments to ensure precise results. The analysis helps us understand if any particular question consistently shows a different difficulty level for any group, which would indicate bias.
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.