iFAB (Individualized Formative Assessment of Behavior)
DBR-MIS: Academic Engagement

Summary

Individualized Formative Assessment of Behavior (iFAB) is a mobile-enabled web-based system for behavioral progress monitoring of elementary students in grades K -3, that is flexible, feasible, and psychometrically-sound. Using iFAB, teachers or other school staff can generate assessment plans, enter data, record events (e.g., change of intervention plan), and review auto-generated time series charts of student progress. Developed from the ground up for the purpose of formative behavior assessment, iFAB offers the ability to assess both academic enablers (i.e., Academic Engagement, Organization Skills, Interpersonal Skills) and problem behaviors that commonly occur in school settings (Disruptive, Oppositional, Interpersonal Conflict, Anxious-Depressed, and Social Withdrawal). The system offers considerable flexibility in that ratings may be completed daily (direct behavior rating method) or weekly (formative behavior rating method). Users also have the choice of rating student behavior using single-item scales or multi-item scales. This submission applies to the DBR rating method, the multi-item scale, and the Academic Engagement behavior construct.

Where to Obtain:
Self
r.volpe@neu.edu
617-702-6220
https://www.ifabonline.com/
Initial Cost:
Free
Replacement Cost:
Free
Included in Cost:
At this point, we are making the tools available at no cost.
NA
Training Requirements:
Training not required
Qualified Administrators:
No minimum qualifications specified.
Access to Technical Support:
Assessment Format:
  • Rating scale
Scoring Time:
  • Scoring is automatic OR
  • 0 minutes per
Scores Generated:
  • Raw score
  • Composite scores
Administration Time:
  • 3 minutes per student
Scoring Method:
  • Automatically (computer-scored)
Technology Requirements:
  • Computer or tablet
  • Internet connection

Tool Information

Descriptive Information

Please provide a description of your tool:
Individualized Formative Assessment of Behavior (iFAB) is a mobile-enabled web-based system for behavioral progress monitoring of elementary students in grades K -3, that is flexible, feasible, and psychometrically-sound. Using iFAB, teachers or other school staff can generate assessment plans, enter data, record events (e.g., change of intervention plan), and review auto-generated time series charts of student progress. Developed from the ground up for the purpose of formative behavior assessment, iFAB offers the ability to assess both academic enablers (i.e., Academic Engagement, Organization Skills, Interpersonal Skills) and problem behaviors that commonly occur in school settings (Disruptive, Oppositional, Interpersonal Conflict, Anxious-Depressed, and Social Withdrawal). The system offers considerable flexibility in that ratings may be completed daily (direct behavior rating method) or weekly (formative behavior rating method). Users also have the choice of rating student behavior using single-item scales or multi-item scales. This submission applies to the DBR rating method, the multi-item scale, and the Academic Engagement behavior construct.
Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?
not selected
selected
The tool is intended for use with the following grade(s).
not selected Preschool / Pre - kindergarten
selected Kindergarten
selected First grade
selected Second grade
selected Third grade
not selected Fourth grade
not selected Fifth grade
not selected Sixth grade
not selected Seventh grade
not selected Eighth grade
not selected Ninth grade
not selected Tenth grade
not selected Eleventh grade
not selected Twelfth grade

The tool is intended for use with the following age(s).
not selected 0-4 years old
selected 5 years old
selected 6 years old
selected 7 years old
selected 8 years old
selected 9 years old
not selected 10 years old
not selected 11 years old
not selected 12 years old
not selected 13 years old
not selected 14 years old
not selected 15 years old
not selected 16 years old
not selected 17 years old
not selected 18 years old

The tool is intended for use with the following student populations.
selected Students in general education
selected Students with disabilities
selected English language learners

ACADEMIC ONLY: What dimensions does the tool assess?

Reading
not selected Global Indicator of Reading Competence
not selected Listening Comprehension
not selected Vocabulary
not selected Phonemic Awareness
not selected Decoding
not selected Passage Reading
not selected Word Identification
not selected Comprehension

Spelling & Written Expression
not selected Global Indicator of Spelling Competence
not selected Global Indicator of Writting Expression Competence

Mathematics
not selected Global Indicator of Mathematics Comprehension
not selected Early Numeracy
not selected Mathematics Concepts
not selected Mathematics Computation
not selected Mathematics Application
not selected Fractions
not selected Algebra

Other
Please describe specific domain, skills or subtests:


BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
Academic Engagement
BEHAVIOR ONLY: Which category of behaviors does your tool target?
Both

Acquisition and Cost Information

Where to obtain:
Email Address
r.volpe@neu.edu
Address
Phone Number
617-702-6220
Website
https://www.ifabonline.com/
Initial cost for implementing program:
Cost
$0.00
Unit of cost
Currently, we are making the tools available at no cost.
Replacement cost per unit for subsequent use:
Cost
$0.00
Unit of cost
NA
Duration of license
we will renew access on an annual basis
Additional cost information:
Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.
At this point, we are making the tools available at no cost.
Provide information about special accommodations for students with disabilities.
NA

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?
selected
selected
not selected
not selected
not selected
selected
If other, please specify:
Teacher completed

BEHAVIOR ONLY: What is the administration format?
not selected
selected
not selected
not selected
not selected
If other, please specify:

BEHAVIOR ONLY: What is the administration setting?
selected
selected
not selected
not selected
not selected
not selected
not selected
If other, please specify:

Does the program require technology?

If yes, what technology is required to implement your program? (Select all that apply)
selected
selected
not selected

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?
not selected
not selected    If small group, n=
not selected    If large group, n=
not selected
selected
If other, please specify:
These are teacher rating forms

What is the administration time?
Time in minutes
3
per (student/group/other unit)
student

Additional scoring time:
Time in minutes
0
per (student/group/other unit)

How many alternate forms are available, if applicable?
Number of alternate forms
4
per (grade/level/unit)
there are 4 alternative methods for each construct

ACADEMIC ONLY: What are the discontinue rules?
not selected
not selected
not selected
not selected
If other, please specify:

BEHAVIOR ONLY: Can multiple students be rated concurrently by one administrator?

If yes, how many students can be rated concurrently?
This is not known, but the tools were designed for rating a small number of students receiving classroom interventions.

Training & Scoring

Training

Is training for the administrator required?
No
Describe the time required for administrator training, if applicable:
We have purposely evaluated the measures as administered with no training.
Please describe the minimum qualifications an administrator must possess.
We made a conscious choice to design formative behavior assessment tools that required no training. As such, no training was involved in our psychometric studies.
selected No minimum qualifications
Are training manuals and materials available?
No
Are training manuals/materials field-tested?
No
Are training manuals/materials included in cost of tools?
If No, please describe training costs:
NA
Can users obtain ongoing professional and technical support?
Yes
If Yes, please describe how users can obtain support:

Scoring

BEHAVIOR ONLY: What types of scores result from the administration of the assessment?
Score
Observation Behavior Rating
not selected Frequency
not selected Duration
not selected Interval
not selected Latency
selected Raw score
Conversion
Observation Behavior Rating
not selected Rate
not selected Percent
not selected Standard score
selected Subscale/ Subtest
selected Composite
not selected Stanine
not selected Percentile ranks
not selected Normal curve equivalents
not selected IRT based scores
Interpretation
Observation Behavior Rating
not selected Error analysis
not selected Peer comparison
not selected Rate of change
not selected Dev. benchmarks
not selected Age-Grade equivalent
How are scores calculated?
not selected Manually (by hand)
selected Automatically (computer-scored)
not selected Other
If other, please specify:

Do you provide basis for calculating performance level scores?
No

What is the basis for calculating performance level and percentile scores?
not selected Age norms
not selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

What types of performance level scores are available?
selected Raw score
not selected Standard score
not selected Percentile score
not selected Grade equivalents
not selected IRT-based score
not selected Age equivalents
not selected Stanines
not selected Normal curve equivalents
not selected Developmental benchmarks
not selected Developmental cut points
not selected Equated
not selected Probability
not selected Lexile score
not selected Error analysis
selected Composite scores
not selected Subscale/subtest scores
not selected Other
If other, please specify:

Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
Multi-item scales generate scale composites and also allow users to track scores by item as they tend to overlap with specific target behaviors. Broad composites for academic enablers and problem behaviors also are generated.
Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?
No
ACADEMIC ONLY: Do you provide benchmarks for the slopes?
ACADEMIC ONLY: Do you provide percentile ranks for the slopes?
What is the basis for calculating slope and percentile scores?
not selected Age norms
not selected Grade norms
not selected Classwide norms
not selected Schoolwide norms
not selected Stanines
not selected Normal curve equivalents

Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
Assessment plans can be tailored to each individual student in that teachers/consultants select relevant scales for monitoring. They can decide which of the four methods to employ for progress monitoring. Our psychometric studies were conducted on diverse groups of students in both regular education and special education. We are currently seeking funding to explore the measurement invariance of the iFAB measures.

Levels of Performance and Usability

Are levels of performance specified in your manual or published materials?
No
If yes, specify the levels of performance and how they are used for progress monitoring:

What is the basis for specifying levels of performance?
not selected
not selected
not selected Other
If other, please specify:
False

If norm-referenced, describe the normative profile.

National representation (check all that apply):
Northeast:
not selected New England
not selected Middle Atlantic
Midwest:
not selected East North Central
not selected West North Central
South:
not selected South Atlantic
not selected East South Central
not selected West South Central
West:
not selected Mountain
not selected Pacific

Local representation (please describe, including number of states)
Date
Size
Gender (Percent)
Male
Female
Unknown
SES indicators (Percent)
Eligible for free or reduced-price lunch
Other SES Indicators
Race/Ethnicity (Percent)
White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Asian/Pacific Islander
Other
Unknown
Disability classification (Please describe)


First language (Please describe)


Language proficiency status (Please describe)
If criterion-referenced, describe procedures for specifying levels of performance.

Describe any other procedures for specifying levels of performance.

Has a usability study been conducted on your tool (i.e., a study that examines the extent to which the tool is convenient and practicable for use?)

If yes, please describe, including the results:
We administered the Usage Rating Profile-Assessment to teachers participating in several phases of psychometric studies. Below is a summary of their ratings of the iFAB tools. On the left is a summary of scores for a sample of 21 teachers participating in reliability studies (n=21) and on the right is a summary of scores from teachers participating in the treatment sensitivity studies which involved the implementation of daily behavior report card interventions over several weeks. The URP-A is rating on a 6 point scale. Scaling ranges from 1 (Strongly Disagree) to 6 (Strongly Agree) with higher scores being indicative of measures with high acceptability. Mean item scores in our samples were around 5 (range 4.69 to 5.27), which supports the usability of our tools. URP-A Reliability Sample. DBRC Sample Feasibility 4.69 4.89 Understanding 4.87 5.27 Feasibility 5.24 5.20


Has a social validity study been conducted on your tool (i.e., a study that examines the significance of goals, appropriateness of procedures (e.g., ethics, cost, practicality), and the importance of treatment effects)?
Yes
If yes, please describe, including the results:
We have published several papers that outline our iterative scale development process. The process involved engagement with our consumer advisory panel, LEA partners, and our scientific advisory panel. Included in our screening of potential items was consideration of whether the items reflected suitable targets for classroom interventions and whether the items represented malleable behaviors. The development of the formative rating measures and the web-based system involved a iterative process that included feedback from teachers using the system. This feedback included ratings on the URP-A and focus groups with a subset of teachers and school administrators.

Performance Level

Reliability

Age / Grade
Informant
Grades K-3
Teacher
Rating Partially convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Offer a justification for each type of reliability reported, given the type and purpose of the tool.
Internal consistency: This is a relevant indicator of item homogeneity and item quality for our two multi-item methods (DBR-MIS and FBRM-MIS). This is not specific to formative assessment measures, but is an important consideration for any multi-item scale or test generating a composite score. Temporal stability and dependability across occasions are important to examine in formative behavioral assessment as inconsistency of ratings across measurement occasions is a common source of error. For formative measures we want to know the extent to which changes in scores can be attributable to change in the target students behavior as opposed to time-related error. Inter-rater: Although it is typical to employ a single teacher rater in school-based behavioral progress-monitoring (see Chafouleas, Christ, Riley-Tillman, Briesch, & Chanese, 2007), users may be interested in interrater reliability to examine the degree to which ratings conducted by different raters during the same observation interval are associated. We calculated intra-class correlations (ICC) for two raters that observed students during the same interval. Each rating summarized teacher observations for a single school day. We report ICC for a single rater’s score, using a two-way mixed-effects model with fixed raters and along with this report the 95% confidence interval.
*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
Internal Consistency A (from Volpe et al., 2019): Participants were general and special education teachers from 35 public school districts across 13 states. Investigators e-mailed principals and school psychologists from partner network schools to provide information regarding the study. Subsequent to administrator approval, interested teachers in participating schools were contacted directly by investigators and provided with a brief description of the purpose and procedures of the study. A total of 307 K-3 teachers each completed ratings for one randomly selected student in their class. Teachers were primarily female (95.8%) with a wide range of teaching experience. The student sample was comprised of 187 (60.9%) males and 120 (39.1%) females, and students were nearly evenly distributed across grade levels. The composition of students by race and ethnicity was as follows: 67.1% White, 13% Black, 15.0% Hispanic, 3.3% Asian, 1.0% American/Alaska Native, and 5.2% Unknown. Approximately 35% of students were receiving special education services at the time of data collection. Dependability (from Volpe et al., 2023): Ninety-one K–5 educators (i.e., teachers, paraprofessionals, and intervention specialists) from elementary schools located in the Northeast, Midwest, and West of the United States participated in the study. Educators were mainly female (n = 86), White (n = 87), and their age ranged from 21 to 62 (M = 39.00, SD = 10.40) years. On average, they had 12.70 (SD = 7.87) years of experience. Each educator rated one student in their classroom. The largest number of participating students were classified as male (n = 61) and their age ranged from 5 to 10 (M = 7.36, SD = 1.39) years. The majority of students were White (n = 60), followed by Black (n = 17), Hispanic (n = 10), and biracial (n = 4). Approximately 45% of the students received special education services (n = 41). Once all student participants were identified, educators were randomly assigned to one of the following four conditions: (a) DBR-SIS, (b) FBRM-SIS, (c) DBR-MIS, or (d) FBRM-MIS. For the multi-item scales (DBR-MIS and FBRM-MIS), we also report internal consistency. Cronbach’s alpha was calculated on the first ratings completed by teachers for each student. For DBR-MIS, the first rating occurred after one school day at the beginning of the week (typically, on Monday). For FBRM-MIS, the first rating occurred at the end of one school week (typically, on Friday). MTMM (test-retest): A total of 68 elementary school teachers were recruited from schools in Boston and the surrounding metro area. The mean age of teachers was 37.82 (range 25-67) and their mean years of experience teaching was 11.49 (range 1 to 40). Inter-rater: The same pairs of raters completed ratings for students on 3 consecutive school days across all 8 DBR-MIS scales. Primary raters were 16 general and 3 special education Kindergarten- 3rd-grade teachers from schools in the Northeastern United States. Most of the primary raters were female (n = 17) and 18 were White with missing race data for one teacher. Primary raters had 13.9 years teaching experience (SD = 8.9 years). The secondary raters were comprised of 6 paraeducators, 1 special education teacher and 9 student teachers. Most of the secondary raters were White (17) and female (17) with one being Black and race data missing for one. They had a mean of 7.8 years of experience (SD = 10 years). Each teacher pair rated one student. Approximately, 74% of students were White, with the remainder of the students being Black. The mean age of students was 7.84 (SD = 1.14). Of the 19 students, 8 where had an IEP.
*Describe the analysis procedures for each reported type of reliability.
As part of our initial development of the scales comprising the iFAB, we conducted a series of exploratory factor analyses for one of our four assessment methods (DBR-MIS). In our three published studies (Briesch et al., 2022; Daniels, Briesch, & Volpe, 2021; Volpe, Chaffee, Yeung, & Briesch, 2019) we reported both coefficient alpha and coefficient omega. We used these indices as indicators of item homogeneity and item quality. We examined the dependability of all four methods across all scales in a recently published series of G studies (Volpe, Matta, Briesch, & Owens, 2023). Here we conducted a set of parallel G studies to investigate differences in dependability across our four rating methods. A single-facet design was used with students (s) as the object of measurement and occasion (o) as the facet. The design was fully crossed with all students evaluated on five occasions (s x o). Time related variance has been well-documented as a source of error in formative behavioral assessment and in this study, we were interested in the extent to which a single week of data collection might generate scores with acceptable levels of dependability. We were particularly interested in making comparisons across methods and constructs. We are currently preparing several manuscripts to examine the criterion-related validity of the iFAB measures. We are using a multi-trait multi-method (MTMM) approach, wherein we examine associations with the iFAB measures and a group of criterion measures. In the diagonal of the MTMM matrix are coefficients of stability for each measure. We report the 1-week stability coefficients for each scale. Inter-rater: Inter-rater reliability was assessed for each of the 8 iFAB DBR-MIS using a two-way random-effects intraclass correlation coefficient with absolute agreement. Note that all ICC are the ICC for a single rater as opposed to the average across raters.

*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.

Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Scale Informant Grade N Rater N Participant Dependability Coefficient Number of Measurement Occasions to Criterion Academic Engagement (DBR-MIS) Teacher K-5 27 27 0.94 2 daily rating to reach >0.80 coefficient Volpe, R. J., Matta, M., & Briesch, A. M. (2023). Formative behavioral assessment across eight Constructs: Dependability of direct behavior ratings and formative rating measures. Journal of School Psychology. https://doi.org/10.1016/j.jsp.2023.101251
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Validity

Age / Grade
Informant
Grades K-3
Teacher
Rating Partially convincing evidence
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
Engagement: The iFab Engagement scales were designed to measure student academic engagement. The ACES Engagement subscale is an established measure of student academic engagement, we hypothesize strong positive correlations with that construct. As the inattentive symptoms of ADHD are associated with low levels of academic engagement, we hypothesize the ADHDSC-4 Inattention subscale and the BASC-3 Attention Problems subscale (both established measures of the inattentive symptoms of ADHD to demonstrate strong negative correlations with the iFab Engagement scales.
*Describe the sample(s), including size and characteristics, for each validity analysis conducted.
Elementary school teachers were recruited from schools in Boston and the surrounding metro area. To avoid range restriction in this correlational design, a general sample of students was selected (as opposed to selecting at-risk students alone). Specifically, teachers were asked to complete a packet of rating forms for one male and one female student selected at random. The overall sample was stratified by gender and grade level in order to ensure equal proportions of male and female students in grades K-1 and 2-3. Each of the two students was assigned to one of the two following conditions: • Daily condition: Teachers completed the 8 DBR scales using both SIS and MIS methods each day. Both the order of DBR formats (SIS and MIS) and scales (e.g., Disruptive, Social Withdrawal) was randomized. Both the order and format of scales was randomized. Each daily rating completed over the course of 1-week was averaged for a composite of ratings for each scale. • Weekly condition: Teachers completed the 8 DBR scales using both SIS and MIS methods at the end of each week. Both the order of DBR formats (SIS and MIS) and scales (e.g., Disruptive, Social Withdrawal) was randomized. Teachers were also asked to complete a set of established rating measures, including ACES, ADHD Symptom Checklist-IV, BASC-3, BRIEF-2 (Plan/Organize and Organization of Materials subscales), CDI-2, and SAS-TR, at the end of the week (i.e., on the weekend). Description of Students and Teachers in the Daily Condition: A total of 71 Kindergarten – third-grade teachers completed ratings for the study. Teacher raters were between 25 and 67 years of age (M = 37.96; SD = 10.28) and had between 1- and 40-years teaching experience (M = 11.64; SD = 9.0). All teachers were female, with the overwhelming majority of them identifying as White. One teacher identified as Hispanic. Description of Students in the Daily Condition: There were 72 students in the Daily Condition (34 boys, 33 girls). Sex data were missing for 5 students. Kindergarten students comprised 33.3% of the sample, first-graders 26.4%, second-graders 16.7% and third-graders 16.7% of the sample. Grade data were missing for 5 cases. Approximately 68% of the sample was White, 13.9% Black, 1.4% Hispanic, 2.8 Asian, and 5.6% other. In regard to ethnicity, 20.8% of students were Hispanic. A total of 11 students were receiving some kind of special education services or were in the process of being evaluated for special education eligibility. Student participants were between 5- and 9-years of age (M = 6.7; SD = 1.23). According to the current NCES statistics on school-aged student demographics (White = 44%, Hispanic/Latino = 28%, Black non-Hispanic = 15%, Asian = 6%, two or more races = 5%, American Indian/Alaskan Native = 1%, and Native Hawaiian/Pacific Islander < 1%). White students were overrepresented in our sample. While Hispanic students were underrepresented, and Asian students were somewhat underrepresented, the representation of Black students was similar to national estimates.
*Describe the analysis procedures for each reported type of validity.
To examine concurrent validity, we conducted bi-variate correlations between each iFAB measure and criterion measures. We performed bootstrapping on 1,000 bootstrap samples setting the desired confidence level to 95%, the resulting output provided correlation coefficients in addition to the lower and upper bounds of the 95% confidence intervals.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.
Describe the degree to which the provided data support the validity of the tool.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?

If yes, fill in data for each subgroup with disaggregated validity data.

Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
Provide citations for additional published studies.

Bias Analysis

Age / Grade: Informant Grades K-3
Teacher
Rating Not Provided
Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
No
If yes,
a. Describe the method used to determine the presence or absence of bias:
b. Describe the subgroups for which bias analyses were conducted:
c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.

Growth Standards

Sensitivity to Behavior Change

Age / Grade: Informant Grades K-3
Teacher
Rating Data unavailable
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
Describe evidence that the monitoring system produces data that are sensitive to detect incremental change (e.g., small behavior change in a short period of time such as every 20 days, or more frequently depending on the purpose of the construct). Evidence should be drawn from samples targeting the specific population that would benefit from intervention. Include in this example a hypothetical illustration (with narrative and/or graphics) of how these data could be used to monitor student performance frequently enough and with enough sensitivity to accurately assess change:

Reliability (Intensive Population): Reliability for Students in Need of Intensive Intervention

Age / Grade
Informant
Grades K-3
Teacher
Rating Data unavailable
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
Offer a justification for each type of reliability reported, given the type and purpose of the tool:
Describe the sample(s), including size and characteristics, for each reliability analysis conducted:
Describe the analysis procedures for each reported type of reliability:

In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Report results by age range or grade level (if relevant) and include detail about the type of reliability data, statistic generated, and sample size and demographic information.

Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of reliability analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Validity (Intensive Population): Validity for Students in Need of Intensive Intervention

Age / Grade
Informant
Grades K-3
Teacher
Rating Dash
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
Describe the sample(s), including size and characteristics, for each validity analysis conducted.
Describe the analysis procedures for each reported type of validity.
In the table(s) below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.
Describe the degree to which the provided data support the validity of the tool.
Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
If yes, fill in data for each subgroup with disaggregated validity data.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)
n
(raters)
Median Coefficient 95% Confidence Interval
Lower Bound
95% Confidence Interval
Upper Bound
Results from other forms of validity analysis not compatible with above table format:
Manual cites other published reliability studies:
No
Provide citations for additional published studies.

Decision Rules: Data to Support Intervention Change

Age / Grade: Informant Grades K-3
Teacher
Rating Data unavailable
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
Are validated decision rules for when changes to the intervention need to be made specified in your manual or published materials?
No
If yes, specify the decision rules:
What is the evidentiary basis for these decision rules?

Decision Rules: Data to Support Intervention Selection

Age / Grade: Informant Grades K-3
Teacher
Rating Data unavailable
Legend
Full BubbleConvincing evidence
Half BubblePartially convincing evidence
Empty BubbleUnconvincing evidence
Null BubbleData unavailable
dDisaggregated data available
Are validated decision rules for what intervention(s) to select specified in your manual or published materials?
No
If yes, specify the decision rules:
What is the evidentiary basis for these decision rules?

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.