iFAB (Individualized Formative Assessment of Behavior)
DBR-MIS: Academic Engagement

Summary

Descriptive Information

Individualized Formative Assessment of Behavior (iFAB) is a mobile-enabled web-based system for behavioral progress monitoring of elementary students in grades K -3, that is flexible, feasible, and psychometrically-sound. Using iFAB, teachers or other school staff can generate assessment plans, enter data, record events (e.g., change of intervention plan), and review auto-generated time series charts of student progress. Developed from the ground up for the purpose of formative behavior assessment, iFAB offers the ability to assess both academic enablers (i.e., Academic Engagement, Organization Skills, Interpersonal Skills) and problem behaviors that commonly occur in school settings (Disruptive, Oppositional, Interpersonal Conflict, Anxious-Depressed, and Social Withdrawal). The system offers considerable flexibility in that ratings may be completed daily (direct behavior rating method) or weekly (formative behavior rating method). Users also have the choice of rating student behavior using single-item scales or multi-item scales. This submission applies to the DBR rating method, the multi-item scale, and the Academic Engagement behavior construct.

Acquisition & Cost

Where to Obtain:: Self; r.volpe@neu.edu; 617-702-6220; https://www.ifabonline.com/

Initial Cost:: Free

Replacement Cost:: Free

Included in Cost:: At this point, we are making the tools available at no cost.; NA

Training & Technical Support

Training Requirements:: Training not required

Qualified Administrators:: No minimum qualifications specified.

Access to Technical Support:

Administration

Assessment Format:

Rating scale

Scoring Time:

Scoring is automatic OR
0 minutes per

Scores Generated:

Raw score
Composite scores

Administration Time:

3 minutes per student

Scoring Method:

Automatically (computer-scored)

Technology Requirements:

Computer or tablet
Internet connection

Tool Information

Descriptive Information

Please provide a description of your tool:: Individualized Formative Assessment of Behavior (iFAB) is a mobile-enabled web-based system for behavioral progress monitoring of elementary students in grades K -3, that is flexible, feasible, and psychometrically-sound. Using iFAB, teachers or other school staff can generate assessment plans, enter data, record events (e.g., change of intervention plan), and review auto-generated time series charts of student progress. Developed from the ground up for the purpose of formative behavior assessment, iFAB offers the ability to assess both academic enablers (i.e., Academic Engagement, Organization Skills, Interpersonal Skills) and problem behaviors that commonly occur in school settings (Disruptive, Oppositional, Interpersonal Conflict, Anxious-Depressed, and Social Withdrawal). The system offers considerable flexibility in that ratings may be completed daily (direct behavior rating method) or weekly (formative behavior rating method). Users also have the choice of rating student behavior using single-item scales or multi-item scales. This submission applies to the DBR rating method, the multi-item scale, and the Academic Engagement behavior construct.

Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?: End-year goal
Short-term skill

The tool is intended for use with the following grade(s).

Preschool / Pre - kindergarten
selected

Kindergarten
selected

First grade
selected

Second grade
selected

Third grade
not selected

Fourth grade
not selected

Fifth grade
not selected

Sixth grade
not selected

Seventh grade
not selected

Eighth grade
not selected

Ninth grade
not selected

Tenth grade
not selected

Eleventh grade
not selected

Twelfth grade

The tool is intended for use with the following age(s).

0-4 years old
selected

5 years old
selected

6 years old
selected

7 years old
selected

8 years old
selected

9 years old
not selected

10 years old
not selected

11 years old
not selected

12 years old
not selected

13 years old
not selected

14 years old
not selected

15 years old
not selected

16 years old
not selected

17 years old
not selected

18 years old

The tool is intended for use with the following student populations.

Students in general education
selected

Students with disabilities
selected

English language learners

ACADEMIC ONLY: What dimensions does the tool assess?

Reading

Global Indicator of Reading Competence
not selected

Listening Comprehension
not selected

Vocabulary
not selected

Phonemic Awareness
not selected

Decoding

Passage Reading
not selected

Word Identification
not selected

Comprehension

Spelling & Written Expression

Global Indicator of Spelling Competence
not selected

Global Indicator of Writting Expression Competence

Mathematics

Global Indicator of Mathematics Comprehension
not selected

Early Numeracy
not selected

Mathematics Concepts
not selected

Mathematics Computation
not selected

Mathematics Application
not selected

Fractions

Algebra

Other

Please describe specific domain, skills or subtests:

BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.: Academic Engagement

BEHAVIOR ONLY: Which category of behaviors does your tool target?: Both

Acquisition and Cost Information

Where to obtain:

Email Address: r.volpe@neu.edu
Address
Phone Number: 617-702-6220
Website: https://www.ifabonline.com/

Initial cost for implementing program:

Cost: $0.00
Unit of cost: Currently, we are making the tools available at no cost.

Replacement cost per unit for subsequent use:

Cost: $0.00
Unit of cost: NA
Duration of license: we will renew access on an annual basis

Additional cost information:

Describe basic pricing plan and structure of the tool. Provide information on what is included in the published tool, as well as what is not included but required for implementation.: At this point, we are making the tools available at no cost.

Provide information about special accommodations for students with disabilities.: NA

Administration

BEHAVIOR ONLY: What type of administrator is your tool designed for?

General education teacher
selected

Special education teacher
not selected

Parent

Child

External observer
selected

Other

If other, please specify:

Teacher completed

BEHAVIOR ONLY: What is the administration format?

Direct observation
selected

Rating scale
not selected

Checklist

Performance measure
not selected

Other

If other, please specify:

BEHAVIOR ONLY: What is the administration setting?

General education classroom
selected

Special education classroom
not selected

School office
not selected

Recess

Lunchroom

Home

Other

If other, please specify:

Does the program require technology?

Yes

If yes, what technology is required to implement your program? (Select all that apply)

Computer or tablet
selected

Internet connection
not selected

Other technology (please specify)

If your program requires additional technology not listed above, please describe the required technology and the extent to which it is combined with teacher small-group instruction/intervention:

What is the administration context?

Individual

Small group If small group, n=

Large group If large group, n=

Computer-administered
selected

Other

If other, please specify:

These are teacher rating forms

What is the administration time?

Time in minutes

per (student/group/other unit)

student

Additional scoring time:

Time in minutes

per (student/group/other unit)

How many alternate forms are available, if applicable?

Number of alternate forms

per (grade/level/unit)

there are 4 alternative methods for each construct

ACADEMIC ONLY: What are the discontinue rules?

No discontinue rules provided
not selected

Basals

Ceilings

Other

If other, please specify:

BEHAVIOR ONLY: Can multiple students be rated concurrently by one administrator?

Yes

If yes, how many students can be rated concurrently?

This is not known, but the tools were designed for rating a small number of students receiving classroom interventions.

Training & Scoring

Training

Is training for the administrator required?: No

Describe the time required for administrator training, if applicable:: We have purposely evaluated the measures as administered with no training.

Please describe the minimum qualifications an administrator must possess.: We made a conscious choice to design formative behavior assessment tools that required no training. As such, no training was involved in our psychometric studies.; No minimum qualifications

Are training manuals and materials available?: No

Are training manuals/materials field-tested?: No

Are training manuals/materials included in cost of tools?
If No, please describe training costs:: NA

Can users obtain ongoing professional and technical support?: Yes
If Yes, please describe how users can obtain support:

Scoring

BEHAVIOR ONLY: What types of scores result from the administration of the assessment?

Score
Observation	Behavior Rating
Frequency Duration Interval Latency	Raw score

Conversion
Observation	Behavior Rating
Rate Percent	Standard score Subscale/ Subtest Composite Stanine Percentile ranks Normal curve equivalents IRT based scores

Interpretation
Observation	Behavior Rating
Error analysis Peer comparison Rate of change	Dev. benchmarks Age-Grade equivalent

How are scores calculated?

Manually (by hand)
selected

Automatically (computer-scored)
not selected

Other

If other, please specify:

Do you provide basis for calculating performance level scores?

What is the basis for calculating performance level and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

What types of performance level scores are available?

Raw score

Standard score
not selected

Percentile score
not selected

Grade equivalents
not selected

IRT-based score
not selected

Age equivalents
not selected

Stanines

Normal curve equivalents
not selected

Developmental benchmarks
not selected

Developmental cut points
not selected

Equated

Probability
not selected

Lexile score
not selected

Error analysis
selected

Composite scores
not selected

Subscale/subtest scores
not selected

Other

If other, please specify:

Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.: Multi-item scales generate scale composites and also allow users to track scores by item as they tend to overlap with specific target behaviors. Broad composites for academic enablers and problem behaviors also are generated.

Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?: No

ACADEMIC ONLY: Do you provide benchmarks for the slopes?

ACADEMIC ONLY: Do you provide percentile ranks for the slopes?

What is the basis for calculating slope and percentile scores?

Age norms

Grade norms
not selected

Classwide norms
not selected

Schoolwide norms
not selected

Stanines

Normal curve equivalents

Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.: Assessment plans can be tailored to each individual student in that teachers/consultants select relevant scales for monitoring. They can decide which of the four methods to employ for progress monitoring. Our psychometric studies were conducted on diverse groups of students in both regular education and special education. We are currently seeking funding to explore the measurement invariance of the iFAB measures.

Levels of Performance and Usability

Are levels of performance specified in your manual or published materials?

If yes, specify the levels of performance and how they are used for progress monitoring:

What is the basis for specifying levels of performance?

Norm-referenced
not selected

Criterion-referenced
not selected

Other

If other, please specify:

False

If norm-referenced, describe the normative profile.

National representation (check all that apply):

Northeast:

New England

Middle Atlantic

Midwest:

East North Central

West North Central

South:

South Atlantic

East South Central

West South Central

West:

Mountain

Pacific

Local representation (please describe, including number of states)

Date
Size

Gender (Percent)

Male
Female
Unknown

SES indicators (Percent)

Eligible for free or reduced-price lunch
Other SES Indicators

Race/Ethnicity (Percent)

White, Non-Hispanic
Black, Non-Hispanic
Hispanic
American Indian/Alaska Native
Asian/Pacific Islander
Other
Unknown

Disability classification (Please describe)
First language (Please describe)
Language proficiency status (Please describe)

If criterion-referenced, describe procedures for specifying levels of performance.

Describe any other procedures for specifying levels of performance.

Has a usability study been conducted on your tool (i.e., a study that examines the extent to which the tool is convenient and practicable for use?)

Yes

If yes, please describe, including the results:

We administered the Usage Rating Profile-Assessment to teachers participating in several phases of psychometric studies. Below is a summary of their ratings of the iFAB tools. On the left is a summary of scores for a sample of 21 teachers participating in reliability studies (n=21) and on the right is a summary of scores from teachers participating in the treatment sensitivity studies which involved the implementation of daily behavior report card interventions over several weeks. The URP-A is rating on a 6 point scale. Scaling ranges from 1 (Strongly Disagree) to 6 (Strongly Agree) with higher scores being indicative of measures with high acceptability. Mean item scores in our samples were around 5 (range 4.69 to 5.27), which supports the usability of our tools. URP-A Reliability Sample. DBRC Sample Feasibility 4.69 4.89 Understanding 4.87 5.27 Feasibility 5.24 5.20

Has a social validity study been conducted on your tool (i.e., a study that examines the significance of goals, appropriateness of procedures (e.g., ethics, cost, practicality), and the importance of treatment effects)?

Yes

If yes, please describe, including the results:

We have published several papers that outline our iterative scale development process. The process involved engagement with our consumer advisory panel, LEA partners, and our scientific advisory panel. Included in our screening of potential items was consideration of whether the items reflected suitable targets for classroom interventions and whether the items represented malleable behaviors. The development of the formative rating measures and the web-based system involved a iterative process that included feedback from teachers using the system. This feedback included ratings on the URP-A and focus groups with a subset of teachers and school administrators.

Performance Level

Reliability

Age / Grade Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Offer a justification for each type of reliability reported, given the type and purpose of the tool.: Internal consistency: This is a relevant indicator of item homogeneity and item quality for our two multi-item methods (DBR-MIS and FBRM-MIS). This is not specific to formative assessment measures, but is an important consideration for any multi-item scale or test generating a composite score. Temporal stability and dependability across occasions are important to examine in formative behavioral assessment as inconsistency of ratings across measurement occasions is a common source of error. For formative measures we want to know the extent to which changes in scores can be attributable to change in the target students behavior as opposed to time-related error. Inter-rater: Although it is typical to employ a single teacher rater in school-based behavioral progress-monitoring (see Chafouleas, Christ, Riley-Tillman, Briesch, & Chanese, 2007), users may be interested in interrater reliability to examine the degree to which ratings conducted by different raters during the same observation interval are associated. We calculated intra-class correlations (ICC) for two raters that observed students during the same interval. Each rating summarized teacher observations for a single school day. We report ICC for a single rater’s score, using a two-way mixed-effects model with fixed raters and along with this report the 95% confidence interval.

*Describe the sample(s), including size and characteristics, for each reliability analysis conducted.: Internal Consistency A (from Volpe et al., 2019): Participants were general and special education teachers from 35 public school districts across 13 states. Investigators e-mailed principals and school psychologists from partner network schools to provide information regarding the study. Subsequent to administrator approval, interested teachers in participating schools were contacted directly by investigators and provided with a brief description of the purpose and procedures of the study. A total of 307 K-3 teachers each completed ratings for one randomly selected student in their class. Teachers were primarily female (95.8%) with a wide range of teaching experience. The student sample was comprised of 187 (60.9%) males and 120 (39.1%) females, and students were nearly evenly distributed across grade levels. The composition of students by race and ethnicity was as follows: 67.1% White, 13% Black, 15.0% Hispanic, 3.3% Asian, 1.0% American/Alaska Native, and 5.2% Unknown. Approximately 35% of students were receiving special education services at the time of data collection. Dependability (from Volpe et al., 2023): Ninety-one K–5 educators (i.e., teachers, paraprofessionals, and intervention specialists) from elementary schools located in the Northeast, Midwest, and West of the United States participated in the study. Educators were mainly female (n = 86), White (n = 87), and their age ranged from 21 to 62 (M = 39.00, SD = 10.40) years. On average, they had 12.70 (SD = 7.87) years of experience. Each educator rated one student in their classroom. The largest number of participating students were classified as male (n = 61) and their age ranged from 5 to 10 (M = 7.36, SD = 1.39) years. The majority of students were White (n = 60), followed by Black (n = 17), Hispanic (n = 10), and biracial (n = 4). Approximately 45% of the students received special education services (n = 41). Once all student participants were identified, educators were randomly assigned to one of the following four conditions: (a) DBR-SIS, (b) FBRM-SIS, (c) DBR-MIS, or (d) FBRM-MIS. For the multi-item scales (DBR-MIS and FBRM-MIS), we also report internal consistency. Cronbach’s alpha was calculated on the first ratings completed by teachers for each student. For DBR-MIS, the first rating occurred after one school day at the beginning of the week (typically, on Monday). For FBRM-MIS, the first rating occurred at the end of one school week (typically, on Friday). MTMM (test-retest): A total of 68 elementary school teachers were recruited from schools in Boston and the surrounding metro area. The mean age of teachers was 37.82 (range 25-67) and their mean years of experience teaching was 11.49 (range 1 to 40). Inter-rater: The same pairs of raters completed ratings for students on 3 consecutive school days across all 8 DBR-MIS scales. Primary raters were 16 general and 3 special education Kindergarten- 3rd-grade teachers from schools in the Northeastern United States. Most of the primary raters were female (n = 17) and 18 were White with missing race data for one teacher. Primary raters had 13.9 years teaching experience (SD = 8.9 years). The secondary raters were comprised of 6 paraeducators, 1 special education teacher and 9 student teachers. Most of the secondary raters were White (17) and female (17) with one being Black and race data missing for one. They had a mean of 7.8 years of experience (SD = 10 years). Each teacher pair rated one student. Approximately, 74% of students were White, with the remainder of the students being Black. The mean age of students was 7.84 (SD = 1.14). Of the 19 students, 8 where had an IEP.

*Describe the analysis procedures for each reported type of reliability.: As part of our initial development of the scales comprising the iFAB, we conducted a series of exploratory factor analyses for one of our four assessment methods (DBR-MIS). In our three published studies (Briesch et al., 2022; Daniels, Briesch, & Volpe, 2021; Volpe, Chaffee, Yeung, & Briesch, 2019) we reported both coefficient alpha and coefficient omega. We used these indices as indicators of item homogeneity and item quality. We examined the dependability of all four methods across all scales in a recently published series of G studies (Volpe, Matta, Briesch, & Owens, 2023). Here we conducted a set of parallel G studies to investigate differences in dependability across our four rating methods. A single-facet design was used with students (s) as the object of measurement and occasion (o) as the facet. The design was fully crossed with all students evaluated on five occasions (s x o). Time related variance has been well-documented as a source of error in formative behavioral assessment and in this study, we were interested in the extent to which a single week of data collection might generate scores with acceptable levels of dependability. We were particularly interested in making comparisons across methods and constructs. We are currently preparing several manuscripts to examine the criterion-related validity of the iFAB measures. We are using a multi-trait multi-method (MTMM) approach, wherein we examine associations with the iFAB measures and a group of criterion measures. In the diagonal of the MTMM matrix are coefficients of stability for each measure. We report the 1-week stability coefficients for each scale. Inter-rater: Inter-rater reliability was assessed for each of the 8 iFAB DBR-MIS using a two-way random-effects intraclass correlation coefficient with absolute agreement. Note that all ICC are the ICC for a single rater as opposed to the average across raters.

*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:: Scale Informant Grade N Rater N Participant Dependability Coefficient Number of Measurement Occasions to Criterion Academic Engagement (DBR-MIS) Teacher K-5 27 27 0.94 2 daily rating to reach >0.80 coefficient Volpe, R. J., Matta, M., & Briesch, A. M. (2023). Formative behavioral assessment across eight Constructs: Dependability of direct behavior ratings and formative rating measures. Journal of School Psychology. https://doi.org/10.1016/j.jsp.2023.101251

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Validity

Age / Grade Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

*Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.: Engagement: The iFab Engagement scales were designed to measure student academic engagement. The ACES Engagement subscale is an established measure of student academic engagement, we hypothesize strong positive correlations with that construct. As the inattentive symptoms of ADHD are associated with low levels of academic engagement, we hypothesize the ADHDSC-4 Inattention subscale and the BASC-3 Attention Problems subscale (both established measures of the inattentive symptoms of ADHD to demonstrate strong negative correlations with the iFab Engagement scales.

*Describe the sample(s), including size and characteristics, for each validity analysis conducted.: Elementary school teachers were recruited from schools in Boston and the surrounding metro area. To avoid range restriction in this correlational design, a general sample of students was selected (as opposed to selecting at-risk students alone). Specifically, teachers were asked to complete a packet of rating forms for one male and one female student selected at random. The overall sample was stratified by gender and grade level in order to ensure equal proportions of male and female students in grades K-1 and 2-3. Each of the two students was assigned to one of the two following conditions: • Daily condition: Teachers completed the 8 DBR scales using both SIS and MIS methods each day. Both the order of DBR formats (SIS and MIS) and scales (e.g., Disruptive, Social Withdrawal) was randomized. Both the order and format of scales was randomized. Each daily rating completed over the course of 1-week was averaged for a composite of ratings for each scale. • Weekly condition: Teachers completed the 8 DBR scales using both SIS and MIS methods at the end of each week. Both the order of DBR formats (SIS and MIS) and scales (e.g., Disruptive, Social Withdrawal) was randomized. Teachers were also asked to complete a set of established rating measures, including ACES, ADHD Symptom Checklist-IV, BASC-3, BRIEF-2 (Plan/Organize and Organization of Materials subscales), CDI-2, and SAS-TR, at the end of the week (i.e., on the weekend). Description of Students and Teachers in the Daily Condition: A total of 71 Kindergarten – third-grade teachers completed ratings for the study. Teacher raters were between 25 and 67 years of age (M = 37.96; SD = 10.28) and had between 1- and 40-years teaching experience (M = 11.64; SD = 9.0). All teachers were female, with the overwhelming majority of them identifying as White. One teacher identified as Hispanic. Description of Students in the Daily Condition: There were 72 students in the Daily Condition (34 boys, 33 girls). Sex data were missing for 5 students. Kindergarten students comprised 33.3% of the sample, first-graders 26.4%, second-graders 16.7% and third-graders 16.7% of the sample. Grade data were missing for 5 cases. Approximately 68% of the sample was White, 13.9% Black, 1.4% Hispanic, 2.8 Asian, and 5.6% other. In regard to ethnicity, 20.8% of students were Hispanic. A total of 11 students were receiving some kind of special education services or were in the process of being evaluated for special education eligibility. Student participants were between 5- and 9-years of age (M = 6.7; SD = 1.23). According to the current NCES statistics on school-aged student demographics (White = 44%, Hispanic/Latino = 28%, Black non-Hispanic = 15%, Asian = 6%, two or more races = 5%, American Indian/Alaskan Native = 1%, and Native Hawaiian/Pacific Islander < 1%). White students were overrepresented in our sample. While Hispanic students were underrepresented, and Asian students were somewhat underrepresented, the representation of Black students was similar to national estimates.

*Describe the analysis procedures for each reported type of validity.: To examine concurrent validity, we conducted bi-variate correlations between each iFAB measure and criterion measures. We performed bootstrapping on 1,000 bootstrap samples setting the desired confidence level to 95%, the resulting output provided correlation coefficients in addition to the lower and upper bounds of the 95% confidence intervals.

*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Describe the degree to which the provided data support the validity of the tool.

Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?

If yes, fill in data for each subgroup with disaggregated validity data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:

Provide citations for additional published studies.

Bias Analysis

Age / Grade: Informant	Grades K-3 Teacher
Rating	Not Provided

Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.: No

If yes,
a. Describe the method used to determine the presence or absence of bias:

b. Describe the subgroups for which bias analyses were conducted:

c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.

Growth Standards

Sensitivity to Behavior Change

Age / Grade: Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Describe evidence that the monitoring system produces data that are sensitive to detect incremental change (e.g., small behavior change in a short period of time such as every 20 days, or more frequently depending on the purpose of the construct). Evidence should be drawn from samples targeting the specific population that would benefit from intervention. Include in this example a hypothetical illustration (with narrative and/or graphics) of how these data could be used to monitor student performance frequently enough and with enough sensitivity to accurately assess change:

Reliability (Intensive Population): Reliability for Students in Need of Intensive Intervention

Age / Grade Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Offer a justification for each type of reliability reported, given the type and purpose of the tool:

Describe the sample(s), including size and characteristics, for each reliability analysis conducted:

Describe the analysis procedures for each reported type of reliability:

In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Report results by age range or grade level (if relevant) and include detail about the type of reliability data, statistic generated, and sample size and demographic information.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?: No

If yes, fill in data for each subgroup with disaggregated reliability data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of reliability analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Validity (Intensive Population): Validity for Students in Need of Intensive Intervention

Age / Grade Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.

Describe the sample(s), including size and characteristics, for each validity analysis conducted.

Describe the analysis procedures for each reported type of validity.

In the table(s) below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Describe the degree to which the provided data support the validity of the tool.

Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?

If yes, fill in data for each subgroup with disaggregated validity data.

Type of	Subscale	Subgroup	Informant	Age / Grade	Test or Criterion	n (sample/ examinees)	n (raters)	Median Coefficient	95% Confidence Interval Lower Bound	95% Confidence Interval Upper Bound

Results from other forms of validity analysis not compatible with above table format:

Manual cites other published reliability studies:: No

Provide citations for additional published studies.

Decision Rules: Data to Support Intervention Change

Age / Grade: Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Are validated decision rules for when changes to the intervention need to be made specified in your manual or published materials?: No
If yes, specify the decision rules:

What is the evidentiary basis for these decision rules?

Decision Rules: Data to Support Intervention Selection

Age / Grade: Informant	Grades K-3 Teacher
Rating

Legend

Convincing evidence

Partially convincing evidence

Unconvincing evidence

Data unavailable

^dDisaggregated data available

Are validated decision rules for what intervention(s) to select specified in your manual or published materials?: No
If yes, specify the decision rules:

What is the evidentiary basis for these decision rules?

Data Collection Practices

Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.

Summary

Tool Information
Descriptive Information
Administration
Training & Scoring
Usability

Performance Level
Reliability
Validity
Bias Analysis

Growth Standards
Sensitivity to Behavior Change
Reliability (Intensive Population)
Validity (Intensive Population)
Decision Rules

Data Collection Practices

iFAB (Individualized Formative Assessment of Behavior)DBR-MIS: Academic Engagement

Summary

Tool Information

Descriptive Information

Acquisition and Cost Information

Administration

Training & Scoring

Training

Scoring

Levels of Performance and Usability

Performance Level

Reliability

Validity

Bias Analysis

Growth Standards

Sensitivity to Behavior Change

Reliability (Intensive Population): Reliability for Students in Need of Intensive Intervention

Validity (Intensive Population): Validity for Students in Need of Intensive Intervention

Decision Rules: Data to Support Intervention Change

Decision Rules: Data to Support Intervention Selection

Data Collection Practices

iFAB (Individualized Formative Assessment of Behavior)
DBR-MIS: Academic Engagement