DBR-SIS (Direct Behavior Rating - Single Item Scale)
Disruptive Behavior
Summary
As a behavioral assessment methodology, DBR combines characteristics of systematic direct observations and behavioral rating scales. Specifically, DBR-SIS reflects a teacher’s rating regarding the proportion of time in which a target student was observed to engage in a specific behavior, using as a scale from 0 (never) to 10 (always), during a specified observation period. For example, if a student received a score of 8 out of 10 on a DBR-SIS form while being observed for Academic Engagement over a 20-minute period, this score would be interpreted as the student being academically engaged during 80% of the period. While observation periods and settings may vary depending on student and behavior specific factors, DBR-SIS forms reflecting student behaviors are always completed immediately following the observation.
- Where to Obtain:
- PAR, Inc. – in conjunction with authors Chafouleas & Riley-Tillman / PAR, Inc.
- https://dbr.education.uconn.edu/
- Initial Cost:
- Contact vendor for pricing details.
- Replacement Cost:
- Contact vendor for pricing details.
- Included in Cost:
- Training is free of charge via the online training module: http://dbrtraining.education.uconn.edu/
- Training Requirements:
- Less than one hour of training. Training is free of charge via the online training module: http://dbrtraining.education.uconn.edu/
- Qualified Administrators:
- There are no minimum qualifications of the examiner.
- Access to Technical Support:
- Assessment Format:
-
- Direct observation
- Rating scale
- Scoring Time:
-
- Scoring is automatic OR
- 1 minutes per student
- Scores Generated:
-
- Administration Time:
-
- 15 minutes per student
- Scoring Method:
-
- Automatically (computer-scored)
- Technology Requirements:
-
- Computer or tablet
- Internet connection
Tool Information
Descriptive Information
- Please provide a description of your tool:
- As a behavioral assessment methodology, DBR combines characteristics of systematic direct observations and behavioral rating scales. Specifically, DBR-SIS reflects a teacher’s rating regarding the proportion of time in which a target student was observed to engage in a specific behavior, using as a scale from 0 (never) to 10 (always), during a specified observation period. For example, if a student received a score of 8 out of 10 on a DBR-SIS form while being observed for Academic Engagement over a 20-minute period, this score would be interpreted as the student being academically engaged during 80% of the period. While observation periods and settings may vary depending on student and behavior specific factors, DBR-SIS forms reflecting student behaviors are always completed immediately following the observation.
- Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?
-
ACADEMIC ONLY: What dimensions does the tool assess?
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
- Disruptive Behavior - A student action that interrupts regular school or classroom activity.
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
Externalizing
Acquisition and Cost Information
Administration
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- Less than one hour of training. Training is free of charge via the online training module: http://dbrtraining.education.uconn.edu/
- Please describe the minimum qualifications an administrator must possess.
- There are no minimum qualifications of the examiner.
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- Yes
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- Can users obtain ongoing professional and technical support?
- No
- If Yes, please describe how users can obtain support:
Scoring
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- The scoring format is a 0-10 scale, with items rated using the scale following each observation period. There is a single rating (item) per sub-domain. As previously described, there are three domains that form school-based core behavior competencies (academically engaged, disruptive, respectful). Observers are asked to estimate the proportion of time that a student exhibited each behavioral competency during the observation period, and to convert that percentage to the 0-10 scale, with “0” indicating that the competency was not observed and “10” indicating that the competency was observed throughout the entire observation period. To calculate level of performance for each sub-domain, it is recommended that the average rating across more than 5 occasions be utilized. Rate of change is completed at the individual level, as consistent with single subject design logic that 5 or more data points are recommended (minimum of 3).
- Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?
- No
- ACADEMIC ONLY: Do you provide benchmarks for the slopes?
- ACADEMIC ONLY: Do you provide percentile ranks for the slopes?
- Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
Levels of Performance and Usability
- Date
- 2011-2012 School Year
- Size
- 629 (Fall), 606 (Winter), 609 (Spring)
- Male
- 52 (Fall), 52.1 (Winter), 51.7 (Spring)
- Female
- 48 (Fall), 47.9 (Winter), 48.3 (Spring)
- Unknown
- Eligible for free or reduced-price lunch
- Mean rate of students eligible for free or reduced lunch in schools within sample 36.4%
- Other SES Indicators
- White, Non-Hispanic
- 81.4 (Fall), 82.8 (Winter), 82.3 (Spring)
- Black, Non-Hispanic
- Hispanic
- American Indian/Alaska Native
- Asian/Pacific Islander
- Other
- Unknown
- Disability classification (Please describe)
- First language (Please describe)
- Language proficiency status (Please describe)
Performance Level
Reliability
Age / Grade Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- DBR-SIS was originally developed with intent to mirror opportunities for formative data streams as provided within systematic direct observation. As such, issues of reliability (particularly which types to emphasize) can be openly discussed and debated. In this section, we present a two-pronged approach to reliability that includes a) intraclass correlations b) generalizability theory. The first approach involves reliability estimated by converting intraclass correlation coefficients to reliability coefficients using the approach suggested by Shrout & Fleiss (1979), with data obtained from studies designed to examine DBR-SIS within screening purposes. These reliability estimates are based on large samples of students across a diverse range of general classroom settings, and they address a wide range of grade levels and ultimately consider the variability between students and within observations These data provide insight into the consistency of student ratings across observation periods and indicate that ratings are very stable across observation periods. Using generalizability theory, reliability data are calculated through dependability studies to demonstrate how reliability varies based on number of observations and days observed. This approach is appropriate as DBR-SIS data are rating scales and the ability to generalize scores such that we can assume a student would receive a similar rating from a different observer is of key concern. Generalization studies allow for reliability estimates across several thresholds of ratings, in this case, determining how many observations are needed to obtain various estimates. This provides practitioners with a range of administration options depending on the type of decision to be made (e.g. low stakes intervention, high stakes interventions). Reliability coefficients assuming assessment considerations (differing numbers of observations, type of rater scoring students) are discussed in the sources listed below, which purposely sampled from classrooms in which variability in student behavior was expected (e.g. inclusive classroom with intensive intervention needs).
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- Sample information for Johnson et. al (2016) study: Subgroup Fall Winter Spring Male 52.0% 52.1% 51.7% Female 48.0% 47.9% 48.3% White 81.4% 82.8% 82.3% Black 12.2% 11.0% 11.3% Asian/Pacific Islander 1.7% 1.7% 1.7% Non-Hispanic 92.5% 92.4% 92.6% Hispanic 7.5% 7.6% 7.4% Other 3.7% 3.5% 3.6% Multiracial 1.0% 1.0% 0.9% Sample information for Chafouleas et. al (2013) study: Grades K-5: 51.7% female, White, Non-Hispanic (N = 553; 89.6%), White, Hispanic (N = 12; 1.9%), Black (N = 9; 1.5%), American Indian or Alaska Native (N = 2; 0.3%), Asian (N = 193 2.1%), Other (N = 8; 1.3%), missing (N = 20 3.2%). Grades 6-8: 46.3% female, 89.7% White, non-Hispanic. Sample information for Chafouleas et. al (2010) study: Seven 8th-grade students attending an inclusive language arts classroom. Students demographics included: 3 boys/4 girls, 6 hispanic/1 african-american, 4 receiving special education services. Raters included the classroom teacher, a special education teacher who provided services in the classroom, and two research assistants. In the actual study, raters observed students three times a day over six consective days for a period of 45-60 minutes. Reliability coefficients below present the reliability for raters including classroom teachers and research assistants separately, across a variety of total observations.
- *Describe the analysis procedures for each reported type of reliability.
- Analysis procedures for Johnson et. al (2016) study:: Average DBR-SIS DB scores across 6-10 observations per student were used for analysis. Specifically, data reliability was calculated from a one-way intra-class correlation coefficient (ICC) that examined variability between students and within observations, corresponding to ICC(I,k), using a formula proposed by Shrout and Fleiss (1979). The average ICC (k=6-10) was selected. Analysis procedures for Chafouleas et. al (2013) study: Students were rated on DB by teachers across 5-10 data points and these scores were averaged to obtain a mean value. ICCs were then calculated, using a formula in accordance with Shrout and Fleiss’ (1979) recommendations. Intra-class correlation (ICC) coefficients were examined for each DBR-SIS behavior target to assess the appropriateness of this within-student DBR-SIS data aggregation. Analysis procedures for Chafouleas et. al (2010) study: Four primary facets of interest were identified (i.e., person, rater, day and rating occasion). Every student was rated on every occasion by every rater and, given that the goal was to generalize results beyond the specific students, raters, and rating occasions examined in the current study, all facets were considered to be random. An ANOVA with Type III sum of squares was used to derive all variance components (Chafouleas et al., 2010).
*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Additional reliability data is available from the Center upon request.
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Age / Grade Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- Concurrent validity serves as the primary source of data presented as related to DBR-SIS. As described, the intended purpose of DBR-SIS is in formative uses. As such, a primary source of validity data comes from concurrent comparisons with variety of behavior observation. While there is no single behavior assessment method that combines both teacher ratings and formative assessment, comparisons the Behavioral and Emotional Screening System and Student Risk Screening Scale (teacher ratings), both established and technically sound screening measures, provide information about the validity of DBR-SIS
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- Sample information for Johnson et. al (2016) study: Subgroup Fall Winter Spring Male 52.0% 52.1% 51.7% Female 48.0% 47.9% 48.3% White 81.4% 82.8% 82.3% Black 12.2% 11.0% 11.3% Asian/Pacific Islander 1.7% 1.7% 1.7% Non-Hispanic 92.5% 92.4% 92.6% Hispanic 7.5% 7.6% 7.4% Other 3.7% 3.5% 3.6% Multiracial 1.0% 1.0% 0.9% Sample information for Kilgus, Riley-Tillman, Chafouleas, Christ, & Welsh (2014) study: The sample consisted of 1108 students in the 1st, 4th, and 7th grades sampled from 13 schools across three geographic regions (northeast, southeast, Midwest). Specifically, the sample consisted of 410 first grade students – 31 teachers, 354 fourth grade students – 25 teachers, and 344 seventh grade students – 23 teachers. Regarding region the sample consisted of 28 teachers at the Northeast site (first-grade n = 8, fourth-grade n = 9, and seventh-grade n = 11), 29 teachers at the Southeast site (first grade n = 14, fourth-grade n = 10, and seventh-grade n = 5) and 22 teachers at the Midwest site (first-grade n = 9, fourth-grade n = 6, and seventh-grade n = 7). The majority of students were identified as White, non Hispanic (n = 536; 48.38%); 141 as White, Hispanic (12.73%); 297 as Black or African American (26.81%); 20 as American Indian or Alaskan Native (1.81%); 45 as Asian American (4.06%); and 32 as Other (2.89%). Race/ethnicity data were not provided for 37 students (3.33%). A review of data indicated that the student sample at each geographic site was representative of its corresponding state population with regard to gender and race/ethnicity, with a slight underrepresentation of White, non-Hispanic students. Sample information for Chafouleas et. al (2013) study: Elementary (K-5) 617 Elementary Students (K-90; 1st-116; 2nd- 106; 3rd- 92; 4th-122; 5th-91) Lower Elementary (K-2) – 312 Upper Elementary (3-5) – 305 Female (51.7%) White, Non-Hispanic (N = 553; 89.6%) White, Hispanic (N = 12; 1.9%) Black (N = 9; 1.5%) American Indian or Alaska Native (N = 2; 0.3%) Asian (N = 193 2.1%) Other (N = 8; 1.3%) Missing (N = 20 3.2%). Middle School (6-8) 214 middle school students (6th-18; 7th-155; 8th-41) 46.3% female 89.7% White, non-Hispanic
- *Describe the analysis procedures for each reported type of validity.
- Analysis procedures for Johnson et. al (2016) study: Correlation coefficients were calculated between BESS T-scores and mean DBR-SIS DB scores. Analysis procedures for Kilgus et. al (2014) study: Pearson product-moment bi-variate correlations between screening scales (e.g., DB, BESS, and SRSS) were calculated across grades. Analysis procedures for Chafouleas et. al (2013) study: Concurrent validity evaluated by calculating Pearson product-moment correlation coefficients (r) assessing the correlation between mean DBR-SIS DB scores and computed SRSS summed scores and BESS T scores.
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Internal Validity: The following steps were taken to protect against the threat of internal validity: (a) counterbalancing of measure presentation, (b) random order assignment of students on individual measures, and (c) random selection of students within classrooms. Counterbalancing of presentation order took place by measure through the random assignment of conditions to teacher participants, with corrections made after random assignment in order to ensure even distribution of conditions within site and grade group.
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- Results of Johnson et. al (2016) study: The DB scale, in which lower scores indicated less risk (e.g., lower disruption) was positively correlated with the BESS-T scale for which lower scores indicate less risk across all grades and time-points. All correlations were statistically significant from 0 at the p<.001 level using the Holm–Bonferroni correction for Type I error inflation (Holm, 1979). These results, in addition to the steps taken to protect against threats to internal validity (see above), provide evidence strengthening the validity of DBR-SIS DB scores. Results of Kilgus et. al (2014) study: Bivariate correlations between the BESS and DBR-SIS DB and SRSS and DBR-SIS DB were all in the expected direction (e.g., Lower DB scores positively correlated with BESS and SRSS higher risk scores) and were statistically significant at the p<.001 level. Results of Chafouleas et. al (2013) study: All correlations between DBR-SIS DB and BESS-T and DBR-SIS DB and SRSS scores were statistically significant at the .001 level and in the expected direction. Additionally, the influence of subgroup size (e.g., ratings of students within two vs three subgroups) was taken into consideration and no differences in correlation scores were found.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Age / Grade: Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating | No | No |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- No
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- b. Describe the subgroups for which bias analyses were conducted:
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Growth Standards
Sensitivity to Behavior Change
Age / Grade: Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- Describe evidence that the monitoring system produces data that are sensitive to detect incremental change (e.g., small behavior change in a short period of time such as every 20 days, or more frequently depending on the purpose of the construct). Evidence should be drawn from samples targeting the specific population that would benefit from intervention. Include in this example a hypothetical illustration (with narrative and/or graphics) of how these data could be used to monitor student performance frequently enough and with enough sensitivity to accurately assess change:
- Evidence that DBR-SIS can produce data that are sensitive to detect incremental change (e.g., small behavior change in a short period of time) is provided in the 3 studies below. Actual, not hypothetical, data are available to demonstrate how DBR-SIS has been used to monitor student performance on a frequent basis to inform decisions about student performance. The studies below represent a continuum of classwide (middle school, elementary) to individual (elementary) student focus. Graphs are provided in 2 of the 3 (JOBE, AEI) manuscripts to illustrate how data present enough sensitivity to assess change – the third manuscript (Exceptional Children) presents aggregated information in table format only given the volume of data. Chafouleas, S. M., Sanetti, L.M.H., Kilgus, S. P., & Maggin, D. M. (2012). Evaluating sensitivity to behavioral change across consultation cases using Direct Behavior Rating Single-Item Scales (DBR-SIS). Exceptional Children, 78, 491-505. Abstract. In this study, the sensitivity of Direct Behavior Rating Single Item Scales (DBR-SIS) for assessing behavior change in response to an intervention was evaluated. Data from 20 completed behavioral consultation cases involving a diverse sample of elementary participants and contexts utilizing a common intervention in an A-B design were included in analyses. Secondary purposes of the study were to investigate the utility of five metrics proposed for understanding behavioral response as well as the correspondence among these metrics and teachers’ ratings of intervention acceptability. Overall, results suggest that DBR-SIS demonstrated sensitivity to behavior change regardless of the metric used. Furthermore, there was limited association between student change and teachers’ ratings of acceptability. Chafouleas, S. M., Sanetti, L.M.H., Jaffery, R., & Fallon, L. (2012). Research to practice: An evaluation of a class-wide intervention package involving self-management and a group contingency on behavior of middle school students. Journal of Behavioral Education, 21, 34-57. Doi:10.1007/s10864-011-9135-8. Abstract. The effectiveness of an intervention package involving self-management and a group contingency at increasing appropriate classroom behaviors was evaluated in a sample of middle school students. Participants included all students in each of the 3 eighth-grade general education classrooms and their teachers. The intervention package included strategies recommended as part of best practice in classroom management to involve both building skill (self-management) and reinforcing appropriate behavior (group contingency). Data sources involved assessment of targeted behaviors using Direct Behavior Rating—single item scales completed by students and systematic direct observations completed by external observers. Outcomes suggested that, on average, student behavior moderately improved during intervention as compared to baseline when examining observational data for off-task behavior. Results for Direct Behavior Rating data were not as pronounced across all targets and classrooms in suggesting improvement for students. Limitations and future directions, along with implications for school-based practitioners working in middle school general education settings, are discussed. Riley-Tillman, T.C., Methe, S.A., & Weegar, K. (2009). Examining the use of Direct Behavior Rating methodology on classwide formative assessment: A case study. Assessment for Effective Intervention, 34, 242-250. doi:10.1177/1534508409333879 Abstract. High-quality formative assessment data are critical to the successful application of any problem-solving model (e.g., response to intervention). Formative data available for a wide variety of outcomes (academic, behavior) and targets (individual, class, school) facilitate effective decisions about needed intervention supports and responsiveness to those supports. The purpose of the current case study is to provide preliminary examination of direct behavior rating methods in class-wide assessment of engagement. A class-wide intervention is applied in a single-case design (B-A-B-A), and both systematic direct observation and direct behavior rating are used to evaluate effects. Results indicate that class-wide direct behavior rating data are consistent with systematic direct observation across phases, suggesting that in this case study, direct behavior rating data are sensitive to classroom-level intervention effects. Implications for future research are discussed. In addition, the following study provides evidence DBR-SIS for both Academic Engagement and Disruptive behavior is also sensitive to change in an intensive need population. In this study students, all had demonstrated weakness in social confidence and a majority were diagnosed formally with autism or emotional disturbances. In these studies, it was demonstrated that when behavior changed over time, both DBR and systemic direct observation altered accordingly. In this case, SDO was used as a marker to document DBR sensitivity to change for both academic engagement and disruptive behavior. Kilgus, S. P., Riley-Tillman, T. C., Stichter, J. P., Schoemann, A., & Owens, S. (in press). Examining the concurrent criterion-related validity of Direct Behavior Rating Single Item Scales (DBR-SIS) with students with high functioning autism. Assessment for Effective Intervention. A line of research has supported the development and validation of Direct Behavior Rating – Single Item Scales (DBR-SIS) for use in progress monitoring. Yet, this research was largely conducted within the general education setting with typically developing children. It is unknown whether the tool may be defensibly used with students exhibiting more substantial concerns, including students with social competence difficulties. The purpose of this investigation was to examine the concurrent validity of DBR-SIS in a middle school sample of students exhibiting substantial social competence concerns (n = 58). Students were assessed using both DBR-SIS and systematic direct observation (SDO) across three target behaviors. Each student was enrolled in one of two interventions: the Social Competence Intervention or a business-as-usual control condition. Students were assessed across three time points, including baseline, mid-intervention, and post-intervention. A review of across-time correlations indicated small to moderate correlations between DBR-SIS and SDO data (r = .25-.45). Results further suggested that the relationships between DBR-SIS and SDO targets were small to large at baseline. Correlations attenuated over time, though differences across time points were not statistically significant. This was with the exception of academic engagement correlations, which remained moderate-high across all time points.
Reliability (Intensive Population): Reliability for Students in Need of Intensive Intervention
Age / Grade Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- Offer a justification for each type of reliability reported, given the type and purpose of the tool:
- DBR-SIS was originally developed with intent to mirror opportunities for formative data streams as provided within systematic direct observation. As such, issues of reliability (particularly which types to emphasize) can be openly discussed and debated. In this section, we present a two-pronged approach to reliability that includes a) intraclass correlations and b) Min N Estimation Kilgus, S. P., Riley-Tillman, T. C., Stichter, J. P., Schoemann, A. M., & Bellesheim, K. (2016). Reliability of Direct Behavior Ratings – Social Competence (DBR-SC) data: How many ratings are necessary? School Psychology Quarterly, 31, 431-442. The first approach involves reliability estimated by converting intraclass correlation coefficients to reliability coefficients using the approach suggested by Shrout & Fleiss (1979), with data obtained from a study designed to examine DBR-SIS with a population of children who have intensive needs related to social competence. These data provide insight into the consistency of student ratings across observation periods and indicate that ratings are very stable across observation periods. In addition, analysis estimated the minimum number of observations necessary to reach .8 reliability. This part is particularly important as it provides guidance as to how many observations are necessary for the estimate to be reliable.
- Describe the sample(s), including size and characteristics, for each reliability analysis conducted:
- Participants met the following inclusion criteria: (a) student aged 11 to 14, (b) diagnosis of ASD or Special Education eligibility criteria of autism or school-identified social need,1 and (c) cognitive functioning (i.e., full-scale IQ) within 2.0 standard deviations of the mean. A sample of 33 students at six schools constituted the SCI-A group and 30 students at six schools constituted the BAU group for a total of 63 participants. Two students were dropped from analyses because of misreported IQ scores and one additional student was dropped because of a lack of data on outcome measures. The resulting sample includes 60 students (29 SCI-A and 31 BAU). Parent consent and student assent were obtained before the start of the study. Across all student participants, 55 students were male and 5 were female. The majority of participants met criteria for special education services, specifically 43.33% in the Autism category, 25% in the Emotional Disturbance category, and 20% in the Other Health Impairment category. Two students met eligibility for Specific Learning Disability, and one student met eligibility for Speech/Language Impairment. Four students did not have a current individualized education plan (IEP), and one student had a Section 504 Plan without an IEP.
- Describe the analysis procedures for each reported type of reliability:
- To assess DBR-SC performance, intraclass correlations (ICC) coefficients were first calculated to evaluate the consistency of DBR-SC data points across time within students. ICC and other statistics were calculated separately for different time points and different groups (SCI-A and BAU). This resulted in a 2 (treatment groups) 3 (times of assessment) mixed design with repeated measures on the time of assessment. ICCs were calculated via a two level unconditional multilevel structural equation model, with DBR-SC observations at level 1 and students at level 2 of the model. For each model, variances and covariances of each DBR-SC subscale were estimated both between and within observations. ICC was computed as the ratio of between group variance to total variance (between group variance within group variance). Next, ICCs were used to generate reliability estimates in accordance with recommendations from Shrout and Fleiss (1979).
In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Report results by age range or grade level (if relevant) and include detail about the type of reliability data, statistic generated, and sample size and demographic information.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Reliability Type Age or Grade n (examinees) n (raters) Median Coefficient Confidence Interval ICC 11-14 Years Old 23 60 0.793 - 0.947 -
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
- If yes, fill in data for each subgroup with disaggregated reliability data.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)n
(raters)Median Coefficient 95% Confidence Interval
Lower Bound95% Confidence Interval
Upper Bound
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity (Intensive Population): Validity for Students in Need of Intensive Intervention
Age / Grade Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- Concurrent validity serves as the primary source of data presented as related to DBR-SIS. As described, the intended purpose of DBR-SIS is in formative uses. As such, a primary source of validity data comes from concurrent comparisons with Systematic Direct Observation. Kilgus, S. P., Riley-Tillman, T. C., Stichter, J. P., Schoemann, A., & Owens, S. (in press). Examining the concurrent criterion-related validity of Direct Behavior Rating Single Item Scales (DBR-SIS) with students with high functioning autism. Assessment for Effective Intervention. SDO data were collected across 15-min observation sessions, each of which was divided into 30-sec intervals. The SDO employed both partial interval recording and momentary time sampling recording to estimate percentage of time target students engaged in relevant classroom behaviors. Three OCF behaviors were considered as part of this study. Academic engagement (SDO-AE) was defined as physical orientation to the teacher or current stimuli or active participation in the lesson or social interaction. Disruptive behavior (SDO-DB) was defined as purposeful engagement in behavior that interrupts the natural flow of academic instruction or classroom functioning. Noncompliance (SDO-NC) was defined as failure to follow/complete verbal or gestural behavioral directions provided by the teacher to a group or target student within 5 seconds. Note that SDO-DB and NC were coded using partial interval recording (where a behavior was marked as having occurred if it was observed at any point within each 30-sec interval), whereas SDO-AE was coded using momentary time sampling (where a behavior was marked as having occurred if it was observed at the end of each 30-sec interval). Partial interval was deemed appropriate given the typically irregular and brief, albeit still interruptive, nature of both DB and NC. Momentary time sampling was also considered appropriate given the expectation of frequent and nearly continuous AE within the classroom.
- Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- Participants met the following inclusion criteria: (a) student aged 11 to 14, (b) diagnosis of ASD or Special Education eligibility criteria of autism or school-identified social need,1 and (c) cognitive functioning (i.e., full-scale IQ) within 2.0 standard deviations of the mean. A sample of 33 students at six schools constituted the SCI-A group and 30 students at six schools constituted the BAU group for a total of 63 participants. Two students were dropped from analyses because of misreported IQ scores and one additional student was dropped because of a lack of data on outcome measures. The resulting sample includes 60 students (29 SCI-A and 31 BAU). Parent consent and student assent were obtained before the start of the study. Across all student participants, 55 students were male and 5 were female. The majority of participants met criteria for special education services, specifically 43.33% in the Autism category, 25% in the Emotional Disturbance category, and 20% in the Other Health Impairment category. Two students met eligibility for Specific Learning Disability, and one student met eligibility for Speech/Language Impairment. Four students did not have a current individualized education plan (IEP), and one student had a Section 504 Plan without an IEP.
- Describe the analysis procedures for each reported type of validity.
- Correlation coefficients were calculated to examine the relationship between each DBR and SDO target within each time point (i.e., pre, mid, post). hypothesized convergent relations corresponded to the pairings of (a) DBR-AE and SDO-AE, (b) DBR-DB and SDO-DB, (c) DBR-RS and SDO-NC. All other DBR-SDO pairings were hypothesized to be discriminant relations and were thus expected to be lower in magnitude relative to convergent relations. We followed Cohen’s (1988) guidelines for effect size interpretations of correlation magnitudes, where r ≥ .10 was considered small, r ≥ .30 medium, and r ≥ .50 large. In the interest of limiting over-interpretation of spurious or non-meaningful relations, conclusions regarding the presence of concurrent criterion-related validity were limited to medium and large correlations. Next, correlation coefficients were compared across time points within each DBR-SDO pairing to examine the extent to which correlation magnitude varied over time. This testing was accomplished via chi-square nested model comparisons between a model with correlations freely estimated across time to a model that specified correlational equivalence (H0: ρ1 = ρ2= ρ3). Finally, a single overall correlation was estimated and evaluated within each DBR-SDO pair to evaluate the relationship between each measure across all time points. All correlations were estimated with Mplus v. 7.11 (Muthén & Muthén, 1998–2013). Participants met the following inclusion criteria: (a) student aged 11 to 14, (b) diagnosis of ASD or Special Education eligibility criteria of autism or school-identified social need, and (c) cognitive functioning (i.e., full-scale IQ) within 2.0 standard deviations of the mean.
- In the table(s) below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)n
(raters)Median Coefficient 95% Confidence Interval
Lower Bound95% Confidence Interval
Upper Bound
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
- If yes, fill in data for each subgroup with disaggregated validity data.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)n
(raters)Median Coefficient 95% Confidence Interval
Lower Bound95% Confidence Interval
Upper Bound
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Decision Rules: Data to Support Intervention Change
Age / Grade: Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- Are validated decision rules for when changes to the intervention need to be made specified in your manual or published materials?
- No
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
Decision Rules: Data to Support Intervention Selection
Age / Grade: Informant |
Grades K-5
Teacher |
Grades 6-8
Teacher |
---|---|---|
Rating |
- Are validated decision rules for what intervention(s) to select specified in your manual or published materials?
- No
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.