Momentary Time-Sampling

Academic Engagement

 

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

Momentary time-sampling (MTS) for academic engagement is a non-commercial assessment and, therefore, does not have a formal pricing plan. MTS may be considered free to use.

 

Replacement Cost:

Not applicable

 

Included in Cost:

Not applicable

Technology Requirements:

  • No information provided

 

Training Requirements:

  • 8 or more hours of training

 

Qualified Administrators:

  • No minimum qualifications specified

 

Accommodations:

No information provided

 

Where to Obtain:

Momentary time-sampling is described in numerous books, articles, and presentations. Its methods are simple and transparent.

 

Access to Technical Support:

MTS is a common methodology for direct observation, and technical support should be available from any expert in behavior assessment.

 

Momentary time-sampling is a behavior assessment methodology within systematic direct observation wherein an observation period is divided into intervals, and behavior during each interval is scored as an occurrence if the behavior is occurring at the moment the interval begins or ends (depending on the specific procedures used).

 

By dividing intervals scored as occurrences by the total number of intervals, MTS provides an estimate of the proportion or percentage of an observation period during which a target behavior was occurring (i.e., “prevalence”). Depending on the characteristics of the underlying behavior, MTS may also be used to estimate the frequency of a target behavior (see Suen & Ary, 1989).

 

However, this latter use of MTS is uncommon in the social sciences literature. Like interval-recording procedures (i.e., partial-interval and whole-interval recording), MTS can be used with a number of interval lengths, observation durations, and target behaviors depending upon the target assessment question.

Assessment Format:

  • Individual
  • Group

 

Administration Time:

· Variable

 

Scoring Time:

  • Variable

 

Scoring Method:

  • Calculated automatically
  • Calculated manually

 

Scores Generated:

  • Frequency
  • Duration
  • Percent
  • Peer Comparison

 

 

Reliability

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
RatingHalf-filled bubbleHalf-filled bubble

Justify the appropriateness of each type of reliability reported:

No qualifying evidence provided.

 

Describe the sample characteristics for each reliability analysis conducted:

Sample information for Briesch, Chafouleas, & Riley-Tillman (2010) study:

Examinee sample: 12 students. Mean age = 5 years 11 months. White = 10, African-American = 1, Asian = 1. Female = 7, Male = 5.

SDO rater sample: 2 researchers, trained with videos to 95% IOA criterion between observers (kappa = .89). Training lasted 8 hours.

Sample information for Wood, Hojnoski, Laracy, & Olson (2015) study:

Examinee sample: 24 children. Female = 11, Male = 13. Age range = 38 – 65 months. Mean age = 51 months, SD = 8.30 months. Majority Caucasian. Primary Language of English = 23, Spanish = 1. Special education services for speech/language = 6. Special education services for unidentified needs = 1.

Rater sample: 3 researchers. Trained using three training videos with criterion of 85% agreement on 3 consecutive videos met.

Sample information for Zakszeski, Hojnoski, & Wood (2017) study:

Examinee sample: 24 children. Female = 11, Male = 13. Age range = 38 – 66 months. Mean age = 51 months, SD = 8 months. White = 16. Primary Language of English = 23, Spanish = 1. Special education services = 7.

Rater sample: 3 researchers. Trained using three training videos with criterion of 85% agreement on all behavior categories.

Sample information for Wood & Hintze (2004) study:

Examinee sample: 14 students. 100% in fifth grade. Female = 7, male = 7. Mean age = 12 years 2 months (SD = 1.5 months). Gen ed = 12, special education = 3. Caucasian = 12, African American = 2.

Rater sample: 5 school psychology graduate students. 100% female. Trained for 4 hours against master-coded video. All demonstrated >90% agreement with master codes.

Sample information for Johnson, Chafouleas, & Briesch (2017) study:

Examinee sample: 1 elementary-aged student. White Male.

Rater sample: 10 school psychology graduate students. Hours of prior SDO training: M = 7.2, SD = 5.0.

Sample information for Briesch, Volpe, & Ferguson (2014) study:

Examinee sample: 16 students. 100% in 7th grade. Male = 12, female = 4. 100% ethnic minority group.

Two subsamples: general classroom group and eligible for intervention group.

Rater sample: 4 school psychology graduate students, trained with videos to 95% IOA criterion between observer and researcher.

Sample information for Ferguson, Breisch, Volpe, & Daniels (2012) study:

Examinee sample: 20 students. 100% in 7th grade. 11 = male, 9 = female. 100% students of color.

Rater sample: 2 school psychology graduate students, trained using three 10-min videos, demonstrating 88% IOA / .74 kappa.

 

Describe the analysis procedures for each reported type of reliability:

No qualifying evidence provided.

 

Subscale: Academically Engaged Form: Researcher Age Range: Early childhood/K

Type of Reliability

Age or Grade

n (examinees)

n (raters)

Coefficient

Confidence Interval

G theory

Early childhood / Kindergarten

12

2

Ep2 for one observation per day across:

 

1 day = 0.50

5 days = 0.83

10 days = 0.91

15 days = 0.93

20 days = 0.98

100 days = 0.99

 

G theory

Early childhood / Kindergarten

12

2

Ep2 for three observations per day across:

 

1 day = 0.73

5 days = 0.93

10 days = 0.96

15 days = 0.97

20 days = 0.98

100 days = 0.99

 

G theory

Early childhood / Kindergarten

12

2

Phi for one observation per day across:

 

1 day = 0.48

5 days = 0.82

10 days = 0.90

15 days = 0.93

20 days = 0.97

100 days = 0.99

 

G theory

Early childhood / Kindergarten

12

2

Phi for three observations per day across:

 

1 day = 0.70

5 days = 0.92

10 days = 0.96

15 days = 0.97

20 days = 0.97

100 days = 0.99

 

Interobserver agreement

Early childhood / Kindergarten

24

3

Observation period ranged from 10:19 to 19:59 (min:sec), mean of 14 min.

 

Mean IOA (percent agreement) = 95.5%, range = 91.2% - 100%.

 

Kappa = 0.89

 

Interobserver agreement

Early childhood / Kindergarten

8

2

Kappa = 0.754

 

 

Subscale: Academically Engaged Form: Researcher Age Range: Elementary school

Type of Reliability

Age or Grade

n (examinees)

n (raters)

Coefficient

Confidence Interval

G theory

Elementary school

14

5

Ep2 (length of observation = 15 minutes, observers = 1)

 

10 days, 2 obs per day = 0.63

10 days, 1 obs per day = 0.50

3 days, 1 obs per day = 0.25

20 days, 2 obs per day = 0.71

40 days, 4 obs per day = 0.83

 

G theory

Elementary school

14

5

Phi (length of observation = 15 minutes, observers = 1)

 

10 days, 2 obs per day = 0.62

10 days, 1 obs per day = 0.46

3 days, 1 obs per day = 0.25

20 days, 2 obs per day = 0.62

40 days, 4 obs per day = 0.83

 

G theory

Elementary school

1

10

Ep2 (length of observation = 10 minutes, number of behaviors observed = 1)

 

Number of raters

1 = 0.86

2 = 0.92

3 = 0.95

4 = 0.96

5 = 0.97

6 = 0.97

7 = 0.98

8 = 0.98

9 = 0.98

10 = 0.98

 

G theory

Elementary school

1

10

Phi (length of observation = 10 minutes, number of behaviors observed = 1)

 

Number of raters

1 = 0.76

2 = 0.87

3 = 0.91

4 = 0.93

5 = 0.94

6 = 0.95

7 = 0.96

8 = 0.96

9 = 0.97

10 = 0.97

 

 

Subscale: Academically Engaged Form: Researcher Age Range: Middle school

Type of Reliability

Age or Grade

n (examinees)

n (raters)

Coefficient

Confidence Interval

G theory

Middle school

16

4

Phi (general group):

2 observers, 20 min period, 5 observations = 0.87

1 observer, 20 min period, 2 observations = 0.71

1 observer, 20 min period, 4 observations = 0.82

1 observer, 20 min period, 10 observations = 0.91

 

Phi (eligible group):

2 observers, 20 min period, 5 days = 0.75

1 observer, 20 min period, 4 observations = 0.70

1 observer, 20 min period, 8 observations = 0.81

1 observer, 20 min period, 10 observations = 0.84

 

G theory

Middle school

20

2

Rater not in model, all observations based on assumption of single rater.

 

2 days, 6 five-minute observations (original study conditions).

 

Ep2 = 0.71

Phi = 0.70

 

G theory

Middle school

20

2

Phi

 

2 days, 20 five-minute observations = 0.74

3 days, 3 five-minute observations = 0.71

4 days, 2 five-minute observations = 0.72

5 days, 2 five-minute observations = 0.76

3 days, 9 five-minute observations = 0.80

4 days, 5 five-minute observation = 0.81

5 days, 3 five-minute observations = 0.80

 

G theory

Middle school

 

 

Ep2 (observations within one day)

 

1 five-minute obs = 0.46

2 five-minute obs = 0.63

3 five-minute obs = 0.72

4 five-minute obs = 0.77

5 five-minute obs = 0.81

6 five-minute obs = 0.83

7 five-minute obs = 0.85

8 five-minute obs = 0.87

9 five-minute obs = 0.88

10 five-minute obs = 0.89

11 five-minute obs = 0.90

12 five-minute obs = 0.91

 

G theory

Middle school

20

2

Phi (observations within one day)

 

1 five-minute obs = 0.43

2 five-minute obs = 0.61

3 five-minute obs = 0.70

4 five-minute obs = 0.75

5 five-minute obs = 0.79

6 five-minute obs = 0.82

7 five-minute obs = 0.84

8 five-minute obs = 0.86

9 five-minute obs = 0.87

10 five-minute obs = 0.89

11 five-minute obs = 0.89

12 five-minute obs = 0.90

 

 

 

Validity

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
RatingEmpty bubbleEmpty bubble

Describe and justify the criterion measures used to demonstrate validity:

No qualifying evidence provided.

 

Describe the sample characteristics for each validity analysis conducted:

Sample information for Wood, Hojnoski, Laracy, & Olson (2015) study:

Examinee sample: 24 children. Female = 11, Male = 13. Age range = 38 – 65 months. Mean age = 51 months, SD = 8.30 months. Majority Caucasian. Primary Language of English = 23, Spanish = 1. Special education services for speech/language = 6. Special education services for unidentified needs = 1

Rater sample: 3 researchers. Trained using three training videos with criterion of 85% agreement on 3 consecutive videos met.

Sample information for Saudargas & Zanolli (1990) study:

Examinee sample: 16 students. Grade 1 = 2, Grade 2 = 1, Grade 3 = 5, Grade 4 = 8.

Rater sample: 2 graduate students. Trained using videotapes.

 

Describe the analysis procedures for each reported type of validity:

No qualifying evidence provided.

 

Subscale: Academically Engaged Form: Researcher Age Range: Early childhood/K

Type of Validity

Age or Grade

Test or Criterion

n (examinees)

n (raters)

Coefficient

Confidence Interval /Measurement Error

Convergent

Early childhood / Kindergarten

Continuous Duration Recording (CDR)

24

3

Relative percent difference with CDR. Average = 8.31%, SD = 7.65, Range = 0.49% to 31.59%.

 

Correlation (across students) with CDR = 0.83, p < 0.001

 

Absolute difference with CDR. Average = 0.06, SD = 0.05, p < 0.001.

 

Correlations between MTS-derived ranking ordering of student engagement and teacher- or expert-nominated rankings of student  engagement, using Spearman’s rho.

- Rho (teacher ranking) = 0.34 (p = 0.108)

- Rho (expert ranking) = .71 (p < -0.001)

Absolute mean measurement error against CDR = 6.28%

 

Mean measurement error against CDR = -3.35%

 

Convergent

Early childhood / Kindergarten

CDR

24

3

Pearson’s r = 0.890 (p < .01)

Measurement error = 2.04% (% from MTS minus % from CDR)

 

Subscale: Academically Engaged Form: Researcher Age Range: Elementary

Type of Validity

Age or Grade

Test or Criterion

n (examinees)

n (raters)

Coefficient

Confidence Interval

Convergent

Elementary

Continuous Observation

16

2

Less than 9% discrepancy identified between scores derived from MTS and continuous observation for 18 of 22 observations (82%)*

 

* Based upon visual analysis from 20-min observation periods, which suggested similar patterns of behavior for most days. In almost all cases, trend followed across days, even when level was discrepant.

 

Describe the degree to which the provided data support the validity of the tool:

As is true for information regarding sensitivity to change, validity evidence for estimates of academic engagement derived from MTS 15-second procedures is sparse, given that time-sampling procedures in general and MTS specifically are often viewed as a gold standard measure when continuous observation is not feasible. However, recently, Wood, Hojnoski, Laracy, and Olson (2015) examined error of MTS-derived estimates of prevalence when compared to those derived from continuous observation. MTS was found to be the least error-prone estimate when compared to PI and WI sampling. Absolute mean error (across students) was 6.28%, while mean measurement error that maintained the properties of over/underestimation was -3.35%. The Pearson correlation coefficient between MTS-derived estimates and those from continuous observation was .83, and Spearman’s rho, a non-parametric rank-order correlation coefficient, was .71 when MTS-derived estimates were compared to expert rankings of student engagement. In a follow-up to this study, Zakszeski, Hojnoski, and Wood (2017) examined the error of MTS-derived estimates of prevalence when compared to those derived from continuous observation. The Pearson correlation coefficient between MTS-derived estimates and those from continuous observation was .890 (p < .01), with an observed measurement error of 2.04% (percentage derived from continuous observation subtracted from   percentage derived from MTS). In a less quantitative study, Saudargas and Zanolli (1990) used visual analysis to examine patterns of engagement estimates derived from both continuous observation and MTS. In almost all cases, trends between both data patterns were consistent across days, even when level was discrepant. Quantitative results reported by authors indicates that there was a less than 9% discrepancy identified between scores derived from MTS and continuous observation for 18 of 22 observations (82%).

Bias Analysis Conducted

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
RatingNoNo

Have additional analyses been conducted to establish whether the tool is or is not biased against demographic subgroups (e.g., students who vary by race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)?

Bias Analysis Method:

No qualifying evidence provided.

 

Subgroups Included:

No qualifying evidence provided.

 

Bias Analysis Results:

No qualifying evidence provided.

Sensitivity

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Ratingdashdash

Describe evidence that the monitoring system produces data that are sensitive to detect incremental change (i.e., small behavior change in a short period of time):

Evidence on sensitivity to change comparing MTS with 15-second intervals for academic engagement is somewhat difficult to identify, given that SDO procedures are often put forth as the gold standard against which other methods are evaluated when continuous observation is not available. The congruence between data derived from continuous observation and MTS has been examined in studies such as Sharp, Mudford, and Elliffe (2015), but these are methodology-wide and do not pertain to estimates of academic engagement specifically. However, knowing specific characteristics of the academic engagement that will be observed may help bolster evidence towards sensitivity to change.

Reliability: Intensive Population

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Ratingdashdash

Justify the appropriateness of each type of reliability reported:

No qualifying evidence provided.

 

Describe the sample characteristics for each reliability analysis conducted:

No qualifying evidence provided.

 

Describe reliability of the slope analyses conducted with a population of students in need of intensive intervention:

No qualifying evidence provided.

Validity: Intensive Population

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Ratingdashdash

Describe and justify the criterion measures used to demonstrate validity:

No qualifying evidence provided.

 

Describe the sample characteristics for each validity analysis conducted:

No qualifying evidence provided.

 

Describe predictive validity of the slope of improvement analyses conducted with a population of students in need of intensive intervention:

No qualifying evidence provided.

 

Describe the degree to which the provided data support the validity of the tool:

No qualifying evidence provided.

Decision Rules: Changing Intervention

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Ratingdashdash

Specification of validated decision rules for when changes to the intervention should be made:

No qualifying evidence provided.

 

Evidentiary basis for these rules:

No qualifying evidence provided.

Decision Rules: Choosing Intervention

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Ratingdashdash

Specification of validated decision rules to inform intervention selection:

No qualifying evidence provided.

 

Evidentiary basis for these rules:

No qualifying evidence provided.

Administration Format

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Data
  • Direct Observation
  • Direct Observation

Admin & Scoring Time

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Data
  • Variable
  • Variable

Scoring Format

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Data
  • Manually-scored
  • Computer-scored
  • Manually-scored
  • Computer-scored

Levels of Performance

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
Data
  • Variable
  • Variable

Specify the levels of performance and how they are used for progress monitoring:

Fellers, G., & Saudargas, R. A. (1987). Classroom Behaviors of LD and Nonhandicapped Girls. Learning Disability Quarterly, 10(3), 231. http://doi.org/10.2307/1510495

Observed behavior of two groups of 15 female students (LD and non-LD; total n = 30) across grades 2, 4, and 5 from public elementary schools. LD and non-LD students were matched based on classroom (i.e., one for each group drawn from each classroom). Observed using SECOS system, which utilizes a combined definition of academic engagement called “schoolwork” with 15s MTS procedures. Students were observed at least three times for 20 minutes across two weeks.

Percentage of total intervals during which “seatwork” was indicated, as mean (M) and standard deviation (SD).

LD group. M = 68.3%, SD = 12.7%.

Non-LD group. M = 73.9%, SD = 14.3%.

 

Slate, J. R., & Saudargas, R. A. (1986). Differences in Learning Disabled and Average Students’ Classroom Behaviors. Learning Disability Quarterly, 9(1), 61. http://doi.org/10.2307/1510402

Observed behavior of two groups of 14 male students (LD and non-LD; total n = 28) across grades 3, 4, and 5 from public elementary schools. Of LD group, White = 7, Black = 7. Of non-LD group, White = 6, Black = 8. Observed using SECOS system, which utilizes a combined definition of academic engagement called “schoolwork” with 15s MTS procedures. Students were observed four to six times for 20 minutes across 10 weeks.

Percentage of total intervals during which “seatwork” was indicated, as mean (M) and standard deviation (SD).

LD group. M = 67.9%, SD = 12.1%.

Non-LD group. M = 68.1%, SD = 8.53%.

 

Slate, J. R., & Saudargas, R. A. (1986). Differences in the classroom behaviors of behaviorally disordered and regular class children. Behavioral Disorders, 45–53

Observed behavior of two groups of 13 male students (behaviorally disordered [BD] and non-BD; total n = 26) across grades 3, 4, and 5 from public elementary schools. Observed using SECOS system, which utilizes a combined definition of academic engagement called “schoolwork” with 15s MTS procedures. Students were observed four times for 20 minutes, with each individual student’s observations occurring within a single two week period.

Percentage of total intervals during which “seatwork” was indicated, as mean (M) and standard deviation (SD).

BD group. M = 66.83%, SD = 14.38%.

Non-BD group. M = 67.52%, SD = 7.40%.

 

Zigmond, N., Kerr, M. M., & Schaeffer, A. (1988). Behavior patterns of learning disabled and non-learning-disabled adolescents in high school academic classes. Remedial and Special Education, 9(2), 6–11.

Observed behavior of three groups of students: students with LD, students with emotional disturbance (ED), and a control group of students. Observed using 15s MTS procedures of on-task behavior. Students were observed twice weekly for 30 minutes.

LD group: n = 36. Male = 28, Female = 8. Grades 9 to 11.

ED group: n = 8.  Male = 7, Female = 1. Grades 9 to 12.

Control students: typical students, randomly selected at each observation of a student with LD or ED.

Number of total intervals during which “on-task” was indicated, as mean (M) and standard deviation (SD). Total intervals = 15.

LD group. M = 8.49, SD = 2.734

ED group. M = 8.78, SD = 1.974

Control group. M = 8.82, SD = 1.742

C5. Usability Study

If a usability study has been conducted on your tool, describe the results of the study:

Riley-Tillman, T., Chafouleas, S., Briesch, A., & Eckert, T. (2008). Daily Behavior Report Cards and Systematic Direct Observation: An investigation of the acceptability, reported training and use, and decision reliability among school psychologists. Journal of Behavioral Education, 17(4), 313-327. doi:10.1007/s10864-008-9070-5

The broader class of systematic direct observation (SDO) methodologies, which includes SDO, has been examined in a combined usability and social validity study conducted by Riley-Tillman, Chafouleas, Briesch, and Eckert (2008). The total sample size across two samples of school psychologists was 191 (92 in Study 1, 99 in Study 2). Most respondents worked in public schools (83.7%, 88.9% by Study), were female (76.1%, 74.7%), practiced with a “Masters plus 30” credential (48.9%, 41.4%), and were fairly evenly split across years in practice, urbanicity, and age group served. Results from responses to 16 Likert-type-scaled items (1 = strongly disagree, 6 = strongly agree) indicated that SDO procedures were generally perceived as acceptable to very acceptable (mean scores for positively-worded items were 4.4 to 5.1 across samples). Items specific to the time and intrusiveness upon teachers/staff, school psychologists, and the general classroom environment were rated from a mean of 2.0 to 2.8 using the scale described above, indicating low to moderate feelings towards the intrusiveness of procedures. To wit, each of these items began with the stem “The use of this technique was overly intrusive on…”. Mean responses to the item “This technique provides a feasible method of assessing the effectiveness of an intervention” were 4.7 and 4.8 across samples.

Usability Study

Age/Grade: InformantEarly Childhood / K:
Teacher
Grades K-5:
Teacher
DataYesYes

If a social validity study has been conducted on your tool, describe the results of the study:

Many of the items in the study described above (Riley-Tillman, Chafouleas, Briesch, and Eckert, 2008) relate to the specific social validity of SDO procedures. For instance, mean responses for “This technique should prove effective in monitoring an intervention” were 4.9 and 5.0 across samples, “Use of this technique was a good way to handle the child’s problems” were 4.7 and 4.4 across samples, and “Overall, using this technique would be beneficial for the child” were 4.9 and 4.6 across samples.