Momentary Time-Sampling
Academic Engagement
Summary
Momentary time-sampling (MTS) is a behavior assessment methodology within systematic direct observation wherein an observation period is divided into intervals, and behavior during each interval is scored as an occurrence if the behavior is occurring at the moment the interval begins or ends (depending on the specific procedures used). By dividing intervals scored as occurrences by the total number of intervals, MTS provides an estimate of the proportion or percentage of an observation period during which a target behavior was occurring (i.e., “prevalence”). Depending on the characteristics of the underlying behavior, MTS may also be used to estimate the frequency of a target behavior (see Suen & Ary, 1989). However, this latter use of MTS is uncommon in the social sciences literature. Like interval-recording procedures (i.e., partial-interval and whole-interval recording), MTS can be used with a number of interval lengths, observation durations, and target behaviors depending upon the target assessment question. This review focuses on the use of MTS with 15-second intervals, with a target behavior of academic engagement (or AE; defined as including both passive and active engagement, as described below) and a focus on individual students (rather than progress monitoring academic engagement for an entire class or small group). The studies reviewed are those that explicitly examine the reliability, validity, and levels of performance of data derived from MTS with 15-second intervals for AE with individual students as the target.
- Where to Obtain:
- John Hintze & William Matthews
- Initial Cost:
- Free
- Replacement Cost:
- Free
- Included in Cost:
- Momentary time-sampling is described in numerous books, articles, and presentations. Its methods are simple and transparent, and as a result, MTS may be considered free to use. However, training in the use of MTS typically occurs within the context of a graduate course on behavior assessment (e.g., school psychology, special education, applied behavior analysis), and as a result, some costs may be associated with its use. However, this is not necessarily a “prerequisite” to use of the instrument, and the cost for training will vary by user (from free to thousands of dollars).
- The information provided on MTS in published tools varies widely.
- Training Requirements:
- Generally, 8 hours or more, however training time varies by study. Fellers & Saudargas, 1987: 12 hours (using SECOS system) Briesch, Chafouleas, & Riley-Tillman, 2010: 8 hours (using only MTS) Hintze & Matthews, 2004: 4 hours (using only MTS) Slate & Saudargas, 1986b: 13-15 hours (using SECOS system)
- Qualified Administrators:
- No minimum qualifications specified.
- Access to Technical Support:
- MTS is a common methodology for direct observation, and technical support should be available from any expert in behavior assessment.
- Assessment Format:
-
- Direct observation
- Scoring Time:
-
- Scoring is automatic OR
- Scores Generated:
-
- Administration Time:
-
- minutes per
- Scoring Method:
-
- Manually (by hand)
- Automatically (computer-scored)
- Technology Requirements:
-
Tool Information
Descriptive Information
- Please provide a description of your tool:
- Momentary time-sampling (MTS) is a behavior assessment methodology within systematic direct observation wherein an observation period is divided into intervals, and behavior during each interval is scored as an occurrence if the behavior is occurring at the moment the interval begins or ends (depending on the specific procedures used). By dividing intervals scored as occurrences by the total number of intervals, MTS provides an estimate of the proportion or percentage of an observation period during which a target behavior was occurring (i.e., “prevalence”). Depending on the characteristics of the underlying behavior, MTS may also be used to estimate the frequency of a target behavior (see Suen & Ary, 1989). However, this latter use of MTS is uncommon in the social sciences literature. Like interval-recording procedures (i.e., partial-interval and whole-interval recording), MTS can be used with a number of interval lengths, observation durations, and target behaviors depending upon the target assessment question. This review focuses on the use of MTS with 15-second intervals, with a target behavior of academic engagement (or AE; defined as including both passive and active engagement, as described below) and a focus on individual students (rather than progress monitoring academic engagement for an entire class or small group). The studies reviewed are those that explicitly examine the reliability, validity, and levels of performance of data derived from MTS with 15-second intervals for AE with individual students as the target.
- Is your tool designed to measure progress towards an end-of-year goal (e.g., oral reading fluency) or progress towards a short-term skill (e.g., letter naming fluency)?
-
ACADEMIC ONLY: What dimensions does the tool assess?
- BEHAVIOR ONLY: Please identify which broad domain(s)/construct(s) are measured by your tool and define each sub-domain or sub-construct.
- This review focuses on examinations of the properties of data derived from MTS with 15-second intervals for measuring AE, when this construct is defined as including both active (e.g., writing on a piece of paper) and passive (e.g., looking at the teacher during a lecture) engagement. In other words, a student could engage in active or passive engagement in order to be considered to be engaging in AE.
- BEHAVIOR ONLY: Which category of behaviors does your tool target?
Externalizing
Acquisition and Cost Information
Administration
Training & Scoring
Training
- Is training for the administrator required?
- Yes
- Describe the time required for administrator training, if applicable:
- Generally, 8 hours or more, however training time varies by study. Fellers & Saudargas, 1987: 12 hours (using SECOS system) Briesch, Chafouleas, & Riley-Tillman, 2010: 8 hours (using only MTS) Hintze & Matthews, 2004: 4 hours (using only MTS) Slate & Saudargas, 1986b: 13-15 hours (using SECOS system)
- Please describe the minimum qualifications an administrator must possess.
- No minimum qualifications
- Are training manuals and materials available?
- Yes
- Are training manuals/materials field-tested?
- No
- Are training manuals/materials included in cost of tools?
- Yes
- If No, please describe training costs:
- Can users obtain ongoing professional and technical support?
- Yes
- If Yes, please describe how users can obtain support:
- MTS is a common methodology for direct observation, and technical support should be available from any expert in behavior assessment.
Scoring
- Please describe the scoring structure. Provide relevant details such as the scoring format, the number of items overall, the number of items per subscale, what the cluster/composite score comprises, and how raw scores are calculated.
- Typically, when calculating level of performance, comparisons are made within-student (“absolute” decision-making using data collected for a specific student across time) or between a target student and a peer (“relative” decision-making, as takes place within the BOSS). Prevalence is the most frequent score resulting from MTS, and may be calculated by summing the number of intervals scored as an occurrence and dividing this value by the total number of intervals observed. Prevalence can be converted into a percentage (by multiplying prevalence with 100) or into a duration estimate (by multiplying prevalence with the observation period length). Frequency can also be calculated according to formulas found in Suen & Ary (1989) when certain criteria for the interval length and behavior stream are met.
- Do you provide basis for calculating slope (e.g., amount of improvement per unit in time)?
- No
- ACADEMIC ONLY: Do you provide benchmarks for the slopes?
- ACADEMIC ONLY: Do you provide percentile ranks for the slopes?
- Describe the tool’s approach to progress monitoring, behavior samples, test format, and/or scoring practices, including steps taken to ensure that it is appropriate for use with culturally and linguistically diverse populations and students with disabilities.
Levels of Performance and Usability
- Date
- 1986-1988
- Size
- Small separate samples.
- Male
- Female
- Unknown
- Eligible for free or reduced-price lunch
- Other SES Indicators
- White, Non-Hispanic
- Black, Non-Hispanic
- Hispanic
- American Indian/Alaska Native
- Asian/Pacific Islander
- Other
- Unknown
- Disability classification (Please describe)
- First language (Please describe)
- Language proficiency status (Please describe)
Performance Level
Reliability
Age / Grade Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- *Offer a justification for each type of reliability reported, given the type and purpose of the tool.
- *Describe the sample(s), including size and characteristics, for each reliability analysis conducted.
- BRIESCH, CHAFOULEAS, & RILEY-TILLMAN (2010): Examinee sample: 12 students. Mean age = 5 years 11 months. White = 10, African-American = 1, Asian = 1. Female = 7, Male = 5. SDO rater sample: 2 researchers, trained with videos to 95% IOA criterion between observers (kappa = .89). Training lasted 8 hours. WOOD, HOJNOSKI, LARACY, & OLSON (2015): Examinee sample: 24 children. Female = 11, Male = 13. Age range = 38 – 65 months. Mean age = 51 months, SD = 8.30 months. Majority Caucasian. Primary Language of English = 23, Spanish = 1. Special education services for speech/language = 6. Special education services for unidentified needs = 1. Rater sample: 3 researchers. Trained using three training videos with criterion of 85% agreement on 3 consecutive videos met. ZAKSZESKI, HOJNOSKI, & WOOD (2017) Examinee sample: 24 children. Female = 11, Male = 13. Age range = 38 – 66 months. Mean age = 51 months, SD = 8 months. White = 16. Primary Language of English = 23, Spanish = 1. Special education services = 7. Rater sample: 3 researchers. Trained using three training videos with criterion of 85% agreement on all behavior categories.
- *Describe the analysis procedures for each reported type of reliability.
*In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Include detail about the type of reliability data, statistic generated, and sample size and demographic information.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- *Briesch, Chafouleas, & Riley-Tillman, 2010 +Hintze & Matthews, 2004 ^(Johnson, Chafouleas, & Briesch, 2017)
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Briesch, A. M., Chafouleas, S. M., & Riley-Tillman, T. C. (2010). Generalizability and dependability of behavior assessment methods to estimate academic engagement: A comparison of systematic direct observation and direct behavior rating. School Psychology Review, 39(3), 408. Wood, B. K., Hojnoski, R. L., Laracy, S. D., & Olson, C. L. (2015). Comparison of Observational Methods and Their Relation to Ratings of Engagement in Young Children. Topics in Early Childhood Special Education, 0271121414565911. Zakszeski, B. N., Hojnoski, R. L., & Wood, B. K. (2017). Considerations for Time Sampling Interval Durations in the Measurement of Young Children’s Classroom Engagement. Topics in Early Childhood Special Education, 37(1), 42-53.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated reliability data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity
Age / Grade Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- *Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- *Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- WOOD, HOJNOSKI, LARACY, & OLSON, 2015: Examinee sample: 24 children. Female = 11, Male = 13. Age range = 38 – 65 months. Mean age = 51 months, SD = 8.30 months. Majority Caucasian. Primary Language of English = 23, Spanish = 1. Special education services for speech/language = 6. Special education services for unidentified needs = 1. Rater sample: 3 researchers. Trained using three training videos with criterion of 85% agreement on 3 consecutive videos met. SAUDARGAS & ZANOLLI, 1990: Examinee sample: 16 students. Grade 1 = 2, Grade 2 = 1, Grade 3 = 5, Grade 4 = 8. Rater sample: 2 graduate students. Trained using videotapes.
- *Describe the analysis procedures for each reported type of validity.
*In the table below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- *Wood, Hojnoski, Laracy, & Olson, 2015 +Zakszeski, Hojnoski, & Wood, 2017 ^Saudargas & Zanolli, 1990
- Manual cites other published reliability studies:
- Yes
- Provide citations for additional published studies.
- Wood, B. K., Hojnoski, R. L., Laracy, S. D., & Olson, C. L. (2015). Comparison of Observational Methods and Their Relation to Ratings of Engagement in Young Children. Topics in Early Childhood Special Education, 0271121414565911. Zakszeski, B. N., Hojnoski, R. L., & Wood, B. K. (2017). Considerations for Time Sampling Interval Durations in the Measurement of Young Children’s Classroom Engagement. Topics in Early Childhood Special Education, 37(1), 42-53.
- Describe the degree to which the provided data support the validity of the tool.
- As is true for information regarding sensitivity to change, validity evidence for estimates of academic engagement derived from MTS 15-second procedures is sparse, given that time-sampling procedures in general and MTS specifically are often viewed as a gold standard measure when continuous observation is not feasible. However, recently, Wood, Hojnoski, Laracy, and Olson (2015) examined error of MTS-derived estimates of prevalence when compared to those derived from continuous observation. MTS was found to be the least error-prone estimate when compared to PI and WI sampling. Absolute mean error (across students) was 6.28%, while mean measurement error that maintained the properties of over/underestimation was -3.35%. The Pearson correlation coefficient between MTS-derived estimates and those from continuous observation was .83, and Spearman’s rho, a non-parametric rank-order correlation coefficient, was .71 when MTS-derived estimates were compared to expert rankings of student engagement. In a follow-up to this study, Zakszeski, Hojnoski, and Wood (2017) examined the error of MTS-derived estimates of prevalence when compared to those derived from continuous observation. The Pearson correlation coefficient between MTS-derived estimates and those from continuous observation was .890 (p < .01), with an observed measurement error of 2.04% (percentage derived from continuous observation subtracted from percentage derived from MTS). In a less quantitative study, Saudargas and Zanolli (1990) used visual analysis to examine patterns of engagement estimates derived from both continuous observation and MTS. In almost all cases, trends between both data patterns were consistent across days, even when level was discrepant. Quantitative results reported by authors indicates that there was a less than 9% discrepancy identified between scores derived from MTS and continuous observation for 18 of 22 observations (82%).
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
If yes, fill in data for each subgroup with disaggregated validity data.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Bias Analysis
Age / Grade: Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating | No | No |
- Have you conducted additional analyses related to the extent to which your tool is or is not biased against subgroups (e.g., race/ethnicity, gender, socioeconomic status, students with disabilities, English language learners)? Examples might include Differential Item Functioning (DIF) or invariance testing in multiple-group confirmatory factor models.
- No
- If yes,
- a. Describe the method used to determine the presence or absence of bias:
- b. Describe the subgroups for which bias analyses were conducted:
- c. Describe the results of the bias analyses conducted, including data and interpretative statements. Include magnitude of effect (if available) if bias has been identified.
Growth Standards
Sensitivity to Behavior Change
Age / Grade: Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- Describe evidence that the monitoring system produces data that are sensitive to detect incremental change (e.g., small behavior change in a short period of time such as every 20 days, or more frequently depending on the purpose of the construct). Evidence should be drawn from samples targeting the specific population that would benefit from intervention. Include in this example a hypothetical illustration (with narrative and/or graphics) of how these data could be used to monitor student performance frequently enough and with enough sensitivity to accurately assess change:
Reliability (Intensive Population): Reliability for Students in Need of Intensive Intervention
Age / Grade Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- Offer a justification for each type of reliability reported, given the type and purpose of the tool:
- Describe the sample(s), including size and characteristics, for each reliability analysis conducted:
- Describe the analysis procedures for each reported type of reliability:
In the table(s) below, report the results of the reliability analyses described above (e.g., model-based evidence, internal consistency or inter-rater reliability coefficients). Report results by age range or grade level (if relevant) and include detail about the type of reliability data, statistic generated, and sample size and demographic information.
Type of | Subscale | Subgroup | Informant | Age / Grade | Test or Criterion | n (sample/ examinees) |
n (raters) |
Median Coefficient | 95% Confidence Interval Lower Bound |
95% Confidence Interval Upper Bound |
---|
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Do you have reliability data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
- If yes, fill in data for each subgroup with disaggregated reliability data.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)n
(raters)Median Coefficient 95% Confidence Interval
Lower Bound95% Confidence Interval
Upper Bound
- Results from other forms of reliability analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Validity (Intensive Population): Validity for Students in Need of Intensive Intervention
Age / Grade Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- Describe each criterion measure used and explain why each measure is appropriate, given the type and purpose of the tool.
- Describe the sample(s), including size and characteristics, for each validity analysis conducted.
- Describe the analysis procedures for each reported type of validity.
- In the table(s) below, report the results of the validity analyses described above (e.g., concurrent or predictive validity, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)n
(raters)Median Coefficient 95% Confidence Interval
Lower Bound95% Confidence Interval
Upper Bound
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
- Describe the degree to which the provided data support the validity of the tool.
- Do you have validity data that are disaggregated by gender, race/ethnicity, or other subgroups (e.g., English language learners, students with disabilities)?
- No
- If yes, fill in data for each subgroup with disaggregated validity data.
Type of Subscale Subgroup Informant Age / Grade Test or Criterion n
(sample/
examinees)n
(raters)Median Coefficient 95% Confidence Interval
Lower Bound95% Confidence Interval
Upper Bound
- Results from other forms of validity analysis not compatible with above table format:
- Manual cites other published reliability studies:
- No
- Provide citations for additional published studies.
Decision Rules: Data to Support Intervention Change
Age / Grade: Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- Are validated decision rules for when changes to the intervention need to be made specified in your manual or published materials?
- No
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
Decision Rules: Data to Support Intervention Selection
Age / Grade: Informant |
Early childhood / K
Researcher |
Grades K-5
Researcher |
---|---|---|
Rating |
- Are validated decision rules for what intervention(s) to select specified in your manual or published materials?
- No
- If yes, specify the decision rules:
-
What is the evidentiary basis for these decision rules?
Data Collection Practices
Most tools and programs evaluated by the NCII are branded products which have been submitted by the companies, organizations, or individuals that disseminate these products. These entities supply the textual information shown above, but not the ratings accompanying the text. NCII administrators and members of our Technical Review Committees have reviewed the content on this page, but NCII cannot guarantee that this information is free from error or reflective of recent changes to the product. Tools and programs have the opportunity to be updated annually or upon request.