Observation Survey of Early Literacy Achievement

Reading

Cost

Technology, Human Resources, and Accommodations for Special Needs

Service and Support

Purpose and Other Implementation Information

Usage and Reporting

Initial Cost:

Costs associated with An Observation Survey of Early Literacy Achievement are per teacher. No additional costs are incurred per student.

 

Training costs are part of Reading Recovery implementation budgets. Reading Recovery sites employ a teacher leader who is trained in the administration, scoring, and interpretation of the Observation Survey.

 

That teacher leader is responsible for training teachers to use the tool for screening, selecting children for intervention, for monitoring progress, and for making Reading Recovery exit decisions. Reading Recovery teachers are also trained to use outcomes of the Observation Survey to inform their teaching of each individual child.

 

Replacement Cost:

No information provided; contact vendor for details.

 

Included in Cost:

The following information is included within the book:

1. Theoretical information about observing and recording early literacy behaviors

2. Theoretical information about the processes of reading and writing

3. Information about early detection of literacy difficulties and early intervention

4. A chapter on each task of the Observation Survey that provides (a) rationales for the task; (b) procedures for administering, scoring, and interpreting the task; (c) student protocols; and (d) scoring sheets/ record sheets.

5. A chapter on how to summarize, interpret, and use the results of all six tasks

6. Forms to show change over time (progress monitoring)

7. Appendices which include

  • New Zealand norms (including stanines and percentile ranks, inter-correlations, and validity and reliability reports)
  • Administration and Score Sheets
  • U. S. Norms (including stanines and percentile ranks) and correlations

 

Technology Requirements:
No information provided; contact vendor for details.

 

Training Requirements:

  • 4-8 hours of training

 

Qualified Administrators:

  • Professionals

 

Accommodations:

Because the Observation Survey measures authentic literacy knowledge and is administered individually, accommodations are made based on teacher observations of each child.

 

Where to Obtain:

Website: www.heinemann.com              

Address: Heinemann,  P.O. Box 6926, Portsmouth, NH, 03802-6926        

Phone number: 800-225-5800


Access to Technical Support:

Trained leaders

An Observation Survey of Early Literacy Achievement comprises six systematic, standard observation tasks developed in research studies. The tasks yield a composite and comprehensive assessment of the literacy performance of young learners. All tasks have the qualities of sound assessment instruments with reliability and validity and discrimination indices established in research. Children are assessed individually by a specially trained teacher.

 

The Observation Survey tasks are designed to allow children to work with the complexities of written language and allow the examiner to be confident of characteristics of good measurement instruments: a standard task; a standard way of administering the task; ways of knowing when we can rely on our observation and make valid comparisons; and a task that relates to real-world tasks (content validity).

 

Because young children begin literacy learning and instruction in early grades with individually unique knowledge and confusions, assessment of beginning reading and writing should inform the examiner of literacy behaviors along several dimensions of learning. Therefore, early assessments must be wide-ranging, with tasks to observe

  • Concepts about print (how print encodes information)
  • The reading of continuous text
  • Letter knowledge
  • Reading vocabulary
  • Writing vocabulary
  • Phonemic awareness and sound-symbol relationships

 

It is important to know how learning is proceeding in each of these areas and to identify problems early. No one task should be used in isolation because it may assess only one aspect of early literacy behavior. Therefore, all six tasks of the Observation Survey are considered as a composite when screening and selecting children for Reading Recovery (a literacy intervention for first graders who are struggling). Because of the comprehensive nature of the assessment of each individual, results are used for screening, to guide teaching, and to monitor progress.

Assessment Format:

  • One-to-one

 

Administration Time:

  • 15-45 minutes per student

 

Scoring Time:

  • 15+ minutes per student

 

Scoring Method:

Raw scores are converted to stanines and all six tasks are considered as a composite to identify children who need a literacy intervention.

For the composite measure, the raw scores on the six tasks for each student are summed to compute a total raw score. Each student’s total raw score is converted to an IRT-based scale score that can range from 0 to 800 points. The IRT-based scale scores were derived by treating each task as a polythomously-scored item, and the six tasks were scaled based on a partial-credit Rasch model using the Winsteps computer program. Separate random samples of student task responses from either Fall 2009, mid-year (late December 2009 to early February 2010), or Spring 2010 were used as data to create the unidimensional scale that reflects literacy growth.

 

Scores Generated:

  • Raw score
  • Percentile score
  • Stanines

 

 

Classification Accuracy

Grade1
Criterion 1 FallFull bubbled
Criterion 1 WinterFull bubble
Criterion 1 Springdash
Criterion 2 FallFull bubbled
Criterion 2 Winterdash
Criterion 2 Springdash

Primary Sample

 

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Fall

 

Grade 1

Cut points

421

Base rate in the sample for children requiring intensive intervention

0.06

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.19

False Negative Rate

0.23

Sensitivity

0.77

Specificity

0.81

Positive Predictive Power

0.38

Negative Predictive Power

0.96

Overall Classification Rate

0.81

Area Under the Curve (AUC)

0.87

AUC 95% Confidence Interval Lower

0.84

AUC 95% Confidence Interval Upper

0.89

At 90% Sensitivity, specificity equals

0.63

At 80% Sensitivity, specificity equals

0.77

At 70% Sensitivity, specificity equals

0.85

 

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Winter

 

Grade 1

Cut points

498

Base rate in the sample for children requiring intensive intervention

0.06

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.19

False Negative Rate

0.17

Sensitivity

0.83

Specificity

0.81

Positive Predictive Power

0.40

Negative Predictive Power

0.97

Overall Classification Rate

0.81

Area Under the Curve (AUC)

0.90

AUC 95% Confidence Interval Lower

0.88

AUC 95% Confidence Interval Upper

0.92

At 90% Sensitivity, specificity equals

0.71

At 80% Sensitivity, specificity equals

0.84

At 70% Sensitivity, specificity equals

0.88

 

Criterion 2: Observation Survey of Early Literacy Achievement, Text Reading Level

Time of Year: Fall

 

Grade 1

Cut points

428

Base rate in the sample for children requiring intensive intervention

0.23

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.19

False Negative Rate

0.29

Sensitivity

0.71

Specificity

0.81

Positive Predictive Power

0.52

Negative Predictive Power

0.90

Overall Classification Rate

0.79

Area Under the Curve (AUC)

0.85

AUC 95% Confidence Interval Lower

0.83

AUC 95% Confidence Interval Upper

0.87

At 90% Sensitivity, specificity equals

0.58

At 80% Sensitivity, specificity equals

0.72

At 70% Sensitivity, specificity equals

0.82

 

 

Additional Classification Accuracy

The following are provided for context and did not factor into the Classification Accuracy ratings.

 

Disaggregated Data

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Fall

Subgroup: African Americans

 

Grade 1

Cut points

421

Base rate in the sample for children requiring intensive intervention

0.20

Base rate in the sample for children considered at-risk, including those with the most intensive needs

n/a

False Positive Rate

0.24

False Negative Rate

0.18

Sensitivity

0.82

Specificity

0.76

Positive Predictive Power

0.47

Negative Predictive Power

0.94

Overall Classification Rate

0.77

Area Under the Curve (AUC)

0.88

AUC 95% Confidence Interval Lower

0.85

AUC 95% Confidence Interval Upper

0.91

At 90% Sensitivity, specificity equals

0.64

At 80% Sensitivity, specificity equals

0.75

At 70% Sensitivity, specificity equals

0.85

 

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Fall

Subgroup: English Language Learners

 

Grade 1

Cut points

421

Base rate in the sample for children requiring intensive intervention

0.20

Base rate in the sample for children considered at-risk, including those with the most intensive needs

n/a

False Positive Rate

0.33

False Negative Rate

0.14

Sensitivity

0.85

Specificity

0.67

Positive Predictive Power

0.40

Negative Predictive Power

0.95

Overall Classification Rate

0.71

Area Under the Curve (AUC)

0.88

AUC 95% Confidence Interval Lower

0.85

AUC 95% Confidence Interval Upper

0.91

At 90% Sensitivity, specificity equals

0.63

At 80% Sensitivity, specificity equals

0.73

At 70% Sensitivity, specificity equals

0.80

 

Criterion 2: Observation Survey of Early Literacy Achievement, Text Reading Level

Time of Year: Fall

Subgroup: African Americans

 

Grade 1

Cut points

428

Base rate in the sample for children requiring intensive intervention

0.34

Base rate in the sample for children considered at-risk, including those with the most intensive needs

n/a

False Positive Rate

0.22

False Negative Rate

0.28

Sensitivity

0.72

Specificity

0.78

Positive Predictive Power

0.62

Negative Predictive Power

0.84

Overall Classification Rate

0.76

Area Under the Curve (AUC)

0.85

AUC 95% Confidence Interval Lower

0.82

AUC 95% Confidence Interval Upper

0.88

At 90% Sensitivity, specificity equals

0.63

At 80% Sensitivity, specificity equals

0.72

At 70% Sensitivity, specificity equals

0.78

 

Criterion 2: Observation Survey of Early Literacy Achievement, Text Reading Level

Time of Year: Fall

Subgroup: English Language Learners

 

Grade 1

Cut points

428

Base rate in the sample for children requiring intensive intervention

0.34

Base rate in the sample for children considered at-risk, including those with the most intensive needs

n/a

False Positive Rate

0.24

False Negative Rate

0.20

Sensitivity

0.80

Specificity

0.76

Positive Predictive Power

0.63

Negative Predictive Power

0.88

Overall Classification Rate

0.77

Area Under the Curve (AUC)

0.85

AUC 95% Confidence Interval Lower

0.82

AUC 95% Confidence Interval Upper

0.88

At 90% Sensitivity, specificity equals

0.60

At 80% Sensitivity, specificity equals

0.75

At 70% Sensitivity, specificity equals

0.80

 

 

Cross-Validation Sample

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Fall

 

Grade 1

Cut points

421

Base rate in the sample for children requiring intensive intervention

0.05

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.20

False Negative Rate

0.30

Sensitivity

0.70

Specificity

0.80

Positive Predictive Power

0.36

Negative Predictive Power

0.94

Overall Classification Rate

0.79

Area Under the Curve (AUC)

0.84

AUC 95% Confidence Interval Lower

0.82

AUC 95% Confidence Interval Upper

0.87

At 90% Sensitivity, specificity equals

0.60

At 80% Sensitivity, specificity equals

0.72

At 70% Sensitivity, specificity equals

0.80

 

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Winter

 

Grade 1

Cut points

498

Base rate in the sample for children requiring intensive intervention

0.05

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.19

False Negative Rate

0.20

Sensitivity

0.80

Specificity

0.81

Positive Predictive Power

0.40

Negative Predictive Power

0.96

Overall Classification Rate

0.81

Area Under the Curve (AUC)

0.90

AUC 95% Confidence Interval Lower

0.88

AUC 95% Confidence Interval Upper

0.91

At 90% Sensitivity, specificity equals

0.69

At 80% Sensitivity, specificity equals

0.81

At 70% Sensitivity, specificity equals

0.91

 

 

Criterion 2: Observation Survey of Early Literacy Achievement, Text Reading Level

Time of Year: Fall

 

Grade 1

Cut points

428

Base rate in the sample for children requiring intensive intervention

0.05

Base rate in the sample for children considered at-risk, including those with the most intensive needs

0.20

False Positive Rate

0.21

False Negative Rate

0.28

Sensitivity

0.72

Specificity

0.79

Positive Predictive Power

0.50

Negative Predictive Power

0.91

Overall Classification Rate

0.78

Area Under the Curve (AUC)

0.85

AUC 95% Confidence Interval Lower

0.83

AUC 95% Confidence Interval Upper

0.86

At 90% Sensitivity, specificity equals

0.59

At 80% Sensitivity, specificity equals

0.73

At 70% Sensitivity, specificity equals

0.80

 

Reliability

Grade1
RatingFull bubbled
  1. Justification for each type of reliability reported, given the type and purpose of the tool:

Not provided

 

  1. Description of the sample(s), including size and characteristics, for each reliability analysis conducted

Not provided

 

  1. Description of the analysis procedures for each reported type of reliability:

Not provided

 

  1. Reliability of performance level score (e.g., model-based, internal consistency, inter-rater reliability).

Type of Reliability

Age or Grade

n

Coefficient

Confidence Interval

Alpha

1st

7,926

0.87

0.87, 0.89

Split-half 1st

1st

7,926

0.89

0.87, 0.89

 

Disaggregated Reliability

The following disaggregated reliability data are provided for context and did not factor into the Reliability rating.

Type of Reliability

Subgroup

Age or Grade

n

Coefficient

Confidence Interval

Alpha

African American

1st

1,150

0.87

0.87, 0.89

Split-half 1st

African American

1st

1,150

0.89

0.87, 0.89

Alpha

Hispanic

1st

859

0.88

0.87, 0.89

Split-half 1st

Hispanic

1st

859

0.89

0.87, 0.89

 

Validity

Grade1
RatingFull bubble
  1. Description of each criterion measure used and explanation as to why each measure is appropriate, given the type and purpose of the tool

Not provided

 

  1. Description of the sample(s), including size and characteristics, for each validity analysis conducted

Not provided

 

  1. Description of the analysis procedures for each reported type of validity

Not provided

 

  1. Validity for the performance level score (e.g., concurrent, predictive, evidence based on response processes, evidence based on internal structure, evidence based on relations to other variables, and/or evidence based on consequences of testing), and the criterion measures.

Type of Validity

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

Predictive

1st

Spring 2010

Slosson Oral

Reading Test-

Revised

878

0.72

0.67, 0.77

Predictive

1st

Mid-Year

Slosson Oral

Reading Test-

Revised

789

0.75

0.70, 0.80

Predictive

1st

Spring 2010

Text Reading

Level

7,152

0.74

0.69, 0.79

Predictive

1st

Mid-Year Scale

Score

7,147

0.83

0.78, 0.88

Construct

1st

Fall 2009

Slosson Oral

Reading Test-

Revised

820

0.78

0.73, 0.83

 

  1. Results for other forms of validity (e.g. factor analysis) not conducive to the table format:

CONTENT VALIDITY

The content of all tasks of the Observation Survey aligns with national reading standards and is well-documented in research. Developed for use in early-grade classrooms, Observation Survey tasks each “assess an area of literacy knowledge which provides an essential foundation for progress in reading and writing. The content of the tasks represents what is actually taught in the classroom” (Clay, 2002, 2006, p. 159). Denton et al. (2006) say that the wide implementation of the instrument suggests there is agreement about the content validity of the Observation Survey.

 

All of the tasks in the Observation Survey represent real-world tasks and literacy instructional expectations in earlygrade classrooms. Support for content validity of each task is summarized below.

 

Letter Identification

Concept Assessed: Letter Knowledge

Content Validity: It is common practice in the early grades for teachers to find out how many letters a child knows and if the child can visually distinguish letters one from another. In this task, all lowercase and uppercase letters (plus print forms of ‘a’ and ‘g’) are assessed. Children can respond with the letter name, a sound the letter makes, or a word beginning with the letter. Because this is a closed knowledge set, the task is most valid during the time a child is acquiring letter knowledge. The content represents what is actually taught in classrooms.

 

Ohio Word Test

Concept Assessed: Word Knowledge (Reading Vocabulary)

Content Validity: Word knowledge correlates strongly with text reading performance. And teachers of early readers seek ways to document the number of words a child knows and how this developing knowledge changes across time. The content of this task represents typical classroom expectations.

 

This task is structured to sample words that the children have had some opportunities to learn — words that occur frequently in their texts in school. Word lists (lists of 20 words) are drawn from the Dolch list of highfrequency words. Three equivalent lists are available. Because these lists include very high-frequency words, the task is most valid over the period when a child is acquiring an initial reading vocabulary.

 

In addition to information about a child’s knowledge of words in isolation and how this knowledge develops over time, the teacher can learn how a child works with words through attempts and self-corrections.

 

Concepts About Print

Concept Addressed: Print Knowledge

(what children know about the way spoken language is represented in print)

Content Validity: Conventions used for printed language must be learned so the child can attend to the essential visual information on the page. The task assesses a child’s current knowledge of 24 print concepts. Content validity is supported through the use of a specially designed book that allows the child to demonstrate knowledge of print concepts in an authentic setting.

 

Writing Vocabulary

Concept Addressed: Writing Vocabulary

(a child’s personal resource of known words)

Content Validity: In this task, the child writes words he knows for a period of 10 minutes. The teacher may prompt the child in various ways to think of other known words. The score represents the number of words written independently and also serves as a screen on the child’s visual attention to print, sound sequence, motor control, and useful approximations. This task represents classroom instructional expectations.

 

Hearing and Recording Sounds in Words

Concept Addressed: Phonemic Awareness and Letter/Sound Relationships

Content Validity: The importance of phonemic awareness and representing phonemes with letters or clusters of letters is well documented in the research literature. Teachers can use the score on this dictation task (with a sampling of 37 phonemes) as an indicator of a child’s developing knowledge in this area. This task measures classroom instructional practices.

 

The teacher dictates a passage (five forms available) and asks the child to say the words slowly and write letters to represent the sounds. Each phoneme recorded in a way that is acceptable in English is counted.

 

Because this is a closed knowledge set, the task is most valid over the period when a child is acquiring this knowledge and before the words in the passage become part of a child’s known writing vocabulary.

 

Running Records of Text Reading

Concept Addressed: Instructional Reading Level for Reading Continuous Text

(also child’s behaviors while reading real books)

Content Validity: The score on this task represents the highest text level (from texts representing a gradient of difficulty) that a child reads at 90% accuracy or higher. The task is an authentic assessment of the reading of continuous text. The testing packet used for Reading Recovery in the United States has been shown to be a stable measure of reading performance that represents escalating gradients of difficulty.

 

Evidence from research and classroom practice confirms that text difficulty relates to a reader’s developing competencies. For learning to occur, the difficulty level of reading materials should present challenges from which the child can learn—texts that are not too hard or too easy.

 

As the child reads the texts, the teacher uses established conventions to record behaviors in order to analyze the child’s reading behaviors.

 

 

  1. Describe the degree to which the provided data support the validity of the tool:

Not provided

 

 

Disaggregated Validity

The following disaggregated validity data are provided for context and did not factor into the Validity rating.

Type of Validity

Subgroup

Age or Grade

Test or Criterion

n

Coefficient

Confidence Interval

None

 

 

 

 

 

 

 

Results for other forms of disaggregated validity (e.g. factor analysis) not conducive to the table format

Not provided

Sample Representativeness

Grade1
RatingFull bubble

Primary Classification Accuracy Sample

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Representation

39 states

Date

2016-17

Size

2,207

Male

49.5%

Female

50.1%

Unknown

0.4%

Free or reduced-price lunch

31%

White, Non-Hispanic

64.3%

Black, Non-Hispanic

12.8%

Hispanic

6.3%

American Indian/Alaska Native

0.7%

Asian/Pacific Islander

4.6%

Other

11.3%

Unknown

0%

Disability classification

Speech and language impairment, 3%; Other health impairment, .6%; Developmental delay, .5%; Specific learning disability, .4%; All other categories, 1.2%

First language

English, 86%; Spanish, 7.8%; Some other language, 5.1%; Unknown, 1.1%

Language proficiency status

13.3% are English Language Learners

 

Cross Validation Sample

Criterion 1: Slosson Oral Reading Test-Revised, Grade Equivalent of 1.4

Time of Year: Fall

Representation

40 states

Date

2015-2016

Size

2,361

Male

46.9%

Female

52.8%

Unknown

0.3%

Free or reduced-price lunch

35.1%

White, Non-Hispanic

64.3%

Black, Non-Hispanic

13.4%

Hispanic

5.1%

American Indian/Alaska Native

0.8%

Asian/Pacific Islander

3.4%

Other

13%

Unknown

0%

Disability classification

Speech and language impairment, 2.8%; Other health impairment, .5%; Developmental delay, .6%; Specific learning disability, .3%; All other categories, 1.1%

First language

English, 86.3%; Spanish, 9.0%; Some other language, 4.1%; Unknown, 0.6%

Language proficiency status

13.7% are English Language Learners

 

Bias Analysis Conducted

Grade1
RatingNo
  1. Description of the method used to determine the presence or absence of bias:

Not provided

 

  1. Description of the subgroups for which bias analyses were conducted:

Not provided

 

  1. Description of the results of the bias analyses conducted, including data and interpretative statements:

Not provided

Administration Format

Grade1
Data
  • Individual
  • Administration & Scoring Time

    Grade1
    Data
  • 15-45 minutes
  • Scoring Format

    Grade1
    Data
  • Manual
  • Types of Decision Rules

    Grade1
    Data
  • None
  • Evidence Available for Multiple Decision Rules

    Grade1
    Data
  • No