Empirical Research-Assessments

Empirical Article- Assessments

Reference:

Burns, M. K., & Senesac, B. V. (2005). Comparison of dual discrepancy criteria to assess response to intervention. Journal of School Psychology, 43, 393-406.

Purpose:

The study furthers previous research concerning dual discrepancy (level and growth) by comparing the validity and diagnostic implications of four dual discrepancy (DD) models. There were two main questions that were addressed. They were the following: Which approach of DD best differentiates reading skills as measured by a norm-referenced test of reading achievement? What prevalence rates are associated with the four DD models?

Population/Sample:

There were 146 students in this study from first (14.6%), second (37.7%), and third (47.7%) grades that were nominated by their teacher as having reading difficulties. All of these individuals scored at or below the 25^th percentile on a district-administered group test of reading. There was an equal distribution of females (50%) and males (50%). Among these students were 27.8% Caucasians, 2.8% African Americans, and 2.6% Hispanics. 57.71% of these students were eligible for the federal free of reduced lunch program.

Methods:

-There were two interventions that were used in this study. The students participated in remedial program entitled, Help One Student to Succeed (HOSTS), or they received Title I support. HOSTS is a comprehensive literacy program that supplements classroom reading instruction by trained volunteer tutors. Individual reading and computer-based assessments are conducted with children having reading difficulties. This data is used to make personalized interventions for each student. This includes daily and weekly lesson plans and tutoring for four thirty-minute sessions each week. Tutoring was available Monday through Friday and students may have had different tutors in these days. The HOSTS teacher is present during tutoring sessions to monitor the instruction and provide feedback to the tutors.

- Because the intervention can vary, Title I was selected. This included weekly individual reading instruction from the Title I consultant or small group instruction in class or in a pull-out model. Five of the nine schools used the HOSTS program and four used Title I services. Also, five schools were randomly selected from all the schools that used the HOSTS program. Five elementary schools not using the HOSTS program were selected using student enrollment, percentage of students receiving free or reduced lunch, average student to teacher ratio, and percentage of children who passed the fourth grade tests.

- Data from the Dynamic Indicators of Basic Early Literacy Skills (DIBELS) test were collected mid-year and end-of-year. DIBELS is an assessment developed from the curriculum-based measurement model to assess early literacy development (phonological awareness) for kindergarten and first graders.

-The Gray Oral Reading Test (GORT-4) was used to study level or reading proficiency.

-Students were seen individually in January to take the tests, and again in May of the same school year.

- The DIBELS median fluency rate for the end-year assessment was the post-intervention reading level score. The mid-year score was subtracted from the end-year score and this was the growth score. This score was then rank ordered within three grades to create responsiveness groups using normative criteria. Student response was examined in which the mean growth was computed. Those students whose post-intervention or end-year DIBELS score fell within the “at- risk” criterion and whose fluency change score fell at or below the non-responsiveness criterion were considered dually discrepant.

Findings:

-All three percentile-rank models greatly differentiated the reading scores of DD from other students, but the one standard deviation approach did not. Reading scores were compared according to two groups of student response. The three percentile rank criteria largely differentiated the groups’ reading scores, and the one standard deviation criterion did not. A chi squared statistic was computed for those with DD and non-DD. The 50^th percentile led to a significant effect for grade, but this may have been because of an error.

-There were 23.3% for the 25^th percentile, 28.1% for the 33^rd percentile, 41.8% for the 50^th percentile, and 12.3% for the one standard deviation criterion. There were 74 students who received the intervention. Eight were DD for the 25^th percentile, 11 for the 33^rd percentile, 22 for the 50^th percentile, and 3 for the one standard deviation group. The prevalence rate for children in 1^st, 2^nd, and 3^rd grades was 3.7%, for 25^th percentile, 4.4% for the 33^rd percentile, 6.6% for the 50^th percentile, and 1.9% for the one standard deviation below the mean.

Implications for Education

This study showed that dual discrepancy (DD) was most effective for the 25^th and 33^rd percentile rank groupings than for the 50^th percentile and one standard deviation criteria. According to the author, because there was consistency in the prevalence rates, the DD method seemed to be reliable. The author states that this data should be compared to national prevalence data for learning disabilities to thoroughly understand the utility of this model. He also states that further research needs to be done to find best practices in the diagnosing of learning disabilities within a response to intervention model.

I think that perhaps the results may have had to do with the fact that the Title I program allowed for group instruction, whereas the HOSTS program did not. Also, the results may have not been as reliable based on the fact that in the HOSTS program, the students may have received a different tutor (with a different teaching method) each time he or she went to tutoring. In addition, I would want to know the difference in the results, for example, if an Informal Reading Inventory was administered as well.

Furthermore, I believe that no one or two tests can determine everything a student knows or does not know. There are many factors that can play a role in the results of tests scores. For example, a child may be an intelligent individual, but may be having a bad day or isn’t feeling well. That is a reason why teachers should administer ongoing assessments. By doing this, they will have more reliable results.

Although the author stated that his study was reliable, I am not too sure that it was. In this study, he stated that there may have been an error in the scores. In addition to this, he stated that further research needed to be done to refine DD and that additional comparisons of definitions for non-responsiveness of growth data was needed. More research should have been done in this study to truly make it reliable.