Educational Data Systems
 
 
EDS Publications

The following are original EDS publications based on work by our research team:

Objectivity & Multidimensionality: An Alternating Least Squares Algorithm For Imposing Rasch-Like Standards Of Objectivity On Highly Multidimensional Datasets PDF Document   [937Kb]

by Mark Moulton, Ph.D.

Abstract: To an increasing degree, psychometric applications (e.g., predicting music preferences) are characterized by highly multidimensional, incomplete datasets. While the data mining and machine learning fields offer effective algorithms for such data, few specify Rasch-like conditions of objectivity. On the other hand, while Rasch models specify conditions of objectivity—made necessary by the imperative of fairness in educational testing—they do not decisively extend those conditions to multidimensional spaces. This paper asks the following questions: What must a multidimensional psychometric model do in order to be classified as "objective" in Rasch’s sense? What algorithm can meet these requirements? The paper describes a form of "alternating least squares" matrix decomposition (NOUS) that meets these requirements to a large degree. It shows that when certain well-defined empirical criteria are met, such as fit to the model, ability to predict "pseudo-missing" cells, and structural invariance, NOUS person and item parameters and their associated predictions can be assumed to be invariant and sample-free with all the benefits this implies. The paper also describes those conditions under which the model can be expected to fail. Demonstrations of NOUS mathematical properties are performed using an open-source implementation of NOUS called Damon.


EdScale: How to Measure Growth using Formative Exams Without Common Persons or Items PDF Document   [384Kb]

by Mark Moulton, Ph.D.

Abstract: A new method is proposed for equating multidimensional formative exams without common persons or items. A multidimensional IRT (MIRT) model called NOUS is used to link students who take different formative exams by exploiting scores received on a common test administered at some point in the recent past, such as the California Standards Test. The "past test" vector is projected into the multidimensional subspace of the two formative exams, and students located in the same subspace are projected onto the common “past test” vector, allowing apples-to-apples comparisons between students on a common, well-understood metric. Growth measurement is handled by applying a timeseries function to expected growth rates computed from previous years. The methodology is presented in connection with a scaling product developed in California, called EdScale, which is used to measure student growth on benchmark exams developed or purchased independently by school districts.


The California Reading First Year 7 Evaluation Report PDF Document   [1.2MB]

by Diane Haager, Ph.D.; Renuka Dhar; Mark Moulton, Ph.D.; Susan McMillan, Ph.D.

Abstract: The last in a seven-year series of evaluation reports of the California Reading First program (2003-09), this report generally replicates the findings of previous reports with particular focus on schools that have been in the program since the second year of Reading First funding (Cohort 2). It finds that for Cohort 2 schools growth has been significant, that higher implementing schools post larger gains than lower implementing schools, and that Reading First schools generally out-perform a statistical control group, though the differences are not significant in all cases. Reading First schools have posted higher growth rates than non-Reading First schools and have proven to be particularly effective in helping low-performing students and English Learners move to higher proficiency levels. These effects, while smaller in Cohort 2 schools than for the Reading First population as a whole, are in substantial agreement with a meta-analysis of California Reading First effect sizes presented in the Year 6 Evaluation Report which found that, overall, the Reading First effect is educationally meaningful and has a high degree of statistical significance. The report also finds that program implementation declined in 2008-09, that principal participation and teacher perceptions are the strongest predictors of success, that the program is responsible for building capacity at state and local levels, and that it has created a sustainable, comprehensive structure for reading/language arts instruction. The Report concludes with lessons learned over the seven-year course of the evaluation.

In addition to the primary report, the Appendices [5.5MB] are also available.


The California Reading First Year 6 Evaluation Report PDF Document   [1.1MB]

by Diane Haager, Ph.D.; Craig Heimbichner; Renuka Dhar; Mark Moulton, Ph.D.; Susan McMillan, Ph.D.

Abstract: Replicating the results of previous Reading First Evaluation Reports, the Year 6 Report documents the achievement and implementation of more than 800 schools that have received Reading First funding since 2003. Achievement is measured using STAR scores. lmplementation is measured with data from surveys administered to Reading First teachers, coaches, and principals, using a 3-Facet Rasch Model. The report finds that Reading First schools show significantly higher gains (p < 0.05) than a "statistical control group," and that High lmplementation Reading First schools show significantly higher gains than Low lmplementation Reading First schools. This is true even for achievement in grade 5, despite the fact that Reading First is only administered for K-3. The Year 6 Report includes a meta-analysis of the Reading First effect to date which effectively removes any doubt regarding the overall efficacy of the program. It also includes quantitative and qualitative information regarding the efficacy of individual program elements and the usefulness of the interim assessment system, and devotes chapters to the effect of Reading First on English Learners and Special Education classrooms.

In addition to the primary report, the Appendices [1MB] are also available.


Reading First Supplemental Survey Report, March 2008 PDF Document   [225KB]

by Craig Heimbichner, Susan McMillan, Ph.D., Mark H. Moulton, Ph.D., Renuka Dhar

Abstract: Developed in response to 2007-08 Senate Bill (SB) 77, Provision 6, this report presents results of a survey given online to administrators of California school districts (Local Educational Agencies, or LEAs) that were eligible to participate in the federal Reading First program. The survey solicited information regarding why districts chose to participate or not in the Reading First program, its perceived strengths and weaknesses, and for districts that chose not to participate in Reading First, the reading programs, coaching, and professional development that these districts offer in grades K-3 instead. The report compares the achievement of eligible participating and non-participating districts on the grade 2 and grade 3 English Language Arts California Standards Test over a five-year period. Subject to limitations in sample size, findings include: eligible non-Reading First LEAs were concerned about too many requirements when opting not to apply; approximately 50%-60% of eligible non-Reading First LEAs use the Houghton-Mifflin reading program, one of the two programs required in Reading First; teachers in eligible non-Reading First LEAs receive less professional development than those in Reading First LEAs; reading coaches are less available and under-utilized in eligible non-Reading First LEAs; perceptions of Reading First by participating LEAs are very positive; and Reading First schools and LEAs show stronger achievement growth than eligible non-participating schools and LEAs on the CSTs for English Language Arts. Both the survey and the achievement results support the finding that Reading First has been beneficial for participating LEAs, replicating results in The California Reading First Year 5 Evaluation Report.

In addition to the primary report, the Appendices [185KB] are also available.


The California Reading First Year 5 Evaluation Report PDF Document   [1MB]

by Diane Haager, Ph.D.; Renuka Dhar; Mark Moulton, Ph.D.; Susan McMillan, Ph.D.

Abstract: Replicating the results of the Year 4 Report, the Year 5 Reading First Evaluation Report commissioned by the California Department of Education in accordance with No Child Left Behind documents the achievement and implementation of three cohorts of schools that have received Reading First funding since 2003. Achievement was measured primarily using STAR scores, lmplementation was measured with data from surveys administered to Reading First teachers, coaches, and principals using a 4-Facet Rasch Model. The report finds that Reading First schools show significantly higher gains (p < 0.05) than a "statistical control group," and that High lmplementation Reading First schools show significantly higher gains than Low lmplementation Reading First schools. The report concludes that California's Reading First program is effective, and that its effectiveness is directly related to the degree it is implemented. Additional chapters document perceptions of the various reading program elements, the performance of English learners, and the performance of students in waivered classrooms. We find that Reading First is effective with the English learner population, and that non-waivered classrooms are more effective than waivered classrooms.

In addition to the primary report, the Appendices [7.5MB] are also available.


One Ruler, Many Tests: A Primer on Test Equating PDF Document   [750KB]

by Mark Moulton, Ph.D.

Abstract: Given the variety of languages, cultures, and curricular priorities across APEC countries, it would seem difficult to unite around a common set of teaching and learning standards for purposes of international communication. Yet the nascent field of “psychometrics” offers practical solutions to such problems by its ability to “equate” tests that differ in difficulty and even content, and by its ability to set standards that have the same meaning across tests and countries. After summarizing the principles behind classical and modern educational measurement, the paper discusses several technologies that can make it possible for APEC countries to jump the language barrier without sacrificing local imperatives. These technologies include: local, national and international item banks, computer adaptive testing, and the Lexile framework.


NOUS 2003 Demo Spreadsheet Excel Document   [1.5MB]

by Mark Moulton, Ph.D.

Abstract: At EDS we make use of three powerful psychometric tools: WinSteps (Rasch Model), Facets (the Many Facets Rasch Model), and NOUS (Non-Unidimensional Scaling). The first two are well-known but NOUS, developed by Mark Moulton and Howard Silsdorf, is the secret tool behind much of EDS's psychometric work, especially in the area of equating benchmark exams. It combines important principles of the Rasch Model with concepts drawn from multidimensional matrix methods such as Singular Value Decomposition and Alternating Least Squares to yield a tool that is able to analyze dimensionally complex data sets and solve equating problems that are widely considered to be intractable. For all of its apparent complexity, the algorithm is actually quite simple and is presented here in complete detail with Excel formulas and helpful annotations.

This spreadsheet requires Microsoft Excel 2007


The California Reading First Year 4 Evaluation Report PDF Document   [740KB]

by Diane Haager, Ph.D.; Renuka Dhar; Mark Moulton, Ph.D.; Susan McMillan, Ph.D.

Abstract: Replicating the results of the Year 3 Report, the Year 4 Reading First Evaluation Report commissioned by the California Department of Education in accordance with No Child Left Behind documents the achievement and implementation of three cohorts of schools that have received Reading First funding since 2003. Achievement was measured primarily using STAR scores. Implementation was measured with data from surveys administered to Reading First teachers, coaches, and principals using a 4-Facet Rasch Model. The report finds that Reading First schools show significantly higher gains (p < 0.05) than a "statistical control group," and that High Implementation Reading First schools show significantly higher gains than Low Implementation Reading First schools. The report concludes that California’s Reading First program is effective, and that its effectiveness is directly related to the degree it is implemented.

In addition to the primary report, the Appendices [740KB] are also available.


Multidimensional Equating PDF Document   [275KB]
Linking Multidimensional Test Forms by Constructing an Objective N-Space

by Mark Moulton, Ph.D.; Howard A. Silsdorf

Abstract: Form equating methods have proceeded under the assumption that test forms should be unidimensional, both across forms and within each form. This assumption is necessary when the data are fit to a unidimensional model, such as Rasch. When the assumption is violated, variations in the dimensional mix of the items on each test form, as well as in the mix of skills in the student population, can lead to problematic testing anomalies. The assumption ceases to be necessary, however, when data are fit to an appropriate multidimensional model. In such a scenario, it becomes possible to reproduce the same composite dimension rigorously across multiple test forms, even when the relative mix of dimensions embodied in the items on each form varies substantially. This paper applies one such multidimensional model, NOUS, to a simulated multidimensional dataset and shows how it avoids the pitfalls that can arise when fitting the same data to a single dimension. Some implications of equating multidimensional forms are discussed.


The California Reading First Year 3 Evaluation Report PDF Document   [1MB]

by Diane Haager, Ph.D.; Renuka Dhar; Mark Moulton, Ph.D.; Seema Varma, Ph.D.

Abstract: The Year 3 Reading First Evaluation report, commissioned by the California Department of Education in accordance with No Child Left Behind, documents the achievement and implementation of three cohorts of schools that have received Reading First funding since 2003. Achievement was measured using primarily STAR scores. Implementation was measured with data from surveys administered to Reading First teachers, coaches, and principals, analyzed using a 4-Facet Rasch Model. The report finds that Reading First schools show somewhat higher gains than comparable non-Reading First schools, and that High Implementation Reading First schools show significantly higher gains than Low Implementation Reading First schools and comparable non-Reading First schools. The report concludes that California’s Reading First program is effective, and that its effectiveness is directly related to the degree it is implemented.

In addition to the primary report, the Appendices [650KB] are also available.


One Use of a Non-Unidimensional Scaling (NOUS) Model PDF Document
Transferring Information Across Dimensions and Subscales

by Mark H. Moulton, Ph.D.

Abstract: Test administrators sometimes ask for student performance on test subscales having few items, rendering them unreliable and hard to equate. Worse, subscales sometimes embody orthogonally distinct secondary dimensions as well. Traditional Rasch analysis offers reasonable solutions in some cases, but not all, and is not a general solution. This paper proposes a general solution using a Rasch-derived non-unidimensional scaling measurement model, called NOUS, which transfers information across items, subscales, and dimensions. Drawing examples from a recent state exam, it shows that NOUS yields measures for short subscales that are comparable to unidimensional measures computed using long forms of the same subscale. It concludes by discussing applications for multidimensional equating, student-level diagnostics, and measurement of performance on open-ended items.


Weighting and Calibration PDF Document
Merging Rasch Reading and Math Subscale Measures into a Composite Measure

by Mark H. Moulton, Ph.D.

Initially presented at American Educational Research Association (AERA) 2004 Annual Meeting

Abstract: While the emergence of Rasch and related IRT methodologies has made it routine to update tests across administrations without altering the original Pass/Fail standard, their insistence on unidimensionality raises a problem when the standard combines performance on multiple dimensions, such as mathematics and language. How combine a student’s mathematics and language measures to make a Pass/Fail decision on composite ability when the two scales embody different dimensions and logit units, and lack common items. Using client-determined weights and student expected scores, we present a simple method for combining unrelated subscales, encountered in a recent high-stakes certification exam, to produce composite logit measures without sacrificing the advantages of unidimensional IRT methodologies.


Rasch Demo Spreadsheet Excel Document

by Mark H. Moulton, Ph.D.

Abstract: This Excel file was developed to help students and practitioners of the Rasch Model get a simple and intuitive look at what goes on "under the hood" of most Rasch programs for dichotomous data. The Excel workbook shows all the matrices, formulas, and iterations needed to understand how person abilities, item difficulties, standard errors, expected values, and misfit statistics are computed. It also allows the user to perform simple experiments to see the effects of missing and misfitting data. You will find that, beneath the sophistication and apparent inscrutability of modern Rasch software, the model and its estimation algorithm are surprisingly easy to understand.

The workbook most closely emulates the UCON algorithm described by Wright and Stone in Best Test Design.

The spreadsheet is fully documented using Excel comments, which may be turned on and off as desired.


Preliminary Item Statistics Using Point-Biserial Correlation and P-Values PDF Document

by Seema Varma, Ph.D.

Abstract: This document demonstrates the usefulness of the point biserial correlation for doing item analysis. Step by step computation of the point biserial correlation is shown in an Excel demo sheet. The SPSS syntax for computing the correlation in SPSS is also provided. A dummy dataset is used to help the reader interpret the statistic.


Point Biserials Demo Spreadsheet Excel Document

by Seema Varma, Ph.D.

Abstract: This Excel file accompanies the above document and demonstrates how point-biserials can be computed with Microsoft Excel.

The spreadsheet is fully documented using Excel comments, which may be turned on and off as desired.


Redistribution policy:

These publications may be quoted or distributed without prior consent of the authors subject to the condition that the authors be correctly and fully acknowledged and that quotations be in context.



Test Bubbles