Data SGP is a research project dedicated to the goal of analyzing and publishing multi-proxy sedimentary geochemical (iron, carbon, sulfur) data from the Neoproterozoic through Paleozoic periods. This work requires the assembling or generation of large sets of data and the use of sophisticated statistical analysis to place this data in a meaningful context.
The resulting analyses and publications will provide valuable information for scientists, educators, and students studying shale geochemistry and the geologic record. We hope that the dissemination of this data will lead to improved understanding and new discoveries in these areas of research.
While the word “big data” is often used to describe data sets that are too large for traditional data management applications, the data that is the focus of the data sgp project is actually quite modest by comparison. In contrast, an analysis of global Facebook interactions could easily generate much larger volumes of data. Consequently, we think of the data sgp project as being more properly described as a’medium data’ application.
Using a variety of statistical methods, SGP calculates a percentile measure that places a student’s current test score performance in the context of other students with similar prior test scores. The results of these calculations are displayed on the growth graphs in Star and on the individual student profile/growth dashboard.
In the example of Simon, a sixth grader who achieved a scale score of 370 on this year’s statewide assessment in English language arts (ELA), SGP comparison groups established to establish his growth percentiles take into account up to five years of previous test scores. Specifically, Simon’s growth percentile is calculated by comparing his current assessment performance to those of students who scored similarly on previous assessments from fifth grade through sixth grade.
For this reason, it is possible for two students to have the same SGP even if they had different MCAS scale score histories. In order to have the same SGP, both students must be in a growth comparison group with the same academic peer group at each of the different MCAS testing administrations.
Despite the similarities between the two, it is very common for a student’s score to move up and down in successive administrations of the state assessment. The movement in a single administration is often attributed to temporary factors such as illness, weather conditions, or other issues that can impact student performance. This is the reason that it is important to view growth over time and consider a student’s long term progress when evaluating their academic achievement.
To run SGP analyses, one needs access to the sgpData data set and the R software environment. R is available for Windows, OSX, and Linux and, as an open source project, can be downloaded free of charge. Although running SGP analyses is relatively straightforward, we recommend that any new users of the data sgp system spend some time getting familiar with the software and the general concepts behind the calculations before diving in. In almost all cases, errors that occur when running SGP analyses are a result of improper data preparation and these errors are usually fairly easy to correct.