What stuck out to me most about the reading on Big Data by Boyd and Crawford, was the assertion that big data is not just narrowed down to ones and zeroes, but rather that it is subjective, involving interpretation to shape and organize it into logical sets. And once it is produced, it requires more interpretation to analyze it, given its context. This makes me think, then what about big data (or data in general) is actually objective? When it is in it’s raw form, can we then say that the data is objective? I would assume that there involves a collection process of the raw data by a researcher. To begin to make data sets, a researcher has to know what raw data to chose from. Doesn’t the researcher’s process of collecting raw data then require another layer of interpretation? Does this therefore imply subjectivity even when it comes to raw data? Then what about it is actually “raw” if it was a set of data that was interpreted and narrowed down (just like Big Data) in to a “set of raw data to further group into sets of Big Data”. I think that that the term “Big” automatically implies a categorization of a certain type of data which has been distinguished (as distinctly different from other types of data), interpreted, and narrowed down from other types of data–this categorization implies that it was interpreted at some point, implying its natural subjectivity. Then again I ponder the question, what is actually objective about data? I’ve always assumed that that data was objective, so there has to be somethingobjective about it. Possibly, objectivity only exists in the initial record of the singular statistic, such as when a person runs a mile and their exact time is recorded. But once there is a mass of data and the sorting and interpretation begins, then we see the notion of the subjectivity of data coming to light.
Maya – this focus on the selectivity of data is very important and I hope others read it. (It reminds me of a great book called “Raw Data is an Oxymoron.”) This post also compliments Rei’s post this week on numbers. If the post claims that there “must” be something objective about the data, then I’d ask Mitchell’s question: What it is about the way that data is represented to persuade us that there is such a moment. In the running example, would we have to include the context and motivations for running to understand the data? Is the data knowable without that?