Data | Culture, Media, and Data

Adjusting a graph of carbon emissions

Matthew Gancayco, November 20, 2020 1 Comment

Data

Notes:

The first adjustment I wanted to make from the original was making the company in question more apparent. The original had all of the companies the same color, so it was hard to distinguish. I also wanted to add context to the carbon emission values. I discovered the total amount of carbon emissions for the time period. Although these companies are the largest contributors to carbon emissions, they are not the only ones to blame. Their totals are miniscule in comparison to the grand total of the world. The entire world is accountable, not just these companies.

Comments

Maya Stepansky says:

November 22, 2020 at 5:51 pm

Hi Matthew! I thought your modifications to the Pemex visualization made a lot of sense. I liked the way you made the visualization more accessible by distinguishing the company in question from the other companies by giving it a different color, as well as the way you took into account the total context by showing that most of the carbon emissions are not just through these companies. I also like the way that you practically made a political statement in the process of changing the visualization, because you emphasized the importance of accountability and accuracy in visualizations, and specifically when it comes to carbon emissions. It made me realize how easy it is for these visualizations to focus on one particular thing that results in a misrepresentation of the full picture. This I think that that last contribution had the biggest effect on this visualization because it moved the emphasis of this visualization from Pemex to a general focus on the outrageous amount of carbon emissions that are being released globally. It makes me wonder, if it isn’t these companies that are dominating carbon emissions globally, then what is? Possibly, it would make sense to change the title of the visualization from one that emphasizes Pemex to something else that focuses on the large amount of global carbon emissions being released in the word. If this change is made, it might not even make sense to highlight Pemex anymore—possibly it would make more sense to highlight the Global Total. I am also wondering, was there a specific intention behind changing the bar graph from horizontal to vertical?

Poll Watchers in Clayton, GA

Emily Yu, November 18, 2020 1 Comment

Data

Note: first picture is the original data visualization (in this instance, I’m considering an Excel table to be a data table) which contains information about certified poll watchers for Clayton County, GA. The second is a spatial map plotting the location from which the the certified poll watcher for Clayton County, GA originates from.

Looking at the first data table, the way in which it is formatted makes it hard to draw any kind of correlation between the individual pieces of data. As a result, it appears that this data table and information is somehow neutral and doesn’t have the capability of being weaved into narrative. One of the most pressing questions that I wanted to be answered from the data set was whether there was any correlation between birth location and party affiliation. Thus, creating this second visualization was a way to spatialize these data points and see correlations that would not have been easy to with the first visualization.

The few takeaways from this map is that there seems to be more Democratic poll watchers than for any other party and these poll watchers seem to be come from a wider range of locations. In contrast, the Republican party seems to have poll watchers concentrated from a few location. Another insight is that those who were born in the county seemed to be chosen at a higher rate, leading into questions such as whether the selection process for poll watchers just inherently favors individuals born in the county or whether there is just greater likelihood of being political active hence a greater sample size to choose poll watchers from.

In creating this second visualization, one of the largest takeaways is that transforming data visualizations into other formats can allow for further questioning and weaving of narratives than other types of data visualizations.

Comments

Jerome Desrosiers says:

November 18, 2020 at 3:23 pm

Hey Emily, like you said I also think that table don’t allow you see or make connections between the rows/Columns. The same thing happened to me when I was just looking at the straight data table of the visualization I made. It didn’t seem to be showing anything. What’s interesting and funny to me is that making a new visualization is simply doing the work for us by doing one more level of interpretation.

I think the chart you made has so many levels that are impossible to see when looking at the data table. Your map has the geographical aspects to it (+ the name of the locations when putting the cursor on a square). This makes me wonder if you had to find the latitude and longitude for each or was it already included in the data? There is also the many colored square representing the political affiliation of the poll workers. There is just as much information in both the table and the map yet the second visualization can be interpreted and understood much quicker.

Coronavirus Cases by Zipcode in Hawaii

Cynthia Vu, November 17, 2020 1 Comment

Data

Original Visualization:

New Visualization:

I was really curious to look at the distribution of coronavirus cases in Hawaii. I found a very simple map that displayed coronavirus cases broken down by zip codes. A lot of Hawaii is dominated by natural landscapes that have no residents. However, one thing I realized from staying in Hawaii is how non-uniform the population distribution is. This is true for plenty of states, and I was reminded of the different election map visualizations we viewed that tried to demonstrate the idea that “dirt doesn’t vote”.

My edited map still uses the same data about coronavirus cases, but I’ve included new data about the populations of the different zip codes and the relative ranks of those population sizes. This provides more context with which to analyze the distribution of coronavirus cases, as a viewer can compare the population size to the number of COVID cases in a particular zip code and calculate cases per capita. I’ve also highlighted a few of these population ranks, including zip code 96720, which contains the 11th largest population in Hawaii but only has between 6 – 30 coronavirus cases. The other zip codes with population ranks 1 – 10 all have 60+ coronavirus cases. By taking a closer look at this comparable data, new and potentially actionable information is revealed. Zip code 96720 represents Hilo, which is where I am currently staying.

Even in my new visualization, which I think provides better context, there is still a lot of missing data that could clarify specific details about each zip code and give greater context to a reading of the coronavirus case distribution. For example, zip code 96863 is in last place for population size but still has 1 – 5 coronavirus cases. It actually corresponds to a Marine Corps Base, so its residents live in very close proximity and likely also have faster access to tests. Some of these zip codes reflect areas where millionaires like Bill Gates or Beyonce have bought out sections of land, while others reflect actual urban centers. Assumptions about where these activities might be occurring cannot be intuited from just the population information. There is still a lot missing from my map.

Comments

Rei Zhang says:

November 17, 2020 at 10:55 pm

Cynthia,

I think your instinct to normalize by population is also the first step I would have thought of. I agree that it’s interesting that code 96720 has such a low proportion of cases; I wonder if demographics (more seniors/middle aged people that would have already contracted COVID-19 or are taking social distancing more seriously).

I also think your point about “dirt not voting” is really important when we consider visualizations that have population baked in; it’s really easy to conflate land area, visually, with numerical size.

Maya’s Visualization — Weekly COVID incident rates in Berlin

Maya Stepansky, November 17, 2020 2 Comments

Data

Revisualization

Original

As you probably know from class, I had trouble embedding my visualization (I think it had to do with the specific visualization that I chose from Datawrapper). The original visualization shows the weekly COVID incident rates in the city of Berlin by age group over each week from March to October. I found that though the visualization emphasized weekly rates, that was not clearly reflected in the initial visual. I also though that hear map was slightly misleading because it didn’t show how the overall weekly rates fluctuated. The only real relationship show was the increase and decrease in rates per age group by rough approximation of color gradation. In effort to make this visualization more accessible and intuitive as well as highlighting the actual comparison of weekly rates, I decided to make a stacked bar graph that clearly reflected the change in rates by both week AND age group. This made it overall much easier to see trends and correlations.

Comments

Lauren McGrath says:

November 18, 2020 at 3:42 pm

Hi Maya,

Your comparison of these two visualizations based on the same data made me think about the visualization of US immigration as metaphorical rings that we discussed in class.

As an aggregate, the original viz is a visually striking image in an “artistic” sense, but as we discussed in lecture, it doesn’t reveal the complexities of the data like your version does (being able to see the weekly labels, the colors showing age categories rather than incidence rates).

I think with data visualizations, some authors have the intention of making data “beautiful” while others really want to show the irregularities, patterns, or inconsistencies in data. I don’t think that there is a “right” way to visualize data, as long as the author is able to be forthcoming about their methods and reasoning for their creation. I think your comparison here does a great job of showing those dynamics!

Ailee Mendoza says:

November 18, 2020 at 5:58 pm

Maya, after class yesterday I couldn’t stop thinking about your visualization. Considering that, admittedly, all I really did in my visualization was change the chart style into bar form (what I personally deem to be the most intelligible), I really appreciated how you were able to simplify a chart that was initially so “crowded” and intimidating.

I started to think about the criteria for data visualizations that I’ve arbitrarily designated as “intelligible” and the reasoning behind these criteria. I think the temporal aspect you introduced in your version of the visualization was particularly compelling to this self-reflection. The addition of axes immediately gave me a sense of direction, of progression, of narrative. It’s crazy how aesthetically and intellectually satisfying the presence of time is in a graph. Whereas before I had to interact with the data in order to orient myself temporally, now it has been laid out for me in a single tagline, if you will.

I honestly wonder if I am the only person who has this perspective on data- does simplicity and “obviousness” appeal to you? Do you privilege immediate accessibility or creativity? Or both?

Black vs White Race Representation Ratios for Various Interactions with Police, 2015 to 2019

Ailee Mendoza, November 17, 2020 1 Comment

Data

Description:

I seem to have lost the original visualization, so I’ll just describe as best I can the choices I made in creating my version. The question Prof. H posed yesterday about what factor should be the most privileged in constructing a data representation really got me thinking about which of those 4 factors have inherently guided my thought processes and my creativity. I mentioned this in my comment on Maya’s visualization, but I really think this question highlights how 1) anything interpretive or creative is subjective and 2) I personally privilege accessibility/comprehensibility. To be honest, I’m not really a fan at all of frequency ratios. For someone like me, the original dataset that only showed the frequency rations was hard to understand without great intellectual effort. I wanted to figure out a way to supplement these ratios with a visual component to accompany and to give these numbers some “immediate” significance. I think the bar chart format allows for an easy and concrete way in which we can interpret the significance of the frequency ratios at first glance. I do, however, appreciate interactivity in data representation as well. I scrolled past Cynthia’s visualization and liked how you could learn more about specific territories in Hawaii by hovering over them- but then we have to ask ourselves which information will be available at “first glance” and which is only accessible when you interact with it? These choices end up producing very different narratives with the same set of data…

Comments

Matthew Gancayco says:

November 22, 2020 at 7:46 pm

Hi Ailee,

First off, I agree with your point that anything interpretive is subjective. If the reader is not directed or told exactly the point you are trying to get across, the data is going to mean different things for everyone. For instance, in your graph the data that stood out to me the most was the fact that the ratio was higher for blacks fleeing vs not fleeing. My immediate assumption would have been that those who aren’t fleeing would be more likely to be killed as there is a possible confrontation, but the opposite was the case.

Your choice to make the data into a bar chart does give the it immediate significance. I also like how aesthetically you ordered the different figures so that the ratios scaled ascendingly and descendingly. One possible change that could change the narrative could be adding either another minority groups data or the data for the population as a whole. That could add a whole other layer of depth.

Comprehending COVID-19: Exploring Testing Capacity’s Confounding Influence

Zack Kurtovich, November 17, 2020 1 Comment

Data

Description:

For this assignment, I thought it would be interesting to investigate each state’s COVID testing capacity to try to gauge its confounding influence on current understandings of the distribution of coronavirus cases within the United States. The data for my visualization was obtained through the data set provided below, which was created as part of the COVID project. As you can see, the original creators decided to list the states, which I thought limited the audience’s interpretation when it comes to comparing larger regional and inter-state discrepancies. While most discourse and data pertaining to COVID-19 is centered around the number of cases, the impact of testing is often overlooked. As such, I thought it would be valuable to map out the differences to visually assess the landscape and identify any patterns, as it lends insight into the accuracy of each state’s confirmed case rate.

After completing the first graphic, I then similarly mapped out each state’s confirmed case rate to cross reference between the two visualizations. This yielded a few key observations. Of these, I personally found Oregon’s extremely low testing capacity and low confirmed case rate to be the most noteworthy, as I have always considered them ahead of the curve in terms of listening to the science and making data-driven decisions (marijuana legalization, drug decriminalization, etc). This has really made me question what is driving this trend – why is their testing capacity so low? Is it a lack of state or federal funding? Pennsylvania also really stood out to me because it had the lowest testing capacity (0.2154%), but still had a relatively high confirmed case rate (0.6019%), which indicates that the pandemic has probably hit that much area worse in reality. Texas and Florida also reflected this dynamic, although I wasn’t necessarily shocked due to the press coverage these states have received in recent months. Lastly, I was surprisingly impressed by Alaska’s response to the pandemic. Despite being a Red state, Alaska appears to have been able to capitalize on their geographic isolation and low population density, as they not only have the highest testing capacity (1.15%), but also the second lowest confirmed case rate behind Hawaii.

Original Data Visualization:

Comments

Emily Yu says:

November 18, 2020 at 11:03 am

Hi Zack, I think your choice of using a spatial maps in order to visualize the original data visualization really helped to contextualize all the floating numbers in the chart by allowing us to compare not the same state to itself but also to all other states. The insights that you brought up are really interesting in that the news mostly focuses on the lack of testing capabilities yet your maps reveal and allow us to question and guess why certain states, like Alaska, were able to respond better than others.

I was thinking maybe another way to approach this visualization is to take the difference between each state in terms of testing capacity and confirmed coronavirus cases (testing capacity-confirmed coronavirus cases). This might lead to more insights about the level of over or under preparedness that the states might have in terms of testing than having to kind of guess between the difference between two shades of colors. I also wonder if there’s way to further disaggregate this data to see the distribution of testing capabilities as well as coronavirus cases to see if there’s any correlation.

20 Centuries of Carbon Dioxide Concentration in the Atmosphere

Rei Zhang, November 17, 2020 1 Comment

Data

This is my chart:

Based on this chart:

I thought that it would be nice to include a more historical aspect and contextualize the effect of the Industrial Revolution by pulling carbon data from the beginning of the Common Era.

What I don’t like as much about my visualization are the overlapping labels, although they do help emphasize the similarity of atmospheric carbon levels until humans start emitting many tons more of CO2.

Comments

Cynthia Vu says:

November 22, 2020 at 4:41 pm

Hi Rei,

I really liked the original Six Decades of Carbon Dioxide Concentration in the Atmosphere chart and I’m happy you chose to work with it! I agree with your instincts to contextualize some of the changes in Carbon Dioxide Concentration over time by trying to indicate historical context. Personally, when I was considering working with this chart, I thought about mapping or labelling significant historical events that we might claim had a direct influence on the rising CO2 levels–for example, the invention of the steam engine. However, I wonder about the possibility of providing too much context on a data visualization. Does much significance should we place on context in a data visualization? Or should context remain outside of the visualization, as we have done with our accompanying paragraphs?

You mentioned disliking the overlapping labels on your visualization, although I agree that this has the effect of helping the 20th century CO2 levels stand out. Perhaps the solution might be to decrease the step of the y-axis ppm levels, so that we can see more minute changes between the CO2 levels before the 20th century.

data wrapper

Anna Durak, November 17, 2020 1 Comment

Data

http://

Comments

Lauren McGrath says:

November 27, 2020 at 1:26 pm

Hi Anna,

I thought this visualization was so cool with the tabs at the top. I additionally thought it was very thoughtful of you in their organization, from state population to amount of COVID19 tests administered per state to those tests per 1000 people.

I thought your ordering of those tabs provided a great example of the powers of aggregation and de-aggregation that we’ve been discussing. I think this applies in the sense of not only viewing total numbers but then per 1000 so that we can “compare” states, but also de-aggregating positive and negative COVID tests from the total administered in the population.

I think that one way this graph could be interpreted is not in the sense of “rising” COVID cases indicating increased testing, but looking at the testing availability that we now have; I remember at the beginning of the pandemic there was great worry over the accessibility of testing and who would be allowed to be tested. On the other hand, like we talked about with the pixelated person, pixelations of states into statistics “per 1000” are more of value because of their ability for comparison. The animated aspect of your viz demonstrates this; as the viewer cycles through the tabs, they watch states flip from the right side of the chart to the left. This change is enabled by our ability to now “compare.”

Datawrapper: COVID19 trends

Lauren McGrath, November 17, 2020 1 Comment

Data

both of these were based on:

Hi all,

So like I said in class, I saw this table by the COVID19 project (the embed doesn’t seem to be updating right now but I’ll double check on it) that essentially walked the viewer through calculations, starting on the lefthand columns and working towards the righthand columns, that resulted in a designation of an “up” or “down” arrow value for a state in their COVID19 hospitalizations. This table showed aggregations of data and a specific sequence of calculations (like dividing the number of cases per population of the state to make it comparable) that ended in a simple symbol. I wondered what the effect would be if each of these columns were to be visualized on a map. Seeing the data dis-aggregated from the table made me realize how many different dynamics are at play with COVID19 data; as you can probably tell, the images of the two maps I created invoke drastically different reactions.

Comments

Matthew Gancayco says:

November 27, 2020 at 12:38 am

Hi Lauren,

Converting the table to a map makes the data much more comprehensible. I found reading the data in the table for all of the states and the capital to be a bit overwhelming. By consolidating all of the data onto the map it is much easier to see how covid is playing out in each state. The map is also helpful in identifying trends that are forming in certain areas of the country, as well as making the trends going on in all of the states more apparent. I was previously unaware how hard the midwest was getting hit by the pandemic, and the map really highlights that.

I agree that the two maps created different reactions, and they support the same narrative. The pandemic is getting worse, especially in the places that were not previously a problem. Cases are rising throughout the country, but the percentage is drastically increasing in a condensed area in the middle of the country.

I think it would be interesting if there was a third map for cases in that day, just to see how the map compared to the per capita map. Perhaps that map could be used for people hoping to pose a counter argument.

Global Carbon Budgets

Grace Logan, November 17, 2020 1 Comment

Data

Original:

The original visualization shows a target temperature, that is the number of degrees in celsius that the global temperature will rise, with a corresponding emissions budget, a percentage of how much of that budget has been used, a trend line showining how rapidly yearly emissions need to reduce to stay within the target temperature and the number of years to halve emissions. There is a lot of information here. In my opinion, too much information for this one visualization. I think it is straight forward and effectivley communicates that different target temperatures will have very different trajectories for how much we need to curb emissions. But I felt lke the more visual representations, that is the percentage of budget used bar and trend line for yearly emissions, are too small to be as impactful as they could have been. I realize that this visualization was probably made to stand on its own, but I think for our projects ir would be more helpful to break up the components to make them more impactful, and because we have the room to explain and discuss the content I think there is no reason not to.

My Visualizations:

I have taken a piece from the original which is a bar chart that shows the global carbon budget for each target temperature. I think that putting it in a bar chart emphasizes how different these budgets are per target temperature and makes the “budget used” more visible. In the original the small percentage bar that was meant to represent this gets lost and the variation is harder to see. Originally I wanted to also make a second visualization to show a line graph of the trajectory for emissions at each target, but my one semester of math is failing me. The data set has emissions for each year from 2021 to 2100 that they used to produce the trend line. I realize that the solution is probably a simple equation and formula added to a new column, but I do not currently know what it should be. Anyways, to reiterate my main reason for doing this was to show that with or projects we have the space to break down these components and discuss them further so they do not need to be crammed into one visualization.

Comments

Anna Durak says:

November 27, 2020 at 11:43 pm

Hi Grace!! I really like how you honed in on the specificities in the data. It was really cool to see the original visualization and then what you made it into. I agree with you, your visualization is very comprehensible and easy to follow. The specific data that you included really allows the viewer to focus on the budget solely and therefore gather a very strong “thick” understanding of that particular narrative. While the original one is very informative, I do find myself skimming the information rather than really soaking it in like in your version. When I looked at the original table, I had not realized how much remaining budget each of the target temperature had left. I think that it gets lost in the information but your graph accomplishes the budget really well and consicely, which I feel is really important in coveying the point. It really makes it feel like this is a very attainable goal where as I feel it gets lost in the percentages of the first graph. I think this orginal visualization has a lot of potential for a lot more data visualizations to really break down all of the data.

Joe Bartusek, November 17, 2020 1 Comment

Data

Comments

Grace Logan says:

November 24, 2020 at 3:04 pm

Hi Joe, I think that this chart could be really useful for your group’s project. I recall that you had yet to come up with a relevant visualization since data was difficult to come by. I did not watch the Buzzfeed video, so I do not know if they take up the issue of poverty and the relationship that you are trying to represent here. Maybe if it is not discussed in the Buzzfeed video you could bring this visualization and the data that it comes from into your project, if it does not further complicate things. Other than that, I like that it is interactive, hovering over the chart definitely makes it more readable. The one thing that I am confused about is the size of the bars. This could just be my own data iliteracy, but I thought that if they are showing proportions than they would all be the same height? Explaining this would be one thing that would make this chart more accessable.

Jerome’s data visualization

Jerome Desrosiers, November 17, 2020 1 Comment

Data

Comments

Zack Kurtovich says:

November 27, 2020 at 1:58 pm

Hi Jerome!

I loved your decision to utilize a traditional scatterplot as a means of providing the viewer with a feel for the enormity of the problem while attempting to preserve the humanity and identity of the individuals represented by each datum. As we’ve extensively discussed over the course of this semester, negotiating this tension between abstraction/ aggregation and personalization is often the most challenging and difficult component of data visualization, so I found the scatterplot to be an especially impressive and impactful contribution to discourse surrounding police brutality and racial injustice. As I mentioned when you presented last week in class, what I find truly compelling about your graph, aside from the sheer size and scale of the victim pool, are the outliers, the people that don’t fall into the typical age range. The data points for Jeremy Mardis and Kameron Prescott particularly struck me, as they were both only six years old when they were murdered by the police. This, I suppose, is one of your visualization’s greatest attributes- it evokes larger questions, it encourages the audience to learn more about the people behind the data points. Through its sort of humanized abstraction, your graph tells a crucial story of an entrenched system of violence perpetrated against the very people police officers are sworn to protect, which successfully challenges traditional characterizations of these shootings as isolated incidents committed by a “few bad apples”.

For future analysis, it might be valuable to further contextualize this data with more information about the victims beyond simply their government name, such as their date of death, race, state of legal residence, or some determined metric for socio-economic status (occupation, etc). While I am a firm proponent of the power of names, it is ultimately limited in terms of capturing a person’s identity. For example, after conducting some cursory research, I discovered that Jeremy Mardis, the six-year-old victim I previously mentioned, was actually white, which indicates an important nuance that should probably be accounted for in the future. I also think it would be extremely interesting to examine the perpetrating police officers to obtain further insight into the context enabling and facilitating these shootings. What was the racial composition of the officers that pulled the trigger? How many of them actually faced repercussions, and how were these punishments distributed? The answers to these questions may help us further understand why interactions between the police and certain demographics are often unsuccessful. I truly loved your visualization, thank you for tackling such an important issue. Have a great Thanksgiving break!

Data (but only if you ask for it…)

Jerome Desrosiers, November 3, 2020 1 Comment

Data

Over the past week, I tried to keep track of the data that I produced while using my favorite phone applications. At first I went into my settings to try and find the hidden data that we are not offered to us unless we look for it. The problem is, once I found it, I wasn’t able to read it in order to understand what I had produced. As you can see on the first picture included with this post, the text is impossible to understand

I then searched for more information that would be easier to understand like my daily average screen time. All my friends have been talking about it except me and the reason is that I haven’t updated my phone by fear of slowing it down. This meant I had no way to find such data unless I downloaded another application. I then went on Spotify, and as soon as I opened the app, the algorithm had figured out what I had been listening to it and offered me multiple playlists that I would probably like.

Lastly, since I spent most of the time on my phone on youtube, I decided to see what kind of data I produced. The first thing that I found was the main page that offers me videos based on my research history and previously watched videos.

What I found interesting while going over my produced data is that it was never given to me automatically. I always had to make the effort to find the data I produced while using applications. It is data that I produced yet it is hidden from us.

Comments

Zack Kurtovich says:

November 5, 2020 at 10:49 pm

Hey Jerome, I loved this post – it really touches on some core themes that I’ve also been wrestling with. I also found the selective exhibition of our data to be extremely disconcerting. Spotify clearly hangs on to your data, as evidenced by their “Unwrapped playlists, so why do they choose to function as a black box instead of allowing us to really dig into the ones and zeros of what we produce? Is it because of a perceived lack of demand or is it intentionally designed to obstruct us from monitoring our data? This underscores the importance of informed consent while simultaneously calling its existence into question. It’d be super interesting to take a look at the regulatory precedent and see if there’s a way we can promote a greater transparency and ensure protect our proprietorship over our data. Can we even really consider it our data if we know that they collect our data and choose to engage in it anyways? At what point do we relinquish ownership? Your post evoked a lot of questions – overall a great read!

Adjusting a graph of carbon emissions

Comments

Poll Watchers in Clayton, GA

Comments

Coronavirus Cases by Zipcode in Hawaii

Comments

Maya’s Visualization — Weekly COVID incident rates in Berlin

Comments

Black vs White Race Representation Ratios for Various Interactions with Police, 2015 to 2019

Comments

Comprehending COVID-19: Exploring Testing Capacity’s Confounding Influence

Comments

20 Centuries of Carbon Dioxide Concentration in the Atmosphere

Comments

data wrapper

Comments

Datawrapper: COVID19 trends

Comments

Global Carbon Budgets

Comments

Comments

Jerome’s data visualization

Comments

Data (but only if you ask for it…)

Comments

Meta