EC – Data Cultures

Posted on November 8, 2018 by EC

Lynching Data?

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In the article “Lynching, Visualization, and Visibility” Mullen explores how lynching as a religious history aspect is different in many way from lynching in a data collection aspect. They first look at the studies of lynchings with no stories behind them. They make charts of when and where the lynchings occur but they put no story behind them. I think this is problematic because they are looking at a piece of history that is very controversial and by just looking at the raw data I think a lot of the story aspect is lost. The line in the article that reads “lynching was a ritual that made power visible, yet its power depended in part on its lack of visibility in the official records.” As the lynchings were not recorded as legitimate data, making data visualizations of them now is somewhat inaccurate because we are unsure of what information is really accounted for. They then go on to adress how now data visualizations could also be skewed because they do not keep track of things such as when police use brute force. I think we should be learning from the past and taking into account now how people are being treated, we should be learning from our mistakes. We should be addressing racial conflicts and they showed how lynching could have done that and how now we should be addressing that with police brutality and understanding when it really goes in not just sometimes.

As RF said in their blog “the context in which we read and visualize data can sometimes be just as powerful as the insights themselves.” I think this statement is really true, but I think the Lynching article addresses a contradiction to this because it shows how sometimes their might not be enough data collected in order to create a good data visualization.

Posted on November 4, 2018 by EC

Feminism in the Data Field

In the article “Feminist Data Visualization” D’Ignazio and Klein discuss how feminist theory can be intertwined with information visualization research. They address how they can bring feminist practice into areas such as STS and HCI. In order to expand thinking about the data visualizations field can be expanded they use how the research regarding feminism can expand the boundaries. I think it’s important to explore where women are currently sitting in the power hierarchy and continue to do research that allows the data field to be stretched to its greatest boundaries. I like the article’s idea of “working backwards” first looking at the individual or institution that has an idea and then proceeding to make a data visualization with that important information in mind.

Classmate BL references how this article is good about showing how “incorporating a much more humanistic perspective” allows us to explore even more data than we have before. I think this is a good point about how people need to be more open about using different methods to explore already developed fields.

Posted on October 25, 2018 by EC

Can the Humanities Topic Model?

In the article “The Digital Humanities Contribution to Topic Modeling” Meeks and Winegart address how topic modeling has become more frequent in the humanities. When they address how “You’re introduced to topics, and how a computer came to generate them automatically without any prior knowledge of word definitions or grammar,” it really highlights how topic modeling is new to the humanities world and people can question if it is really a good fit or not. They address how topic modeling is beneficial and could help people make advances in this field. I think it is important how they conclude with the idea that topic modeling cannot take over the humanities it is just should be a new tool that they can use to help, I feel that as topic modeling in this field is so new it is important to remember that it should not be the end all be all of analyzing data and that people will still have to review it.

J-OS stated “it’s important to how exactly the topic modeling program is being run to know what exactly the results are saying.” I agree that people have to be extremely careful about what types of analysis they use with what type of data. It really shows that right now human analysis is very necessary.

Posted on October 22, 2018 by EC

Words Count…

In the article “Making Meaning Count” Sinclair and Rockwell explore the differences in word choice for description of black NFL prospects and white NFL prospects. They address that although written words are not the most flashy way to communicate, it still is the most common way, “In this simple sense, text is already a type of visualization,” and therefore analyzing the text is worthwhile. I think creating the interactive was an interesting way to analyze the word choices and can really demonstrate the differences. They then go on to talk about the logistics of text analysis and talk about how text analysis can synthesize data and how we can then put it into a visual platform such as a wordle. I think that text analysis seems like something that definitely needs to be further developed but could be a huge tool in data analysis. I think that it could be used as a tool to analyze extremely long texts but I also think it can make mistakes and that’s why humans also need to analyze the data to ensure that things are not being missed.

One of my classmates wrote “I think you can be especially creative when the data is in the form of text.” I would agree with this statement it is in such a simple form there are so many different ways that you can go with it and so many different angles you can take when analyzing it.

Posted on October 13, 2018 by EC

Words?

In the article “Words that Have Made History, or Modeling the Dynamics of Linguistic Changes” Maciej Eder explored how collected data on words in different languages could demonstrate to people how the language has changed over time. The reader is informed of the methods that are used to find said changes in language, how the researchers first set a hypothesis of how a language changed and then they take data from before and after the time in which the language changed in order to see if the language really did change based on randomly selected words. The article then goes on to explain that even with this data it is tough to recognize of there was a language change because if they are taking random words and the change is based on a single word the data may be skewed. I think that exploring language change is an extremely interesting thing to use data for. I would have never thought that trend lines could be applied to the english language. In JN’s post they address how sometimes studies can be irrelevant by saying, “I think using this data to find general trends has the potential to be very misleading.” I think this may be able to be applied to these trend lines, I feel like taking random words to see changes in certain words or certain parts of the language may not be very effective and may create data that we can draw conclusions from that may or may not make sense.

Posted on October 10, 2018 by EC

Low-pay leads to Low Happiness

In the journal article “The effects of low-pay and unemployment on psychological well being: a logistic regression approach” Theodossiou explores the correlation between unemployment and mental health issues. They explored the issue mainly based on people’s answers to a series of questions, which makes me wonder of people are unhappy because of their unemployment or if they are unhappy because they are most likely surrounded by other unemployed people so maybe it has become a consensus that they are unhappy instead of individual feelings of unhappiness. I think is is important how the article addresses why unemployment might make people so unhappy, it reads “it may be a source of prestige and social recognition, a basis for self-respect and sense of worth, an opportunity for social participation or merely a way of earning a living.” This proves that if people are unemployed they are typically unsatisfied and allows data to explore to what level they are unsatisfied. I think it is also important that they look at the subgroups, as we know from the Simpsons Paradox reading they can paint a different picture, in this case they seem to get the same result but with different reasoning, for example young people are more unhappy out of boredom and lack of purpose instead of out of stress. I think this article does a complete job of addressing how unemployment can affect many different aspects of mental health, I think they use there data well to prove multiple different correlations that are present between this data set. I think this was one of the more comprehensive articles in collecting data and addressing the concrete conclusions that can be made from it. In the simpson’s paradox blog from last week the question was raised “Is this a connection that humans are forcing or something that happens on its own?” In regards to correlation and I think in this particular study they were able to present accurate and enough information that the correlation between unemployment and unhappiness is one that is really evident and not one that humans are forcing.

Posted on October 8, 2018 by EC

Simpson’s Paradox: Is the data telling the right story?

The article that addresses Simpson’s Paradox begins by addressing the problem with people’s observations of data. It most clearly describes how data can derive different conclusions if the data is split into sub-groups versus all of the data being kept together. The graph that analyzes Alcohol Intake and IQ levels is extremely interesting, it is extremely evident that within the group of people, the people with higher IQ’s consume more alcohol but within each individual the less alcohol they consume the higher their IQ. This proves Simpson’s Paradox that the overall conclusion goes against what is really true. This article gives many good examples of this paradox being apparent in data. By going on to address how it should be dealt with there are many good suggestions of how scientists should be aware of their data and how they should assess outcomes like this. I think the idea of looking at clustering within data to see if trends might be going in conflicting ways is one of the best ways to address this issue. They conclude that data has to be analyzed very carefully in order to make the correct conclusions. I thought this article did a good job of presenting examples of Simpson’s Paradox and explaining the errors is conclusions that can be made because of this. It is an interesting topic because it is probably extremely present in many data sets. Last week Jackson Hayes addressed how in data can be manipulated “I was also intrigued by the way data can easily vary based on different variables.” I think this relates to simpson’s paradox because it shows how data in many ways can represent the wrong things. All in all I think both articles bring important aspects of data analysis to light.

Posted on October 4, 2018 by EC

Martel Reading Response- Elizabeth Cullen

In “Differences Do Not Matter: Exploring the Wage Gap for Same-Sex Behaving Men” Michael E. Martel explores the previously studied statistics that confirm gay men make noticeably less than heterosexual men, and emphasizes that the data proves that this wage gap is in no way created by difference in skill level. In this article GSS and Census data is used to make initial claims that the existing wage gap is present, there is a table that references multiple studies that find that a penalty in earnings comes with being gay. I think the presence of the table really emphasizes the wage gap and is an effective use of data to ensure a picture is painted for the reader of what the issue is. It is stated in the reading that “[they] are interested in differential treatment that gay men experience at work. Those who identify as gay are more likely to indicate to employers and coworkers that they are gay [Carpenter 2005],” (Martel 39). I think this claim really makes there assumptions of the sources of the wage gap more valid because they are using data that has obvious correlation. They then go on to compare it to the wage gap with minorities which also emphasizes the problem that is at stake. The conclusion reiterates the point that the wage gap is clearly evident between gay men and heterosexual men. I think this article used data in a really straightforward way that supported their claims and their assumptions about where this wage gap comes from, they were able to support their reasons for this claim, that were not clear, with data that was correct. One of my classmates made the claim “misses out a big chunk of potential data from people who aren’t in a cohabiting relationship” and although I think this is a good point that even if a gay man is not in a cohabiting he should be recognized in the workplace. Although I think this is a valid claim I believe that they left that group of people out because they wanted to ensure that the data was from a group of extremely similar people and a cohabiting relationship is what they decided would be a factor of a similar group of people that would provide good supporting data.