MG – Data Cultures

Posted on December 10, 2018 by MG

Guiding Principles for FAIR Data

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This article caused me to think significantly about the way I present data in my projects and also what data is accessible to me. It also made me think about the way that data has historically been inaccessible to some groups. In my Environmental Science class we recently watched a movie about the Love Canal, an environmental disaster that was wrought by the lack of information about a certain community’s exposure to chemicals. Not only were the residents unaware of the fact that they were exposed to a significant amount of chemicals, but when they sought out information on their exposure, the data was very difficult to read because of the density and use of scientific language. The data in this situation clearly went against the FAIR guidelines because it was not accessible. As Michael commented, the FAIR guidelines help ensure “effective research and future studies.” In the case of the Love Canal, the data was unable to be effective or influence future studies until it was made more accessible by a scientist who translated it into something more understandable by members outside of the scientific community. This article made me think immediately of the Love Canal situation and made me realize the importance of making all data FAIR.

Posted on December 6, 2018 by MG

What you can, can’t and shouldn’t do with social media data

This post made me immediately think of Donald Trump’s tweets. I think that this day in age, especially with Trump, social media has surpassed being simply a platform of socialization but one that was almost likened to that of a newspaper. For instance, Trump tweets his every move. If social media were more privatized and protected, it would be impossible for reporters and other nations to access the information they needed. I don’t necessarily think that Twitter is the appropriate platform to publish national decisions, but I appreciate its accessibility.

One question I had while reading this article was how different levels of privacy related to how much of one’s data researchers were able to access. For instance, whether one’s instagram account was public or private. Was research based primarily off accounts that were public? Or do the terms and conditions give permission to some researchers regardless of the privacy settings of the account?

I also disagreed with the point made about how the ability to conduct research without direct consent allowed researchers to avoid the Bradley effect, Hawthorne effect, and response bias. I think that people cultivate their online personas, and while they might not be cultivated with the intention of responding to certain research questions, they are not truly authentic.

I did not see any other annotations to comment on.

Posted on November 27, 2018 by MG

Debates in the Digital Humanities Response

At the beginning of the article, David asked if people who were posting online weren’t being counted as credible sources. His question made me think of the issue of fake news, and what online sources we trust and assume to be credible. As we have discussed in the past few weeks, one danger of data analyzation is the misrepresentation of data in graphs. It is no surprise that the misuse of technology and data analytics can result in misleading or false conclusions (as shown in Civilian Casualties and Searching for Black Girls). This is not meant to discredit all digital humanities studies, but to acknowledge the danger when used irresponsibly, and to understand the questioning credibility of digital humanities scholars. In an age where information is readily accessible at all times, and where there is a plethora of online articles, it is the readers job to analyze the source and determine its credibility. I think, that in response to David’s question, we should assume credibility with a certain amount of doubt that allows us to critically examine the work before assuming its validity.

Posted on November 6, 2018 by MG

Feminist Data Visualization

Michael’s annotation that highlights the importance of whose voices are represented in the data strikes me as one of the key arguments D’Ignazio and Klein’s article Feminist Data Visualization. Because data collection is a process designed by the collector, it is inherently biased and controlled by the collector. This consequently limits the scope of the data to the resources possessed by the collector. Feminist theory in the context of data visualization aims to enlarge this scope to be as inclusive as possible by emphasizing the perspectives of many who have been marginalized and whose voices have historically been excluded. Yet, Klein and D’Ignazio recognize that even feminist theory runs into limitations when considering those who are gender non-conforming or transgender.

Michael notes how this question is a recurring theme in many of the articles we are reading. This idea reminded me of the Han Rosling’s 200 Countries 200 Years video which briefly addressed the silencing of certain narratives through averages. For example, the video addressed the intersection of life expectancy and wealth by analyzing country averages for both categories. Rosling acknowledged that for certain countries, say China, when split into counties, had numbers that fell all over the graph and far from the average. Specifically, more rural provinces were poorer and had shorter life expectancies whereas Shanghai had higher levels of wealth and longer life expectancies. Rosling, the designer of the visualization, could have omitted dividing China into provinces and thus left out the voice of the poorer and more marginalized community. It was his choice to include such a narrative, but he only did so for China. Rosling therefore left many voices out of his data by using mostly averages per country.

Feminist Data Visualization aims to minimize the voices left out of data collection, making inclusive conclusions and visualizations.

Posted on November 1, 2018 by MG

Using Metadata to find Paul Revere

The language and tone of this article was initially confusing, and seemed out of place in the context of the content of the article. Written in the prose of 18th century England, the dated language of the article is a sharp contrast to the technological terms and concepts that the article is focused on. Regardless, the author puts forth a clear and easily understood explanation of the way s/he used matrices to draw conclusions about a certain dataset. Matrix functions can be used to manipulate data in ways that expand the bounds of the original dataset. This expansion allows us to draw new conclusions about the given data, conclusions that we could not have assumed by looking at the original dataset. Though the old English tone of the article felt out of place, the explanation was straightforward and the status of the author as a “low-level operative” allowed him/her to use simple wording to plainly paint what was going on with the data. Additionally, the placing of the article in the 18th century eliminated most of the technical jargon that often makes some readings dense and confusing. Thus, although the tone of the author seemed out of place, it was ultimately helpful in understanding the concepts discussed.

Posted on October 25, 2018 by MG

The Digital Humanities Contribution to Topic Modeling

When I use a calculator for a math problem, I never question the answer I am given. This is because I trust the computer doing the mathematical analysis that gives me the answer. I trust the answer that I am given without challenging it. How does this differ when the computations and analysis being performed are on the humanities and digital literature? Should we trust computers the same amount? In my previous post I touched on this idea, and Meeks and Weingart mention it in their critique of Topic Modeling. They discuss how Topic Modeling extends beyond the capacities of humans and open new doors of understanding. However, their critique lies in our willingness to accept the conclusions drawn by these models without challenge or question. What stuck out to me most from this article was Meeks and Weingart’s call to use caution when accepting the results of Topic Modeling. They emphasize the interpretive capacities of human scholars, and urge us to think critically about the results that ensue. This is especially important in avoiding situations like those brought up in the Civilian Casualties article, ones where the data can reflect a bias and corroborate a stereotype that isn’t necessarily true. This isn’t to say that criminal analysis data is the same as topic modeling, but it points out the same possibility that the data we trust and the models we rely on might be producing conclusions that we do not want to take at face value. The article stressed to me the importance of not losing our human intuition to the presumption that computers know everything. Topic Modeling can open many doors, but we must look through those doors with caution.

Posted on October 23, 2018 by MG

Jockers and Underwood Reading

This reading gave me a new perspective on the capabilities of digital text analysis and visualization. I had previously thought that digital analyzation of text would be limiting. How could a computer, that works on a system of 0’s and 1’s, analyze text in more ways than my brain could? However, this article made me realize the breadth of possibilities present with Digital text analysis. I found it particularly interesting when the authors explained how analysis could reveal to the reader new ways to look at the reading. The authors take into account that the interpretations generated by the analysis methods are to be questioned like any other human source and interpretation, which I found important. I think we often believe that digitized calculations and analysis are correct without questioning them even though we would question the conclusions of our peers. The suggestion to question the results of computers and the span of possibilities they are capable of makes me see computers almost in a more human way. I’ve generally thought of computers as able to compute numbers quickly, however, I’ve never considered them to be able to do analysis on texts like this article made me understand.

Posted on October 11, 2018 by MG

Theodossiou Reading Response

Yesterday was National Mental Health Awareness Day, making this most recent article all the more relevant. I. Theodossiou’s findings surprised me for a few reasons. Firstly, I was surprised by his conclusions about gender, and the fact that women reported being less affected by unemployment. In a society that generally stereotypes women with mental instability, Theodossiou’s findings clearly exposed this as a stereotype. Additionally, given the prominence of mental health awareness among young people, I was surprised that Theodossiou’s results set middle-aged people as experiencing higher odds of lower mental health conditions. These conclusions that refuted the generalizations that I held reminded me of Devin’s comment when he remarked on how the study was statistical, not based on a theory or preconceptions. By coming from a statistical approach, Theodossiou was able to avoid biases in his data collection and analyzation. His lack of bias is perhaps what allowed his findings to counter many stereotypes about those who experience mental health conditions.