Reading Responses – Data Cultures

Posted on November 1, 2018 by JN

Insignificant Data, Significant Conclusions

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Something I found interesting about the article “Using Metadata to Find Paul Revere” was how metadata can be used not to show minute details, but rather to show more sweeping relationships of a social network. Metadata seems to be used to make more broad thematic connections between various things, like people and the organizations they were involved in, in the this particular article. Another thing I found interesting was how conclusions drawn from using metadata, in this specific instance, uses “no actual conversations” or transcribed meetings. It can take very general information and use it to come to significant conclusions. This seems especially important in areas where specific information may be difficult to come by, which is often the case with less important historical cases.

J-OS also mentioned the fact that these small pieces of data that may seem very basic can actually be extremely useful in creating a visual of social networks and their connections between people. It can draw valid conclusions and help to understand who the protagonists are in a vast social network.

Posted on October 23, 2018 by JN

Issues with Topic Modeling

Something that really stuck out to me in Meeks and Weingart’s article “The Digital Humanities Contribution to Topic Modeling” was the difficulties that are highlighted when using topic modeling. The first point that drew my attention was the idea that “what we might have identified as cohesive ‘topics’ are more complex than simple thematic connections”. It makes sense that people may assume topics they extract from modeling represent cohesive ideas and themes throughout the piece of literature, but the reality is discerning a topic can be very broad. Topics lead into subtopics, and those subtopics have their own subtopics. The reality is this modeling may give the umbrella topic of a piece, but can often lack specificity of subjects addressed within the larger topic. This leads into the next point I found interesting; that “different methodological choices may lead to contrasting results”. People will use different code and methods to try to extract topics, and the even slight differences could lead to varying conclusions about the text. This creates misconceptions about the text, which topic model is more representative? Which is more validated and why? These questions are all debatable, which creates vagueness in any conclusion drawn about a topic model.

In response to my colleague CD, the line that “no amount of counting can produce meaning” also resonated with me. They make great points in the article about the uses of topic modeling and how it can be incredibly effective at certain things, but the reality remains that, as you said, “the power of interpretation and meaning still lies in the hands of the individual”. Great point, I agree fully!

Posted on October 22, 2018 by JN

Analysis May be Computerized, but Meaning Remains Human

Sinclair and Rockwell’s piece on “Text Analysis and Visualization” draws emphasis to the recurring themes of text analysis–the inhuman nature of the process. They recognize the necessity of text analysis, mentioning how some 200 billion emails, 5 billion Google searches, and hundreds of hours of video are uploaded every day. The organization, storage, and retrieval of such all being capable through text based processes and searches. However, the authors also mention the recurring theme of text analysis: “[It provides] a snapshot, but [doesn’t provide] exploration and experimentation”. The reality remains that text analysis seems to be a loose term in the significance of the word analysis. The analysis of the physical text itself and the statistical modeling of words can be easily computerized, but the ability to discern meaning and draw conclusions from text seems to remain a human trait. This is very significant to me because it reassures the fact that the human mind is unique and irreplaceable. We can utilize computers to find text we are looking for or to determine which texts in a vast collection are worth our time, but our ability to contextually analyze text and find true meaning is seemingly unmatched and remains human.

In response to my colleague NL, I too find it very interesting how people still communicate primarily through text. You mention the text analysis used in toy commercials and how cloud visualizations show how certain words play a major role in advertising. Although text is the most frequent means of communication, I ask whether it remains the most effective way to communicate? With visual and audio communication it can often be easier to portray emotion which textual communication cannot. I do not have an answer for this question, but it’s something to think about!

Posted on October 15, 2018 by JN

Computer’s Difficulty with Humanness

Something I found very interesting in Jeremy Binder’s article about text mining is that computers seem have great difficulty when dealing with the human reality of language. Binder argues that when studying literary and cultural texts, text-mining softwares can pull out key words or sentences, but in analysis it focuses primarily on the literal meanings of words. This could lead to many misinterpretations, as language and slang change so frequently over time that evaluating texts on literal meaning could lead to false conclusions about the information presented. As a result, as Binder recognizes, text-mining is often better utilized as “statistical methods in applications like search engines, spellcheckers, autocomplete features, and computer vision systems”. This makes sense, because these application don’t take into the account the fluid nature of language and they search strictly based off spelling or literal meaning.

In response to “Money=Happiness?”, by ZC, I agree! This studying presents itself with a convincing case, but when you really break down the method of data collection, you can start to realize that it doesn’t sound too convincing. You also referenced a point which I overlooked, the sample size is way too small, I agree, good point.

Posted on October 11, 2018 by JN

Is This Study Even Relevant?

I found difficulty in recognizing the relevance of this study for two reasons. First, this study was conducted in 1992. I would find it extremely interesting to run this study again today and see the difference in results. My hypothesis is that this past study is completely irrelevant because the social and economic climate in 1992 is drastically different from today. Today, mental health issues are way more prevalent and seem to affect a larger percentage of the population and to a larger extreme, rendering this past study useless in applying it’s findings to today. The second issue I had with the study, which is unavoidable, is simply what the study is measuring. The study is asking about people’s personal mental wellbeing. I find this measure very subjective. There could be two people in the exact same situation and one could be perfectly happy and one could be utterly distraught. I think using this data to find general trends has the potential to be very misleading.

Posted on October 9, 2018 by JN

Can we Trust Anything?

Something I found interesting in Guinnane and O’Grada’s “Mortality in the North Dublin Union” is that the authors say “no measure is entirely immune to outside conditions“. If data itself is a collection of measurements, whether directly or indirectly, then that means no data is immune to outside conditions. If that’s the case, then theoretically no data is truly independent. If all data is affected or influenced in some way by outside conditions, then can any data be truly representative of the situation it’s attempting to model? How do we know which data to trust and which not to trust? It seems like conclusions are often attempted to be drawn from data which seems representative, but may not be. How do we mitigate the effects of these outside conditions and preserve the purity and trust of our data?

Posted on October 3, 2018 by JN

Reading Response 1

One point I found interesting in Engerman and Fogel’s “Guest Editors’ Forward” is how they reference an analysis of data for Trinidad and seem to claim those results have significance on the global scale. I find this very misleading because Trinidad is a small island likely with it’s own unique population, healthcare systems, diets, and cultural traditions which have their own specific effects on their specific population. The island is not at all a representative sample. How could the authors lay claim that the trends found in the Trinidad population could have any significance or be at all representative on the trends of the global population? I found this point invalid, and even if it’s population trends are true, I find it unconvincing because of the insignificance of the sample.

Posted on October 3, 2018 by JN

Reading Response 2

Something I found interesting in “The Changing State of Recidivism” was the method of data collecting would suggest, at least to me, that there could be biases. The author states that “The federal Bureau of Justice Statistics collects data submitted voluntarily by state departments of corrections and parole”. This method of voluntary submissions of data could lead to biases because states that choose not to report their data could be doing so for some specific reason. As a result of voluntary submission, “only 23 states provided data for the entire 2005-15 time frame”. This shows the incompleteness of the data collected, as not even half of the states reported. All that being said, maybe the data from the states that didn’t report is consistent with the states that did. But, we don’t know because we don’t have that data, and therefore I find it safer to be skeptical and make no assumptions on the data than to make possibly false assumptions.