PE – Data Cultures

Posted on December 14, 2018 by PE

In response to The Guiding Principals For Findable, Accessible, Interoperable And Reusable Data

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In reading, “The Guiding Principles for findable, accessible, interoperable and reusable data” I was able to learn about how to make data available and how to implement the use of data fairly. I believe that the information put forth by this article is incredibly important should be seen as a way to combat the rise of “fake news” in our current social and political spheres. The article stresses that data should be findable, accessible, interoperable and reusable to all in order for it to be considered fair data. Although I agree completely, I cannot help but think about top secret documents and the information they hold and wonder if, because their data isn’t accessible to the greater population, if this data is considered to be illegitimate?
My fellow classmate RC comments, “ Historically, data is often tightly held and those who create data can be biased in their creation and analysis so opening up data flows will create more equitable data”. Though I agree with this statement, could we not hope that the CIA officer dealing with top secret documents would better analyze the data than I would? Because I do not have access to it does it make it that much more likely to be open to biases?

Posted on November 13, 2018 by PE

Locating Place Names At Scale: Using Natural Language Processing To Identify Geographical Information In Text

In the article “Locating Place Names At Scale: Using Natural Language Processing To Identify Geographical Information In Text” by Lauren Tilton, the author speaks about the notion of understanding historical data differently by using programming to look at the data in a broader context. The author gives the example of names of geographic places, how at first glance a historical piece of data reading Paris could mean Paris, France or Paris, Texas and with the help of broader context we are able to deduce which is which. Computer programming has worked to streamline this process by using Named Entity Recognition (NER), a natural language processing technique, and applying it to interviews taken of American experiences during the New Deal to help to give a broader understanding of movement and place in America. I echo DA’s sentiments which they say, “I think this article was a great way to explore the fault in some programming and show how searching for specific things does not always get you the results you need. Sometimes we need to look outside what we are actually looking for.”

Posted on November 8, 2018 by PE

Sentiment Analysis and Subjectivity

I found the piece “Sentiment Analysis and Subjectivity” by Bing Lui interesting as it really expanded on idea of computational text analysis as a means to analyze data. When we first began to look into text mining in class, I was a little confused as to how computers were able to take something like a news article or opinion piece, something so subjective and in the “grey area”, and turn it into something so black and white without going through the human thought-process of synthesizing the information. Lui’s writing helped me to understand this as he talked about how exactly computers synthesize the information for “opinion mining”.

Opinion mining is done by looking at each individual sentence and wording within that sentence with certain subjectivity classifications that help us deem what the overall consensus of the written piece entails. After learning how the actual process works, I find it curious to compare “Sentiment Analysis and Subjectivity” with what a peer of mine (BM) said in their post on “Text Mining/Language Standardization”, that a “computer will never understand the emotional values and ever-changing expressions of human beings”. From what I read today I think that computers are getting incredibly close to understanding the emotional values of human beings due to the work that has been done with opinion mining and the call for a better way to synthesize the public’s opinion and perception of things and products on a larger scale. Although there is a long way still to go, I think that this significant headway will be the basis for future breakthroughs.

Posted on November 8, 2018 by PE

Feminist Data Visualization!

Before reading Feminist Data Visualization by Catherine D’Ignazio and Lauren F. Klein I was curious how feminism, the advocacy of women’s rights on the basis of the equality of the sexes, could be intertwined with data visualization. I saw feminism as something that ebbed and flowed and was maybe a little subjective at times. In contrast to my notions of feminism, I saw data visualization as something purely objective. I thought a mix of the two would be confusing and honestly a stretch. After reading Feminist Data Visualization, I now understand I am very wrong. Feminist thought braided into data visualization lends to a productivity in the advances of female rights at a level that is both modern and practical. What I found the most interesting was the authors focus on epistemology – who is included in dominant ways of producing and communicating knowledge and whose perspectives are marginalized. From this focus on epistemology, feminist data visualization holds six principles constant when discussing data synthesis and visualization: Rethink Binaries, Embrace Pluralism, Examine Power and Aspire, to Empowerment, Consider Context, Legitimize Embodiment and Affect, Make Labor Visible. I think these principles are a great way to evaluate data as a means to better both humans and society. As MLC said , “critical thinking about all of these categories will allow the audience and the author to remove some of the societal inequalities that all STEM fields currently have.”At first I saw these principals as a means to skew the objective information available, but after my growing understanding that data is constantly skewed and can really never be truly objective, I believe that having concrete principals in place for when someone is working with data leads to a more positively skewed outcome.

Posted on November 8, 2018 by PE

Social Media Analysis of Historical Figures

The article “Using Metadata to Find Paul Revere by Kieran Healy was very interesting because it allowed the reader to get an understanding of how we could relate the mediums of social networking to going through historical data. Paralleling people on social media today to historical figure Paul Revere drives home the point that information stripping technologies of today can be used to find important people of the past due to repetition. The relationships between people, the interconnections between groups and the overarching figures allow historians to learn who the key players of history are just like today. Also, through chronicling and annotating every social connection between people and groups, the author suggests that you could one day be able to learn more about their social and personal life. My peer, NL, relates this article to the six degrees of Francis Bacon “because they both showed that important people can be found using unbiased network analysis”. The one thing I found a little troubling to understand was how social media analysis would work in the future. Looking at the information we have of the past, it was only the most famous who were written about. Only the most important were a part of organizations, social groups and clubs. Now, whether it be the most important or the least important person, there is so much information on every human being. How will someone know that Barack Obama is more important than say, Snooki 100 years from now?

Posted on November 8, 2018 by PE

Lynching, Visualization and Visibility!

In the article “Lynching, Visualization and Visibility” by Lincoln Mullen, Mullen highlights how data can be perceived differently without the context alongside it. In the piece, Mullen looks at historical data of lynchings, a controversial and saddening part of American history. The writer also highlights the work of another journalist, Mathews, who looked at the same given data alongside a strong religious understanding of the act. As a result of this, Mathews drew a very different conclusion on the historical data than Mullen did. Mathews visualized the data without looking only at the empirical data, but rather, the history and religious significance behind the act to find out why it was caused. This writer looked at the trends of data, and took from it not their significance, but what it said about the religious understanding of lynching. I found this wildly confusing. Matthews argued that with a lack of lynching data, it meant, essentially, that more lynchings were happening during that time period due to the understanding that lynching was “a ritual that made power visible, yet its power depended in part on its lack of visibility in the official records”. So basically, at times were there was less information available about lynchings, more lynchings were taking place. To echo what RF said in their blog post, “the context in which we read and visualize data can sometimes be just as powerful as the insights themselves.” Reading this article was incredibly eye opening as it essentially rebuffs any information taken down about lynching acts ever and that same information has given us a warped understanding of history as a result.

Posted on October 16, 2018 by PE

DCS Response 2 for 10.16.2018

After reading “Alien Reading: Text-Mining, Language Standardization and the Humanities”, I was more aware of how hard it is to fully understand the written word, especially when you are a computer trying to read and synthesize the information in the text. Just as the article stated, computers “tend to privilege the informational over the aesthetic dimensions of language; and they primarily consist of prose” which makes it extremely difficult to be able to fully automate the understanding of a text. For text-mining, for example, technical and informational genres suit the task better as they are easier to understand compared to multiple types of writing at once. This topic was really interesting to me as an english major because my main line of work, currently, is to connect with writers and readers by written communication, whether that be my own writing or reading someone else’s. To have that connection be something that a computer can’t always fully automate is both wildly interesting and powerful, and honestly comforting. I am happy that the written word cannot be so easily understood. I am happy that it takes more than a software to understand humans and their written thoughts.