MS-C – Data Cultures

Posted on November 29, 2018 by MS-C

FAIR!

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

I thought the article from Force 11 for today was very interesting and it seems like the group working on this idea did an in-depth, impressive job coming up with these standards for data on the internet. The FAIR acronym they came up with was very clear and well-done, giving specifics for each part and made me realize the importance of having legitimate data on the internet. I also was intrigued by the “candidate FAIRport” definition that they came up with: 1. Contains FAIR Data Objects (to be judged by the endorsing authority), 2. Provides these Data Objects under well defined accessibility for Re-use. 3. Has a full and open description of all technologies, controlled vocabularies and formats used. All of the aspects of the data being “FAIR” as they describe come with a lot of complication and important side notes that were interesting and at times confusing to read about. My classmate “TB” said that they thought metadata subjectivity could have used more discussion, and I agree with this. I hadn’t thought this before reading this post, but TB said “For things that are objective such as author and character count, other metadata could be less objective”, which makes me agree that there could have been more discussion and explanation for that part of the article.

Posted on November 27, 2018 by MS-C

What is Scholarship?

The article for today, “Developing Things: Notes toward an Epistemology of Building in the Digital Humanities”, was extremely interesting to me. It is discussed whether or not digital humanities is considered “scholarship” or not which I went back and forth on myself throughout reading. At first, I thought it should of course be considered scholarship because of the level of skill and complication that goes into digital work, but the middle of the article changed my mind a bit, saying “Repairing cars requires a high level of technical skill; the intellectual nature of chess is beyond dispute; mining coal is backbreaking work. No one confuses these activities with scholarship.” This is very true, and sets up a possible thought in my head that the digital work should not at all be discredited for its difficulty or impressiveness, but it may just not fall under the category of “scholarship”. However, the internet defines the word scholarship with “academic study or achievement; learning of a high level”, which makes me think it should be able to fall under that category. I went back and forth a few times, and still am unsure what my opinion is, but I would like to read more on arguments about this question.

I thought it was interesting that fellow classmate “TB” noted that “it just seems that whatever point they are arguing has little effect on what the results of any research would be”. I agree with this and lack the knowledge of why this question is so relevant, if it does not affect any of the data or analysis in these works.

Posted on November 13, 2018 by MS-C

Issues that can arise with lack of specificity in text

For the reading “Locating Place Names At Scale: Using Natural Language Processing To Identify Geographical Information In Text”, I mostly was focused on the visualizations shown and also thought about the importance of knowing specification. Using text with a lack of specification has potential to cause real errors in data taken from that text, which is why NER (Named Entity Recognition) seems extremely important and useful to me in terms of collecting data from text. I liked both of the visualizations presented in this article because they were clear and easy to see, and centered on a specific location. However, I would be interested to see how both of these visualizations would look if the entire country was in the picture. The two questions posed in the post by TB intrigued me: “What if the location found is not what it seems to be? Does the algorithm discard the term if there is too much ambiguity?”. While it is useful and important, there are definitely possible problems that come to mind when reading about this process.

Posted on November 8, 2018 by MS-C

The Effectiveness of Visualizations with History

For today’s class, I read “Lynching, Visualization, and Visibility”. Regardless of any reasoning behind the people who executed lynchings, it is obviously an inexcusable, terrible part of history. Nonetheless, data can be effective in analyzing it for a few reasons. Finding details and patterns behind how it all happened and being able to look at it in different ways with different visualizations, such as the ones shown in this article can give people better perspective about the seriousness of the topic and provide more knowledge about ways things such as this can be prevented. The visualizations presented in this article definitely show aspects of lynching that many are not aware of. “KL” also posted about this article and she used the phrase “thoroughly illustrate the past” about the visualizations shown about lynching. I thought this was a great way of putting it and expressing how the data visualizations about history can be effective.

Posted on November 6, 2018 by MS-C

Women are Important in Data Analysis

For this class, I read “Feminist Data Visualization” by Catherine D’Ignazio and Lauren F. Klein, which was relevant to mean a girl studying in this field. Reading about the feminist approach that they took was interesting because of all the different aspects of feminism that can be analyzed through data, many of which have changed throughout time; power, context, embodiment. I enjoyed the connection between humanities and visualization that was discussed because from what I’ve come across so far, the importance of humanities in data isn’t emphasized enough. Contributions from feminist thought can make a big impact. The “Design Process Questions” at the end of each section were good for critical thinking for me as I read and made me realize how important generating effective questions are when analyzing data.

Posted on November 1, 2018 by MS-C

The Use of Numbers in Data

I read “Using Metadata to Find Paul Revere” for class today and found it extremely interesting. At first, I took note of how they used 0 and 1 to represent whether or not the people where in an organization. This is often used in collecting data and let me to wonder why. However, as I kept reading and seeing how many other cool things that could be done with that simple layout of data, I realized how effective using 0 and 1 were and how effective numbers in general can be. This was seen in the matrix that showed which organizations are linked through the people that belong to them both. For example, North Cactus and St. Andrew’s Lodge had three people who belonged to both of them, so there was a three where those two connected in the matrix. This could not be shown as easily and clearly without using numbers, if they had used “yes” and “no” for example when showing if people belonged in the organizations. This was so interesting to me and gave me a better handle on the use of certain techniques in data, especially techniques that allow you to rearrange the data in so many ways, like what was done in this article.

Posted on October 23, 2018 by MS-C

The Necessity of Text

Within the reading “Jockers and Underwood [Text Analysis and Visualization]”, my main reaction was about the purpose of text and its uses in the present, considering how complex and advanced our technological world is becoming today. Text communication can be avoided in numerous ways: people can listen to audiobooks instead of read, talk to text features exist for sending mobile messages. It occurred to me how over time, text is becoming less and less necessary for people on a daily basis. However, both seeing text visualizations used and reading about text for analysis and their uses in this article led me to the realization and opinion that text will likely always be crucial, especially for data analysis. Wordle word cloud visualizations and the Voyant Tools standard reading skin from this article were two text based figures that stuck out to me as being an extremely effective way of viewing and analyzing data. Text provides essential means for analyzing data and deriving means from it, in some ways that cannot be replaced by any other processes that completely avoid using text.

Posted on October 3, 2018 by MS-C

Prisoners Are People Too

What I found most interesting in the The Changing State of Recidivism article was the table of statistics about small differences in prisoner characteristics at the end of the methodology section that showed the different original offenses, genders, and age at release year for prisoners from 23 states. This was done with three different columns that compared the percentage of the group that was released in 2005, the percentage of the group that was released in 2012, and showed the difference between these two. A few different aspects of this chart stuck out to me. The first thing was the difference between the male and female release percentages. First off, in both the 2005 release and the 2012 release, the difference in male to female release percentages is approximately 76%, males being the higher percentage in both cases. The ratio of male to females being released was extremely high, much higher than I would have guessed, and it was practically consistent from 2005 to 2012. In addition, from 2005 to 2012, the male release percentage decreased by exactly 0.4% and the female percentage increased by exactly 0.4%. This shows that although there is an extremely high male to female release ratio, it is potentially getting lower very slowly over time. The other aspect of the chart that interested me was the fact that the 18-34 year old age group release rate from 2005 to 2012 increased (by 0.4%) and the 35-54 year old age group actually decreased by 3.0%. For some reason that middle age group of 25-54 is decreasing its release rate over time, while the younger age group of 18-34 is very slowly increasing. A final aspect of the chart that stood out to me was that violence crime release rates increased by 3.0% from 2005 to 2012, while drug crime release rates decreased by 4.3%. Both of these are somewhat significant changes, and led me to contemplate why each occurred.

As for something a classmate said, I would like to reference Mia Gates annotation saying “Important to note that the data was submitted voluntarily. That makes me wonder why certain states opted out, and if certain data was omitted.” I thought this was an interesting point and made me think about which states were included in the data and why. I noted that only two of the northeast states (New York, Pennsylvania) volunteered their information about recidivism and pondered why there was such a lack of representation from the northeast, the place where I go to school and live at home.