Blog Post 8: What you can, can’t and shouldn’t do with Social Media Data


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

What you CAN Do:

  1. Collected Data MUCH faster and more accurately than before
  2. Use Word Mapper App, very good for linguistic data collection
  3. Gets rid of…
    • Bradley effect: People tend to tell researchers what they think they want to hear
    • Response bias: The sample of people willing to do an experiment/survey differ in a meaningful way from the population as a whole
    • Observer’s paradox/Hawthorne effect: People change their behavior when they know they’re being observed

What you CANT Do:

  1. Can’t be sure what the source of your data is: Have no clue what general demographic categories your sample represents
  2. Inherent Sampling Bias: Social media users tend to be from wealthy, educated, industrialized, rich and democratic societies.
  3. Unable to violate developer’s agreements: Developer’s agreements vary between platforms, but most limit the amount of data you can fetch and store, and how and if you can share it with other researchers. (ex. 50000 Tweets for Twitter)

What you SHOULD Do:

  1. Respect the wishes of users. There are three principles of ethical human subjects research:
    1. Respect for Persons: People should be anonymous and be guaranteed protection
    2. Beneficence:  Maximize possible benefits, minimize possible harms (NEVER harm)
    3. Justice: Both the risks and benefits of research should be distributed equally.
  2. Safeguard their best interests: Be careful of the data you take, make sure that it won’t hurt ANY of the subjects personally.

Blog Post #7: About Sharing Things


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This article posted by the Force 11 Group, outlines the rules, regulations, and goals of data that is shared. I think that it is really great that a group like this stepped up to the task, and made it possible for everyone, humans AND machines, be able to use data that is published out into the world. I agree with my classmate initialed RF, (post is here: http://courses.shroutdocs.org/dcs104-fall2018/2018/11/28/e-universe/), who said that these bylaws allow the data universe to be equal, and equal=good so yay!

Reading Response #6: On the 10% Rule


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This study carried out by Vejune Zemaityte, Deb Verhoeven, and Bronwyn Coate discusses the general implications of the 10 percent in the Australian Movie Industry. Most movie producers use the rule to estimate the income of an American film broadcasted in Australia, where they assume that it will bring in 10% less revenue than it did in America. Zemaityte and her colleagues carried out this study to show that this rule was quite inaccurate, to essentially invalidate the use of the 10% rule for this purpose.

Reading Response #5: A Report Has Come Here


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In the article “A Report Has Come Here” by Lauren Klein, Klein outlines how topic modeling works, as opposed to the reading we had by Meeks and Weingart. Klein explains that by using computing languages such as R, we can show the main ideas mentioned in data collected from papers, essays, interviews, etc. As one of my classmates wrote on his/her blog post (http://courses.shroutdocs.org/dcs104-fall2018/2018/10/30/a-report-has-come-here-the-bright-side-of-text-analysis-and-visualization-in-digital-humanities/), searching for main ideas without these programs would be like finding a needle in a haystack, but since we have these programs, it is much easier.

Reading Response #4: Words That Have Made History, Or Modeling The Dynamics Of Linguistic Changes


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Maciej Eder’s paper, titled “Words That Have Made History, Or Modeling The Dynamics Of Linguistic Changes”, Eder discusses the history of quantitative modeling of words, along with the new technologies and the advancements that have come because of them. As my classmate (Ben Lyons) wrote in his journal entry, the data collected from the text mining can if done correctly, be very effectively manipulated by whoever is running the study. So, people using this methodology for their studies need to be careful to capture the full picture, instead of what they are looking for.

Reading Response #4: Words That Have Made History, Or Modeling The Dynamics Of Linguistic Changes


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Maciej Eder’s paper, titled “Words That Have Made History, Or Modeling The Dynamics Of Linguistic Changes”, Eder discusses the history of quantitative modeling of words, along with the new technologies and the advancements that have come because of them. As my classmate (Ben Lyons) wrote in his journal entry, the data collected from the text mining can if done correctly, be very effectively manipulated by whoever is running the study. So, people using this methodology for their studies need to be careful to capture the full picture, instead of what they are looking for.

Reading Response #3: The effects of low-pay and unemployment on psychological well-being: A logistic regression approach,


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

The scientific study, done by I. Theodossiou, named “The Effects of Low-pay and Unemployment on Psychological Well-being: A Logistic Regression Approach,” aims to show how psychological well-being on individual employees may affect the productivity of the aforementioned subject. The main problem that I have with this study is that in the paper, Theodossiou attempts to quantify human emotions, which he previously acknowledges change on a day to day basis.  I have to agree with one student in my class (initials JB), who wrote, “I still find myself uncertain of how much this data is really telling us.”  For example, Theodossiou decided to correlate all of the test subjects emotions to the amount of money they get paid, but there also may be some external factors that affect productivity in the workplace (i.e marriage issues).

Reading Response #2: Mortality in the North Dublin Union during the Great Famine


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In this paper published by Timothy Guinnane and Cormac Grada, “Mortality in the North Dublin Union during the Great Famine”, the authors intend to show a correlation between the mortality rates in workhouses in Ireland, and the mortality rates due to the Great Famine, an incident that only plagued essentially people of lower class. While reading this study, I found it quite eye opening that they only used the mortality rates from only one workhouse, instead of many to minimize outliers. One of my fellow classmates brought up an interesting point as well, writing that, “Data, here too, could be vary depending on perspective”. This brings up a new point that I hadn’t really thought about until now. If the people directing the study wanted to make a point, couldn’t they just choose one specific workhouse in order to prove their point?

Blog Post #1: On the PEW study


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In this overview of the methodology for a study intended to show the decrease of re-incarceration rates in America’s prisons, PEW’s summary of the methods they used to carry out this study was clearly extremely biased and unprofessional. I just can’t get over the fact that a professional organization for these types of things can be so pathetic in their methodology.