Text Mining and Language Standardization


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Jeffery M. Binder’s ‘Alien Reading’ introduces us to the controversial and unchartered world of Text Mining and Language Standardization.  In an age where written information is exploding at light speeds, the prospect of being able to quickly breakdown and categorize and localize snippets of texts is an extremely compelling technology for researchers and linguists.  However, the difficulty in this task lies in the fluidity of language itself.  To try and convert language into data so that it can be used to make statistical analysis is an inherent problem in and of itself.  For example, language is dynamic and is constantly changing.  What one word or phrase means to somebody may have a completely different meaning to somebody else.  Thus creating a method of standardization is controversial.  This issue is ubiquitous across models by which “over fitting” for language occurs.  The technology of text mining and language standardization needs to find a balance in which their technology is fast and conclusive enough to be useful while also taking into consideration the locomotive nature of language.

In addition, Text mining faces issues of context.  When certain models rely on words, their spelling, and their respective definition these algorithms run into issues about true definition.  This phenomena surfaces in Matthew Jockers’s book Macroanalysis.  We see a “particular use of stream [that] is not related to the “jet stream” or to the “stream of immigrants” entering the United States in the 1850s.” Rather this stream refers to running water.  

With the issues of overfitting and context misjudgment, these text mining algorithms face serious obstacles.  If they continue along this pathway without serious considerations and critical analysis by humans on the other side these algorithms could be responsible for a great deal of confirmation bias down the line.  One could easily imagine an algorithm sacrificing nuance for efficiency leading to a serious misuse of information.

Words?


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In the article “Words that Have Made History, or Modeling the Dynamics of Linguistic Changes” Maciej Eder explored how collected data on words in different languages could demonstrate to people how the language has changed over time. The reader is informed of the methods that are used to find said changes in language, how the researchers first set a hypothesis of how a language changed and then they take data from before and after the time in which the language changed in order to see if the language really did change based on randomly selected words. The article then goes on to explain that even with this data it is tough to recognize of there was a language change because if they are taking random words and the change is based on a single word the data may be skewed. I think that exploring language change is an extremely interesting thing to use data for. I would have never thought that trend lines could be applied to the english language. In JN’s post they address how sometimes studies can be irrelevant by saying, “I think using this data to find general trends has the potential to be very misleading.” I think this may be able to be applied to these trend lines, I feel like taking random words to see changes in certain words or certain parts of the language may not be very effective and may create data that we can draw conclusions from that may or may not make sense.

Reading Response #3: The effects of low-pay and unemployment on psychological well-being: A logistic regression approach,


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

The scientific study, done by I. Theodossiou, named “The Effects of Low-pay and Unemployment on Psychological Well-being: A Logistic Regression Approach,” aims to show how psychological well-being on individual employees may affect the productivity of the aforementioned subject. The main problem that I have with this study is that in the paper, Theodossiou attempts to quantify human emotions, which he previously acknowledges change on a day to day basis.  I have to agree with one student in my class (initials JB), who wrote, “I still find myself uncertain of how much this data is really telling us.”  For example, Theodossiou decided to correlate all of the test subjects emotions to the amount of money they get paid, but there also may be some external factors that affect productivity in the workplace (i.e marriage issues).

On Psychological Well-Being and Unemployment


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Those currently unemployed are more likely to report a lower sense of well-being, thus unemployment is likely not voluntary, concludes a study titled “The effects of low-pay and unemployment on psychological well-being: a logistic regression approach” by Theodossiou. Through self reported data and a wide array of dummy variables, the study finds correlates between age/gender and mental toll of being unemployed.

While I found much of these reported findings believable and perhaps even intuitive, I still find myself uncertain of how much this data is really telling us. First off, self reported data always calls some question on validity. In what fashion and context is this being obtained? We see the literal text of the question they were asked, but the answer to the question “Have you recently been feeling reasonably happy, all things considered?” could change by the hour for any given person. Also who is being asked this? Just anyone 16-91 year old? How is this sample found? Are those who are more likely to respond to this sort of study more likely to answer in a certain way? We cannot say from this piece.

Also, the conclusion that if people don’t like doing something, and they do that thing, then it must be involuntary, seems unconvincing. That is not to say I believe this conclusion is necessarily incorrect, but human emotions and reasoning are not perfect. It may seem fair to assume that people employed involuntarily, but using the fact that unemployed people are unhappy to show that it is against their will doesn’t feel like it tells us much. Would it be expected that people who are unemployed (which means they are actively searching for a job) would be way happier? I don’t think anyone would expect that. I believe this study could have told us more with some more data.

 

Reading Response 10/11 (Theodossiu)


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

I found this reading very engaging! The topic of how unemployment effects psychology is very interesting. This seems to me to be a hard field to collect data on however. This is due to how psychologically everyone is wired differently so that making a broad logistic regression on a sample of 7,897 individuals from a study from 1992 makes me skeptical. It was also interesting that this study wasn’t guided by a formal theory and that an Economic theory provided little guidance. This was surprising because the whole study is based around the economic status of someone so one would thing that a theory would support some of the finding. I was also confused by some of the language that was used such as “ The interpretation of each continuous quantitative risk factor is that the antilog of the logistic coefficient represents the estimated increase in the odds of being in each subsequent range of the relevant measure of psychological well-being per unit increase in the particular characteristic such as age in years” This is a lot of information that is strung out and hard to interpret (at least for me it is). Some of the work that is sighted throughout the text makes me question further the validity of this paper. Specifically, when the author goes into how addition characteristics must be considered to determine an individual’s level of psychological well-being. All the sources say different things in regards to age and well-being.

6.2 Reading


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

The article, “The effects of low-pay and unemployment on psychological well-being: A logistic regression approach,” speaks on the correlation between unemployment and mental distress. The results showed that, generally, if you are unemployed, you are more likely to encounter mental distress. One of my peers mentioned that, “Since this study was conducted in a relatively strong economic period, [he/she] would assume most people out of the labor market would be due to choice and their psychological well-being would not be affected. [He/She] was surprised to see that being out of the labor force had significant effects on well-being for certain subgroups.” However, I think that despite how much an economy is prospering, there will still be those that face unemployment unwillingly. While there may be people who choose not to work, there are always a significant group that are struggling due to unemployment. This will ultimately leave an impact on you, which was seen by the data given in the article. 

Gender discrimination


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

One of the main reasons why people struggle to be happy or find happiness is not the “tragedies” that happen in their lives but rather the social stigmas that are attached to what goes on in people’s in lives. No matter what gender, people are going to be disappointed with themselves if they get let go or become unemployed. The data that suggests men become more depressed and anxious when they are unemployed contributes to more of the social construction that men are suppose to provide and be the main source of income for their family. Although gender norms have broken some barriers, society still needs to do a better job in perceiving both genders as equal. The post,
“Unemployment and Psychological Well-being” brought up a very valid point about how in depth this survey was. In my opinion, I think that it is hard for this data and study to be conducted because everyone has different struggles.

Money = Happiness?


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Again this looks like a study that claims to do great research about a cool topic only to be full of subjective questions which causes misleading data. Theodossiou’s work in studying the relationship between employment status and mental well-being is definitely intriguing but I questions many of the methods. To begin with, the data used comes from a sample of 7897 individuals, a sample that is too small in my opinion. Next, the respondent questions are hard to interpret and leave room for confounding variables. One of the questions is “Have you recently been feeling reasonably happy, all things considered?”. I take offense to this questions because it can elicit responses unrelated to employment based happiness. Recent death in the family as well as many other factors play into psychological well-being and I’m not seeing work done to control for them. Even the scale is subjective as pointed out by cmcclancy. Phrases like “as usual” are open to interpretation and vary from person to person. All in all, not a fan of the work done in this paper.

Effects of Low Pay and Unemployment on Psychological Health Reflection


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In the article, “The Effects of Low Pay and Unemployment on well being: a logistic regression approach,” the author explores the correlation between unemployment and mental distress. Through his data, he finds that people who are unemployed have a greater chance of developing depression, anxiety, and low self esteem. He explains that even a low paying job can increase someone’s general happiness. The author took his data from almost 8000 different individuals between 16 and 91 through a survey with 6 different questions regarding how the individual was feeling about life. The author then correlated the results with unemployment using a logistic regression approach. However, I was confused as to how he determined how many of the individuals from the data were unemployed. I found this part of the article unclear. Anyhow, after trusting his results, he found that unemployed individuals have a higher chance of having psychological issues. In my opinion, his argument was not incredibly persuading. I thought it was pretty intuitive that unemployment is correlated with low self esteem, but it was not his data or argument that persuaded me. I found the article quite interesting, but did not change my stance on the subject in any way.

Sadly, I am not sure how to find everyone else’s blog posts, so I am not sure how to respond to them! I will ask in class!

Class 6.2 Reading Response – Unemployment and Well-Being


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This article was very informative in terms of how the status of being “unemployed” can have a really negative impact on the psychological well-being of people in comparison to somebody who was employed in a low-paid job. One thing that I mentioned in my annotation for this article was the idea of a “discouraged worker”, an economic concept which describes an adult between 18-65 years old (legal employment age) and has not found employment after long-term unemployment. During this time however, this person was originally actively seeking employment, and eventually became discouraged to the point where they would be considered to have left the workforce because they stop seeking employment. I’d be curious to see how this concept would play into the article because this would most likely represent the lowest psychological well-being of somebody who has not only become unemployed, but has been classified as leaving the workforce in its entirety. I think in general, the article was well-written and used survey data really well. Building off this blogpost (http://courses.shroutdocs.org/dcs104-fall2018/2018/10/08/simpsons-paradox-is-the-data-telling-the-right-story/) from one of my classmates, I believe that it address a really important concept of how data can tell a story. In the context of this article, the data is telling a story, but is the story complete without considering the concept of a “discouraged worker”?