Can the Humanities Topic Model?


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In the article “The Digital Humanities Contribution to Topic Modeling” Meeks and Winegart  address how topic modeling has become more frequent in the humanities. When they address how “You’re introduced to topics, and how a computer came to generate them automatically without any prior knowledge of word definitions or grammar,” it really highlights how topic modeling is new to the humanities world and people can question if it is really a good fit or not. They address how topic modeling is beneficial and could help people make advances in this field. I think it is important how they conclude with the idea that topic modeling cannot take over the humanities it is just should be a new tool that they can use to help, I feel that as topic modeling in this field is so new it is important to remember that it should not be the end all be all of analyzing data and that people will still have to review it.

J-OS stated “it’s important to how exactly the topic modeling program is being run to know what exactly the results are saying.” I agree that people have to be extremely careful about what types of analysis they use with what type of data. It really shows that right now human analysis is very necessary.

To Model a Topic


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In the piece “The Digital Humanities Contribution to Topic Modeling” by Meeks and Weingart topic modeling is discussed. More specifically, the authors look at the place of topic modeling within the digital humanities. The post also acts to pull together some resources from others about topic modeling. In fact, much of the information references links that are external to the piece, so without reading those links it is hard to understand the point trying to be made. The piece also mentions two tools for topic modeling, MALLET and Paper Machines. However, the article offers little explanation of what makes them popular or how they function. Overall, the post seems to talk about the field of topic modeling very generally, and rather vague opinions of others (“modeling points the way to a computing that is of as well as in the humanities”). I think in order to get the tangible important ideas of this article, it is necessary to read the external links.

Topic Modeling


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Topic modeling is a unique approach to extract information and digest the results from a collection of documents. This article defines topic modeling as, understanding “buckets of words,” and providing seductive but obscure results in the forms of easily interpreted (and manipulated) “topics.” There remain many tools within the field of topic modeling, and specifically, in new tools under the category of machine learning. From the article, we note, “The work in this issue integrates the Natural Language Processing technique of topic modeling with network representation, GIS, and information visualization. This approach takes advantage of the growing accessibility of tools and methods that had until recently required great resources (technical, professional, and financial).” With new and more accessible tools, we expect topic modeling to be used more widespread.

 

A main concern of topic modeling is that is the usefulness and easy of use for topic modeling tools. In this article, “none of the authors in this issue simply run and accept the results as “useful” or “interesting” for humanities scholarship. Instead, they critically wrestle with the process. Their work is done with as much of a focus on what the computational techniques obscure as reveal.” The next steps for topic modeling will be to ensure that we can reveal useful information that is conclusive for the process of analysis, with far more accuracy and efficiency than we see in hand-process techniques.

How Topic Modeling Shows Gender Inequality in the Humanities


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

As discussed in the previous reflections, data mining, text analysis, sentiment analysis and now topic modeling a contributing in unimaginable ways. These advancements in the data modeling world have expanded its impact in the world of humanities — specifically literature. Using Topic Modeling, computers are able to recognize patterns, themes, and keywords in texts. This has allowed for better research and analysis on human society in many different ways. Data’s ability to solve issues is no longer constrained to numbers. Today computers can crack codes, analyze speeches and extract key elements in literature at rates much faster and efficient as humans did previously. In the study “Where, What, When and Sometimes Why: Data Mining two decades of Women’s history” uses these skills to recognize gender inequality in literature of the past and current article publication rates. Using “word frequency”, computers run programs on finding the frequency of words such as “he”, “him”, “his”, “she”, “her” and “hers”. The results showed that male pronouns were used significantly more often than the female pronouns, which clearly alludes to some question about gender equality in all aspects (considering it’s coming from literature). NB discusses the pattern of white NFL players more frequently being called “intelligent” and black players more frequently being called “natural”. He writes: “studies like the one in the article are easy to conduct, but are they completely relevant and accurate in their findings?” He further discusses the potential of this trend to start off with because there are different quantities of black and white players in the NFL on each team across the team. The same may apply to today’s study. Since women were less represented in academia, the further we go back in time, some of the data we collect today on the literature of the past may not perfectly reflect the gender inequality in today’s society. In retrospect, it’s important to how exactly the topic modeling program is being run to know what exactly the results are saying.

 

8.2 Reading


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

One of my peers commented that they, “hadn’t considered the issue with specificity, or lack thereof” when it came to topic modeling. They continue, saying, “Especially when only looking at abstracts from these articles, there can be a lack of specificity involved with this analysis.” However, when you are looking at thousands of articles, it is tedious to look through all of them when there are alternative options that give you the most important information. In the article, “What,Where, When, and Sometimes Why: Data Mining Two Decades of Women’s History Abstracts,” the author used text modeling to analyze a little over half a million essays and articles about women’s history.  This, in my opinion, is the most efficient way to go about this because I am sure that there were moments within in each article or essay where the author spoke on things that weren’t important to their overall point, and having the ability to skip over or omit these insignificant portions makes the process much quicker. While I can see where my classmate was coming from, this method is just to maximize useful time.

Class 8.2 Reading Response – Topic Modeling for Women’s History


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This article demonstrated a very practical use for topic modeling by looking at trends in women’s history, attempting to debunk certain myths and popular beliefs. It looked at over half a million abstracts throughout this process, which clearly will provide a lot of data points. An interesting aspect that my colleague  discusses in relation to topic modeling that I hadn’t considered is the issue with specificity, or lack thereof. Especially when only looking at abstracts from these articles, there can be a lack of specificity involved with this analysis. At the same time, a topic modeling analysis looking at the entirety of the articles would be a much more time consuming and expensive process, given the sheer influx of words and data points. I think in certain cases, there can be a tradeoff between specificity with the data and the actual quantity of data that you have to work with. It is up to the analyst as to what he or she values more. In the case of this article, I think that having more data points and only using the abstracts was the correct decision, as the purpose of the article was to conduct an analysis of all of women’s history. Given this, they would likely want to include as many data points as possible.

Issues with Topic Modeling


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Something that really stuck out to me in Meeks and Weingart’s article “The Digital Humanities Contribution to Topic Modeling” was the difficulties that are highlighted when using topic modeling. The first point that drew my attention was the idea that “what we might have identified as cohesive ‘topics’ are more complex than simple thematic connections”. It makes sense that people may assume topics they extract from modeling represent cohesive ideas and themes throughout the piece of literature, but the reality is discerning a topic can be very broad. Topics lead into subtopics, and those subtopics have their own subtopics. The reality is this modeling may give the umbrella topic of a piece, but can often lack specificity of subjects addressed within the larger topic. This leads into the next point I found interesting; that “different methodological choices may lead to contrasting results”. People will use different code and methods to try to extract topics, and the even slight differences could lead to varying conclusions about the text. This creates misconceptions about the text, which topic model is more representative? Which is more validated and why? These questions are all debatable, which creates vagueness in any conclusion drawn about a topic model.

 

In response to my colleague CD, the line that “no amount of counting can produce meaning” also resonated with me. They make great points in the article about the uses of topic modeling and how it can be incredibly effective at certain things, but the reality remains that, as you said, “the power of interpretation and meaning still lies in the hands of the individual”. Great point, I agree fully!

The Necessity of Text


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Within the reading “Jockers and Underwood [Text Analysis and Visualization]”, my main reaction was about the purpose of text and its uses in the present, considering how complex and advanced our technological world is becoming today. Text communication can be avoided in numerous ways: people can listen to audiobooks instead of read, talk to text features exist for sending mobile messages. It occurred to me how over time, text is becoming less and less necessary for people on a daily basis. However, both seeing text visualizations used and reading about text for analysis and their uses in this article led me to the realization and opinion that text will likely always be crucial, especially for data analysis. Wordle word  cloud visualizations and the Voyant Tools standard reading skin from this article were two text based figures that stuck out to me as being an extremely effective way of viewing and analyzing data. Text provides essential means for analyzing data and deriving means from it, in some ways that cannot be replaced by any other processes that completely avoid using text.

Meaning in Text


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

I found Jockers and Underwood’s discussion on how to find meaning in digital text very interesting. What particularly stroke me was the following quote that “no amount of counting can produce meaning”. Shedding light on the fact that although digital text can do a lot of things such as rearranging words and grouping them together; ultimately the power of interpretation and meaning still lies in the hands of the individual. Nonetheless, text analysis and visualization help us achieve that meaning, as it allows us to experiment with the representation, and isolate different factors that can highlight connections and relationships that we would have otherwise missed if we weren’t able to visualize the data.

The idea behind this reading strongly reminded me of the “Alien Reading”. PE expresses in their post on this reading that they found happiness in the fact that computers cannot so easily understand the written word, that it “takes more than a software to understand humans and their written thoughts”. I thought this was a beautiful takeaway from the reading, and a thought I strongly agree with. Computers more and more these days allow humans to shortcut a lot of things, except derive meaning; preserving human’s ability to think and be conscious, a characteristic that defines us in the animal world and makes humans.

Jockers and Underwood Reading


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This reading gave me a new perspective on the capabilities of digital text analysis and visualization. I had previously thought that digital analyzation of text would be limiting. How could a computer, that works on a system of 0’s and 1’s, analyze text in more ways than my brain could? However, this article made me realize the breadth of possibilities present with Digital text analysis. I found it particularly interesting when the authors explained how analysis could reveal to the reader new ways to look at the reading. The authors take into account that the interpretations generated by the analysis methods are to be questioned like any other human source and interpretation, which I found important. I think we often believe that digitized calculations and analysis are correct without questioning them even though we would question the conclusions of our peers. The suggestion to question the results of computers and the span of possibilities they are capable of makes me see computers almost in a more human way. I’ve generally thought of computers as able to compute numbers quickly, however, I’ve never considered them to be able to do analysis on texts like this article made me understand.