Text Analysis


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

The following article by Lauren Klein expressed both an interesting historical story and the significance of text analysis under these conditions. Lauren Klein prefaces the story by outlining the relationship between Jefferson and his servant James. Regarding text analysis, we see to determine the true relationship between James and Jefferson and specifically, in regards to the debate over freedom.

User ‘ecullen’, highlights the line, “one of the information that I’ve just told you is immediately evident in the letter on the screen”, which in context, is emphasizing the significance that certain phrases might not be picked up unless a full text analysis has taken place. The text that is under debate here is the phrase, “former servant James”, which as we now know, is directly referring to Jeffersons former servant James Hemings. Moving forward, we not power of text analysis, and as the analysis itself cannot help us understand the meaning behind phrases, we do note the amazing findings that can follow a change in understanding of an article with the discovery of a new word or term.

Class 9.1 Reading Response – Reconstructing Historical Social Networks


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This article was particularly interesting to read given that I’ve actually thought about how difficult it must be to understand specific human social interactions before everything was so digitized, which wasn’t that long ago at all. Now, going back almost 200 more years and trying to understand interactions would be much more difficult. Using biographical data would make sense under these circumstances, as there aren’t many clear indicators of social interactions in more historical time periods that would provide enough data points, as it was mentioned that letters are clearly good indicators of social interactions but doesn’t represent full populations by any means. In response to my colleague’s post, I think it’s interesting to see the conjunction of historians and computer scientists, and how useful having a strong background of data analysis can be when attempting to better understand history, especially given how hard it can be to interpret certain sources of historical information.

A Report Has Come Here


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

This article was one that I found very interesting. The in-depth look into the James Hemings and his life was a very revealing experience. It is easy to gather information from documents but I have never seen how people attempt to gain information for the lack of documents. There were no official documents about James Hemings but through using many different tools historians were able to gain some insight on his life from Thomas Jefferson’s documented letter. This article was a great teaching moment, showing me that the lack of information on a person or certain thing does not mean that story must remain unknown. Using certain tools and human intuition we can “offers some acknowledgement of the lives and stories that will forever remain unknown”.

Response: The Digital Humanities Contribution to Topic Modeling


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Reading the “The Digital Humanities Contribution to Topic Modeling” article, I found it extremely interesting and informational on topic modeling. It seemed to be a well researched and truly informational piece by guest editors, who brought the sometimes complex world of topic modeling into a much easier to understand informational piece with multiple references to academic journals and more specific insights into topic modeling. The critical engagement section at the end was especially useful in offering actual insights into the information given in the article, for example saying, “Traditional humanities scholars often equate digital humanities with technological optimism. Rather the opposite is true: digital humanists offer the jaundiced realization that computational techniques like topic modeling — long held inaccessible and unapproachable and therefore unassailable — are not an upgrade from simplistic human-driven research, but merely more tools in the ever-growing shed” (Meeks).  I completely agree with the statement because I feel like the concept of topic modeling has been long held inaccessible and thought of as too complex for the average reader cannot necessarily replace human research but offer an additional tool to an ever growing audience. A classmate also noted that the article “was very clear and concise, and really got to, what I think, is the crux of the issue for topic modeling.” (MS-A) Which I can completely agree with, topic modeling is important and necessary and its expansion to a broader audience is good for all users.

The Digital Humanities Contribution to Topic Modeling


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

When I use a calculator for a math problem, I never question the answer I am given. This is because I trust the computer doing the mathematical analysis that gives me the answer. I trust the answer that I am given without challenging it. How does this differ when the computations and analysis being performed are on the humanities and digital literature? Should we trust computers the same amount? In my previous post I touched on this idea, and Meeks and Weingart mention it in their critique of Topic Modeling. They discuss how Topic Modeling extends beyond the capacities of humans and open new doors of understanding. However, their critique lies in our willingness to accept the conclusions drawn by these models without challenge or question. What stuck out to me most from this article was Meeks and Weingart’s call to use caution when accepting the results of Topic Modeling. They emphasize the interpretive capacities of human scholars, and urge us to think critically about the results that ensue. This is especially important in avoiding situations like those brought up in the Civilian Casualties article, ones where the data can reflect a bias and corroborate a stereotype that isn’t necessarily true. This isn’t to say that criminal analysis data is the same as topic modeling, but it points out the same possibility that the data we trust and the models we rely on might be producing conclusions that we do not want to take at face value. The article stressed to me the importance of not losing our human intuition to the presumption that computers know everything. Topic Modeling can open many doors, but we must look through those doors with caution.

Response to “The Digital Humanities Contribution to Topic Modeling”


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

First, looking back at the blog posts from last week, I really enjoyed reading the blog titled: “Analysis May be Computerized, but Meaning Remains Human.”  The author, in my opinion, correctly noted that although computers performed most of the work, there is still an element to this analysis that is very human.  We still get to discern the results we get from algorithms.  Point being, the results we get from algorithms are not the end all be all.  It is still up to humans to interpret the results in a responsible manner.

With that said, I really enjoyed the reading for this week.  It was very clear and concise, and really got to, what I think, is the crux of the issue for topic modeling.  This reading highlighted the idea that it is very easy to see how topic modeling and either think that this is a powerful tool and we should use it as much as possible.  On the contrary, it is also very easy to see topic modeling and immediately disregard it.  After all, how could a computer algorithm give any real meaning to written text.  This article pushes back on both schools of thought, and notes the advantages that come from topic modeling lie in the middle of these two schools.  Meaning, topic modeling adds an additional data point, and it is up to human to decide what weight to put on the analysis.  As the authors of the reading say, topic modeling is simply a tool in a shed, and it is up to the researcher to interpret and weight the results.

Digital Humanities


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Topic modeling is an important field that will allow computer programs to better analyze and synthesize categorical data. While the programs we have right now aren’t perfect, they’re improving with machine learning and our understanding of their code increases over time. In “The Digital Humanities Contribution to Topic Modeling” Elijah Meeks and Scott Weingart argue that the tools we have at our disposal are too blunt at the moment for them to be taken seriously. While I agree with this sentiment, I don’t believe we should give up on the whole field just because they haven’t produced a perfect finished product yet. I also disapprove of their writing style, because the complexity of the words they used distracted me from the content of the piece.

EC said in the post “Can the Humanities Topic Model?” that we should remember that topic modeling is just a tool to help us, and shouldn’t be the only thing we use to analyze the works in the humanities. We shouldn’t allow ourselves to become too dependent on computers and programs to do our work for us, and this is a great example of where we have more knowledge about something than a computer does, at least for the moment.

Can Computer Scientists and Humanists be Friends?


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

In The Digital Humanities Contribution to Topic Modeling, I found it interesting how much the authors stress the importance of humanist’s role in topic modeling. Due to the availability and magnitude of text data in recent years topic modeling has exploded with popularity. Because topic models work by consuming vast amounts or corpus of text, the results of topic models are generally widespread or blanket statements about the data itself. For this reason I agree with the authors that understanding how these methods work is critical. However, is it really enough to understand the inner workings of the algorithm alone?

Meeks and Weingart believe that in some cases the debate surrounding topic models is too concerned with the success of the algorithm itself opposed to the human space that the algorithm is working in. As a field, topic modelers have become obsessed with understanding the strengths and weaknesses of the topic models that they lost focus on what is really important, interpreting language.

This point of contention will be difficult for researchers to balance in the future. Society rewards speed and profit making it difficult to ensure that our models are not only accurate but ethical. My hope is that humanists and computer scientists will work together to make topic models more accurate and it turn less futile.

In SJ’s reading response, he/she brings up a very compelling argument against topic models, which is that sheer abundance of word count should be mistaken for abundance of meaning. Language and text are not always exact science but also art. There is certainly room for human emotion within a text and I agree with SJ in that sometimes these emotions can trump frequency.

Why Topic Modeling is important


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

The discipline of women’s studies and the analysis of gender in regard to women has only become expansive since 1985. Even though the this scholarship is finally joining history and enables people to have the resources and opportunity to learn about the history of women and their struggle, topic modeling educates people that their is a lot more progress surrounding women’s studies to happen. From a a Quantitative analysis, it is quite obvious that the study of womanhood is centrally focused on “gender”, “women”, or a “group of woman” hence that these words are mentioned at least every 1500 words and majority of articles in this fieldwork are highly concentrated around these words. It is very crucial for data analysis and topic modeling to be apart of researching women’s studies because it gives statistical data that portrays the frequency of words that is used to describe this field study. Without these numbers, society would just think it would be ok with the progress that has been made and just check the box rather then seeking for improvement. In a blog written by a peer of mine states perfectly on this subject. The blog expresses that since topical modeling can decipher frequency of words, this skill exploits the inequality between gender and the discourse of womens studies.

Limitations of Topic Modeling


Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Over the course of the last week or so, our class has researched, read, and have been taught the principle ideas of topic modeling and text analysis. I have learned that these two types of data extraction are extremely efficient and can also lead to the discovery of valuable information that would not have been found without the analysis of the compiled text. For instance, we read about the subtle differences in racial treatment through the descriptions of white and black NFL athletes. I found NB’s example of Colin Kaepernick to be extremely relevant when reading about this, “This doesn’t come as a huge surprise considering the recent events in the NFL involving quarterback Colin Kaepernick. Kaepernick, with both the IQ and athleticism to be an elite quarterback in the league, still has yet to be signed because he publicly expressed his disgust in the current state of the league.” While the information obtained through this specific example of text analysis is important, Sharon Block’s, What, Where, When, and Sometimes Why: Data Mining Two Decades of Women’s History Abstracts raised some essential questions for me. The only limitations I have about her topic analysis research is that the scale of her topic modeling may have been too large. If a researcher is concluding that the number of words that appear in a text document ultimately dictates, in this case equality of information within gender, word count is not a reasonable causation to this claim. I am not sure how one would go about collecting more substantial data for this, but I do not think assessing the number of specific words can prove that there is inequality among sources. Inequality in regards to publishing history relates to the fundamental issue of published works, whether something written is published about gender or women’s history. The data in this case would not be found in word count, but instead the number of abstracts that were presented and published, and the number of abstracts that were presented and not published. Topic modeling cannot efficiently answer this question.