Natural Language Processing

Warning: Undefined variable $num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 126

Warning: Undefined variable $posts_num in /home/shroutdo/public_html/courses/wp-content/plugins/single-categories/single_categories.php on line 127

Historical data text sources are often filled with metadata about where a subject or object was made and stored. There is a plethora of geographical information throughout any textual document, no matter how old. The idea of having this information embedded in textual data sounds great because you could tag any document with the corresponding data in which it includes itself, however, the challenge surrounding this is that it is difficult to extract that sort of geospatial information on a large scale. For example, sorting through hundreds and thousands of documents attempting to extract the location stated in each document has been a hard obstacle in the realm of digital work. A DH2018, Mexico City article mentions how a piece of text might say Paris. The question standing would be if the writer intended to talk about Paris, France or Paris, Texas, USA. Although with close reading through the document that would be in you could figure out it was most likely the capital of France they were talking about, it just might be Texas the writer is talking about. This sort of close reading is simply impossible when we start to discuss hundreds and thousands of documents, and for that reason digital workers use computational methods that identify and geolocate place-based data. Tools such as Named Entity Recognition (NER) and natural language processing (NLP) are used to find and label geospatial data factors, such as countries, states, and cities, at scale.