Student of Computational Linguistics and High Performance Computing
I am a 3rd year PhD student in Computational Linguistics at Indiana University. I am interested in computational linguistics for under resourced languages, morphologically rich languages, and African languages.
I am also interested in NLP tasks that relate to the sentiments and emotions of the author. This includes tasks like stance detection, sentiment analysis, irony/sarcasm detection etc.
TLDR The parser can be found at https://parser.ksteimel.duckdns.org. What’s with the name? Mti ([m̩ti]) is the Swahili word for tree. This parser generates trees. Motivation My department in Linguistics had…
Motivation I am a PhD student in Computational Linguistics. As such I have both the need to experiment with deep learning frameworks and little money to build a powerful deep…
Operating System I used Ubuntu 19.04 partially because I wanted to try out the April release of Ubuntu and I knew that the newer kernels were more compatible with Vega…
Deep learning has historically been dominated by NVIDIA GPUs. The Nvidia CUDA API is a proprietary standard for writing code to run on graphics processing hardware. CUDA is tightly integrated…
For a project on maliciousness detection that I am working on, I needed an unsupervised stemming method. We were examining the role that text cleanup plays in the classification task….
Just wanted to give a heads up, with newer versions of scikit-learn (I believe starting with 0.20), the random forest implementation will use 1 job in the default setting as…
I ran 1000 iterations for each of the cross-language part of speech tagging experiments discussed in this paper. For each iteration, the training data was a different subsample of all…
This is a presentation introducing what Computational Linguistics is at a high level. Keep in mind that this is for an undergraduate Introduction to Linguistics course. Some things are simplified…
Julia version 0.7 and 1.0 have been beta tested and released! The official downloads are available here. However, it seems like these are running a bit slow for me. I…
I am working on integrating distantly supervised sentiment tweets into irony/sarcasm detection for tweets. Previous results that I’ve obtained have shown that the MPQA lexicon improved results during cross validation…
I have my first results from using an SVM tagger on Wanga data extracted from the fieldworks corpus. They’re quite good. With only 1500 wanga words, noun class identification reaches…
My problem I have developed quite a set up for doing machine learning and other tasks (80+ cpu cores and 150+ GB of ram combined). It is nice when the…