Computational Linguistics Projects

I am a 3rd year PhD student in Computational Linguistics at Indiana University. I am interested in computational linguistics for under resourced languages, morphologically rich languages, and African languages.

I am also interested in NLP tasks that relate to the sentiments and emotions of the author. This includes tasks like stance detection, sentiment analysis, irony/sarcasm detection etc.

Blog posts about computational linguistics

Oct 6, 2019

Mti

TLDR The parser can be found at https://parser.ksteimel.duckdns.org. What’s with the name? Mti ([m̩ti]) is the Swahili word for tree. This parser generates trees. Motivation My department in Linguistics had…

Aug 7, 2019

Scrapyard Deep Learning - Part 1

Motivation I am a PhD student in Computational Linguistics. As such I have both the need to experiment with deep learning frameworks and little money to build a powerful deep…

Apr 23, 2019

Building an AMD Deep Learning Machine - Software Stack

Operating System I used Ubuntu 19.04 partially because I wanted to try out the April release of Ubuntu and I knew that the newer kernels were more compatible with Vega…

Apr 22, 2019

Building an AMD Deep Learning Machine - Motivation and Hardware

Deep learning has historically been dominated by NVIDIA GPUs. The Nvidia CUDA API is a proprietary standard for writing code to run on graphics processing hardware. CUDA is tightly integrated…

Dec 20, 2018

Pypy and python speed comparison

For a project on maliciousness detection that I am working on, I needed an unsupervised stemming method. We were examining the role that text cleanup plays in the classification task….

Nov 19, 2018

Scikit-Leaern Random Forest Parallelism inside of GridSearch

Just wanted to give a heads up, with newer versions of scikit-learn (I believe starting with 0.20), the random forest implementation will use 1 job in the default setting as…

Nov 16, 2018

Luyia POS tagging results with 1000 runs

I ran 1000 iterations for each of the cross-language part of speech tagging experiments discussed in this paper. For each iteration, the training data was a different subsample of all…

Nov 16, 2018

Introduction presentation to Computational Linguistics

This is a presentation introducing what Computational Linguistics is at a high level. Keep in mind that this is for an undergraduate Introduction to Linguistics course. Some things are simplified…

Aug 9, 2018

Julia 0.7 released (mirror download provided)

Julia version 0.7 and 1.0 have been beta tested and released! The official downloads are available here. However, it seems like these are running a bit slow for me. I…

May 3, 2018

Research Activities Spring 2018

I am working on integrating distantly supervised sentiment tweets into irony/sarcasm detection for tweets. Previous results that I’ve obtained have shown that the MPQA lexicon improved results during cross validation…

Apr 11, 2018

Part-of-Speech tagging in Wanga

I have my first results from using an SVM tagger on Wanga data extracted from the fieldworks corpus. They’re quite good. With only 1500 wanga words, noun class identification reaches…

Mar 13, 2018

Pachyderm for NLP?

My problem I have developed quite a set up for doing machine learning and other tasks (80+ cpu cores and 150+ GB of ram combined). It is nice when the…