Second Sentiment Analysis Experiment on Naive Bayes With NLTK : Bigrams
From my last post I experimented with some of the techniques such as stopwords and bag-of-words model. I yielded some acceptable results. This post, I’m going to try with bigrams to see if I can increase the accuracy.
I changed the code a little bit to be
1 2 3 4 5 6 7
I decided to use chi_sq as suggested in this post. However, the accuracy has gone down significantly to 19.7530864198%. I guess this might be that my document (~100 document for each sentiment) is not large enough to use bigrams. But this is just my conclusion. I’m going to try to increase the dataset and test it again.