05 / Project
Sentiment Analysis on 1.6M Tweets
Large-scale NLP on the Kaggle Twitter dataset
sentiment-analysis.stack
The Problem
Understanding public sentiment at scale requires a pipeline that can clean, vectorize, and classify millions of noisy, informal tweets.
What I Built
A sentiment classifier (positive / negative / neutral) trained on 1.6M tweets using a classical NLP pipeline.
Approach
Tweets are tokenized and normalized with NLTK, vectorized with gensim-based embeddings, and classified using scikit-learn models, with SciPy/NumPy/Pandas handling the numerical heavy lifting.
Next project
IIT Dharwad Library Web Portal