Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
YStrano
GitHub Repository: YStrano/DataScience_GA
Path: blob/master/lessons/lesson_13/practice/solution-code/sentiment_analysis-lab-solutions.ipynb
1904 views
Kernel: Python 2

Airline Tweets Sentiment Analysis Lab

Authors: Phillippa Thomson (NYC)


You are going to be analyzing tweets about airlines. These have been hand-tagged with sentiment. There are three categories: positive, neutral, and negative.

Use VADER to calculate sentiment for each tweet, and see if you can correctly predict the hand-tagged sentiment.

What is the accuracy? Print out a heatmap to see where your model performs well, and where it performs poorly.

import pandas as pd import numpy as np from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, \ precision_score, recall_score import seaborn as sns import matplotlib.pyplot as plt %matplotlib inline
tweets = pd.read_csv("../../data/Tweets.csv")
(Output Hidden)
tweets.head()

1. Preview the airline_sentiment column.

  • What percentage of reviews are positive, neutral, and negative?

tweets['airline_sentiment'].value_counts() / len(tweets)
negative 0.626913 neutral 0.211680 positive 0.161407 Name: airline_sentiment, dtype: float64

2. Load in the Sentiment IntensityAnalyzer from Vader and add compound, negative, neutral, and positive scores into the DataFrame.

from nltk.sentiment.vader import SentimentIntensityAnalyzer sia = SentimentIntensityAnalyzer() compound= [] neg = [] neu = [] pos = [] for tweet in tweets['text']: sent = sia.polarity_scores(tweet) compound.append(sent['compound']) neg.append(sent['neg']) neu.append(sent['neu']) pos.append(sent['pos'])
/home/alex/anaconda3/lib/python2.7/site-packages/nltk/twitter/__init__.py:20: UserWarning: The twython library has not been installed. Some functionality from the twitter package will not be available. warnings.warn("The twython library has not been installed. "
tweets['compound'] = compound tweets['neg'] = neg tweets['neu'] = neu tweets['pos'] = pos
tweets.head()

3. Store airline_sentiment in y to use as labels and create an appropriate feature matrix, X.

y = tweets['airline_sentiment'] X = tweets[['compound', 'neg', 'neu', 'pos']]

4. Fit a model of your choice to predict airline_sentient and cross-validate.

from sklearn.ensemble import RandomForestClassifier rf = RandomForestClassifier() from sklearn.model_selection import cross_val_score, train_test_split
rf.fit(X,y)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini', max_depth=None, max_features='auto', max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False)
cross_val_score(rf, X, y) # versus the baseline (63%), this is a little weak.
array([ 0.66769105, 0.66188525, 0.67985243])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.30, random_state=14) rf.fit(X_train,y_train)
RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini', max_depth=None, max_features='auto', max_leaf_nodes=None, min_impurity_split=1e-07, min_samples_leaf=1, min_samples_split=2, min_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=1, oob_score=False, random_state=None, verbose=0, warm_start=False)

5. Display the confusion matrix.

  • What reviews are difficult to identify?

conmat = np.array(confusion_matrix(y_test, rf.predict(X_test))) confusion = pd.DataFrame(conmat, index=['negative', 'neutral', 'positive'],\ columns=['Pred neg', 'Pred neutral', 'Pred pos']) plt.figure(figsize = (6,6)) heat = sns.heatmap(confusion, annot=True, annot_kws={"size": 20},cmap='Blues',fmt='g', cbar=False) plt.xticks(rotation=0, fontsize=14) plt.yticks(fontsize=14) plt.title("Confusion Matrix", fontsize=20)
<matplotlib.text.Text at 0x7fc43ea75710>
Image in a Jupyter notebook

6. Print the classification report and discuss the characteristics of the model.

print(classification_report(y_test, rf.predict(X_test)))
precision recall f1-score support negative 0.73 0.88 0.80 2794 neutral 0.33 0.14 0.20 902 positive 0.58 0.52 0.55 696 avg / total 0.62 0.67 0.64 4392

The model does ok with negative tweets (the predominant class) but quite poorly with neutral.

To put this in perspective, human concordance, the probability that two people assign the same sentiment to an observation is usually around 70%-80% our baseline is at 63%. Even small increases in accuracy quickly move us towards a theoretical maximum in performance.