Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
YStrano
GitHub Repository: YStrano/DataScience_GA
Path: blob/master/lessons/lesson_13/extra-materials/bayes_theorem_iris.ipynb
1904 views
Kernel: Python 2

Applying Bayes' Theorem to Iris Classification

Can Bayes' theorem help us to solve a classification problem, namely predicting the species of an iris?

Preparing the Data

We'll read the iris data into a DataFrame, and round up all of the measurements to the next integer:

import pandas as pd import numpy as np
# Read the iris data into a DataFrame. url = 'http://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data' col_names = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width', 'species'] iris = pd.read_csv(url, header=None, names=col_names) iris.head()
# Apply the ceiling function to the numeric columns. iris.loc[:, 'sepal_length':'petal_width'] = iris.loc[:, 'sepal_length':'petal_width'].apply(np.ceil) iris.head()

Deciding How to Make a Prediction

Let's say I have an out-of-sample iris with the following measurements: 7, 3, 5, 2. How might I predict the species?

# Show all observations with features 7, 3, 5, 2. iris[(iris.sepal_length==7) & (iris.sepal_width==3) & (iris.petal_length==5) & (iris.petal_width==2)]
# Count the species for these observations. iris[(iris.sepal_length==7) & (iris.sepal_width==3) & (iris.petal_length==5) & (iris.petal_width==2)].species.value_counts()
Iris-versicolor 13 Iris-virginica 4 dtype: int64
# Count the species for all observations. iris.species.value_counts()
Iris-setosa 50 Iris-versicolor 50 Iris-virginica 50 dtype: int64

Let's frame this as a conditional probability problem: What is the probability of some particular species, given the measurements 7, 3, 5, and 2?

P(species  7352)P(species \ | \ 7352)

We could calculate the conditional probability for each of the three species, and then predict the species with the highest probability:

P(setosa  7352)P(setosa \ | \ 7352)P(versicolor  7352)P(versicolor \ | \ 7352)P(virginica  7352)P(virginica \ | \ 7352)

Calculating the Probability of Each Species

Bayes' theorem gives us a way to calculate these conditional probabilities.

Let's start with versicolor:

P(versicolor  7352)=P(7352  versicolor)×P(versicolor)P(7352)P(versicolor \ | \ 7352) = \frac {P(7352 \ | \ versicolor) \times P(versicolor)} {P(7352)}

We can calculate each of the terms on the right side of the equation:

P(7352  versicolor)=1350=0.26P(7352 \ | \ versicolor) = \frac {13} {50} = 0.26P(versicolor)=50150=0.33P(versicolor) = \frac {50} {150} = 0.33P(7352)=17150=0.11P(7352) = \frac {17} {150} = 0.11

Therefore, Bayes' theorem says the probability of versicolor given these measurements is:

P(versicolor  7352)=0.26×0.330.11=0.76P(versicolor \ | \ 7352) = \frac {0.26 \times 0.33} {0.11} = 0.76

Let's repeat this process for virginica and setosa:

P(virginica  7352)=0.08×0.330.11=0.24P(virginica \ | \ 7352) = \frac {0.08 \times 0.33} {0.11} = 0.24P(setosa  7352)=0×0.330.11=0P(setosa \ | \ 7352) = \frac {0 \times 0.33} {0.11} = 0

We predict that the iris is a versicolor, since that species had the highest conditional probability.

Summary

  1. We framed a classification problem as three conditional probability problems.

  2. We used Bayes' theorem to calculate those conditional probabilities.

  3. We made a prediction by choosing the species with the highest conditional probability.

Bonus: The Intuition Behind Bayes' Theorem

Let's make some hypothetical adjustments to the data to demonstrate how Bayes' theorem makes intuitive sense:

Pretend that more of the existing versicolors had measurements of 7352:

  • P(7352  versicolor)P(7352 \ | \ versicolor) would increase, thus increasing the numerator.

  • It would make sense that, given an iris with measurements of 7352, the probability of it being a versicolor would also increase.

Pretend that most of the existing irises were versicolor:

  • P(versicolor)P(versicolor) would increase, thus increasing the numerator.

  • It would make sense that the probability of any iris being a versicolor (regardless of measurements) would also increase.

Pretend that 17 of the setosas had measurements of 7352:

  • P(7352)P(7352) would double, thus doubling the denominator.

  • It would make sense that given an iris with measurements of 7352, the probability of it being a versicolor would be cut in half.