Path: blob/master/incubator/poisson-regression.ipynb
411 views
Introduction
Problem Type
Poisson regression is the kind of regression we do when we want to estimate the effect that our explanatory variables have on the dependent variable, which is of type "count data". If we're trying to find a linear combination of the explanatory variables, then our Poisson regression is a subset of generalized linear models.
It's "Poisson" mainly because we use the Poisson distribution to model the likelihood of the dependent variable.
What we get out of this type of model is the relative contribution of each explanatory variable to the value of the dependent variable.
Data structure
To use it with this model, the data should be structured as such:
Each row is one measurement.
The columns should be:
One column per explanatory variable.
Use ordinal data where possible; otherwise, strictly categorical data should be binarized.
One column for the dependent variable.
Extensions to the model
None.
Reporting summarized findings
Here are examples of how to summarize the findings.
For every increase in , we expect to see an increase in Y by
mean
(95% HPD: [lower
,upper
].
Other notes
None.
Exploratory Data Analysis
The best interpretation of this is that the log10 number of months that a boat has been used is the strongest positive contributor to the number of damages that a ship takes.
Posterior Predictive Checks
Let's see what the PPC looks like. We will sample 10,000 predicted values for each row in the dataframe, and plot the 95% HPD of the values.
Let's plot the posterior distribution of predictions vs. actual.