Summary Statistics Tutorial
When calculating summary statistics for continuous data, we can categorize these statistics into two categories: central tendency or dispersion. Statistics that describe central tendencies are mean and median. For the spread and dispersion of data, we can look at standard deviation, variance and range.
Let's calculate the mean, median, standard deviation, variance and range of the radius_worst variable.
- 7.93
- 36.04
Correlation Tutorial
Correlation measures the strength of relationship between two vectors of numerics. From the correlation coefficient, we can gather information such as the magnitude and the direction of the relationship. Correlation coefficient ranges between -1 and 1 with -1 showing a strong negative relationship and +1 showing a strong positive relationship.
There is a strong correlation between the mean smoothness and mean compactness measures.
However, there is little to no correlation between mean texture and mean symmetry measures.
Hypothesis Testing (Student's T-Test) Tutorial
To test whether the mean measurements between two groups are significantly different from 0, we can use the Student's t-test to calculate the p-value
Correlation Plot Tutorial
Other Plots Tutorial
{"output_type":"display_data"}
{"output_type":"display_data"}
{"output_type":"display_data"}
{"output_type":"display_data"}