{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "# Project 1: Deaths by tuberculosis\n", "\n", "by Gerry McGinty, 11 May 2018.\n", "\n", "This is the project notebook for Week 1 of The Open University's [_Learn to code for Data Analysis_](http://futurelearn.com/courses/learn-to-code) course.\n", "\n", "In 2000, the United Nations set eight Millenium Development Goals (MDGs) to reduce poverty and diseases, improve gender equality and environmental sustainability, etc. Each goal is quantified and time-bound, to be achieved by the end of 2015. Goal 6 is to have halted and started reversing the spread of HIV, malaria and tuberculosis (TB).\n", "TB doesn't make headlines like Ebola, SARS (severe acute respiratory syndrome) and other epidemics, but is far deadlier. For more information, see the World Health Organisation (WHO) page .\n", "\n", "Given the population and number of deaths due to TB in some countries during one year, the following questions will be answered: \n", "\n", "- What is the total, maximum, minimum and average number of deaths in that year?\n", "- Which countries have the most and the least deaths?\n", "- What is the death rate (deaths per 100,000 inhabitants) for each country?\n", "- Which countries have the lowest and highest death rate?\n", "\n", "The death rate allows for a better comparison of countries with widely different population sizes." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## The data\n", "\n", "The data consists of total population and total number of deaths due to TB (excluding HIV) in 2013 in every country in the world.\n", "\n", "The data was taken in July 2015 from (population) and (deaths). The uncertainty bounds of the number of deaths were ignored.\n", "\n", "The data was collected into an Excel file which should be in the same folder as this notebook." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryPopulation (1000s)TB deaths
0Afghanistan3055213000.00
1Albania317320.00
2Algeria392085100.00
3Andorra790.26
4Angola214726900.00
5Antigua and Barbuda901.20
6Argentina41446570.00
7Armenia2977170.00
8Australia2334345.00
9Austria849529.00
10Azerbaijan9413360.00
11Bahamas3771.80
12Bahrain13329.60
13Bangladesh15659580000.00
14Barbados2852.00
15Belarus9357850.00
16Belgium1110418.00
17Belize33220.00
18Benin103231300.00
19Bhutan75488.00
20Bolivia (Plurinational State of)10671430.00
21Bosnia and Herzegovina3829190.00
22Botswana2021440.00
23Brazil2003624400.00
24Brunei Darussalam41813.00
25Bulgaria7223150.00
26Burkina Faso169351500.00
27Burundi101632300.00
28Côte d'Ivoire203164000.00
29Cabo Verde499150.00
............
164Suriname53912.00
165Swaziland12501100.00
166Sweden957113.00
167Switzerland807817.00
168Syrian Arab Republic21898450.00
169Tajikistan8208570.00
170Thailand670108100.00
171The former Yugoslav republic of Macedonia210733.00
172Timor-Leste1133990.00
173Togo6817810.00
174Tonga1052.50
175Trinidad and Tobago134129.00
176Tunisia10997230.00
177Turkey74933310.00
178Turkmenistan52401300.00
179Tuvalu102.80
180Uganda375794100.00
181Ukraine452396600.00
182United Arab Emirates934664.00
183United Kingdom of Great Britain and Northern I...63136340.00
184United Republic of Tanzania492536000.00
185United States of America320051490.00
186Uruguay340740.00
187Uzbekistan289342200.00
188Vanuatu25316.00
189Venezuela (Bolivarian Republic of)30405480.00
190Viet Nam9168017000.00
191Yemen24407990.00
192Zambia145393600.00
193Zimbabwe141505700.00
\n", "

194 rows × 3 columns

\n", "
" ] }, "execution_count": 9, "metadata": { }, "output_type": "execute_result" } ], "source": [ "import warnings\n", "warnings.simplefilter('ignore', FutureWarning)\n", "\n", "from pandas import *\n", "data = read_excel('WHO POP TB all.xls')\n", "data" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## The range of the problem\n", "\n", "The column of interest is the last one." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "tbColumn = data['TB deaths']" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "The total number of deaths in 2013 is:" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "1072677.97" ] }, "execution_count": 11, "metadata": { }, "output_type": "execute_result" } ], "source": [ "tbColumn.sum()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "The largest and smallest number of deaths in a single country are:" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "240000.0" ] }, "execution_count": 12, "metadata": { }, "output_type": "execute_result" } ], "source": [ "tbColumn.max()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0.0" ] }, "execution_count": 13, "metadata": { }, "output_type": "execute_result" } ], "source": [ "tbColumn.min()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "From zero (to 2 significant figures) to almost a quarter of a million deaths is a huge range. The average number of deaths, over all countries in the data, can give a better idea of the seriousness of the problem in each country.\n", "The average can be computed as the mean or the median. Given the wide range of deaths, the median is probably a more sensible average measure." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "##### tbColumn.mean()" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "315.0" ] }, "execution_count": 14, "metadata": { }, "output_type": "execute_result" } ], "source": [ "tbColumn.median()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "The median is far lower than the mean. This indicates that some of the countries had a very high number of TB deaths in 2013, pushing the value of the mean up." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## The most affected\n", "\n", "To see the most affected countries, the table is sorted in ascending order by the last column, which puts those countries in the last rows." ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryPopulation (1000s)TB deaths
147San Marino310.00
125Niue10.01
111Monaco380.03
3Andorra790.26
129Palau210.36
40Cook Islands210.41
118Nauru100.67
76Iceland3300.93
68Grenada1061.10
5Antigua and Barbuda901.20
113Montenegro6211.20
152Seychelles931.40
105Malta4291.50
143Saint Kitts and Nevis541.60
11Bahamas3771.80
14Barbados2852.00
144Saint Lucia1822.20
99Luxembourg5302.20
44Cyprus11412.30
174Tonga1052.50
50Dominica722.70
137Qatar21692.70
179Tuvalu102.80
145Saint Vincent and the Grenadines1093.10
126Norway50434.40
146Samoa1906.10
121New Zealand45066.30
103Maldives3457.60
12Bahrain13329.60
164Suriname53912.00
............
160South Sudan112964500.00
119Nepal277974600.00
2Algeria392085100.00
193Zimbabwe141505700.00
184United Republic of Tanzania492536000.00
181Ukraine452396600.00
46Democratic People's Republic of Korea248956700.00
4Angola214726900.00
158Somalia104967700.00
31Cameroon222547800.00
170Thailand670108100.00
88Kenya443549100.00
163Sudan379649700.00
30Cambodia1513510000.00
100Madagascar2292512000.00
0Afghanistan3055213000.00
141Russian Federation14283417000.00
190Viet Nam9168017000.00
115Mozambique2583418000.00
159South Africa5277625000.00
116Myanmar5325926000.00
134Philippines9839427000.00
58Ethiopia9410130000.00
36China139333741000.00
47Democratic Republic of the Congo6751446000.00
128Pakistan18214349000.00
78Indonesia24986664000.00
13Bangladesh15659580000.00
124Nigeria173615160000.00
77India1252140240000.00
\n", "

194 rows × 3 columns

\n", "
" ] }, "execution_count": 15, "metadata": { }, "output_type": "execute_result" } ], "source": [ "data.sort_values('TB deaths')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "The table raises the possibility that a large number of deaths may be partly due to a large population. To compare the countries on an equal footing, the death rate per 100,000 inhabitants is computed." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryPopulation (1000s)TB deathsTB deaths (per 100,000)
147San Marino310.000.000000
111Monaco380.030.078947
126Norway50434.400.087250
120Netherlands1675920.000.119339
137Qatar21692.700.124481
166Sweden957113.000.135827
121New Zealand45066.300.139814
185United States of America320051490.000.153101
16Belgium1110418.000.162104
32Canada3518262.000.176226
8Australia2334345.000.192777
113Montenegro6211.200.193237
44Cyprus11412.300.201578
82Israel773316.000.206905
167Switzerland807817.000.210448
45Czech Republic1070228.000.261633
76Iceland3300.930.281818
60Finland542617.000.313306
43Cuba1126637.000.328422
3Andorra790.260.329114
9Austria849529.000.341377
105Malta4291.500.349650
65Germany82727300.000.362639
81Ireland462718.000.389021
177Turkey74933310.000.413703
99Luxembourg5302.200.415094
48Denmark561924.000.427122
11Bahamas3771.800.477454
86Jordan727435.000.481166
83Italy60990310.000.508280
...............
29Cabo Verde499150.0030.060120
58Ethiopia9410130000.0031.880639
4Angola214726900.0032.134873
131Papua New Guinea73212400.0032.782407
31Cameroon222547800.0035.049879
106Marshall Islands5321.0039.622642
160South Sudan112964500.0039.837110
193Zimbabwe141505700.0040.282686
0Afghanistan3055213000.0042.550406
153Sierra Leone60922600.0042.678923
39Congo44482000.0044.964029
95Lesotho2074960.0046.287367
159South Africa5277625000.0047.370017
33Central African Republic46162200.0047.660312
116Myanmar5325926000.0048.818040
96Liberia42942100.0048.905449
13Bangladesh15659580000.0051.087199
100Madagascar2292512000.0052.344602
92Lao People's Democratic Republic67703600.0053.175775
62Gabon1672910.0054.425837
117Namibia23031300.0056.448111
30Cambodia1513510000.0066.072019
47Democratic Republic of the Congo6751446000.0068.134017
115Mozambique2583418000.0069.675621
71Guinea-Bissau17041200.0070.422535
158Somalia104967700.0073.361280
172Timor-Leste1133990.0087.378641
165Swaziland12501100.0088.000000
124Nigeria173615160000.0092.157936
49Djibouti873870.0099.656357
\n", "

194 rows × 4 columns

\n", "
" ] }, "execution_count": 16, "metadata": { }, "output_type": "execute_result" } ], "source": [ "populationColumn = data['Population (1000s)']\n", "data['TB deaths (per 100,000)'] = tbColumn * 100 / populationColumn\n", "data.sort_values('TB deaths (per 100,000)')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Conclusions\n", "\n", "The total number of deaths in the world in 2013 due to TB is over 1 million. The median shows that half of these coutries had fewer than 315. The much higher mean (over 5500) indicates that some countries had a very high number. The least affected were San Marino and Niue, with 0 and 0.01 deaths respectively, and the most affected were China and India with 160 thousand and 240 thousand deaths in a single year. However, taking the population size into account, the least affected were San Marino (again) and Monaco with less than 0.08 deaths per 100 thousand inhabitants, and the most affected were Nigeria and Djinouti with over 92 deaths per 100,000 inhabitants.\n", "\n", "One should not forget that most values are estimates, and that the chosen countries are a small sample of all the world's countries. Nevertheless, they convey the message that TB is still a major cause of fatalities, and that there is a huge disparity between countries, with several ones being highly affected." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (Ubuntu Linux)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.2" } }, "nbformat": 4, "nbformat_minor": 0 }