{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "As with `NumPy`, there's a fairly standard `import` statement that gets used for Pandas.  Our textbook author also mentions importing `DataFrame` and `Series` specifically since these are used so frequently."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "import pandas as pd\n",
    "import numpy as np\n",
    "\n",
    "from pandas import Series, DataFrame"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "#  Getting the FIPS data\n",
    "\n",
    "In the NumPy version of this exercise, we manually downloaded and parsed a file that contained our FIPS code database.  Pandas provides some pre-built parsing capabilities, and is able to parse information from \n",
    "* CSV (<span style=\"text-decoration:underline\">C</span>omma <span style=\"text-decoration:underline\">S</span>eparated <span style=\"text-decoration:underline\">V</span>alues)\n",
    "* tab-delimited files\n",
    "* JSON\n",
    "* HTML tables\n",
    "* Data stored in an SQL database\n",
    "\n",
    "The USDA has a <a href=\"https://www.nrcs.usda.gov/wps/portal/nrcs/detail/?cid=nrcs143_013696\" target=\"_blank\">page that contains FIPS codes</a>.  First let's try just plain `pd.read_html` to see what it does."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "15\n"
     ]
    }
   ],
   "source": [
    "url = 'https://www.nrcs.usda.gov/wps/portal/nrcs/detail/?cid=nrcs143_013696'\n",
    "tables = pd.read_html(url)\n",
    "print(len(tables))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Yikes, that's a lot of tables (for those of you who have some HTML experience, this is another reason why table-based page layout is not a very good idea!) \n",
    "\n",
    "Let's refine our search to look for one that contains the text **\"FIPS\"**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "10\n"
     ]
    }
   ],
   "source": [
    "tables = pd.read_html(url, match=\"FIPS\")\n",
    "print(len(tables))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "That's a little better, but not great.  Here's a version where you can identify the table by an attribute it has in the HTML - this requires a bit of HTML knowledge, but it works out the best.  It would be even better if our table had an **id** attribute, but it doesn't."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1\n"
     ]
    }
   ],
   "source": [
    "tables = pd.read_html(url, attrs={'class':'data'})\n",
    "print(len(tables))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "OK, now that we've got it down to 1 table, we can pick it out, and see what it looks like."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Arizona</td>\n",
       "      <td>AZ</td>\n",
       "      <td>4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Arkansas</td>\n",
       "      <td>AR</td>\n",
       "      <td>5</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>California</td>\n",
       "      <td>CA</td>\n",
       "      <td>6</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Name Postal Code  FIPS\n",
       "0     Alabama          AL     1\n",
       "1      Alaska          AK     2\n",
       "2     Arizona          AZ     4\n",
       "3    Arkansas          AR     5\n",
       "4  California          CA     6"
      ]
     },
     "execution_count": 5,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips = tables[0]\n",
    "fips.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "And look at the data at the end (usually if there's going to need to be adjustments made, they will occur at the beginning of the table or at the end)\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>Virginia</td>\n",
       "      <td>VA</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>46</th>\n",
       "      <td>Washington</td>\n",
       "      <td>WA</td>\n",
       "      <td>53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>West Virginia</td>\n",
       "      <td>WV</td>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>Wisconsin</td>\n",
       "      <td>WI</td>\n",
       "      <td>55</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49</th>\n",
       "      <td>Wyoming</td>\n",
       "      <td>WY</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>50</th>\n",
       "      <td>American Samoa</td>\n",
       "      <td>AS</td>\n",
       "      <td>60</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>51</th>\n",
       "      <td>Guam</td>\n",
       "      <td>GU</td>\n",
       "      <td>66</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>52</th>\n",
       "      <td>Northern Mariana Islands</td>\n",
       "      <td>MP</td>\n",
       "      <td>69</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>53</th>\n",
       "      <td>Puerto Rico</td>\n",
       "      <td>PR</td>\n",
       "      <td>72</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>54</th>\n",
       "      <td>Virgin Islands</td>\n",
       "      <td>VI</td>\n",
       "      <td>78</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "                        Name Postal Code  FIPS\n",
       "45                  Virginia          VA    51\n",
       "46                Washington          WA    53\n",
       "47             West Virginia          WV    54\n",
       "48                 Wisconsin          WI    55\n",
       "49                   Wyoming          WY    56\n",
       "50            American Samoa          AS    60\n",
       "51                      Guam          GU    66\n",
       "52  Northern Mariana Islands          MP    69\n",
       "53               Puerto Rico          PR    72\n",
       "54            Virgin Islands          VI    78"
      ]
     },
     "execution_count": 6,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips.tail(n=10)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Let's get rid of the last 5 rows, since they correspond to territories and not states.  We can apply the same type of array slicing syntax that worked in NumPy."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>45</th>\n",
       "      <td>Virginia</td>\n",
       "      <td>VA</td>\n",
       "      <td>51</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>46</th>\n",
       "      <td>Washington</td>\n",
       "      <td>WA</td>\n",
       "      <td>53</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>47</th>\n",
       "      <td>West Virginia</td>\n",
       "      <td>WV</td>\n",
       "      <td>54</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>48</th>\n",
       "      <td>Wisconsin</td>\n",
       "      <td>WI</td>\n",
       "      <td>55</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>49</th>\n",
       "      <td>Wyoming</td>\n",
       "      <td>WY</td>\n",
       "      <td>56</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "             Name Postal Code  FIPS\n",
       "45       Virginia          VA    51\n",
       "46     Washington          WA    53\n",
       "47  West Virginia          WV    54\n",
       "48      Wisconsin          WI    55\n",
       "49        Wyoming          WY    56"
      ]
     },
     "execution_count": 7,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips = fips[0:-5]\n",
    "fips.tail()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now let's since the Census site is going to use strings with leading zeros to help us identify a state, let's transform the FIPS column.  First, pull out the desired column of the `DataFrame` and assign it to a variable and print out its type."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.series.Series'>\n",
      "0    1\n",
      "1    2\n",
      "2    4\n",
      "3    5\n",
      "4    6\n",
      "Name: FIPS, dtype: int64\n"
     ]
    }
   ],
   "source": [
    "fips_codes = fips['FIPS']\n",
    "print(type(fips_codes))\n",
    "print(fips_codes.head())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now use a **list comprehension** to generate a list containing the string versions of the FIPS codes"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['1',\n",
       " '2',\n",
       " '4',\n",
       " '5',\n",
       " '6',\n",
       " '8',\n",
       " '9',\n",
       " '10',\n",
       " '12',\n",
       " '13',\n",
       " '15',\n",
       " '16',\n",
       " '17',\n",
       " '18',\n",
       " '19',\n",
       " '20',\n",
       " '21',\n",
       " '22',\n",
       " '23',\n",
       " '24',\n",
       " '25',\n",
       " '26',\n",
       " '27',\n",
       " '28',\n",
       " '29',\n",
       " '30',\n",
       " '31',\n",
       " '32',\n",
       " '33',\n",
       " '34',\n",
       " '35',\n",
       " '36',\n",
       " '37',\n",
       " '38',\n",
       " '39',\n",
       " '40',\n",
       " '41',\n",
       " '42',\n",
       " '44',\n",
       " '45',\n",
       " '46',\n",
       " '47',\n",
       " '48',\n",
       " '49',\n",
       " '50',\n",
       " '51',\n",
       " '53',\n",
       " '54',\n",
       " '55',\n",
       " '56']"
      ]
     },
     "execution_count": 9,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_codes_as_strings = [ str(code) for code in fips[\"FIPS\"] ]\n",
    "fips_codes_as_strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Pandas gives us a simpler way to convert data types, however, using the `astype` **method** on a `Series` object."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      1\n",
       "1      2\n",
       "2      4\n",
       "3      5\n",
       "4      6\n",
       "5      8\n",
       "6      9\n",
       "7     10\n",
       "8     12\n",
       "9     13\n",
       "10    15\n",
       "11    16\n",
       "12    17\n",
       "13    18\n",
       "14    19\n",
       "15    20\n",
       "16    21\n",
       "17    22\n",
       "18    23\n",
       "19    24\n",
       "20    25\n",
       "21    26\n",
       "22    27\n",
       "23    28\n",
       "24    29\n",
       "25    30\n",
       "26    31\n",
       "27    32\n",
       "28    33\n",
       "29    34\n",
       "30    35\n",
       "31    36\n",
       "32    37\n",
       "33    38\n",
       "34    39\n",
       "35    40\n",
       "36    41\n",
       "37    42\n",
       "38    44\n",
       "39    45\n",
       "40    46\n",
       "41    47\n",
       "42    48\n",
       "43    49\n",
       "44    50\n",
       "45    51\n",
       "46    53\n",
       "47    54\n",
       "48    55\n",
       "49    56\n",
       "Name: FIPS, dtype: string"
      ]
     },
     "execution_count": 10,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_codes_as_strings = fips_codes.astype('string')\n",
    "fips_codes_as_strings"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "But, we want these to have a leading zero for values under 10.  Let's write a function to do this for us."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "def convert(value):\n",
    "    if value >= 10:\n",
    "        return str(value)\n",
    "    else:\n",
    "        return '0' + str(value)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Try it with the first and last values in `fips`.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "01\n",
      "13\n",
      "56\n"
     ]
    }
   ],
   "source": [
    "print(convert(fips[\"FIPS\"][0]))\n",
    "print(convert(fips[\"FIPS\"][9]))\n",
    "print(convert(fips[\"FIPS\"][49]))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Note that the following call doesn't work.  Why not?\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "fips_series = fips[\"FIPS\"]\n",
    "\n",
    "#fips_series[-1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "The `Series` and `DataFrame` objects provide a separate method named `iloc` to work positionally, rather than with **index values**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     1\n",
       "1     2\n",
       "2     4\n",
       "3     5\n",
       "4     6\n",
       "5     8\n",
       "6     9\n",
       "7    10\n",
       "8    12\n",
       "9    13\n",
       "Name: FIPS, dtype: int64"
      ]
     },
     "execution_count": 14,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips[\"FIPS\"].iloc[0:10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now we can use the `map` method along with `convert` to get new values for each of the values in the \"FIPS\" column.  `map` is a method of a `Series` that returns a **new** `Series` based on calling a function on each value of the original `Series`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/tmp/ipykernel_428/1832877244.py:1: SettingWithCopyWarning: \n",
      "A value is trying to be set on a copy of a slice from a DataFrame.\n",
      "Try using .loc[row_indexer,col_indexer] = value instead\n",
      "\n",
      "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
      "  fips[\"FIPS\"] = fips[\"FIPS\"].map(convert)\n"
     ]
    },
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "      <td>01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "      <td>02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Arizona</td>\n",
       "      <td>AZ</td>\n",
       "      <td>04</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Arkansas</td>\n",
       "      <td>AR</td>\n",
       "      <td>05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>California</td>\n",
       "      <td>CA</td>\n",
       "      <td>06</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Name Postal Code FIPS\n",
       "0     Alabama          AL   01\n",
       "1      Alaska          AK   02\n",
       "2     Arizona          AZ   04\n",
       "3    Arkansas          AR   05\n",
       "4  California          CA   06"
      ]
     },
     "execution_count": 15,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips[\"FIPS\"] = fips[\"FIPS\"].map(convert)\n",
    "fips.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "And now we can replace the index with the FIPS values, and get rid of the FIPS column.  First remind ourselves of the structure of our original `DataFrame`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "      <td>01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "      <td>02</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Name Postal Code FIPS\n",
       "0  Alabama          AL   01\n",
       "1   Alaska          AK   02"
      ]
     },
     "execution_count": 16,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips.head(n=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now create a new `DataFrame` named `fips_new` with the `converted` `Series` serving as the **index**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "fips_new = DataFrame(data={\"Name\": fips[\"Name\"], \"Postal Code\": fips[\"Postal Code\"]}, index=fips[\"FIPS\"])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>02</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>04</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>05</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>06</th>\n",
       "      <td>NaN</td>\n",
       "      <td>NaN</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "     Name Postal Code\n",
       "FIPS                 \n",
       "01    NaN         NaN\n",
       "02    NaN         NaN\n",
       "04    NaN         NaN\n",
       "05    NaN         NaN\n",
       "06    NaN         NaN"
      ]
     },
     "execution_count": 18,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.head(n=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Ughh, that doesn't seem to have worked very well!  When combining `Series` objects together like this, Pandas always wants to combine things together that have the same **index**.  Since our new `DataFame` uses a completely different index than the original `Series` objects, this approach doesn't work."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "RangeIndex(start=0, stop=50, step=1)\n",
      "Index(['01', '02', '04', '05', '06', '08', '09', '10', '12', '13', '15', '16',\n",
      "       '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28',\n",
      "       '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40',\n",
      "       '41', '42', '44', '45', '46', '47', '48', '49', '50', '51', '53', '54',\n",
      "       '55', '56'],\n",
      "      dtype='object', name='FIPS')\n"
     ]
    }
   ],
   "source": [
    "print(fips.index)\n",
    "print(fips_new.index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Here's one approach we can try:  using the `values` property of a `Series` object to get a more \"raw\" form of the data."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['Alabama' 'Alaska' 'Arizona' 'Arkansas' 'California' 'Colorado'\n",
      " 'Connecticut' 'Delaware' 'Florida' 'Georgia' 'Hawaii' 'Idaho' 'Illinois'\n",
      " 'Indiana' 'Iowa' 'Kansas' 'Kentucky' 'Louisiana' 'Maine' 'Maryland'\n",
      " 'Massachusetts' 'Michigan' 'Minnesota' 'Mississippi' 'Missouri' 'Montana'\n",
      " 'Nebraska' 'Nevada' 'New Hampshire' 'New Jersey' 'New Mexico' 'New York'\n",
      " 'North Carolina' 'North Dakota' 'Ohio' 'Oklahoma' 'Oregon' 'Pennsylvania'\n",
      " 'Rhode Island' 'South Carolina' 'South Dakota' 'Tennessee' 'Texas' 'Utah'\n",
      " 'Vermont' 'Virginia' 'Washington' 'West Virginia' 'Wisconsin' 'Wyoming']\n"
     ]
    }
   ],
   "source": [
    "print(fips[\"Name\"].values)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "What might the type of `fips[\"Name\"].values` be?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'numpy.ndarray'>\n"
     ]
    }
   ],
   "source": [
    "print(type(fips[\"Name\"].values))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Let's try to create our new data frame using the **values** from the Series, so that we're not trying to match indexes."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Post Code</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>02</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>04</th>\n",
       "      <td>Arizona</td>\n",
       "      <td>AZ</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>05</th>\n",
       "      <td>Arkansas</td>\n",
       "      <td>AR</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>06</th>\n",
       "      <td>California</td>\n",
       "      <td>CA</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Name Post Code\n",
       "FIPS                      \n",
       "01       Alabama        AL\n",
       "02        Alaska        AK\n",
       "04       Arizona        AZ\n",
       "05      Arkansas        AR\n",
       "06    California        CA"
      ]
     },
     "execution_count": 22,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new = DataFrame(data={\"Name\": fips[\"Name\"].values, \"Post Code\": fips[\"Postal Code\"].values}, index=fips[\"FIPS\"])\n",
    "fips_new.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Use the `info` method to see some details about the `fips_new` `DataFrame`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "Index: 50 entries, 01 to 56\n",
      "Data columns (total 2 columns):\n",
      " #   Column     Non-Null Count  Dtype \n",
      "---  ------     --------------  ----- \n",
      " 0   Name       50 non-null     object\n",
      " 1   Post Code  50 non-null     object\n",
      "dtypes: object(2)\n",
      "memory usage: 1.2+ KB\n"
     ]
    }
   ],
   "source": [
    "fips_new.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Take a look at the **index** associated with `fips_new`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Index(['01', '02', '04', '05', '06', '08', '09', '10', '12', '13', '15', '16',\n",
      "       '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28',\n",
      "       '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40',\n",
      "       '41', '42', '44', '45', '46', '47', '48', '49', '50', '51', '53', '54',\n",
      "       '55', '56'],\n",
      "      dtype='object', name='FIPS')\n"
     ]
    }
   ],
   "source": [
    "print(fips_new.index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now we can get out information about Michigan like this"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Name         Michigan\n",
      "Post Code          MI\n",
      "Name: 26, dtype: object\n",
      "Name         Alabama\n",
      "Post Code         AL\n",
      "Name: 01, dtype: object\n"
     ]
    }
   ],
   "source": [
    "#  Get row by index key value\n",
    "print(fips_new.loc['26'])\n",
    "#  Get row by position\n",
    "print(fips_new.iloc[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "###  Other ways to create the new `DataFrame`"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "First, we can use the `set_index` method, which uses an **existing column** as the index for the `DataFrame`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "      <td>01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "      <td>02</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "      Name Postal Code FIPS\n",
       "0  Alabama          AL   01\n",
       "1   Alaska          AK   02"
      ]
     },
     "execution_count": 26,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips.head(n=2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "      <td>01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>02</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "      <td>02</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>04</th>\n",
       "      <td>Arizona</td>\n",
       "      <td>AZ</td>\n",
       "      <td>04</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>05</th>\n",
       "      <td>Arkansas</td>\n",
       "      <td>AR</td>\n",
       "      <td>05</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>06</th>\n",
       "      <td>California</td>\n",
       "      <td>CA</td>\n",
       "      <td>06</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            Name Postal Code FIPS\n",
       "FIPS                             \n",
       "01       Alabama          AL   01\n",
       "02        Alaska          AK   02\n",
       "04       Arizona          AZ   04\n",
       "05      Arkansas          AR   05\n",
       "06    California          CA   06"
      ]
     },
     "execution_count": 27,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new = fips.set_index(keys=\"FIPS\", inplace=False, drop=False)\n",
    "fips_new.head(n=5)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "It kind of looks like there's an index value **FIPS**.  Let's see if there is."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "#  We can't just use the [ ] notation to try to get an entry out by index, since for a DataFrame \n",
    "#  this will look for a column with that name.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "'FIPS'",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36mget_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m   3620\u001b[0m             \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 3621\u001b[0;31m                 \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcasted_key\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   3622\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0merr\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/_libs/index.pyx\u001b[0m in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;32mpandas/_libs/hashtable_class_helper.pxi\u001b[0m in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;31mKeyError\u001b[0m: 'FIPS'",
      "\nThe above exception was the direct cause of the following exception:\n",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "\u001b[0;32m/tmp/ipykernel_428/3242763737.py\u001b[0m in \u001b[0;36m<cell line: 1>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mfips_new\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mloc\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m'FIPS'\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m__getitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m    965\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    966\u001b[0m             \u001b[0mmaybe_callable\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mcom\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mapply_if_callable\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 967\u001b[0;31m             \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_getitem_axis\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mmaybe_callable\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m    968\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m    969\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_is_scalar_access\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mkey\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m_getitem_axis\u001b[0;34m(self, key, axis)\u001b[0m\n\u001b[1;32m   1200\u001b[0m         \u001b[0;31m# fall thru to straight lookup\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1201\u001b[0m         \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_validate_key\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1202\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_get_label\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1203\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1204\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_get_slice_axis\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mslice_obj\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mslice\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/core/indexing.py\u001b[0m in \u001b[0;36m_get_label\u001b[0;34m(self, label, axis)\u001b[0m\n\u001b[1;32m   1151\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_get_label\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mlabel\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mint\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1152\u001b[0m         \u001b[0;31m# GH#5667 this will fail if the label is not present in the axis.\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 1153\u001b[0;31m         \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mobj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mxs\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mlabel\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0maxis\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0maxis\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   1154\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   1155\u001b[0m     \u001b[0;32mdef\u001b[0m \u001b[0m_handle_lowerdim_multi_index_axis0\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mtup\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0mtuple\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/core/generic.py\u001b[0m in \u001b[0;36mxs\u001b[0;34m(self, key, axis, level, drop_level)\u001b[0m\n\u001b[1;32m   3874\u001b[0m                     \u001b[0mnew_index\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mindex\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mloc\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   3875\u001b[0m         \u001b[0;32melse\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 3876\u001b[0;31m             \u001b[0mloc\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mindex\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   3877\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   3878\u001b[0m             \u001b[0;32mif\u001b[0m \u001b[0misinstance\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mloc\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mndarray\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;32m/usr/local/lib/python3.8/dist-packages/pandas/core/indexes/base.py\u001b[0m in \u001b[0;36mget_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m   3621\u001b[0m                 \u001b[0;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_engine\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mget_loc\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcasted_key\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   3622\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mKeyError\u001b[0m \u001b[0;32mas\u001b[0m \u001b[0merr\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m-> 3623\u001b[0;31m                 \u001b[0;32mraise\u001b[0m \u001b[0mKeyError\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mkey\u001b[0m\u001b[0;34m)\u001b[0m \u001b[0;32mfrom\u001b[0m \u001b[0merr\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m   3624\u001b[0m             \u001b[0;32mexcept\u001b[0m \u001b[0mTypeError\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m   3625\u001b[0m                 \u001b[0;31m# If we have a listlike key, _check_indexing_error will raise\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n",
      "\u001b[0;31mKeyError\u001b[0m: 'FIPS'"
     ]
    }
   ],
   "source": [
    "fips_new.loc['FIPS']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "So the display of _FIPS_ in the view of the table view is just Pandas telling us the **name** of the index, not an actual value."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index(['01', '02', '04', '05', '06', '08', '09', '10', '12', '13', '15', '16',\n",
       "       '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28',\n",
       "       '29', '30', '31', '32', '33', '34', '35', '36', '37', '38', '39', '40',\n",
       "       '41', '42', '44', '45', '46', '47', '48', '49', '50', '51', '53', '54',\n",
       "       '55', '56'],\n",
       "      dtype='object', name='FIPS')"
      ]
     },
     "execution_count": 31,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>FIPS</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "      <td>01</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>02</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "      <td>02</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Name Postal Code FIPS\n",
       "FIPS                          \n",
       "01    Alabama          AL   01\n",
       "02     Alaska          AK   02"
      ]
     },
     "execution_count": 32,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.head(n=2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now let's get rid of the **FIPS** column (which could have been done with we called `set_index`)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>01</th>\n",
       "      <td>Alabama</td>\n",
       "      <td>AL</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>02</th>\n",
       "      <td>Alaska</td>\n",
       "      <td>AK</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "         Name Postal Code\n",
       "FIPS                     \n",
       "01    Alabama          AL\n",
       "02     Alaska          AK"
      ]
     },
     "execution_count": 33,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.drop(\"FIPS\", axis=1, inplace=True)\n",
    "fips_new.head(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "##  Adding data to the DataFrame object\n",
    "Now we'll set the region for each of the Midwest states.  First, set the region to 0 for **all the states**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>Region</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>55</th>\n",
       "      <td>Wisconsin</td>\n",
       "      <td>WI</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>56</th>\n",
       "      <td>Wyoming</td>\n",
       "      <td>WY</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           Name Postal Code  Region\n",
       "FIPS                               \n",
       "55    Wisconsin          WI       0\n",
       "56      Wyoming          WY       0"
      ]
     },
     "execution_count": 34,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new['Region'] = 0\n",
    "fips_new.tail(2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now set the region to 1 for Michigan and Ohio, on the assumption that we **already know** the FIPS codes are **'26'** and **'39'**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>Region</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Michigan</td>\n",
       "      <td>MI</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Ohio</td>\n",
       "      <td>OH</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          Name Postal Code  Region\n",
       "FIPS                              \n",
       "26    Michigan          MI       1\n",
       "39        Ohio          OH       1"
      ]
     },
     "execution_count": 71,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.loc[['26', '39'], 'Region'] = 1\n",
    "fips_new.loc[['26', '39']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "What if we don't know what the codes are?  We can first get the index values for the rows using **boolean indexing**."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'numpy.ndarray'>\n",
      "FIPS\n",
      "01    False\n",
      "02    False\n",
      "04    False\n",
      "05    False\n",
      "06    False\n",
      "08    False\n",
      "09    False\n",
      "10    False\n",
      "12    False\n",
      "13    False\n",
      "15    False\n",
      "16    False\n",
      "17     True\n",
      "18     True\n",
      "19    False\n",
      "20    False\n",
      "21    False\n",
      "22    False\n",
      "23    False\n",
      "24    False\n",
      "25    False\n",
      "26    False\n",
      "27    False\n",
      "28    False\n",
      "29    False\n",
      "30    False\n",
      "31    False\n",
      "32    False\n",
      "33    False\n",
      "34    False\n",
      "35    False\n",
      "36    False\n",
      "37    False\n",
      "38    False\n",
      "39    False\n",
      "40    False\n",
      "41    False\n",
      "42    False\n",
      "44    False\n",
      "45    False\n",
      "46    False\n",
      "47    False\n",
      "48    False\n",
      "49    False\n",
      "50    False\n",
      "51    False\n",
      "53    False\n",
      "54    False\n",
      "55    False\n",
      "56    False\n",
      "Name: Postal Code, dtype: bool\n"
     ]
    }
   ],
   "source": [
    "indiana = fips_new['Postal Code'] == 'IN'\n",
    "illinois = fips_new['Postal Code'] == 'IL'\n",
    "print(type(illinois.values))\n",
    "import numpy as np\n",
    "#either = np.logical_or(indiana,illinois)\n",
    "either = indiana | illinois\n",
    "print(either)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Let's see the rows in `fips_new` matched by this boolean array."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 76,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>Region</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Illinois</td>\n",
       "      <td>IL</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>Indiana</td>\n",
       "      <td>IN</td>\n",
       "      <td>0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          Name Postal Code  Region\n",
       "FIPS                              \n",
       "17    Illinois          IL       0\n",
       "18     Indiana          IN       0"
      ]
     },
     "execution_count": 76,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.loc[either]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "And now we can set the `Region` column."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 77,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "fips_new.loc[either, 'Region'] = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Here's an alternative way.\n",
    "\n",
    "Get the **index** associated with those rows in `either`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "in_ill_index = fips_new[either].index\n",
    "in_ill_index"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "And now we are basically able to do what we did for Michigan and Ohio when we knew those index values in advance."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 78,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>Region</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Illinois</td>\n",
       "      <td>IL</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>Indiana</td>\n",
       "      <td>IN</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Michigan</td>\n",
       "      <td>MI</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Ohio</td>\n",
       "      <td>OH</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          Name Postal Code  Region\n",
       "FIPS                              \n",
       "17    Illinois          IL       1\n",
       "18     Indiana          IN       1\n",
       "26    Michigan          MI       1\n",
       "39        Ohio          OH       1"
      ]
     },
     "execution_count": 78,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#fips_new.loc[in_ill_index, 'Region'] = 1\n",
    "fips_new[fips_new['Region'] == 1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Here's a way to do it all in one step.  Not necessarily recommended, because it's kind of hard to figure out exactly what it's doing, but it works!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 84,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "FIPS\n",
       "27    1\n",
       "55    1\n",
       "Name: Region, dtype: int64"
      ]
     },
     "execution_count": 84,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.loc[(fips_new['Postal Code'] == 'WI') | (fips_new['Postal Code'] == 'MN'), 'Region'] = 1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 82,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "      <th>Region</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>17</th>\n",
       "      <td>Illinois</td>\n",
       "      <td>IL</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>Indiana</td>\n",
       "      <td>IN</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Michigan</td>\n",
       "      <td>MI</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>Minnesota</td>\n",
       "      <td>MN</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Ohio</td>\n",
       "      <td>OH</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>55</th>\n",
       "      <td>Wisconsin</td>\n",
       "      <td>WI</td>\n",
       "      <td>1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "           Name Postal Code  Region\n",
       "FIPS                               \n",
       "17     Illinois          IL       1\n",
       "18      Indiana          IN       1\n",
       "26     Michigan          MI       1\n",
       "27    Minnesota          MN       1\n",
       "39         Ohio          OH       1\n",
       "55    Wisconsin          WI       1"
      ]
     },
     "execution_count": 82,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.loc[fips_new['Region'] == 1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 83,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Name</th>\n",
       "      <th>Postal Code</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>FIPS</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>26</th>\n",
       "      <td>Michigan</td>\n",
       "      <td>MI</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>39</th>\n",
       "      <td>Ohio</td>\n",
       "      <td>OH</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "          Name Postal Code\n",
       "FIPS                      \n",
       "26    Michigan          MI\n",
       "39        Ohio          OH"
      ]
     },
     "execution_count": 83,
     "metadata": {
     },
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fips_new.loc[['26', '39'], ['Name', 'Postal Code']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "#  Getting the population data\n",
    "\n",
    "The cell below contains the function written previously to get us the census population data from api.census.gov\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "import requests\n",
    "import json\n",
    "import numpy as np\n",
    "\n",
    "def download_pop_data(year, states):\n",
    "    states_list = \",\".join(states)\n",
    "\n",
    "    base_url = 'http://api.census.gov/data'\n",
    "    application = \"pep/charagegroups\"\n",
    "    query = f\"get=POP&for=state:{states_list}\"\n",
    "\n",
    "\n",
    "    url = f\"{base_url}/{year}/{application}?{query}\"\n",
    "\n",
    "    response = requests.get(url)\n",
    "\n",
    "    if not (200 <= response.status_code <= 299):\n",
    "        return (False, None)\n",
    "\n",
    "    state_data = json.loads(response.content)\n",
    "    state_data = np.array(state_data[1:])\n",
    "\n",
    "    population = state_data[:,0].astype('int')\n",
    "    state = state_data[:, 1].astype('int')\n",
    "\n",
    "\n",
    "    data = np.column_stack((state, population))\n",
    "    years = np.full(data.shape[0],year, dtype='int')\n",
    "\n",
    "    year_data = np.column_stack((years, data))\n",
    "    return (True, year_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "download_pop_data(2015, fips_new.index)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Let's modify this function to return a Pandas `DataFrame` instead of a NumPy `ndarray`, using the FIPS code as the index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "def download_pop_data_pd(year, states):\n",
    "    states_list = \",\".join(states)\n",
    "\n",
    "    base_url = 'http://api.census.gov/data'\n",
    "    application = \"pep/charagegroups\"\n",
    "    query = f\"get=POP&for=state:{states_list}\"\n",
    "\n",
    "\n",
    "    url = f\"{base_url}/{year}/{application}?{query}\"\n",
    "\n",
    "    response = requests.get(url)\n",
    "\n",
    "    if not (200 <= response.status_code <= 299):\n",
    "        return (False, None)\n",
    "\n",
    "    state_data = json.loads(response.content)\n",
    "    state_data = np.array(state_data[1:])\n",
    "\n",
    "    data = DataFrame(data{'Population': })\n",
    "\n",
    "\n",
    "    return (True, data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "result, df = download_pop_data_pd(2015, fips_new.index)\n",
    "\n",
    "if result:\n",
    "    print(df.head())\n",
    "    print(df.dtypes)\n",
    "else:\n",
    "    print(\"Failed to download population data for 2015 using the Pandas method\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Here's one way for us to generate all of the years of data.  Pandas recommends against doing it this way, though, because of the costs of memory reallocation on each call."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Here's a more efficient way:  create individual data frames for each year, then call `concat` to combine them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Now let's get the total population in the Midwest for 2015.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 0,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
   ],
   "source": [
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (system-wide)",
   "language": "python",
   "metadata": {
    "cocalc": {
     "description": "Python 3 programming language",
     "priority": 100,
     "url": "https://www.python.org/"
    }
   },
   "name": "python3",
   "resource_dir": "/ext/jupyter/kernels/python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}