{ "cells": [ { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Getting Started\n", "\n", "Execute the 2 cells below to import our usual data science libraries, and then load the final version of the access log `DataFrame` from our access log processing exercise." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "import pandas as pd\n", "import numpy as np" ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "logs = pd.read_pickle('homework5.pkl')" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "## Part 1 - Decoding the User Agent Column" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Later in this notebook you'll be doing some analysis of requests by **user agent**, which is a more technical term for browser. To prepare for this, you'll standardize the values in the `User Agent` column to a restricted set of values." ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "First, use the `fillna` method to replace all missing values in the `User Agent` column with the string **Unknown**. " ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "logs['User Agent'].fillna(\"Unknown\", inplace=True)\n", "logs['User Agent'].head()" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Now, create a column in `logs` named `User Agent Lower` which is the result of converting the values in the `User Agent` column to lower case. See pages 220-222 (end of Section 7.3) to see how to do this efficiently. **You should not write a for loop** to accomplish this task.\n", "\n", "Display the first 10 values in the `User Agent Lower` column only to verify your code worked correctly." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "logs[\"User Agent Lower\"] = logs[\"User Agent\"].str.lower()\n", "logs[\"User Agent Lower\"].head(n=10)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "Next, write one or more statements in the cell below that add a column named `Platform` to `logs`. The contents of `Platform` will initially be the value **Other**.\n", "\n", "After creating `Platform`, display the contents of the `User Agent` and `Platform` columns for the first 10 rows of `logs` to verify it worked correctly." ] }, { "cell_type": "code", "execution_count": 0, "metadata": { "collapsed": false }, "outputs": [ ], "source": [ "logs['Platform'] = \"Other\"\n", "logs[['User Agent', 'Platform']].head(n=10)" ] }, { "cell_type": "markdown", "metadata": { "collapsed": false }, "source": [ "In the cell below, write code that looks for the following strings in each of the values in `User Agent Lower`, setting the value in the `Platform` column to the associated value.\n", "
User Agent Lower Contains | \n", "Set Platform to | \n", "
---|---|
android | \n", "Android | \n", "
iphone | \n", "iPhone | \n", "
ipad | \n", "iPad | \n", "
macintosh | \n", "Mac | \n", "
windows | \n", "Windows | \n", "
bot | \n", "Web crawler | \n", "
spider | \n", "Web crawler | \n", "