Book a Demo!
CoCalc Logo Icon
StoreFeaturesDocsShareSupportNewsAboutPoliciesSign UpSign In
probml
GitHub Repository: probml/pyprobml
Path: blob/master/notebooks/tutorials/colab_intro.ipynb
1192 views
Kernel: Python 3

Open In Colab

Introduction to colab

Kevin Murphy, August 2021.

Colab is Google's version of Jupyter notebooks, but has the following advantages:

  • it runs in the cloud, not locally, so you can use it from a cheap laptop, such as a Chromebook.

  • The notebook is saved in your Google drive, so you can share your notebook with someone else and work on it collaboratively.

  • it has nearly all of the packages you need for doing ML pre-installed

  • it gives you free access to a GPU or TPU

  • it has a file editor, so you can separate your code from the output of your code, as with other IDEs, such as Jupyter lab.

  • it has various other useful features, such as collapsible sections (cf. code folding), and ways to specify parameters to your functions via various GUI widgets for use by non-programmers. (You can automatically execute parameterized notebooks with different parameters using papermill.)

More details can be found in the official introduction. Below we describe a few more tips and tricks, focusing on methods that I have found useful when developing the book. (More advanced tricks can be found in this blog post and this blog post.)

IS_COLAB = "google.colab" in str(get_ipython()) print(IS_COLAB)
True
# Standard Python libraries from __future__ import absolute_import, division, print_function, unicode_literals import os import time import glob from typing import Any, Callable, Dict, Iterator, Mapping, Optional, Sequence, Tuple import numpy as np import matplotlib.pyplot as plt

How to import and use standard libraries

Colab comes with most of the packages we need pre-installed. You can see them all using this command.

!pip list -v
Package Version Location Installer ----------------------------- --------------- -------------------------------------- --------- absl-py 0.10.0 /usr/local/lib/python3.6/dist-packages pip alabaster 0.7.12 /usr/local/lib/python3.6/dist-packages pip albumentations 0.1.12 /usr/local/lib/python3.6/dist-packages pip altair 4.1.0 /usr/local/lib/python3.6/dist-packages pip argon2-cffi 20.1.0 /usr/local/lib/python3.6/dist-packages pip asgiref 3.3.1 /usr/local/lib/python3.6/dist-packages pip astor 0.8.1 /usr/local/lib/python3.6/dist-packages pip astropy 4.1 /usr/local/lib/python3.6/dist-packages pip astunparse 1.6.3 /usr/local/lib/python3.6/dist-packages pip async-generator 1.10 /usr/local/lib/python3.6/dist-packages pip atari-py 0.2.6 /usr/local/lib/python3.6/dist-packages pip atomicwrites 1.4.0 /usr/local/lib/python3.6/dist-packages pip attrs 20.3.0 /usr/local/lib/python3.6/dist-packages pip audioread 2.1.9 /usr/local/lib/python3.6/dist-packages pip autograd 1.3 /usr/local/lib/python3.6/dist-packages pip Babel 2.9.0 /usr/local/lib/python3.6/dist-packages pip backcall 0.2.0 /usr/local/lib/python3.6/dist-packages pip beautifulsoup4 4.6.3 /usr/local/lib/python3.6/dist-packages pip bleach 3.2.1 /usr/local/lib/python3.6/dist-packages pip blis 0.4.1 /usr/local/lib/python3.6/dist-packages pip bokeh 2.1.1 /usr/local/lib/python3.6/dist-packages pip Bottleneck 1.3.2 /usr/local/lib/python3.6/dist-packages pip branca 0.4.2 /usr/local/lib/python3.6/dist-packages pip bs4 0.0.1 /usr/local/lib/python3.6/dist-packages pip CacheControl 0.12.6 /usr/local/lib/python3.6/dist-packages pip cachetools 4.2.0 /usr/local/lib/python3.6/dist-packages pip catalogue 1.0.0 /usr/local/lib/python3.6/dist-packages pip certifi 2020.12.5 /usr/local/lib/python3.6/dist-packages pip cffi 1.14.4 /usr/local/lib/python3.6/dist-packages pip chainer 7.4.0 /usr/local/lib/python3.6/dist-packages pip chardet 3.0.4 /usr/local/lib/python3.6/dist-packages pip click 7.1.2 /usr/local/lib/python3.6/dist-packages pip cloudpickle 1.3.0 /usr/local/lib/python3.6/dist-packages pip cmake 3.12.0 /usr/local/lib/python3.6/dist-packages pip cmdstanpy 0.9.5 /usr/local/lib/python3.6/dist-packages pip colorlover 0.3.0 /usr/local/lib/python3.6/dist-packages pip community 1.0.0b1 /usr/local/lib/python3.6/dist-packages pip conda 4.3.16 /usr/local/lib/python3.6/dist-packages pip contextlib2 0.5.5 /usr/local/lib/python3.6/dist-packages pip convertdate 2.2.0 /usr/local/lib/python3.6/dist-packages pip coverage 3.7.1 /usr/local/lib/python3.6/dist-packages pip coveralls 0.5 /usr/local/lib/python3.6/dist-packages pip crcmod 1.7 /usr/local/lib/python3.6/dist-packages pip cufflinks 0.17.3 /usr/local/lib/python3.6/dist-packages pip cupy-cuda101 7.4.0 /usr/local/lib/python3.6/dist-packages pip cvxopt 1.2.5 /usr/local/lib/python3.6/dist-packages pip cvxpy 1.0.31 /usr/local/lib/python3.6/dist-packages pip cycler 0.10.0 /usr/local/lib/python3.6/dist-packages pip cymem 2.0.5 /usr/local/lib/python3.6/dist-packages pip Cython 0.29.21 /usr/local/lib/python3.6/dist-packages pip daft 0.0.4 /usr/local/lib/python3.6/dist-packages pip dask 2.12.0 /usr/local/lib/python3.6/dist-packages pip dataclasses 0.8 /usr/local/lib/python3.6/dist-packages pip datascience 0.10.6 /usr/local/lib/python3.6/dist-packages pip debugpy 1.0.0 /usr/local/lib/python3.6/dist-packages pip decorator 4.4.2 /usr/local/lib/python3.6/dist-packages pip defusedxml 0.6.0 /usr/local/lib/python3.6/dist-packages pip descartes 1.1.0 /usr/local/lib/python3.6/dist-packages pip dill 0.3.3 /usr/local/lib/python3.6/dist-packages pip distributed 1.25.3 /usr/local/lib/python3.6/dist-packages pip Django 3.1.5 /usr/local/lib/python3.6/dist-packages pip dlib 19.18.0 /usr/local/lib/python3.6/dist-packages pip dm-tree 0.1.5 /usr/local/lib/python3.6/dist-packages pip docopt 0.6.2 /usr/local/lib/python3.6/dist-packages pip docutils 0.16 /usr/local/lib/python3.6/dist-packages pip dopamine-rl 1.0.5 /usr/local/lib/python3.6/dist-packages pip earthengine-api 0.1.238 /usr/local/lib/python3.6/dist-packages pip easydict 1.9 /usr/local/lib/python3.6/dist-packages pip ecos 2.0.7.post1 /usr/local/lib/python3.6/dist-packages pip editdistance 0.5.3 /usr/local/lib/python3.6/dist-packages pip en-core-web-sm 2.2.5 /usr/local/lib/python3.6/dist-packages pip entrypoints 0.3 /usr/local/lib/python3.6/dist-packages pip ephem 3.7.7.1 /usr/local/lib/python3.6/dist-packages pip et-xmlfile 1.0.1 /usr/local/lib/python3.6/dist-packages pip fa2 0.3.5 /usr/local/lib/python3.6/dist-packages pip fancyimpute 0.4.3 /usr/local/lib/python3.6/dist-packages pip fastai 1.0.61 /usr/local/lib/python3.6/dist-packages pip fastdtw 0.3.4 /usr/local/lib/python3.6/dist-packages pip fastprogress 1.0.0 /usr/local/lib/python3.6/dist-packages pip fastrlock 0.5 /usr/local/lib/python3.6/dist-packages pip fbprophet 0.7.1 /usr/local/lib/python3.6/dist-packages feather-format 0.4.1 /usr/local/lib/python3.6/dist-packages pip filelock 3.0.12 /usr/local/lib/python3.6/dist-packages pip firebase-admin 4.4.0 /usr/local/lib/python3.6/dist-packages pip fix-yahoo-finance 0.0.22 /usr/local/lib/python3.6/dist-packages pip Flask 1.1.2 /usr/local/lib/python3.6/dist-packages pip flatbuffers 1.12 /usr/local/lib/python3.6/dist-packages pip folium 0.8.3 /usr/local/lib/python3.6/dist-packages pip future 0.16.0 /usr/local/lib/python3.6/dist-packages pip gast 0.3.3 /usr/local/lib/python3.6/dist-packages pip GDAL 2.2.2 /usr/lib/python3/dist-packages gdown 3.6.4 /usr/local/lib/python3.6/dist-packages pip gensim 3.6.0 /usr/local/lib/python3.6/dist-packages pip geographiclib 1.50 /usr/local/lib/python3.6/dist-packages pip geopy 1.17.0 /usr/local/lib/python3.6/dist-packages pip gin-config 0.4.0 /usr/local/lib/python3.6/dist-packages pip glob2 0.7 /usr/local/lib/python3.6/dist-packages pip google 2.0.3 /usr/local/lib/python3.6/dist-packages pip google-api-core 1.16.0 /usr/local/lib/python3.6/dist-packages pip google-api-python-client 1.7.12 /usr/local/lib/python3.6/dist-packages pip google-auth 1.17.2 /usr/local/lib/python3.6/dist-packages pip google-auth-httplib2 0.0.4 /usr/local/lib/python3.6/dist-packages pip google-auth-oauthlib 0.4.2 /usr/local/lib/python3.6/dist-packages pip google-cloud-bigquery 1.21.0 /usr/local/lib/python3.6/dist-packages pip google-cloud-bigquery-storage 1.1.0 /usr/local/lib/python3.6/dist-packages pip google-cloud-core 1.0.3 /usr/local/lib/python3.6/dist-packages pip google-cloud-datastore 1.8.0 /usr/local/lib/python3.6/dist-packages pip google-cloud-firestore 1.7.0 /usr/local/lib/python3.6/dist-packages pip google-cloud-language 1.2.0 /usr/local/lib/python3.6/dist-packages pip google-cloud-storage 1.18.1 /usr/local/lib/python3.6/dist-packages pip google-cloud-translate 1.5.0 /usr/local/lib/python3.6/dist-packages pip google-colab 1.0.0 /usr/local/lib/python3.6/dist-packages pip google-pasta 0.2.0 /usr/local/lib/python3.6/dist-packages pip google-resumable-media 0.4.1 /usr/local/lib/python3.6/dist-packages pip googleapis-common-protos 1.52.0 /usr/local/lib/python3.6/dist-packages pip googledrivedownloader 0.4 /usr/local/lib/python3.6/dist-packages pip graphviz 0.10.1 /usr/local/lib/python3.6/dist-packages pip grpcio 1.32.0 /usr/local/lib/python3.6/dist-packages pip gspread 3.0.1 /usr/local/lib/python3.6/dist-packages pip gspread-dataframe 3.0.8 /usr/local/lib/python3.6/dist-packages pip gym 0.17.3 /usr/local/lib/python3.6/dist-packages pip h5py 2.10.0 /usr/local/lib/python3.6/dist-packages pip HeapDict 1.0.1 /usr/local/lib/python3.6/dist-packages pip holidays 0.10.4 /usr/local/lib/python3.6/dist-packages pip holoviews 1.13.5 /usr/local/lib/python3.6/dist-packages pip html5lib 1.0.1 /usr/local/lib/python3.6/dist-packages pip httpimport 0.5.18 /usr/local/lib/python3.6/dist-packages pip httplib2 0.17.4 /usr/local/lib/python3.6/dist-packages pip httplib2shim 0.0.3 /usr/local/lib/python3.6/dist-packages pip humanize 0.5.1 /usr/local/lib/python3.6/dist-packages pip hyperopt 0.1.2 /usr/local/lib/python3.6/dist-packages pip ideep4py 2.0.0.post3 /usr/local/lib/python3.6/dist-packages pip idna 2.10 /usr/local/lib/python3.6/dist-packages pip image 1.5.33 /usr/local/lib/python3.6/dist-packages pip imageio 2.4.1 /usr/local/lib/python3.6/dist-packages pip imagesize 1.2.0 /usr/local/lib/python3.6/dist-packages pip imbalanced-learn 0.4.3 /usr/local/lib/python3.6/dist-packages pip imblearn 0.0 /usr/local/lib/python3.6/dist-packages pip imgaug 0.2.9 /usr/local/lib/python3.6/dist-packages pip importlib-metadata 3.3.0 /usr/local/lib/python3.6/dist-packages pip importlib-resources 4.1.1 /usr/local/lib/python3.6/dist-packages pip imutils 0.5.3 /usr/local/lib/python3.6/dist-packages pip inflect 2.1.0 /usr/local/lib/python3.6/dist-packages pip iniconfig 1.1.1 /usr/local/lib/python3.6/dist-packages pip intel-openmp 2021.1.2 /usr/local/lib/python3.6/dist-packages pip intervaltree 2.1.0 /usr/local/lib/python3.6/dist-packages pip ipykernel 4.10.1 /usr/local/lib/python3.6/dist-packages pip ipython 5.5.0 /usr/local/lib/python3.6/dist-packages pip ipython-genutils 0.2.0 /usr/local/lib/python3.6/dist-packages pip ipython-sql 0.3.9 /usr/local/lib/python3.6/dist-packages pip ipywidgets 7.6.3 /usr/local/lib/python3.6/dist-packages pip itsdangerous 1.1.0 /usr/local/lib/python3.6/dist-packages pip jax 0.2.7 /usr/local/lib/python3.6/dist-packages pip jaxlib 0.1.57+cuda101 /usr/local/lib/python3.6/dist-packages pip jdcal 1.4.1 /usr/local/lib/python3.6/dist-packages pip jedi 0.18.0 /usr/local/lib/python3.6/dist-packages pip jieba 0.42.1 /usr/local/lib/python3.6/dist-packages pip Jinja2 2.11.2 /usr/local/lib/python3.6/dist-packages pip joblib 1.0.0 /usr/local/lib/python3.6/dist-packages pip jpeg4py 0.1.4 /usr/local/lib/python3.6/dist-packages pip jsonschema 2.6.0 /usr/local/lib/python3.6/dist-packages pip jupyter 1.0.0 /usr/local/lib/python3.6/dist-packages pip jupyter-client 5.3.5 /usr/local/lib/python3.6/dist-packages pip jupyter-console 5.2.0 /usr/local/lib/python3.6/dist-packages pip jupyter-core 4.7.0 /usr/local/lib/python3.6/dist-packages pip jupyterlab-pygments 0.1.2 /usr/local/lib/python3.6/dist-packages pip jupyterlab-widgets 1.0.0 /usr/local/lib/python3.6/dist-packages pip kaggle 1.5.10 /usr/local/lib/python3.6/dist-packages pip kapre 0.1.3.1 /usr/local/lib/python3.6/dist-packages pip Keras 2.4.3 /usr/local/lib/python3.6/dist-packages pip Keras-Preprocessing 1.1.2 /usr/local/lib/python3.6/dist-packages pip keras-vis 0.4.1 /usr/local/lib/python3.6/dist-packages pip kiwisolver 1.3.1 /usr/local/lib/python3.6/dist-packages pip knnimpute 0.1.0 /usr/local/lib/python3.6/dist-packages pip korean-lunar-calendar 0.2.1 /usr/local/lib/python3.6/dist-packages pip librosa 0.6.3 /usr/local/lib/python3.6/dist-packages pip lightgbm 2.2.3 /usr/local/lib/python3.6/dist-packages pip llvmlite 0.31.0 /usr/local/lib/python3.6/dist-packages pip lmdb 0.99 /usr/local/lib/python3.6/dist-packages pip lucid 0.3.8 /usr/local/lib/python3.6/dist-packages pip LunarCalendar 0.0.9 /usr/local/lib/python3.6/dist-packages pip lxml 4.2.6 /usr/local/lib/python3.6/dist-packages pip Markdown 3.3.3 /usr/local/lib/python3.6/dist-packages pip MarkupSafe 1.1.1 /usr/local/lib/python3.6/dist-packages pip matplotlib 3.2.2 /usr/local/lib/python3.6/dist-packages pip matplotlib-venn 0.11.6 /usr/local/lib/python3.6/dist-packages pip missingno 0.4.2 /usr/local/lib/python3.6/dist-packages pip mistune 0.8.4 /usr/local/lib/python3.6/dist-packages pip mizani 0.6.0 /usr/local/lib/python3.6/dist-packages pip mkl 2019.0 /usr/local/lib/python3.6/dist-packages pip mlxtend 0.14.0 /usr/local/lib/python3.6/dist-packages pip more-itertools 8.6.0 /usr/local/lib/python3.6/dist-packages pip moviepy 0.2.3.5 /usr/local/lib/python3.6/dist-packages pip mpmath 1.1.0 /usr/local/lib/python3.6/dist-packages pip msgpack 1.0.2 /usr/local/lib/python3.6/dist-packages pip multiprocess 0.70.11.1 /usr/local/lib/python3.6/dist-packages pip multitasking 0.0.9 /usr/local/lib/python3.6/dist-packages pip murmurhash 1.0.5 /usr/local/lib/python3.6/dist-packages pip music21 5.5.0 /usr/local/lib/python3.6/dist-packages pip natsort 5.5.0 /usr/local/lib/python3.6/dist-packages pip nbclient 0.5.1 /usr/local/lib/python3.6/dist-packages pip nbconvert 5.6.1 /usr/local/lib/python3.6/dist-packages pip nbformat 5.0.8 /usr/local/lib/python3.6/dist-packages pip nest-asyncio 1.4.3 /usr/local/lib/python3.6/dist-packages pip networkx 2.5 /usr/local/lib/python3.6/dist-packages pip nibabel 3.0.2 /usr/local/lib/python3.6/dist-packages pip nltk 3.2.5 /usr/local/lib/python3.6/dist-packages pip notebook 5.3.1 /usr/local/lib/python3.6/dist-packages pip np-utils 0.5.12.1 /usr/local/lib/python3.6/dist-packages pip numba 0.48.0 /usr/local/lib/python3.6/dist-packages pip numexpr 2.7.2 /usr/local/lib/python3.6/dist-packages pip numpy 1.19.5 /usr/local/lib/python3.6/dist-packages pip nvidia-ml-py3 7.352.0 /usr/local/lib/python3.6/dist-packages pip oauth2client 4.1.3 /usr/local/lib/python3.6/dist-packages pip oauthlib 3.1.0 /usr/local/lib/python3.6/dist-packages pip okgrade 0.4.3 /usr/local/lib/python3.6/dist-packages pip opencv-contrib-python 4.1.2.30 /usr/local/lib/python3.6/dist-packages pip opencv-python 4.1.2.30 /usr/local/lib/python3.6/dist-packages pip openpyxl 2.5.9 /usr/local/lib/python3.6/dist-packages pip opt-einsum 3.3.0 /usr/local/lib/python3.6/dist-packages pip osqp 0.6.2 /usr/local/lib/python3.6/dist-packages pip packaging 20.8 /usr/local/lib/python3.6/dist-packages pip palettable 3.3.0 /usr/local/lib/python3.6/dist-packages pip pandas 1.1.5 /usr/local/lib/python3.6/dist-packages pip pandas-datareader 0.9.0 /usr/local/lib/python3.6/dist-packages pip pandas-gbq 0.13.3 /usr/local/lib/python3.6/dist-packages pip pandas-profiling 1.4.1 /usr/local/lib/python3.6/dist-packages pip pandocfilters 1.4.3 /usr/local/lib/python3.6/dist-packages pip panel 0.9.7 /usr/local/lib/python3.6/dist-packages pip param 1.10.1 /usr/local/lib/python3.6/dist-packages pip parso 0.8.1 /usr/local/lib/python3.6/dist-packages pip pathlib 1.0.1 /usr/local/lib/python3.6/dist-packages pip patsy 0.5.1 /usr/local/lib/python3.6/dist-packages pip pexpect 4.8.0 /usr/local/lib/python3.6/dist-packages pip pickleshare 0.7.5 /usr/local/lib/python3.6/dist-packages pip Pillow 7.0.0 /usr/local/lib/python3.6/dist-packages pip pip 19.3.1 /usr/local/lib/python3.6/dist-packages pip pip-tools 4.5.1 /usr/local/lib/python3.6/dist-packages pip plac 1.1.3 /usr/local/lib/python3.6/dist-packages pip plotly 4.4.1 /usr/local/lib/python3.6/dist-packages pip plotnine 0.6.0 /usr/local/lib/python3.6/dist-packages pip pluggy 0.7.1 /usr/local/lib/python3.6/dist-packages pip portpicker 1.3.1 /usr/local/lib/python3.6/dist-packages pip prefetch-generator 1.0.1 /usr/local/lib/python3.6/dist-packages pip preshed 3.0.5 /usr/local/lib/python3.6/dist-packages pip prettytable 2.0.0 /usr/local/lib/python3.6/dist-packages pip progressbar2 3.38.0 /usr/local/lib/python3.6/dist-packages pip prometheus-client 0.9.0 /usr/local/lib/python3.6/dist-packages pip promise 2.3 /usr/local/lib/python3.6/dist-packages pip prompt-toolkit 1.0.18 /usr/local/lib/python3.6/dist-packages pip protobuf 3.12.4 /usr/local/lib/python3.6/dist-packages pip psutil 5.4.8 /usr/local/lib/python3.6/dist-packages pip psycopg2 2.7.6.1 /usr/local/lib/python3.6/dist-packages pip ptyprocess 0.7.0 /usr/local/lib/python3.6/dist-packages pip py 1.10.0 /usr/local/lib/python3.6/dist-packages pip pyarrow 0.14.1 /usr/local/lib/python3.6/dist-packages pip pyasn1 0.4.8 /usr/local/lib/python3.6/dist-packages pip pyasn1-modules 0.2.8 /usr/local/lib/python3.6/dist-packages pip pycocotools 2.0.2 /usr/local/lib/python3.6/dist-packages pip pycosat 0.6.3 /usr/local/lib/python3.6/dist-packages pip pycparser 2.20 /usr/local/lib/python3.6/dist-packages pip pyct 0.4.8 /usr/local/lib/python3.6/dist-packages pip pydata-google-auth 1.1.0 /usr/local/lib/python3.6/dist-packages pip pydot 1.3.0 /usr/local/lib/python3.6/dist-packages pip pydot-ng 2.0.0 /usr/local/lib/python3.6/dist-packages pip pydotplus 2.0.2 /usr/local/lib/python3.6/dist-packages pip PyDrive 1.3.1 /usr/local/lib/python3.6/dist-packages pip pyemd 0.5.1 /usr/local/lib/python3.6/dist-packages pip pyglet 1.5.0 /usr/local/lib/python3.6/dist-packages pip Pygments 2.6.1 /usr/local/lib/python3.6/dist-packages pip pygobject 3.26.1 /usr/lib/python3/dist-packages pymc3 3.7 /usr/local/lib/python3.6/dist-packages pip PyMeeus 0.3.7 /usr/local/lib/python3.6/dist-packages pip pymongo 3.11.2 /usr/local/lib/python3.6/dist-packages pip pymystem3 0.2.0 /usr/local/lib/python3.6/dist-packages pip PyOpenGL 3.1.5 /usr/local/lib/python3.6/dist-packages pip pyparsing 2.4.7 /usr/local/lib/python3.6/dist-packages pip pyrsistent 0.17.3 /usr/local/lib/python3.6/dist-packages pip pysndfile 1.3.8 /usr/local/lib/python3.6/dist-packages pip PySocks 1.7.1 /usr/local/lib/python3.6/dist-packages pip pystan 2.19.1.1 /usr/local/lib/python3.6/dist-packages pip pytest 3.6.4 /usr/local/lib/python3.6/dist-packages pip python-apt 1.6.5+ubuntu0.5 /usr/lib/python3/dist-packages python-chess 0.23.11 /usr/local/lib/python3.6/dist-packages pip python-dateutil 2.8.1 /usr/local/lib/python3.6/dist-packages pip python-louvain 0.15 /usr/local/lib/python3.6/dist-packages pip python-slugify 4.0.1 /usr/local/lib/python3.6/dist-packages pip python-utils 2.4.0 /usr/local/lib/python3.6/dist-packages pip pytz 2018.9 /usr/local/lib/python3.6/dist-packages pip pyviz-comms 2.0.1 /usr/local/lib/python3.6/dist-packages pip PyWavelets 1.1.1 /usr/local/lib/python3.6/dist-packages pip PyYAML 3.13 /usr/local/lib/python3.6/dist-packages pip pyzmq 20.0.0 /usr/local/lib/python3.6/dist-packages pip qdldl 0.1.5.post0 /usr/local/lib/python3.6/dist-packages pip qtconsole 5.0.1 /usr/local/lib/python3.6/dist-packages pip QtPy 1.9.0 /usr/local/lib/python3.6/dist-packages pip regex 2019.12.20 /usr/local/lib/python3.6/dist-packages pip requests 2.23.0 /usr/local/lib/python3.6/dist-packages pip requests-oauthlib 1.3.0 /usr/local/lib/python3.6/dist-packages pip resampy 0.2.2 /usr/local/lib/python3.6/dist-packages pip retrying 1.3.3 /usr/local/lib/python3.6/dist-packages pip rpy2 3.2.7 /usr/local/lib/python3.6/dist-packages pip rsa 4.6 /usr/local/lib/python3.6/dist-packages pip ruamel.yaml 0.16.12 /usr/local/lib/python3.6/dist-packages pip ruamel.yaml.clib 0.2.2 /usr/local/lib/python3.6/dist-packages pip scikit-image 0.16.2 /usr/local/lib/python3.6/dist-packages pip scikit-learn 0.22.2.post1 /usr/local/lib/python3.6/dist-packages pip scipy 1.4.1 /usr/local/lib/python3.6/dist-packages pip screen-resolution-extra 0.0.0 /usr/lib/python3/dist-packages scs 2.1.2 /usr/local/lib/python3.6/dist-packages pip seaborn 0.11.1 /usr/local/lib/python3.6/dist-packages pip Send2Trash 1.5.0 /usr/local/lib/python3.6/dist-packages pip setuptools 51.1.2 /usr/local/lib/python3.6/dist-packages pip setuptools-git 1.2 /usr/local/lib/python3.6/dist-packages pip Shapely 1.7.1 /usr/local/lib/python3.6/dist-packages pip simplegeneric 0.8.1 /usr/local/lib/python3.6/dist-packages pip six 1.15.0 /usr/local/lib/python3.6/dist-packages pip sklearn 0.0 /usr/local/lib/python3.6/dist-packages pip sklearn-pandas 1.8.0 /usr/local/lib/python3.6/dist-packages pip smart-open 4.1.0 /usr/local/lib/python3.6/dist-packages pip snowballstemmer 2.0.0 /usr/local/lib/python3.6/dist-packages pip sortedcontainers 2.3.0 /usr/local/lib/python3.6/dist-packages pip spacy 2.2.4 /usr/local/lib/python3.6/dist-packages pip Sphinx 1.8.5 /usr/local/lib/python3.6/dist-packages pip sphinxcontrib-serializinghtml 1.1.4 /usr/local/lib/python3.6/dist-packages pip sphinxcontrib-websupport 1.2.4 /usr/local/lib/python3.6/dist-packages pip SQLAlchemy 1.3.22 /usr/local/lib/python3.6/dist-packages pip sqlparse 0.4.1 /usr/local/lib/python3.6/dist-packages pip srsly 1.0.5 /usr/local/lib/python3.6/dist-packages pip statsmodels 0.10.2 /usr/local/lib/python3.6/dist-packages pip sympy 1.1.1 /usr/local/lib/python3.6/dist-packages pip tables 3.4.4 /usr/local/lib/python3.6/dist-packages pip tabulate 0.8.7 /usr/local/lib/python3.6/dist-packages pip tblib 1.7.0 /usr/local/lib/python3.6/dist-packages pip tensorboard 2.4.0 /usr/local/lib/python3.6/dist-packages pip tensorboard-plugin-wit 1.7.0 /usr/local/lib/python3.6/dist-packages pip tensorboardcolab 0.0.22 /usr/local/lib/python3.6/dist-packages pip tensorflow 2.4.0 /usr/local/lib/python3.6/dist-packages pip tensorflow-addons 0.8.3 /usr/local/lib/python3.6/dist-packages pip tensorflow-datasets 4.0.1 /usr/local/lib/python3.6/dist-packages pip tensorflow-estimator 2.4.0 /usr/local/lib/python3.6/dist-packages pip tensorflow-gcs-config 2.4.0 /usr/local/lib/python3.6/dist-packages pip tensorflow-hub 0.11.0 /usr/local/lib/python3.6/dist-packages pip tensorflow-metadata 0.26.0 /usr/local/lib/python3.6/dist-packages pip tensorflow-privacy 0.2.2 /usr/local/lib/python3.6/dist-packages pip tensorflow-probability 0.12.1 /usr/local/lib/python3.6/dist-packages pip termcolor 1.1.0 /usr/local/lib/python3.6/dist-packages pip terminado 0.9.2 /usr/local/lib/python3.6/dist-packages pip testpath 0.4.4 /usr/local/lib/python3.6/dist-packages pip text-unidecode 1.3 /usr/local/lib/python3.6/dist-packages pip textblob 0.15.3 /usr/local/lib/python3.6/dist-packages pip textgenrnn 1.4.1 /usr/local/lib/python3.6/dist-packages pip Theano 1.0.5 /usr/local/lib/python3.6/dist-packages pip thinc 7.4.0 /usr/local/lib/python3.6/dist-packages pip tifffile 2020.9.3 /usr/local/lib/python3.6/dist-packages pip toml 0.10.2 /usr/local/lib/python3.6/dist-packages pip toolz 0.11.1 /usr/local/lib/python3.6/dist-packages pip torch 1.7.0+cu101 /usr/local/lib/python3.6/dist-packages pip torchsummary 1.5.1 /usr/local/lib/python3.6/dist-packages pip torchtext 0.3.1 /usr/local/lib/python3.6/dist-packages pip torchvision 0.8.1+cu101 /usr/local/lib/python3.6/dist-packages pip tornado 5.1.1 /usr/local/lib/python3.6/dist-packages pip tqdm 4.41.1 /usr/local/lib/python3.6/dist-packages pip traitlets 4.3.3 /usr/local/lib/python3.6/dist-packages pip tweepy 3.6.0 /usr/local/lib/python3.6/dist-packages pip typeguard 2.7.1 /usr/local/lib/python3.6/dist-packages pip typing-extensions 3.7.4.3 /usr/local/lib/python3.6/dist-packages pip tzlocal 1.5.1 /usr/local/lib/python3.6/dist-packages pip umap-learn 0.4.6 /usr/local/lib/python3.6/dist-packages pip uritemplate 3.0.1 /usr/local/lib/python3.6/dist-packages pip urllib3 1.24.3 /usr/local/lib/python3.6/dist-packages pip vega-datasets 0.9.0 /usr/local/lib/python3.6/dist-packages pip wasabi 0.8.0 /usr/local/lib/python3.6/dist-packages pip wcwidth 0.2.5 /usr/local/lib/python3.6/dist-packages pip webencodings 0.5.1 /usr/local/lib/python3.6/dist-packages pip Werkzeug 1.0.1 /usr/local/lib/python3.6/dist-packages pip wheel 0.36.2 /usr/local/lib/python3.6/dist-packages pip widgetsnbextension 3.5.1 /usr/local/lib/python3.6/dist-packages pip wordcloud 1.5.0 /usr/local/lib/python3.6/dist-packages pip wrapt 1.12.1 /usr/local/lib/python3.6/dist-packages pip xarray 0.15.1 /usr/local/lib/python3.6/dist-packages pip xgboost 0.90 /usr/local/lib/python3.6/dist-packages pip xkit 0.0.0 /usr/lib/python3/dist-packages xlrd 1.1.0 /usr/local/lib/python3.6/dist-packages pip xlwt 1.3.0 /usr/local/lib/python3.6/dist-packages pip yellowbrick 0.9.1 /usr/local/lib/python3.6/dist-packages pip zict 2.0.0 /usr/local/lib/python3.6/dist-packages pip zipp 3.4.0 /usr/local/lib/python3.6/dist-packages pip

To install a new package called 'foo', use the following (see this page for details):

!pip install foo

Numpy

import numpy as np np.set_printoptions(precision=3) A = np.random.randn(2, 3) print(A)
[[ 0.958 -0.542 0.332] [-0.613 1.633 -2.432]]

Pandas

import pandas as pd pd.set_option("precision", 2) # 2 decimal places pd.set_option("display.max_rows", 20) pd.set_option("display.max_columns", 30) pd.set_option("display.width", 100) # wide windows url = "https://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data" column_names = ["MPG", "Cylinders", "Displacement", "Horsepower", "Weight", "Acceleration", "Year", "Origin", "Name"] df = pd.read_csv(url, names=column_names, sep="\s+", na_values="?") df.head()

Sklearn

import sklearn from sklearn.datasets import load_iris iris = load_iris() # Extract numpy arrays X = iris.data y = iris.target import matplotlib.pyplot as plt plt.scatter(X[:, 0], X[:, 1])
<matplotlib.collections.PathCollection at 0x7f2d91bb3c18>
Image in a Jupyter notebook

JAX

# JAX (https://github.com/google/jax) import jax import jax.numpy as jnp A = jnp.zeros((3, 3)) # Check if JAX is using GPU print("jax backend {}".format(jax.lib.xla_bridge.get_backend().platform))
jax backend gpu

Tensorflow

import tensorflow as tf from tensorflow import keras assert tf.__version__ >= "2.0" print("tf version {}".format(tf.__version__)) print([d for d in tf.config.list_physical_devices()]) if not tf.config.list_physical_devices("GPU"): print("No GPU was detected. DNNs can be very slow without a GPU.") if IS_COLAB: print("Go to Runtime > Change runtime and select a GPU hardware accelerator.")
tf version 2.4.0 [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU'), PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

PyTorch

import torch import torchvision print("torch version {}".format(torch.__version__)) if torch.cuda.is_available(): print(torch.cuda.get_device_name(0)) else: print("Torch cannot find GPU")
torch version 1.7.0+cu101 Tesla T4

Plotting

Colab has excellent support for plotting. We give some examples below.

Static plots

Colab lets you make static plots using matplotlib, as shown below. Note that plots are displayed inline by default, so

%matplotlib inline

is not needed.

import matplotlib.pyplot as plt plt.figure() plt.plot(range(10)) plt.title("my plot") plt.xlabel("x axis") plt.savefig("myplot.png")
Image in a Jupyter notebook

Seaborn

Seaborn is a library that makes matplotlib results look prettier. We can also update font size for plots, to make them more suitable for inclusion in papers.

import matplotlib.pyplot as plt import seaborn import seaborn as sns seaborn.set() seaborn.set_style("whitegrid") # Font sizes SIZE_SMALL = 14 SIZE_MEDIUM = 18 SIZE_LARGE = 24 # https://stackoverflow.com/a/39566040 plt.rc("font", size=SIZE_SMALL) # controls default text sizes plt.rc("axes", titlesize=SIZE_SMALL) # fontsize of the axes title plt.rc("axes", labelsize=SIZE_SMALL) # fontsize of the x and y labels plt.rc("xtick", labelsize=SIZE_SMALL) # fontsize of the tick labels plt.rc("ytick", labelsize=SIZE_SMALL) # fontsize of the tick labels plt.rc("legend", fontsize=SIZE_SMALL) # legend fontsize plt.rc("figure", titlesize=SIZE_LARGE) # fontsize of the figure title
plt.figure() plt.plot(range(10)) plt.title("my plot") plt.xlabel("x axis") plt.savefig("myplot.png")
Image in a Jupyter notebook

Interactive plots

Colab also lets you create interactive plots using various javascript libraries - see here for details.

Below we illustrate how to use the bokeh library to create an interactive plot of a pandas time series, where if you mouse over the plot, it shows the corresponding (x,y) coordinates. (Another option is plotly.)

import pandas as pd import numpy as np from bokeh.plotting import figure, show from bokeh.io import output_notebook from bokeh.models import ColumnDataSource, HoverTool # Call once to configure Bokeh to display plots inline in the notebook. output_notebook()
np.random.seed(0) dates = pd.date_range(start="2018-04-24", end="2018-08-27") N = len(dates) vals = np.random.standard_t(1, size=N) dd = pd.DataFrame({"vals": vals, "dates": dates}, index=dates) dd["days"] = dd.dates.dt.strftime("%Y-%m-%d") source = ColumnDataSource(dd) hover = HoverTool( tooltips=[("Date", "@days"), ("vals", "@vals")], ) p = figure(x_axis_type="datetime") p.line(x="dates", y="vals", source=source) p.add_tools(hover) show(p)
MIME type unknown not supported
MIME type unknown not supported

We can also make plots that can you pan and zoom into.

N = 4000 np.random.seed(0) x = np.random.random(size=N) * 100 y = np.random.random(size=N) * 100 radii = np.random.random(size=N) * 1.5 colors = [ "#%02x%02x%02x" % (r, g, 150) for r, g in zip(np.floor(50 + 2 * x).astype(int), np.floor(30 + 2 * y).astype(int)) ] p = figure() p.circle(x, y, radius=radii, fill_color=colors, fill_alpha=0.6, line_color=None) show(p)
MIME type unknown not supported
MIME type unknown not supported

Viewing an image file

You can either use PIL or OpenCV to display (and manipulate) images. According to this notebook, OpenCV is faster, but for a small number of images, it doesn't really matter.

from PIL import Image import requests from io import BytesIO # url = "https://github.com/probml/probml-notebooks/blob/master/images/cat_dog.jpg?raw=true" url = "https://raw.githubusercontent.com/probml/probml-notebooks/main/images/cat_dog.jpg" response = requests.get(url) img = Image.open(BytesIO(response.content)) print(type(img)) display(img)
<class 'PIL.JpegImagePlugin.JpegImageFile'>
Image in a Jupyter notebook
#!wget https://github.com/probml/probml-notebooks/blob/master/images/cat_dog.jpg?raw=true -q -O cat_dog.jpg !wget https://raw.githubusercontent.com/probml/probml-notebooks/main/images/cat_dog.jpg -q -O cat_dog.jpg
!ls -l
total 124 -rw-r--r-- 1 root root 121477 Aug 3 17:06 cat_dog.jpg drwxr-xr-x 1 root root 4096 Jul 16 13:20 sample_data
from IPython.display import Image fname = "cat_dog.jpg" Image(fname)
Image in a Jupyter notebook
from google.colab.patches import cv2_imshow import cv2 def show_image(img_path, size=None, ratio=None): img = cv2.imread(img_path, cv2.IMREAD_UNCHANGED) cv2_imshow(img) show_image("cat_dog.jpg")
Image in a Jupyter notebook

Visualizing arrays

If you use imshow, be careful of aliasing which can occur for certain figure sizes.

np.random.seed(0) fig, axs = plt.subplots(1, 3, figsize=(8, 8)) for t in range(3): X = np.random.binomial(1, 0.5, (128, 128)) axs[t].imshow(X, cmap="Accent")
Image in a Jupyter notebook

You can solve this by specifying interpolation=nearest:

np.random.seed(0) fig, axs = plt.subplots(1, 3, figsize=(8, 8)) for t in range(3): X = np.random.binomial(1, 0.5, (128, 128)) axs[t].imshow(X, cmap="Accent", interpolation="nearest")
Image in a Jupyter notebook

Alternatively, you can call matshow, which is an alias for imshow with interpolation=nearest:

np.random.seed(0) fig, axs = plt.subplots(1, 3, figsize=(8, 8)) for t in range(3): X = np.random.binomial(1, 0.5, (128, 128)) axs[t].matshow(X, cmap="Accent")
Image in a Jupyter notebook

Graphviz

You can use graphviz to layout nodes of a graph and draw the structure.

!apt-get -y install python-pydot !apt-get -y install python-pydot-ng !apt-get -y install graphviz
Reading package lists... Done Building dependency tree Reading state information... Done The following additional packages will be installed: python-pyparsing Suggested packages: python-pyparsing-doc The following NEW packages will be installed: python-pydot python-pyparsing 0 upgraded, 2 newly installed, 0 to remove and 13 not upgraded. Need to get 71.7 kB of archives. After this operation, 347 kB of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu bionic/main amd64 python-pyparsing all 2.2.0+dfsg1-2 [52.1 kB] Get:2 http://archive.ubuntu.com/ubuntu bionic/universe amd64 python-pydot all 1.2.3-1 [19.6 kB] Fetched 71.7 kB in 0s (891 kB/s) Selecting previously unselected package python-pyparsing. (Reading database ... 146374 files and directories currently installed.) Preparing to unpack .../python-pyparsing_2.2.0+dfsg1-2_all.deb ... Unpacking python-pyparsing (2.2.0+dfsg1-2) ... Selecting previously unselected package python-pydot. Preparing to unpack .../python-pydot_1.2.3-1_all.deb ... Unpacking python-pydot (1.2.3-1) ... Setting up python-pyparsing (2.2.0+dfsg1-2) ... Setting up python-pydot (1.2.3-1) ... Reading package lists... Done Building dependency tree Reading state information... Done The following NEW packages will be installed: python-pydot-ng 0 upgraded, 1 newly installed, 0 to remove and 13 not upgraded. Need to get 19.8 kB of archives. After this operation, 96.3 kB of additional disk space will be used. Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 python-pydot-ng all 1.0.0-3 [19.8 kB] Fetched 19.8 kB in 0s (541 kB/s) Selecting previously unselected package python-pydot-ng. (Reading database ... 146392 files and directories currently installed.) Preparing to unpack .../python-pydot-ng_1.0.0-3_all.deb ... Unpacking python-pydot-ng (1.0.0-3) ... Setting up python-pydot-ng (1.0.0-3) ... Reading package lists... Done Building dependency tree Reading state information... Done graphviz is already the newest version (2.40.1-2). 0 upgraded, 0 newly installed, 0 to remove and 13 not upgraded.
from graphviz import Digraph dot = Digraph(comment="Bayes net") print(dot) dot.node("C", "Cloudy") dot.node("R", "Rain") dot.node("S", "Sprinkler") dot.node("W", "Wet grass") dot.edge("C", "R") dot.edge("C", "S") dot.edge("R", "W") dot.edge("S", "W") print(dot.source) dot.render("test-output/graph.jpg", view=True) dot
// Bayes net digraph { } // Bayes net digraph { C [label=Cloudy] R [label=Rain] S [label=Sprinkler] W [label="Wet grass"] C -> R C -> S R -> W S -> W }
Image in a Jupyter notebook

Progress bar

from tqdm import tqdm for i in tqdm(range(20)): x = np.random.randn(1000, 1000)
100%|██████████| 20/20 [00:00<00:00, 23.02it/s]

Accessing local files

Clicking on the file folder icon on the left hand side of colab lets you browse local files. Right clicking on a filename lets you download it to your local machine. Double clicking on a file will open it in the file viewer/ editor, which appears on the right hand side.

The result should look something like this:

You can also use standard unix commands to manipulate files, as we show below.

!pwd
/content
!ls
myplot.png sample_data
!echo 'foo bar' > foo.txt !cat foo.txt
foo bar

However, !cd does not work. You need to use the magic %cd.

!pwd !mkdir dummy %cd dummy !ls %cd ..
/content /content/dummy /content

To make a new (local) file in colab's editor, first create the file with the operating system, and then view it using colab.

from google.colab import files file = "bar.py" !touch $file files.view(file)
<IPython.core.display.Javascript object>

If you make changes to a file containing code, the new version of the file will not be noticed unless you use the magic below.

%load_ext autoreload %autoreload 2

Syncing with Google drive

Files that you generate in, or upload to, colab are ephemeral, since colab is a temporary environment with an idle timeout of 90 minutes and an absolute timeout of 12 hours (24 hours for Colab pro). To save any files permanently, you need to mount your google drive folder as we show below. (Executing this command will open a new window in your browser - you need cut and paste the password that is shown into the prompt box.)

from google.colab import drive drive.mount('/content/gdrive') !pwd
Mounted at /content/gdrive /content
with open("/content/gdrive/MyDrive/foo.txt", "w") as f: f.write("Hello Google Drive!") !cat /content/gdrive/MyDrive/foo.txt
Hello Google Drive!

To ensure that local changes are detected by colab, use this piece of magic.

%load_ext autoreload %autoreload 2

Uploading data to colab from your local machine

from google.colab import files # GUI lets you select the file # the return value is a dict, mapping filename to bytes uploaded = files.upload()

Downloading data from colab to your local machine

from google.colab import files files.download("checkpoints/gan-mlp-mnist-epoch=02.ckpt")

Loading data from the web into colab

You can use wget

!rm timemachine.*
#!wget https://github.com/probml/pyprobml/blob/master/data/timemachine.txt #!wget https://github.com/probml/pyprobml/blob/master/data/timemachine.txt !wget https://raw.githubusercontent.com/probml/probml-data/main/data/timemachine.txt
--2021-07-19 17:42:46-- https://raw.githubusercontent.com/probml/probml-data/main/data/timemachine.txt Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 178887 (175K) [text/plain] Saving to: ‘timemachine.txt’ timemachine.txt 100%[===================>] 174.69K --.-KB/s in 0.03s 2021-07-19 17:42:46 (6.65 MB/s) - ‘timemachine.txt’ saved [178887/178887]
!head timemachine.txt
The Time Machine, by H. G. Wells [1898] I The Time Traveller (for so it will be convenient to speak of him) was expounding a recondite matter to us. His grey eyes shone and twinkled, and his usually pale face was flushed and animated. The fire burned brightly, and the soft radiance of the incandescent lights in the lilies of silver caught the bubbles that flashed and passed in our glasses. Our chairs, being his patents, embraced and caressed us rather than submitted to be sat upon, and there was that luxurious after-dinner atmosphere when thought roams gracefully free of the trammels of precision. And he put it to us in this way--marking the points with a lean forefinger--as we sat and lazily admired his earnestness over this new paradox (as we thought it) and his fecundity. 'You must follow me carefully. I shall have to controvert one or two ideas that are almost universally accepted. The geometry, for instance, they taught you at school is founded on a misconception.' 'Is not that rather a large thing to expect us to begin upon?' said Filby, an argumentative person with red hair.
datadir = "." import re fname = os.path.join(datadir, "timemachine.txt") with open(fname, "r") as f: lines = f.readlines() sentences = [re.sub("[^A-Za-z]+", " ", st).lower().split() for st in lines] for i in range(5): words = sentences[i] print(words)
['the', 'time', 'machine', 'by', 'h', 'g', 'wells'] [] ['i'] [] ['the', 'time', 'traveller', 'for', 'so', 'it', 'will', 'be', 'convenient', 'to', 'speak', 'of', 'him', 'was', 'expounding', 'a', 'recondite', 'matter', 'to', 'us', 'his', 'grey', 'eyes', 'shone', 'and', 'twinkled', 'and', 'his', 'usually', 'pale', 'face', 'was', 'flushed', 'and', 'animated', 'the', 'fire', 'burned', 'brightly', 'and', 'the', 'soft', 'radiance', 'of', 'the', 'incandescent', 'lights', 'in', 'the', 'lilies', 'of', 'silver', 'caught', 'the', 'bubbles', 'that', 'flashed', 'and', 'passed', 'in', 'our', 'glasses', 'our', 'chairs', 'being', 'his', 'patents', 'embraced', 'and', 'caressed', 'us', 'rather', 'than', 'submitted', 'to', 'be', 'sat', 'upon', 'and', 'there', 'was', 'that', 'luxurious', 'after', 'dinner', 'atmosphere', 'when', 'thought', 'roams', 'gracefully', 'free', 'of', 'the', 'trammels', 'of', 'precision', 'and', 'he', 'put', 'it', 'to', 'us', 'in', 'this', 'way', 'marking', 'the', 'points', 'with', 'a', 'lean', 'forefinger', 'as', 'we', 'sat', 'and', 'lazily', 'admired', 'his', 'earnestness', 'over', 'this', 'new', 'paradox', 'as', 'we', 'thought', 'it', 'and', 'his', 'fecundity']

Loading code from the web into colab

We can also download python code and run it locally.

!wget -q https://raw.githubusercontent.com/probml/pyprobml/master/scripts/pyprobml_utils.py import pyprobml_utils as pml pml.test()
welcome to python probabilistic ML library

Viewing all your notebooks

You can see the list of colab notebooks that you have saved as shown below.

import re, pathlib, shutil from pathlib import PosixPath # Get a list of all your Notebooks notebooks = [ x for x in pathlib.Path("/content/gdrive/MyDrive/Colab Notebooks").iterdir() if re.search(r"\.ipynb", x.name, flags=re.I) ] print(notebooks[:2]) # n = PosixPath('/content/gdrive/MyDrive/Colab Notebooks/covid-open-data-paper.ipynb')
[PosixPath('/content/gdrive/MyDrive/Colab Notebooks/saining-cvpr18-video-3d.ipynb'), PosixPath('/content/gdrive/MyDrive/Colab Notebooks/Copy of Getting started with pytorch.ipynb')]

Working with github

You can open any jupyter notebook stored in github in a colab by replacing https://github.com/probml/.../intro.ipynb with https://colab.research.google.com/github/probml/.../intro.ipynb (see this blog post.

It is possible to download code (or data) from githib into a local directory on this virtual machine. It is also possible to upload local files back to github, although that is more complex. See details below.

Cloning a repo from github

You can clone a public github repo into your local colab VM, as we show below, using the repo for this book as an example. (To clone a private repo, you need to specify your password, as explained here. Alternatively you can use the ssh method we describe below.)

!rm -rf pyprobml # Remove any old local directory to ensure fresh install !git clone --depth 1 https://github.com/probml/pyprobml
error: did you mean `--depth` (with two dashes)?
!pwd
/content
!ls
cat_dog.jpg pyprobml sample_data __pycache__ pyprobml_utils.py timemachine.txt

We can run any script as shown below. (Note we first have to define the environment variable for where the figures will be stored.)

import os os.environ["PYPROBML"] = "pyprobml" %run pyprobml/scripts/activation_fun_plot.py
saving image to pyprobml/figures/activationFuns.pdf
Image in a Jupyter notebook
saving image to pyprobml/figures/activationFuns2.pdf
Image in a Jupyter notebook
saving image to pyprobml/figures/sigmoid_saturation_plot.pdf
Image in a Jupyter notebook
!ls pyprobml/figures
activationFuns2.pdf activationFuns.pdf sigmoid_saturation_plot.pdf

We can also import code, as we show below.

!ls
pyprobml sample_data
import pyprobml.scripts.pyprobml_utils as pml pml.test()
welcome to python probabilistic ML library

Pushing local files back to github

You can easily save your entire colab notebook to github by choosing 'Save a copy in github' under the File menu in the top left. But if you want to save individual files (eg code that you edited in the colab file editor, or a bunch of images or data files you created), the process is more complex. There are several possible methods, described here and here. Below we describe two other solutions.

Use the local terminal

Suppose you clone a repo using something like

git clone --depth 1 https://github.com/probml/pyprobml.git

This will create a folder called content/pyprobml. You can access this in colab as usual. Now suppose you want to save your edits, eg to a file called foo.txt. Follow the steps below.

Open a terminal window inside colab. (Or run the commands below in the colab notebook, but prefixed with !. For some reason this fails with the final push command...). Then do the following

  • git config --global user.email "[email protected]"

  • git config --global user.name"Kevin Murphy"

  • cd /content/pyprobml

  • echo 'this is a test' > foo.txt # if haven't yet modified the file

  • git add foo.txt

  • git commit -m "message"

  • git push

Github will ask for your username and password. Instead of a password, enter your personal access token.

Deprecated method

To avoid having to type your PAT every time, you can use the method below. However, as of May 2023, this no longer seems to work.

You first need to do some setup to create SSH keys on your current colab VM (virtual machine), manually add the keys to your github account, and then copy the keys to your mounted google drive so you can reuse the same keys in the future. This only has to be done once.

After setup, you can use the git_ssh function we define below to securely execute git commands. This works by copying your SSH keys from your google drive to the current colab VM, executing the git command, and then deleting the keys from the VM for safety.

To get started, run these commands in your colab. (The commands need to be uncommented.) The cat command will display your public key in the colab window. Cut and paste this and manually add to your github account following these instructions.

#!ssh-keygen -t rsa -b 4096 #!ssh-keyscan -t rsa github.com >> ~/.ssh/known_hosts #!cat /root/.ssh/id_rsa.pub

Test it worked.

!ssh -T git@github.com
Host key verification failed.

Finally, save the generated keys to your Google drive

from google.colab import drive drive.mount('/content/drive') !mkdir /content/drive/MyDrive/ssh/ !cp -r ~/.ssh/* /content/drive/MyDrive/ssh/ !ls /content/drive/MyDrive/ssh/
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True). id_rsa id_rsa.pub known_hosts

Let us check that we can see our SSH keys in our mounted google drive.

from google.colab import drive drive.mount('/content/drive') !ls /content/drive/MyDrive/ssh/
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True). id_rsa id_rsa.pub known_hosts

The following function lets you securely doing a git command via SSH. It copies the keys from your google drive to the local VM, excecutes the command, then removes the keys.

#!rm -rf pyprobml_utils.py # remove any old copies of this file #!wget https://raw.githubusercontent.com/probml/pyprobml/master/scripts/pyprobml_utils.py !rm -rf colab_utils.py # remove any old copies of this file !wget https://raw.githubusercontent.com/probml/pyprobml/master/deprecated/scripts/colab_utils.py
--2023-05-26 21:45:47-- https://raw.githubusercontent.com/probml/pyprobml/master/deprecated/scripts/colab_utils.py Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.108.133, 185.199.110.133, ... Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 2267 (2.2K) [text/plain] Saving to: ‘colab_utils.py’ colab_utils.py 100%[===================>] 2.21K --.-KB/s in 0s 2023-05-26 21:45:47 (27.1 MB/s) - ‘colab_utils.py’ saved [2267/2267]
from google.colab import drive drive.mount("/content/drive") # must do this before running git_ssh import colab_utils # import script into namespace
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).

Below we clone the pyprobml repo to this colab VM using out github credentials, so we can later check stuff back in. This is just an example - you should edit the reponame, username and email variables.

!rm -rf pyprobml # remove any old copies of this directory #!git clone https://github.com/probml/pyprobml.git # clones using wrong credentials colab_utils.git_ssh("git clone https://github.com/probml/pyprobml.git", email="[email protected]", username="probml", verbose=True) # update to use your credentials
executing command via ssh: git clone [email protected]:probml/pyprobml.git Copying keys from gdrive to local VM Executing git commands Cleanup local VM
reponame = 'pyprobml' username = 'probml' email = '[email protected]' # update to use your credentials !rm -rf $reponame # remove any old copies of this directory cmd = f"git clone https://github.com/{username}/{reponame}.git" #colab_utils.git_ssh(cmd, email=email, username=username)
print(cmd)
git clone https://github.com/probml/pyprobml.git

Let's check that we can see this repo in our local drive.

!pwd !ls
/content drive __pycache__ pyprobml pyprobml_utils.py sample_data
!ls /content/$reponame/
book1 CONTRIBUTING.md figures __init__.py notebooks scripts book2 data images LICENSE.txt README.md

Now we create a dummy file inside our local copy of this repo, and push it back to the github (public) version of the repo.

# Make the dummy file in the scripts folder of repo %cd /content/$reponame !echo 'this is a test' > scripts/foo.txt # Add file to the external repo cmd = "git add scripts; git commit -m 'push from colab'; git push" colab_utils.git_ssh(cmd, email=email, username=username)
/content/pyprobml executing command via ssh: git add scripts; git commit -m 'push from colab'; git push

We can check that it worked by visiting this page on github (note the time stamp on the top right):

Finally we clean up our mess.

%cd /content/$reponame cmd = "git rm scripts/foo*.txt; git commit -m 'colab cleanup'; git push" colab_utils.git_ssh(cmd, email=email, username=username, verbose=True) %cd /content
/content/pyprobml executing command via ssh: git rm scripts/foo*.txt; git commit -m 'colab cleanup'; git push Copying keys from gdrive to local VM Executing git commands Cleanup local VM /content

Software engineering tools

Pros and Cons of notebooks

Joel Grus has argued that notebooks are bad for developing complex software, because they encourage creating monolithic notebooks instead of factoring out code into separate, well-tested files.

Jeremy Howard has responded to Joel's critiques here. In particular, the FastAI organization has created nbdev which has various tools that make notebooks more useful.

My recommended workflow is to use the notebook as a development environment, and the convert the code to a set of files, that can run from the command line independently of the notebook (which is useful for parallel experiments, etc).

In ogther words, develop your code in the colab in the usual way, and when it is working, to factor out the core code into separate files. You can edit these files locally in the colab editor or some other file editor (see below), and then call them from a colab cell. This lets you separate your code from the output of your code, as with other IDEs, such as Jupyter lab.

To run a function defined in a local file inside colab, just import it. For example, suppose we have created the file /content/pyprobml/scripts/fit_flax.py; we can use this idiom to run its test suite:

import pyprobml.scripts.fit_flax as ff ff.test()

If you make local edits, you want to be sure that you always import the latest version of the file (not a cached version). So you need to use this piece of colab magic first:

%load_ext autoreload %autoreload 2

When the code is running, save it to github (see details above).

File editors

Colab editor

Colab has a simple file editor, illustrated below for an example file. This lets you separate your code from the output of your code, as with other IDEs, such as Jupyter lab.

You can click on a class name when holding Ctrl and the source code will open in the file viewer. (h/t Amit Choudhary's blog.

VScode

The default colab file editor is very primitive. Fortunately you can run VScode in your browser and connect it to the colab machine via ssh, as explained in this article and this article.

The above method can be quite 'laggy'. An alternative is to access the VM running colab directly via ssh, using these instructions. You can then run VScode locally (on your laptop) and connect to the remote machie using these instructions.

Avoiding problems with global state

One of the main drawbacks of colab is that all variables are globally visible, so you may accidently write a function that depends on the current state of the notebook, but which is not passed in as an argument. Such a function may fail if used in a different context.

One solution to this is to put most of your code in files, and then have the notebook simply import the code and run it, like you would from the command line. Then you can always run the notebook from scratch, to ensure consistency.

Another solution is to use the localscope package can catch some of these errors.

!pip install localscope
Collecting localscope Downloading https://files.pythonhosted.org/packages/71/29/c3010c332c7175fe48060b1113e32f2831bab2202428d2cc29686685302f/localscope-0.1.3.tar.gz Building wheels for collected packages: localscope Building wheel for localscope (setup.py) ... done Created wheel for localscope: filename=localscope-0.1.3-cp36-none-any.whl size=4068 sha256=7a5d6718e16dbff82fe94e1229d233a19ef52280ff0d4fc48ef62a2ba41d5855 Stored in directory: /root/.cache/pip/wheels/89/57/33/ce153d31de05d74323324df0f45a08ea99e92300e549da5154 Successfully built localscope Installing collected packages: localscope Successfully installed localscope-0.1.3
from localscope import localscope
a = "hello world" def myfun(): print(a) # silently accesses global variable myfun()
hello world
a = "hello world" @localscope def myfun(): print(a) myfun()
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-8-549f0808922e> in <module>() 1 a = 'hello world' ----> 2 @localscope 3 def myfun(): 4 print(a) 5 /usr/local/lib/python3.6/dist-packages/localscope/__init__.py in localscope(func, predicate, allowed, allow_closure, _globals) 115 value = _globals[name] 116 if not predicate(value): --> 117 raise ValueError(f'`{name}` is not a permitted global') 118 elif instruction.opname == 'STORE_DEREF': 119 allowed.append(name) ValueError: `a` is not a permitted global
def myfun2(): return 42 @localscope def myfun3(): return myfun2()
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-9-5cfa07ed9c85> in <module>() 2 return 42 3 ----> 4 @localscope 5 def myfun3(): 6 return myfun2() /usr/local/lib/python3.6/dist-packages/localscope/__init__.py in localscope(func, predicate, allowed, allow_closure, _globals) 115 value = _globals[name] 116 if not predicate(value): --> 117 raise ValueError(f'`{name}` is not a permitted global') 118 elif instruction.opname == 'STORE_DEREF': 119 allowed.append(name) ValueError: `myfun2` is not a permitted global
@localscope.mfc # allow for global methods, functions, classes def myfun4(): return myfun2() myfun4()
42

Argparse

Often code is designed to be run from the command line, and can be configured by passing in arguments and flags. To make this work in colab, you have to use parse_known_args, as in the example below.

def main(args): print("my awesome function") print(args.arg1) print(args.arg2) import argparse parser = argparse.ArgumentParser(description="My Demo") parser.add_argument("-arg1", default=1, type=int, help="An integer to print") parser.add_argument("-arg2", "--argument2", default="foo", help="A string to print") parser.add_argument("-f", "--flag", action="store_true", help="Just a flag") # args = parser.parse_args() # error in colab args, unused = parser.parse_known_args() print(args) print(unused)
Namespace(arg1=1, argument2='foo', flag=True) ['/root/.local/share/jupyter/runtime/kernel-695757f6-572d-4770-b43b-f5bcffa78e35.json']
args.arg1 = 42 args.arg2 = "bar" main(args)
my awesome function 42 bar
args.arg1 = 49 args.arg2 = "foo" main(args)
my awesome function 49 foo

YAML files

We show how to create a config file locally, and then pass it to your code.

%%writefile myconfig.yaml model_params: name: 'VanillaVAE' in_channels: 3 latent_dim: 128 exp_params: dataset: celeba data_path: "../../shared/Data/"
Overwriting myconfig.yaml
!cat myconfig.yaml
model_params: name: 'VanillaVAE' in_channels: 3 latent_dim: 128 exp_params: dataset: celeba data_path: "../../shared/Data/"
from google.colab import files file = "myconfig.yaml" #!touch $file files.view(file)
<IPython.core.display.Javascript object>
!cat myconfig.yaml
model_params: name: 'VanillaVAE' in_channels: 3 latent_dim: 128 exp_params: dataset: celeba
import yaml filename = "myconfig.yaml" with open(filename, "r") as file: config = yaml.safe_load(file) print(type(config)) print(config) print(config["model_params"]["in_channels"])
<class 'dict'> {'model_params': {'name': 'VanillaVAE', 'in_channels': 3, 'latent_dim': 128}, 'exp_params': {'dataset': 'celeba', 'data_path': '../../shared/Data/'}} 3

Hardware accelerators

By default, Colab runs on a CPU, but you can select GPU or TPU for extra speed, as we show below. To get access to more powerful machines (with faster processors, more memory, and longer idle timeouts), you can subscript to Colab Pro. At the time of writing (Jan 2021), the cost is $10/month (USD). This is a good deal if you use GPUs a lot.

CPUs

To see what devices you have, use this command.

from tensorflow.python.client import device_lib device_lib.list_local_devices()
[name: "/device:CPU:0" device_type: "CPU" memory_limit: 268435456 locality { } incarnation: 12564556295328712695, name: "/device:GPU:0" device_type: "GPU" memory_limit: 15692777408 locality { bus_id: 1 links { } } incarnation: 6192511611231741902 physical_device_desc: "device: 0, name: Tesla P100-PCIE-16GB, pci bus id: 0000:00:04.0, compute capability: 6.0"]
!cat /proc/version
Linux version 4.19.112+ (builder@a12462ca91c8) (Chromium OS 10.0_pre377782_p20200113-r10 clang version 10.0.0 (/var/cache/chromeos-cache/distfiles/host/egit-src/llvm-project 4e8231b5cf0f5f62c7a51a857e29f5be5cb55734)) #1 SMP Thu Jul 23 08:00:38 PDT 2020
from psutil import cpu_count print("num cores", cpu_count()) !cat /proc/cpuinfo
num cores 4 processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU @ 2.20GHz stepping : 0 microcode : 0x1 cpu MHz : 2200.000 cache size : 56320 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa bogomips : 4400.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU @ 2.20GHz stepping : 0 microcode : 0x1 cpu MHz : 2200.000 cache size : 56320 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 2 initial apicid : 2 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa bogomips : 4400.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 2 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU @ 2.20GHz stepping : 0 microcode : 0x1 cpu MHz : 2200.000 cache size : 56320 KB physical id : 0 siblings : 4 core id : 0 cpu cores : 2 apicid : 1 initial apicid : 1 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa bogomips : 4400.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management: processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 79 model name : Intel(R) Xeon(R) CPU @ 2.20GHz stepping : 0 microcode : 0x1 cpu MHz : 2200.000 cache size : 56320 KB physical id : 0 siblings : 4 core id : 1 cpu cores : 2 apicid : 3 initial apicid : 3 fpu : yes fpu_exception : yes cpuid level : 13 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx smap xsaveopt arat md_clear arch_capabilities bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs taa bogomips : 4400.00 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual power management:

Memory

from psutil import virtual_memory ram_gb = virtual_memory().total / 1e9 print("RAM (GB)", ram_gb)
RAM (GB) 27.393818624
!cat /proc/meminfo
MemTotal: 26751776 kB MemFree: 23653276 kB MemAvailable: 25773356 kB Buffers: 109276 kB Cached: 2230616 kB SwapCached: 0 kB Active: 756592 kB Inactive: 1980036 kB Active(anon): 363548 kB Inactive(anon): 416 kB Active(file): 393044 kB Inactive(file): 1979620 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 392 kB Writeback: 0 kB AnonPages: 396824 kB Mapped: 212400 kB Shmem: 1032 kB Slab: 201364 kB SReclaimable: 140252 kB SUnreclaim: 61112 kB KernelStack: 4528 kB PageTables: 5336 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 13375888 kB Committed_AS: 3257684 kB VmallocTotal: 34359738367 kB VmallocUsed: 0 kB VmallocChunk: 0 kB Percpu: 2096 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 134376 kB DirectMap2M: 4059136 kB DirectMap1G: 25165824 kB

GPUs

If you select the 'Runtime' menu at top left, and then select 'Change runtime type' and then select 'GPU', you can get free access to a GPU.

To see what kind of GPU you are using, see below.

gpu_info = !nvidia-smi gpu_info = "\n".join(gpu_info) print(gpu_info)
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
!grep Model: /proc/driver/nvidia/gpus/*/information | awk '{$1="";print$0}'
Tesla P100-PCIE-16GB