Introduction to Classical Machine Learning
In this lecture, Amira covers the basics of classical machine learning. Amira introduces Machine Learning terminologies and popular models as well as goals and objectives of machine learning. She explores the mathematical representations of machine learning models and their cost functions, along with different areas of machine learning. Amira introduces the ingredients for a machine learning algorithm and explains how we can continuously optimise a model using a method known as gradient descent. Finally, Amira explores the bias-variance tradeoff to show how to choose a good model.
Suggested links
FAQ
What are the alternatives to using gradient descent?
Gradient descent is to optimize your cost function and has a lot of variants such as stochastic gradient descent. There are also alternatives such as ADAM (Adaptive Moment Estimation) or gradient free optimisers. However, gradient descent is successful in practice and has helped neural networks to be more successful.
What is the difference between parameters and hyper-parameters?
In a linear model, your parameters are e.g. the weights that are used to multiply by your data points. However, hyper-parameters are a bit more model specific. In a neural network, hyper-parameters are e.g. the number of neurons or the depth of your neural network.
How can we overcome getting stuck in a local minima?
Using randomness or stochastic gradient descent are a great way. However, the landscape of cost functions are increasingly flat and thus it becomes harder and harder.
How can we minimize a cost function without true labels?
This is still quite difficult. In an unsupervised cluster setting one approach could be to minimize the inner cluster distances and maximize distances between different clusters.
Can the cost function be negative?
Yes.
How to master Quantum computing without knowing mathematics? Can you please share courses you took in your beginning of the journey?
It is definitely possible. Amira came from a Financial background. The Nielsen and Chuang book (Quantum Computation and Quantum Information) is a good place to refer to. The resources and course materials covered in the summer school is in Amira’s opinion, one of the best. Just keep doing it and refer to material again and again.
How can we know that it is a local minimum that we have reached and not the global minimum (during the model training)?
It is almost impossible to say with certain types of models. Neural networks cost landscape which is very complicated. This can be a study of the life time. In certain models in certain set-ups we know, there is a global minimum but in general, it is very hard to tell.
At the beginning of the lecture, we gave an image-pixel example. How does only one data point define a pixel as a pixel is not only dark and light? Is one dimensional vectors always enough to define a feature set?
The example given in the lecture is very simplified. Picture can be more complicated and represented by more complicated model, for example, it can be color picture represented by RGB pixels values and not just +1 and -1. There can be more dimensions in the representation, e.g. by tensors. If you’re interested in specific example of representation of picture in tensors, then google convolutional neural networks. There are lots of blog posts and how these objects look and how we can model them using neural network architectures.
Can you talk a little bit about your current research, Amira?
Capacity of any statistical models, If you come to me with a model, I want to be able to tell the how good the model will generalize and how good it is. How to measure the capacity of statistical models. Recent publical in Nature Computational Science: The Power of Quantum Neural Networks. It talks about the Effective Dimension. A lot of my works is now on how we can measure about this more deeply and about capacity control. Ultimately, we want to make a statement like how quantum models have different capacity in different scenarios and how good a given model is.
Other resources
Read Li, H., Xu, Z., Taylor, G., Studer, C., Goldstein, T., on Visualizing the Loss Landscape of Neural Nets
Read Abbas, A., Sutter, D., Zoufal, C., Lucchi, A., Figaalli, A., Woerner, S., on The Power of Quantum Neural Networks
Deep Learning by Ian Goodfello et al.