Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
Real-time collaboration for Jupyter Notebooks, Linux Terminals, LaTeX, VS Code, R IDE, and more,
all in one place. Commercial Alternative to JupyterHub.
"Guiding Future STEM Leaders through Innovative Research Training" ~ thinkingbeyond.education
Image: ubuntu2204
MNIST Digit Classification: Comparing Classical and Quantum Approaches to Hyperparameter Tuning
Objective:
We wanted to explore different aspects of ML models and focus on a specific topic. After discussing within the team and with our mentor, Dr. Emilie Gregoire, we came to the idea to focus on hyperparameter tuning.
Motivation:
Traditional methods for tuning hyperparameters like grid search and random search are commonly used but they also come with their own limitations. Grid search examines evenly spaced points which can overlook optimal values between them, while random search may miss important areas by selecting points randomly. To overcome these challenges, we explored the RX gate approach which uses quantum circuits to help find the best hyperparameters for machine learning models. It creates a smooth wave like pattern that allows for effective identification of the hyperparameter space. This method is particularly useful when dealing with many hyperparameters. We chose this approach because it uses the unique properties of quantum circuits to overcome the weaknesses of traditional methods and provides a new way to tune hyperparameters effectively.
Research Question:
Compare the optimal performance that Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) can achieve using hyperparameter tuning.
Method and Implementation:
We used the classical method for the SVMs, looping over both the regularization constant, the C parameter and the gamma parameter or the kernel modifier of the Radial Basis Function (RBF) kernel used in the SVM model. The MLP hyperparameters were tuned based on a quantum approach using quantum circuits designed using pennylane which uses rotation gates for 3 qubits to generate Quantum outputs. These outputs are translated into classical hyperparameters which includes the learning rate, number of neurons and regularization strength, for an MLP classifier. A fitness function evaluates the performance of the hyperparameters through Cross-Validation accuracy scores.
Results:
For SVM Model:
Using fixed hyperparameters:
Gamma = 0.05
C = 1.0
Accuracy: 0.9755
Variation of Learning Time and Accuracy:
With respect to Gamma:
Accuracy peaked at Gamma = 0.026
Reached a peak accuracy of 0.967
Regularization Constant (C) set as 1
With respect to C:
Accuracy peaked at C = 1.26 to 1.3
Reached a peak accuracy of 0.963 at Gamma = 0.026 (Peak Accuracy in Hyperparameter Tuning)
For MLP Model:
Cross-Validation Accuracy:
Fold 1 achieved an accuracy of 94.38%.
Fold 2 achieved an accuracy of 93.44%.
Fold 3 achieved an accuracy of 94.07%.
Fitness Score:
The average fitness score or Cross-Validation accuracy score across all folds is 93.96%.
High Accuracy:
A fitness score close to 94% demonstrates that the hyperparameters derived from the quantum circuit provide a strong starting point for the classical MLP.
Conclusion:
For the SVM Model:
A combination of:
Gamma = 0.05 and C = 1.0
Returned the highest accuracy of 0.9755.
Gamma Parameter:
Accuracy rate:
Increased with the value of Gamma.
Plateaued and then decreased with roughly the same slope.
Peaked at Gamma = 0.926 with a peak accuracy of 0.967.
Learning time:
Initially dropped with the value of Gamma.
Increased thereafter and plateaued towards the end.
C Parameter (Regularization Constant):
Accuracy rate:
Initially increased very sharply with the value of C.
Plateaued towards the end.
Peaked at C = 1.26 to 1.3 with a peak accuracy of 0.963.
Learning time:
Initially decreased very sharply with the value of C.
Plateaued towards the end.
For the MLP Model:
The scores for the three Cross-Validation folds (0.9438, 0.9344, and 0.9407) indicates high accuracy in model performance.
The average Cross-Validation accuracy of 0.939625, represented by the fitness score, indicates overall reliable model performance for the hyperparameters generated by the quantum circuit.
These results suggest that the quantum circuit generated hyperparameters are not only valid but also capable of producing high accuracy results.
Future Work:
Our project showed that using the RX gate approach worked better than grid search and random search for tuning hyperparameters by addressing the limitations of these methods. The next step is to expand our research by comparing our method with other approaches like Monte Carlo with an even distribution, normal distribution, Poisson distribution to see how well it performs.
The research poster for this project can be found in the BeyondAI Proceedings 2024.