AI: PBL Problem 2

In PBL-1, we've used three methods: Perceptron, Logistic regression, and SVM to create machine learning models to classify handwritten digits in the MNIST dataset. According to our experience, SVM showed good classification performance, but training an SVM was taking too much time, compared to the other methods.


The company M, who asked us to build the classifier in PBL-1, is now interested using an SVM classifier for the digit classification problem. But now they wonder if one can implement a faster algorithm, which can handle large number of training examples better than the code in scikit-learn. The company has provided us the list of requirements of the algorithm they want to have:

  • The algorithm should be based on stochastic gradient descent (incl. mini-batch)


To evaluate the prediction performance of the new algorithms, the company prepared 50,000 new handwritten digits dataset from the MNIST data. Now we're taking about three datasets:

The company will evaluate your code with the prediction accuracy on the D3 dataset.


  • Your team's python code should be <team#>.py

    • Execution arguments: <team#>.py <training_image> <training_label> <test_image>

    • Your code must have all turned hyperparameter values within the code.

    • If you're concered about the training runtime of your code, you can train your model with D1+D2+new 1k images and submit the trained weights, and make your code to use the stored weights for testing.

  • You need to submit your code, and the prediction outcome on the D3 dataset as a text file, prediction.txt. In the file, each line should be the predicted digit. For example,

1|

0|

3|

4|

....

where | stands for '\n' as in C/C++.

  • Bonus points: make your code, so that GridSearchCV can be used for hyperparameter tuning

    • Stack Overflow [link]

    • Tutorial [link]

    • Official document [link]