HYPERPARAMETER OPTIMIZATION BASED ON APRIORI AND A POSTERIORI KNOWLEDGE ABOUT CLASSIFICATION PROBLEM

Valentina S. Smirnova, Vyacheslav V. Shalamov, Valeria A. Efimova, Andrey A. Filchenkov

, VOLUME , NUMBER ( )

ISSN 2226-1494 (print), ISSN 2500-0373 (online)

Publications

Editor-in-Chief

Nikiforov
Vladimir O.
D.Sc., Prof.

Partners

doi: 10.17586/2226-1494-2020-20-6-828-834

HYPERPARAMETER OPTIMIZATION BASED ON APRIORI AND A POSTERIORI KNOWLEDGE ABOUT CLASSIFICATION PROBLEM

V. S. Smirnova, V. V. Shalamov, V. A. Efimova, A. A. Filchenkov

Read the full article

Article in Russian

For citation:

Smirnova V.S., Shalamov V.V., Efimova V.A., Filchenkov A.A. Hyperparameter optimization based on a priori and a posteriori knowledge about classification problem. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, vol. 20, no. 6, pp. 828-834 (in Russian). doi: 10.17586/2226-1494-2020-20-6-828-834

Abstract

Subject of Research.The paper deals with Bayesian method for hyperparameter optimization of algorithms, used in machine learning for classification problems. A comprehensive survey is carried out about using a priori and a posteriori knowledge in classification task for hyperparameter optimization quality improvement. Method. The existing Bayesian optimization algorithm for hyperparameter setting in classification problems was expanded. We proposed a target function modification calculated on the basis of hyperparameters optimized for the similar problems and a metric for determination of similarity classification problems based on generated meta-features. Main Results. Experiments carried out on the real-world datasets from OpenML database have confirmed that the proposed algorithm achieves usually significantly better performance results than the existing Bayesian optimization algorithm within a fixed time limit.Practical Relevance. The proposed algorithm can be used for hyperparameter optimization in any classification problem, for example, in medicine, image processing or chemistry.

Keywords: machine learning, classification, hyperparameter optimization, Bayesian optimization, Gaussian processes

Acknowledgements. This work was financially supported by the Government of the Russian Federation, Grant 08-08.

References

1. Zhang G.P. Neural networks for classification: a survey. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 2000, vol. 30, no. 4, pp. 451–462. doi: 10.1109/5326.897072

2. Aly M. Survey on multiclass classification methods. Technical Report, Caltech, California Institute of Technology, 2005, 9 p.

3. Liaw A., Wiener M. Classification and regression by randomForest. R news, 2002, vol. 2, no. 3, pp. 18–22.

4. Hastie T., Tibshirani R., Friedman J. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Science & Business Media, 2009, 745 p.

5. Van Der Malsburg C. Frank Rosenblatt: principles of neurodynamics: perceptrons and the theory of brain mechanisms. Brain Theory, Springer, 1986, pp. 245–248. doi: 10.1007/978-3-642-70911-1_20

6. Muravyov S.B., Eﬁmova V.A., Shalamov V.V., Filchenkov A.A., Smetannikov I.B. Automatic hyperparameter optimization for clustering algorithms with reinforcement learning. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2019, vol. 19, no. 3, pp. 508–515. (in Russian). doi: 10.17586/2226-1494-2019-19-3-508-515

7. Eﬁmova V.A., Filchenkov A.A., Shalyto A.A. Reinforcement-based simultaneous classification model and its hyperparameters selection. Machine Learning and Data Analysis, 2016, vol. 2, no. 2, pp. 244–254. (in Russian)

8. Yu T., Zhu H. Hyper-parameter optimization: A review of algorithms and applications. arXiv preprint, arXiv:2003.05689, 2020.

9. Chang C.-C., Lin C.-J. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2011, vol. 2, no. 3, pp. 27. doi: 10.1145/1961189.1961199

10. Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay É. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 2011, vol. 12, pp. 2825–2830.

11. Bergstra J., Komer B., Eliasmith C., Yamins D., Cox D.D. Hyperopt: a Python library for model selection and hyperparameter optimization. Computational Science & Discovery, 2015, vol. 8, no. 1, pp. 014008. doi: 10.1088/1749-4699/8/1/014008

12. Maclaurin D., Duvenaud D., Adams R. Gradient-based hyperparameter optimization through reversible learning. Proceedings 32nd International Conference on Machine Learning, 2015, pp. 2113–2122.

13. Bergstra J.S., Bardenet R., Bengio Y., Kégl B. Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems: Proc. 5th Annual Conference on Neural Information Processing Systems (NIPS 2011), 2011, pp. 2546–2554.

14. Gijsbers P., Vanschoren J., Olson R.S. Layered TPOT: Speeding up tree-based pipeline optimization. arXiv preprint, arXiv:1801.06007, 2018.

15. Fortin F.-A., De Rainville F.-M., Gardner M.-A., Parizeau M., Gagńe C. DEAP: Evolutionary algorithms made easy. Journal of Machine Learning Research, 2012, vol. 13, pp. 2171–2175.

16. Hazan E., Klivans A., Yuan Y. Hyperparameter optimization: A spectral approach. arXiv preprint, arXiv:1706.00764, 2017.

17. Martinez-Cantin R. BayesOpt: A bayesian optimization library for nonlinear optimization, experimental design and bandits. Journal of Machine Learning Research, 2014, vol. 15, pp. 3735–3739.

18. Thornton C., Hutter F., Hoos H.H., Leyton-Brown K. Auto-WEKA: Combined selection and hyperparameter optimization of classification algorithms. Proc. 19th ACM SIGKDD international conference on Knowledge discovery and data mining (KDD 2013), 2013, pp. 847–855. doi: 10.1145/2487575.2487629

19. Feurer M., Klein A., Eggensperger K., Springenberg J.T., Blum M., Hutter F. Efficient and robust automated machine learning. Advances in Neural Information Processing Systems, 2015, pp. 2962–2970.

20. Bischl B., Richter J., Bossek J., Horn D., Thomas J., Lang M. mlrMBO: A modular framework for model-based optimization of expensive black-box functions. arXiv preprint, arXiv:1703.03373, 2017.

21. Probst P., Wright M.N., Boulesteix A.-L. Hyperparameters and tuning strategies for random forest. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2019, vol. 9, no. 3, pp. e1301. doi: 10.1002/widm.1301

22. Hutter F., Hoos H.H., Leyton-Brown K. Sequential model-based optimization for general algorithm configuration. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2011, vol. 6683, pp. 507–523. doi: 10.1007/978-3-642-25566-3_40

23. Feurer M., Hutter F. Hyperparameter optimization. Automated Machine Learning. Springer, 2019, pp. 3–33. doi: 10.1007/978-3-030-05318-5_1

24. Springenberg J.T., Klein A., Falkner S., Hutter F. Bayesian optimization with robust Bayesian neural networks. Advances in Neural Information Processing Systems, 2016, pp. 4141–4149.

25. Snoek J., Ripped O., Swersky K., Kiros R., Satish N., Sundaram N., Patwary M.M.A., Prabhat, Adams R.P. Scalable Bayesian optimization using deep neural networks. Proc. 32nd International Conference on Machine Learning (ICML), 2015, pp. 2171–2180.

26. Klein A., Falkner S., Mansur N., Hutter F. RoBO: A flexible and robust bayesian optimization framework in Python. Proc. 31st Conference on Neural Information Processing Systems, 2017.

27. Mann H.B., Whitney D.R. On a test of whether one of two random variables is stochastically larger than the other. Annals of Mathematical Statistics, 1947, vol. 18, no. 1, pp. 50–60.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License