doi: 10.17586/2226-1494-2020-20-5-667-676


METHOD FOR HYPERPARAMETER TUNING IN MACHINE LEARNING TASKS FOR STOCHASTIC OBJECTS CLASSIFICATION 
 

A. V. Timofeev


Read the full article  ';
Article in Russian

For citation:
Timofeev A.V. Method for hyperparameter tuning in machine learning tasks for stochastic objects classification. Scientific and Technical Journal of Information Technologies, Mechanics and Optics, 2020, vol. 20, no. 5, pp. 667–676 (in Russian). doi: 10.17586/2226-1494-2020-20-5-667-676


Abstract
Subject of Research. The paper presents a simple and practically effective solution for hyperparameter tuning in classification problem by machine learning methods. The proposed method is applicable for any hyperparameters of the real type with the values which lie within the known real parametric compact. Method. A random sample (trial network) of small power is generated within the parametric compact, and the efficiency of hyperparameter tuning is calculated for each element according to a special criterion. The efficiency is estimated by the value of a real scalar, which does not depend on the classification threshold. Thus, a regression sample is formed, the regressors of which are the random sets of hyperparameters from the parametric compact, and regression values are classification efficiency indicator values corresponding to these sets. The nonparametric approximation of this regression is constructed on the basis of the formed data set. At the next stage the minimum value of the constructed approximation is determined for the regression function on the parametric compact by the Nelder-Mead optimization method. The arguments of the minimum regression value appear to be an approximate solution to the problem. Main Results. Unlike traditional approaches, the proposed approach is based on non-parametric approximation of the regression function: a set of hyperparameters – classification efficiency index value. Particular attention is paid to the choice of the classification quality criterion. Due to the use of the mentioned type approximation, it is possible to study the performance indicator behavior out of the trial grid values (“between” its nodes). As it follows from the experiments carried out on various databases, the proposed approach provides a significant increase in the efficiency of hyperparameter tuning in comparison with the basic variants and at the same time maintains almost acceptable performance even for small values of the trial grid power. The novelty of the approach lies in the simultaneous use of non-parametric approximation for the regression function, which links the hyperparameter values with the corresponding values of the quality criterion, selection of the classification quality criterion, and search method for the global extremum of this function. Practical Relevance. The proposed algorithm for hyperparameters tuning can be used in any systems built on the principles of machine learning, for example, in process control systems, biometric systems and machine vision systems.

Keywords: hyperparameters tuning, machine learning, multiclass gradient boosting classifier, multiclass SVM-classifier, SV- regression, gradient boosting regression, Nelder-Mead method

References
1. Montgomery D.C. Design and Analysis of Experiments. 8th ed. John Wiley & Sons, 2013, 752 p.
2. Bergstra J., Bengio Y. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 2012, vol. 13, pp. 281–305.
3. Zeng X., Luo G. Progressive sampling-based Bayesian optimization for efficient and automatic machine learning model selection. Health Information Science and Systems, 2017, vol. 5, pp. 2. doi: 10.1007/s13755-017-0023-z
4. Zhang Y., Bahadori M.T., Su H., Sun J. FLASH: Fast bayesian optimization for data analytic pipelines. Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2016, pp. 2065–2074. doi: 10.1145/2939672.2939829
5. Rasmussen C., Williams C. Gaussian Processes for Machine Learning. The MIT Press, 2006, 248 p.
6. Maclaurin D., Duvenaud D., Adams R. Gradient-based hyperparameter optimization through reversible learning. ICML'15: Proc. of the 32nd International Conference on International Conference on Machine Learning, 2015, pp. 2113–2122.
7. Powers D.M. Evaluation: from precision, recall and F-measure to ROC, Informedness, markedness & correlation. Journal of Machine Learning Technologies, 2011, vol. 2, no. 1, pp. 37–63.
8. Bishop C.M. Pattern Recognition and Machine Learning. Springer, 2006, 738 p.
9. Calders T., Jaroszewicz S. Efficient AUC optimization for classification. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2007, vol. 4702, pp. 42–53. doi: 10.1007/978-3-540-74976-9_8
10. Drucker H., Burges C.J.C., Kaufman L., Smola A., Vapnik V. Support vector regression machines. Advances in Neural Information Processing Systems, 1997, vol. 9, pp. 155–161.
11. Friedman J.H. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 2001, vol. 29, no. 5, pp. 1189–1232. doi: 10.1214/aos/1013203451
12. Nelder J.A., Mead R. A simplex method for function minimization. Computer Journal, 1965, vol. 7, no. 4, pp. 308–313. doi: 10.1093/comjnl/7.4.308
13. Oliphant T.E. A Bayesian perspective on estimating mean, variance, and standard-deviation from data. Available at: https://scholarsarchive.byu.edu/facpub/278 (accessed: 04.06.20).


Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Copyright 2001-2024 ©
Scientific and Technical Journal
of Information Technologies, Mechanics and Optics.
All rights reserved.

Яндекс.Метрика