Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study

Gustavo A. Lujan-Moreno; Phillip R. Howard; Omar G. Rojas; Douglas C. Montgomery

doi:10.1016/j.eswa.2018.05.024

Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study

Gustavo A. Lujan-Moreno, Phillip R. Howard, Omar G. Rojas, Douglas C. Montgomery

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Contribution to journal › Article › peer-review

91 Scopus citations

Abstract

Most machine learning algorithms possess hyperparameters. For example, an artificial neural network requires the determination of the number of hidden layers, nodes, and many other parameters related to the model fitting process. Despite this, there is still no clear consensus on how to tune them. The most popular methodology is an exhaustive grid search, which can be highly inefficient and sometimes infeasible. Another common solution is to change one hyperparameter at a time and measure its effect on the model's performance. However, this can also be inefficient and does not guarantee optimal results since it ignores interactions between the hyperparameters. In this paper, we propose to use the Design of Experiments (DOE) methodology (factorial designs) for screening and Response Surface Methodology (RSM) to tune a machine learning algorithm's hyperparameters. An application of our methodology is presented with a detailed discussion of the results of a random forest case-study using a publicly available dataset. Benefits include fewer training runs, better parameter selection, and a disciplined approach based on statistical theory.

Original language	English (US)
Pages (from-to)	195-205
Number of pages	11
Journal	Expert Systems With Applications
Volume	109
DOIs	https://doi.org/10.1016/j.eswa.2018.05.024
State	Published - Nov 1 2018

Keywords

Design of experiments
Hyperparameters
Machine learning
Random forest
Response surface methodology
Tuning

ASJC Scopus subject areas

General Engineering
Computer Science Applications
Artificial Intelligence

Access to Document

10.1016/j.eswa.2018.05.024

Cite this

@article{1aa2797c472b4aa78a3235a2e294cf18,

title = "Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study",

abstract = "Most machine learning algorithms possess hyperparameters. For example, an artificial neural network requires the determination of the number of hidden layers, nodes, and many other parameters related to the model fitting process. Despite this, there is still no clear consensus on how to tune them. The most popular methodology is an exhaustive grid search, which can be highly inefficient and sometimes infeasible. Another common solution is to change one hyperparameter at a time and measure its effect on the model's performance. However, this can also be inefficient and does not guarantee optimal results since it ignores interactions between the hyperparameters. In this paper, we propose to use the Design of Experiments (DOE) methodology (factorial designs) for screening and Response Surface Methodology (RSM) to tune a machine learning algorithm's hyperparameters. An application of our methodology is presented with a detailed discussion of the results of a random forest case-study using a publicly available dataset. Benefits include fewer training runs, better parameter selection, and a disciplined approach based on statistical theory.",

keywords = "Design of experiments, Hyperparameters, Machine learning, Random forest, Response surface methodology, Tuning",

author = "Lujan-Moreno, {Gustavo A.} and Howard, {Phillip R.} and Rojas, {Omar G.} and Montgomery, {Douglas C.}",

note = "Publisher Copyright: {\textcopyright} 2018",

year = "2018",

month = nov,

day = "1",

doi = "10.1016/j.eswa.2018.05.024",

language = "English (US)",

volume = "109",

pages = "195--205",

journal = "Expert Systems With Applications",

issn = "0957-4174",

publisher = "Elsevier Limited",

}

TY - JOUR

T1 - Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study

AU - Lujan-Moreno, Gustavo A.

AU - Howard, Phillip R.

AU - Rojas, Omar G.

AU - Montgomery, Douglas C.

PY - 2018/11/1

Y1 - 2018/11/1

N2 - Most machine learning algorithms possess hyperparameters. For example, an artificial neural network requires the determination of the number of hidden layers, nodes, and many other parameters related to the model fitting process. Despite this, there is still no clear consensus on how to tune them. The most popular methodology is an exhaustive grid search, which can be highly inefficient and sometimes infeasible. Another common solution is to change one hyperparameter at a time and measure its effect on the model's performance. However, this can also be inefficient and does not guarantee optimal results since it ignores interactions between the hyperparameters. In this paper, we propose to use the Design of Experiments (DOE) methodology (factorial designs) for screening and Response Surface Methodology (RSM) to tune a machine learning algorithm's hyperparameters. An application of our methodology is presented with a detailed discussion of the results of a random forest case-study using a publicly available dataset. Benefits include fewer training runs, better parameter selection, and a disciplined approach based on statistical theory.

AB - Most machine learning algorithms possess hyperparameters. For example, an artificial neural network requires the determination of the number of hidden layers, nodes, and many other parameters related to the model fitting process. Despite this, there is still no clear consensus on how to tune them. The most popular methodology is an exhaustive grid search, which can be highly inefficient and sometimes infeasible. Another common solution is to change one hyperparameter at a time and measure its effect on the model's performance. However, this can also be inefficient and does not guarantee optimal results since it ignores interactions between the hyperparameters. In this paper, we propose to use the Design of Experiments (DOE) methodology (factorial designs) for screening and Response Surface Methodology (RSM) to tune a machine learning algorithm's hyperparameters. An application of our methodology is presented with a detailed discussion of the results of a random forest case-study using a publicly available dataset. Benefits include fewer training runs, better parameter selection, and a disciplined approach based on statistical theory.

KW - Design of experiments

KW - Hyperparameters

KW - Machine learning

KW - Random forest

KW - Response surface methodology

KW - Tuning

UR - http://www.scopus.com/inward/record.url?scp=85047768020&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85047768020&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2018.05.024

DO - 10.1016/j.eswa.2018.05.024

M3 - Article

AN - SCOPUS:85047768020

SN - 0957-4174

VL - 109

SP - 195

EP - 205

JO - Expert Systems With Applications

JF - Expert Systems With Applications

ER -

Design of experiments and response surface methodology to tune machine learning hyperparameters, with a random forest case-study

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this