Improving robustness of random forest under label noise

Xu Zhou, Pak Lun Kevin Ding, Baoxin Li

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Random forest is a well-known and widely-used machine learning model. In many applications where the training data arise from real-world sources, there may be labeling errors in the data. In spite of its superior performance, the basic model of random forest dose not consider potential label noise in learning, and thus its performance can suffer significantly in the presence of label noise. In order to solve this problem, we present a new variation of random forest - a novel learning approach that leads to an improved noise robust random forest (NRRF) model. We incorporate the noise information by introducing a global multi-class noise tolerant loss function into the training of the classic random forest model. This new loss function was found to significantly boost the performance of random forest. We evaluated the proposed NRRF by extensive experiments of classification tasks on standard machine learning/computer vision datasets like MNIST, letter and Cifar10. The proposed NRRF produced very promising results under a wide range of noise settings.

    Original languageEnglish (US)
    Title of host publicationProceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019
    PublisherInstitute of Electrical and Electronics Engineers Inc.
    Pages950-958
    Number of pages9
    ISBN (Electronic)9781728119755
    DOIs
    StatePublished - Mar 4 2019
    Event19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019 - Waikoloa Village, United States
    Duration: Jan 7 2019Jan 11 2019

    Publication series

    NameProceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

    Conference

    Conference19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019
    CountryUnited States
    CityWaikoloa Village
    Period1/7/191/11/19

    Fingerprint

    Labels
    Learning systems
    Labeling
    Computer vision
    Experiments

    ASJC Scopus subject areas

    • Computer Vision and Pattern Recognition
    • Computer Science Applications

    Cite this

    Zhou, X., Ding, P. L. K., & Li, B. (2019). Improving robustness of random forest under label noise. In Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019 (pp. 950-958). [8658395] (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WACV.2019.00106

    Improving robustness of random forest under label noise. / Zhou, Xu; Ding, Pak Lun Kevin; Li, Baoxin.

    Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 950-958 8658395 (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019).

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Zhou, X, Ding, PLK & Li, B 2019, Improving robustness of random forest under label noise. in Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019., 8658395, Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019, Institute of Electrical and Electronics Engineers Inc., pp. 950-958, 19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019, Waikoloa Village, United States, 1/7/19. https://doi.org/10.1109/WACV.2019.00106
    Zhou X, Ding PLK, Li B. Improving robustness of random forest under label noise. In Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 950-958. 8658395. (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019). https://doi.org/10.1109/WACV.2019.00106
    Zhou, Xu ; Ding, Pak Lun Kevin ; Li, Baoxin. / Improving robustness of random forest under label noise. Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 950-958 (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019).
    @inproceedings{dd6d9d10fd844deda3f306c93d1c79d0,
    title = "Improving robustness of random forest under label noise",
    abstract = "Random forest is a well-known and widely-used machine learning model. In many applications where the training data arise from real-world sources, there may be labeling errors in the data. In spite of its superior performance, the basic model of random forest dose not consider potential label noise in learning, and thus its performance can suffer significantly in the presence of label noise. In order to solve this problem, we present a new variation of random forest - a novel learning approach that leads to an improved noise robust random forest (NRRF) model. We incorporate the noise information by introducing a global multi-class noise tolerant loss function into the training of the classic random forest model. This new loss function was found to significantly boost the performance of random forest. We evaluated the proposed NRRF by extensive experiments of classification tasks on standard machine learning/computer vision datasets like MNIST, letter and Cifar10. The proposed NRRF produced very promising results under a wide range of noise settings.",
    author = "Xu Zhou and Ding, {Pak Lun Kevin} and Baoxin Li",
    year = "2019",
    month = "3",
    day = "4",
    doi = "10.1109/WACV.2019.00106",
    language = "English (US)",
    series = "Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019",
    publisher = "Institute of Electrical and Electronics Engineers Inc.",
    pages = "950--958",
    booktitle = "Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019",

    }

    TY - GEN

    T1 - Improving robustness of random forest under label noise

    AU - Zhou, Xu

    AU - Ding, Pak Lun Kevin

    AU - Li, Baoxin

    PY - 2019/3/4

    Y1 - 2019/3/4

    N2 - Random forest is a well-known and widely-used machine learning model. In many applications where the training data arise from real-world sources, there may be labeling errors in the data. In spite of its superior performance, the basic model of random forest dose not consider potential label noise in learning, and thus its performance can suffer significantly in the presence of label noise. In order to solve this problem, we present a new variation of random forest - a novel learning approach that leads to an improved noise robust random forest (NRRF) model. We incorporate the noise information by introducing a global multi-class noise tolerant loss function into the training of the classic random forest model. This new loss function was found to significantly boost the performance of random forest. We evaluated the proposed NRRF by extensive experiments of classification tasks on standard machine learning/computer vision datasets like MNIST, letter and Cifar10. The proposed NRRF produced very promising results under a wide range of noise settings.

    AB - Random forest is a well-known and widely-used machine learning model. In many applications where the training data arise from real-world sources, there may be labeling errors in the data. In spite of its superior performance, the basic model of random forest dose not consider potential label noise in learning, and thus its performance can suffer significantly in the presence of label noise. In order to solve this problem, we present a new variation of random forest - a novel learning approach that leads to an improved noise robust random forest (NRRF) model. We incorporate the noise information by introducing a global multi-class noise tolerant loss function into the training of the classic random forest model. This new loss function was found to significantly boost the performance of random forest. We evaluated the proposed NRRF by extensive experiments of classification tasks on standard machine learning/computer vision datasets like MNIST, letter and Cifar10. The proposed NRRF produced very promising results under a wide range of noise settings.

    UR - http://www.scopus.com/inward/record.url?scp=85063576976&partnerID=8YFLogxK

    UR - http://www.scopus.com/inward/citedby.url?scp=85063576976&partnerID=8YFLogxK

    U2 - 10.1109/WACV.2019.00106

    DO - 10.1109/WACV.2019.00106

    M3 - Conference contribution

    T3 - Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

    SP - 950

    EP - 958

    BT - Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

    PB - Institute of Electrical and Electronics Engineers Inc.

    ER -