Evaluating the Positive Unlabeled Learning Problem

Kristen Jaskie; Andreas Spanias

doi:10.1007/978-3-031-79178-9_3

Evaluating the Positive Unlabeled Learning Problem

Kristen Jaskie, Andreas Spanias

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Chapter

Abstract

Evaluating PU learning models poses challenges that are not present when evaluating standard supervised classification models. Because all negative data labels are missing in PU datasets, standard evaluation techniques that rely on calculating truth tables cannot be used as neither true negative samples nor false negative samples can be calculated. Because of this, neither a model’s predicted precision nor accuracy can be calculated. Even the methods used to train PU models are different, as the standard supervised train—validate—test modeling technique is not possible when a substantial portion of the training dataset is unlabeled. Supervised classification uses Inductive learning to train a model that can be used on new, unlabeled data as shown in Figure 3.1a. In PU learning, as with many semi-supervised learning methods, either Inductive or Transductive learning is possible. The differences between these are summarized in Figures 3.1b and 3.1c.

Original language	English (US)
Title of host publication	Synthesis Lectures on Artificial Intelligence and Machine Learning
Publisher	Springer Nature
Pages	35-46
Number of pages	12
DOIs	https://doi.org/10.1007/978-3-031-79178-9_3
State	Published - 2022

Publication series

Name	Synthesis Lectures on Artificial Intelligence and Machine Learning
ISSN (Print)	1939-4608
ISSN (Electronic)	1939-4616

ASJC Scopus subject areas

Artificial Intelligence

Access to Document

10.1007/978-3-031-79178-9_3

Cite this

@inbook{01701e25a5ef436d97aeb7457869c4f7,

title = "Evaluating the Positive Unlabeled Learning Problem",

abstract = "Evaluating PU learning models poses challenges that are not present when evaluating standard supervised classification models. Because all negative data labels are missing in PU datasets, standard evaluation techniques that rely on calculating truth tables cannot be used as neither true negative samples nor false negative samples can be calculated. Because of this, neither a model{\textquoteright}s predicted precision nor accuracy can be calculated. Even the methods used to train PU models are different, as the standard supervised train—validate—test modeling technique is not possible when a substantial portion of the training dataset is unlabeled. Supervised classification uses Inductive learning to train a model that can be used on new, unlabeled data as shown in Figure 3.1a. In PU learning, as with many semi-supervised learning methods, either Inductive or Transductive learning is possible. The differences between these are summarized in Figures 3.1b and 3.1c.",

author = "Kristen Jaskie and Andreas Spanias",

note = "Publisher Copyright: {\textcopyright} 2022, Springer Nature Switzerland AG.",

year = "2022",

doi = "10.1007/978-3-031-79178-9_3",

language = "English (US)",

series = "Synthesis Lectures on Artificial Intelligence and Machine Learning",

publisher = "Springer Nature",

pages = "35--46",

booktitle = "Synthesis Lectures on Artificial Intelligence and Machine Learning",

address = "United States",

}

TY - CHAP

T1 - Evaluating the Positive Unlabeled Learning Problem

AU - Jaskie, Kristen

AU - Spanias, Andreas

PY - 2022

Y1 - 2022

N2 - Evaluating PU learning models poses challenges that are not present when evaluating standard supervised classification models. Because all negative data labels are missing in PU datasets, standard evaluation techniques that rely on calculating truth tables cannot be used as neither true negative samples nor false negative samples can be calculated. Because of this, neither a model’s predicted precision nor accuracy can be calculated. Even the methods used to train PU models are different, as the standard supervised train—validate—test modeling technique is not possible when a substantial portion of the training dataset is unlabeled. Supervised classification uses Inductive learning to train a model that can be used on new, unlabeled data as shown in Figure 3.1a. In PU learning, as with many semi-supervised learning methods, either Inductive or Transductive learning is possible. The differences between these are summarized in Figures 3.1b and 3.1c.

AB - Evaluating PU learning models poses challenges that are not present when evaluating standard supervised classification models. Because all negative data labels are missing in PU datasets, standard evaluation techniques that rely on calculating truth tables cannot be used as neither true negative samples nor false negative samples can be calculated. Because of this, neither a model’s predicted precision nor accuracy can be calculated. Even the methods used to train PU models are different, as the standard supervised train—validate—test modeling technique is not possible when a substantial portion of the training dataset is unlabeled. Supervised classification uses Inductive learning to train a model that can be used on new, unlabeled data as shown in Figure 3.1a. In PU learning, as with many semi-supervised learning methods, either Inductive or Transductive learning is possible. The differences between these are summarized in Figures 3.1b and 3.1c.

UR - http://www.scopus.com/inward/record.url?scp=85139467825&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85139467825&partnerID=8YFLogxK

U2 - 10.1007/978-3-031-79178-9_3

DO - 10.1007/978-3-031-79178-9_3

M3 - Chapter

AN - SCOPUS:85139467825

T3 - Synthesis Lectures on Artificial Intelligence and Machine Learning

SP - 35

EP - 46

BT - Synthesis Lectures on Artificial Intelligence and Machine Learning

PB - Springer Nature

ER -

Evaluating the Positive Unlabeled Learning Problem

Abstract

Publication series

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this