S2: An efficient graph based active learning algorithm with application to nonparametric classification

Gautam Dasarathy, Robert Nowak, Xiaojin Zhu

Research output: Contribution to journalConference article

1 Citation (Scopus)

Abstract

This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S2 for this task. At each step, S2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S2 queries for the label of the vertex that bisects the shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S2 algorithm to the theory of nonparametric active learning. In particular, we show that S2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.

Original languageEnglish (US)
JournalJournal of Machine Learning Research
Volume40
Issue number2015
StatePublished - Jan 1 2015
Externally publishedYes
Event28th Conference on Learning Theory, COLT 2015 - Paris, France
Duration: Jul 2 2015Jul 6 2015

Fingerprint

Active Learning
Learning algorithms
Labels
Learning Algorithm
Graph in graph theory
Bisect
Query
Binary
Synthetic Data
Vertex of a graph
Minimax
Parametrization
Classification Problems
Shortest path
Excess
Efficient Algorithms
Problem-Based Learning
Prediction
Experimental Results
Estimate

Keywords

  • Active learning on graphs
  • Nonparametric classification
  • Query complexity of finding a cut

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Cite this

S2 : An efficient graph based active learning algorithm with application to nonparametric classification. / Dasarathy, Gautam; Nowak, Robert; Zhu, Xiaojin.

In: Journal of Machine Learning Research, Vol. 40, No. 2015, 01.01.2015.

Research output: Contribution to journalConference article

@article{bac2d237adfa4b3498fd20e2ce828633,
title = "S2: An efficient graph based active learning algorithm with application to nonparametric classification",
abstract = "This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S2 for this task. At each step, S2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S2 queries for the label of the vertex that bisects the shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S2 algorithm to the theory of nonparametric active learning. In particular, we show that S2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.",
keywords = "Active learning on graphs, Nonparametric classification, Query complexity of finding a cut",
author = "Gautam Dasarathy and Robert Nowak and Xiaojin Zhu",
year = "2015",
month = "1",
day = "1",
language = "English (US)",
volume = "40",
journal = "Journal of Machine Learning Research",
issn = "1532-4435",
publisher = "Microtome Publishing",
number = "2015",

}

TY - JOUR

T1 - S2

T2 - An efficient graph based active learning algorithm with application to nonparametric classification

AU - Dasarathy, Gautam

AU - Nowak, Robert

AU - Zhu, Xiaojin

PY - 2015/1/1

Y1 - 2015/1/1

N2 - This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S2 for this task. At each step, S2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S2 queries for the label of the vertex that bisects the shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S2 algorithm to the theory of nonparametric active learning. In particular, we show that S2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.

AB - This paper investigates the problem of active learning for binary label prediction on a graph. We introduce a simple and label-efficient algorithm called S2 for this task. At each step, S2 selects the vertex to be labeled based on the structure of the graph and all previously gathered labels. Specifically, S2 queries for the label of the vertex that bisects the shortest shortest path between any pair of oppositely labeled vertices. We present a theoretical estimate of the number of queries S2 needs in terms of a novel parametrization of the complexity of binary functions on graphs. We also present experimental results demonstrating the performance of S2 on both real and synthetic data. While other graph-based active learning algorithms have shown promise in practice, our algorithm is the first with both good performance and theoretical guarantees. Finally, we demonstrate the implications of the S2 algorithm to the theory of nonparametric active learning. In particular, we show that S2 achieves near minimax optimal excess risk for an important class of nonparametric classification problems.

KW - Active learning on graphs

KW - Nonparametric classification

KW - Query complexity of finding a cut

UR - http://www.scopus.com/inward/record.url?scp=84973394158&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84973394158&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84973394158

VL - 40

JO - Journal of Machine Learning Research

JF - Journal of Machine Learning Research

SN - 1532-4435

IS - 2015

ER -