### Abstract

The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years. In many applications such as sensor networks and proteomics it is often expensive to obtain samples from all the variables involved simultaneously. For instance, this might involve the synchronization of a large number of sensors or the tagging of a large number of proteins. To address this important issue, we initiate the study of a novel graphical model selection problem, where the goal is to optimize the total number of scalar samples obtained by allowing the collection of samples from only subsets of the variables. We propose a general paradigm for graphical model selection where feedback is used to guide the sampling to high degree vertices, while obtaining only few samples from the ones with the low degrees. We instantiate this framework with two specific active learning algorithms, one of which makes mild assumptions but is computationally expensive, while the other is computationally more efficient but requires stronger (nevertheless standard) assumptions. Whereas the sample complexity of passive algorithms is typically a function of the maximum degree of the graph, we show that the sample complexity of our algorithms is provably smaller and that it depends on a novel local complexity measure that is akin to the average degree of the graph. We finally demonstrate the efficacy of our framework via simulations.

Original language | English (US) |
---|---|

Pages | 1356-1364 |

Number of pages | 9 |

State | Published - Jan 1 2016 |

Externally published | Yes |

Event | 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016 - Cadiz, Spain Duration: May 9 2016 → May 11 2016 |

### Conference

Conference | 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016 |
---|---|

Country | Spain |

City | Cadiz |

Period | 5/9/16 → 5/11/16 |

### Fingerprint

### ASJC Scopus subject areas

- Artificial Intelligence
- Statistics and Probability

### Cite this

*Active learning algorithms for graphical model selection*. 1356-1364. Paper presented at 19th International Conference on Artificial Intelligence and Statistics, AISTATS 2016, Cadiz, Spain.

**Active learning algorithms for graphical model selection.** / Dasarathy, Gautam; Singh, Aarti; Balcan, Maria F.; Park, Jong H.

Research output: Contribution to conference › Paper

}

TY - CONF

T1 - Active learning algorithms for graphical model selection

AU - Dasarathy, Gautam

AU - Singh, Aarti

AU - Balcan, Maria F.

AU - Park, Jong H.

PY - 2016/1/1

Y1 - 2016/1/1

N2 - The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years. In many applications such as sensor networks and proteomics it is often expensive to obtain samples from all the variables involved simultaneously. For instance, this might involve the synchronization of a large number of sensors or the tagging of a large number of proteins. To address this important issue, we initiate the study of a novel graphical model selection problem, where the goal is to optimize the total number of scalar samples obtained by allowing the collection of samples from only subsets of the variables. We propose a general paradigm for graphical model selection where feedback is used to guide the sampling to high degree vertices, while obtaining only few samples from the ones with the low degrees. We instantiate this framework with two specific active learning algorithms, one of which makes mild assumptions but is computationally expensive, while the other is computationally more efficient but requires stronger (nevertheless standard) assumptions. Whereas the sample complexity of passive algorithms is typically a function of the maximum degree of the graph, we show that the sample complexity of our algorithms is provably smaller and that it depends on a novel local complexity measure that is akin to the average degree of the graph. We finally demonstrate the efficacy of our framework via simulations.

AB - The problem of learning the structure of a high dimensional graphical model from data has received considerable attention in recent years. In many applications such as sensor networks and proteomics it is often expensive to obtain samples from all the variables involved simultaneously. For instance, this might involve the synchronization of a large number of sensors or the tagging of a large number of proteins. To address this important issue, we initiate the study of a novel graphical model selection problem, where the goal is to optimize the total number of scalar samples obtained by allowing the collection of samples from only subsets of the variables. We propose a general paradigm for graphical model selection where feedback is used to guide the sampling to high degree vertices, while obtaining only few samples from the ones with the low degrees. We instantiate this framework with two specific active learning algorithms, one of which makes mild assumptions but is computationally expensive, while the other is computationally more efficient but requires stronger (nevertheless standard) assumptions. Whereas the sample complexity of passive algorithms is typically a function of the maximum degree of the graph, we show that the sample complexity of our algorithms is provably smaller and that it depends on a novel local complexity measure that is akin to the average degree of the graph. We finally demonstrate the efficacy of our framework via simulations.

UR - http://www.scopus.com/inward/record.url?scp=85054224713&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85054224713&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85054224713

SP - 1356

EP - 1364

ER -