Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins

Yanjun Qi; Oznur Tastan; Jaime G. Carbonell; Judith Klein-Seetharaman; Jason Weston

doi:10.1093/bioinformatics/btq394

Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins

Yanjun Qi, Oznur Tastan, Jaime G. Carbonell, Judith Klein-Seetharaman, Jason Weston

Research output: Contribution to journal › Article › peer-review

107 Scopus citations

Abstract

Motivation: Protein-protein interactions (PPIs) are critical for virtually every biological function. Recently, researchers suggested to use supervised learning for the task of classifying pairs of proteins as interacting or not. However, its performance is largely restricted by the availability of truly interacting proteins (labeled). Meanwhile, there exists a considerable amount of protein pairs where an association appears between two partners, but not enough experimental evidence to support it as a direct interaction (partially labeled). Results: We propose a semi-supervised multi-task framework for predicting PPIs from not only labeled, but also partially labeled reference sets. The basic idea is to perform multi-task learning on a supervised classification task and a semi-supervised auxiliary task. The supervised classifier trains a multi-layer perceptron network for PPI predictions from labeled examples. The semi-supervised auxiliary task shares network layers of the supervised classifier and trains with partially labeled examples. Semi-supervision could be utilized in multiple ways. We tried three approaches in this article, (i) classification (to distinguish partial positives with negatives); (ii) ranking (to rate partial positive more likely than negatives); (iii) embedding (to make data clusters get similar labels). We applied this framework to improve the identification of interacting pairs between HIV-1 and human proteins. Our method improved upon the state-of-the-art method for this task indicating the benefits of semi-supervised multi-task learning using auxiliary information.

Original language	English (US)
Article number	btq394
Pages (from-to)	i645-i652
Journal	Bioinformatics
Volume	26
Issue number	18
DOIs	https://doi.org/10.1093/bioinformatics/btq394
State	Published - Sep 4 2010
Externally published	Yes

ASJC Scopus subject areas

Statistics and Probability
Biochemistry
Molecular Biology
Computer Science Applications
Computational Theory and Mathematics
Computational Mathematics

Access to Document

10.1093/bioinformatics/btq394

Cite this

@article{b31a623b33f44147970bccb8a885ff23,

title = "Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins",

abstract = "Motivation: Protein-protein interactions (PPIs) are critical for virtually every biological function. Recently, researchers suggested to use supervised learning for the task of classifying pairs of proteins as interacting or not. However, its performance is largely restricted by the availability of truly interacting proteins (labeled). Meanwhile, there exists a considerable amount of protein pairs where an association appears between two partners, but not enough experimental evidence to support it as a direct interaction (partially labeled). Results: We propose a semi-supervised multi-task framework for predicting PPIs from not only labeled, but also partially labeled reference sets. The basic idea is to perform multi-task learning on a supervised classification task and a semi-supervised auxiliary task. The supervised classifier trains a multi-layer perceptron network for PPI predictions from labeled examples. The semi-supervised auxiliary task shares network layers of the supervised classifier and trains with partially labeled examples. Semi-supervision could be utilized in multiple ways. We tried three approaches in this article, (i) classification (to distinguish partial positives with negatives); (ii) ranking (to rate partial positive more likely than negatives); (iii) embedding (to make data clusters get similar labels). We applied this framework to improve the identification of interacting pairs between HIV-1 and human proteins. Our method improved upon the state-of-the-art method for this task indicating the benefits of semi-supervised multi-task learning using auxiliary information.",

author = "Yanjun Qi and Oznur Tastan and Carbonell, {Jaime G.} and Judith Klein-Seetharaman and Jason Weston",

year = "2010",

month = sep,

day = "4",

doi = "10.1093/bioinformatics/btq394",

language = "English (US)",

volume = "26",

pages = "i645--i652",

journal = "Bioinformatics",

issn = "1367-4803",

publisher = "Oxford University Press",

number = "18",

}

TY - JOUR

T1 - Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins

AU - Qi, Yanjun

AU - Tastan, Oznur

AU - Carbonell, Jaime G.

AU - Klein-Seetharaman, Judith

AU - Weston, Jason

PY - 2010/9/4

Y1 - 2010/9/4

N2 - Motivation: Protein-protein interactions (PPIs) are critical for virtually every biological function. Recently, researchers suggested to use supervised learning for the task of classifying pairs of proteins as interacting or not. However, its performance is largely restricted by the availability of truly interacting proteins (labeled). Meanwhile, there exists a considerable amount of protein pairs where an association appears between two partners, but not enough experimental evidence to support it as a direct interaction (partially labeled). Results: We propose a semi-supervised multi-task framework for predicting PPIs from not only labeled, but also partially labeled reference sets. The basic idea is to perform multi-task learning on a supervised classification task and a semi-supervised auxiliary task. The supervised classifier trains a multi-layer perceptron network for PPI predictions from labeled examples. The semi-supervised auxiliary task shares network layers of the supervised classifier and trains with partially labeled examples. Semi-supervision could be utilized in multiple ways. We tried three approaches in this article, (i) classification (to distinguish partial positives with negatives); (ii) ranking (to rate partial positive more likely than negatives); (iii) embedding (to make data clusters get similar labels). We applied this framework to improve the identification of interacting pairs between HIV-1 and human proteins. Our method improved upon the state-of-the-art method for this task indicating the benefits of semi-supervised multi-task learning using auxiliary information.

AB - Motivation: Protein-protein interactions (PPIs) are critical for virtually every biological function. Recently, researchers suggested to use supervised learning for the task of classifying pairs of proteins as interacting or not. However, its performance is largely restricted by the availability of truly interacting proteins (labeled). Meanwhile, there exists a considerable amount of protein pairs where an association appears between two partners, but not enough experimental evidence to support it as a direct interaction (partially labeled). Results: We propose a semi-supervised multi-task framework for predicting PPIs from not only labeled, but also partially labeled reference sets. The basic idea is to perform multi-task learning on a supervised classification task and a semi-supervised auxiliary task. The supervised classifier trains a multi-layer perceptron network for PPI predictions from labeled examples. The semi-supervised auxiliary task shares network layers of the supervised classifier and trains with partially labeled examples. Semi-supervision could be utilized in multiple ways. We tried three approaches in this article, (i) classification (to distinguish partial positives with negatives); (ii) ranking (to rate partial positive more likely than negatives); (iii) embedding (to make data clusters get similar labels). We applied this framework to improve the identification of interacting pairs between HIV-1 and human proteins. Our method improved upon the state-of-the-art method for this task indicating the benefits of semi-supervised multi-task learning using auxiliary information.

UR - http://www.scopus.com/inward/record.url?scp=77956497661&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77956497661&partnerID=8YFLogxK

U2 - 10.1093/bioinformatics/btq394

DO - 10.1093/bioinformatics/btq394

M3 - Article

C2 - 20823334

AN - SCOPUS:77956497661

SN - 1367-4803

VL - 26

SP - i645-i652

JO - Bioinformatics

JF - Bioinformatics

IS - 18

M1 - btq394

ER -

Semi-supervised multi-task learning for predicting interactions between HIV-1 and human proteins

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this