Scaling SVM and least absolute deviations via exact data reduction

Jie Wang; Peter Wonka; Jieping Ye

Scaling SVM and least absolute deviations via exact data reduction

Jie Wang, Peter Wonka, Jieping Ye

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Abstract

The support vector machine (SVM) is a widely used method for classification. Although many efforts have been devoted to develop efficient solvers, it remains challenging to apply SVM to large-scale problems. A nice property of SVM is that the non-support vectors have no effect on the resulting classifier. Motivated by this observation, we present fast and efficient screening rules to discard non-support vectors by analyzing the dual problem of SVM via variational inequalities (DVI). As a result, the number of data instances to be entered into the optimization can be substantially reduced. Some appealing features of our screening method are: (1) DVI is safe in the sense that the vectors discarded by DVI are guaranteed to be non-support vectors; (2) the data set needs to be scanned only once to run the screening, and its computational cost is negligible compared to that of solving the SVM problem; (3) DVI is independent of the solvers and can be integrated with any existing efficient solver. We also show that the DVI technique can be extended to detect non-support vectors in the least absolute deviations regression (LAD). To the best of our knowledge, there are currently no screening methods for LAD. We have evaluated DVI on both synthetic and real data sets. Experiments indicate that DVI significantly outperforms the existing state-of-the-art screening rules for SVM, and it is very effective in discarding non-support vectors for LAD. The speedup gained by DVI rules can be up to two orders of magnitude.

Original language	English (US)
Title of host publication	31st International Conference on Machine Learning, ICML 2014
Publisher	International Machine Learning Society (IMLS)
Pages	1912-1927
Number of pages	16
ISBN (Electronic)	9781634393973
State	Published - 2014
Externally published	Yes
Event	31st International Conference on Machine Learning, ICML 2014 - Beijing, China Duration: Jun 21 2014 → Jun 26 2014

Publication series

Name	31st International Conference on Machine Learning, ICML 2014
Volume	3

Other

Other	31st International Conference on Machine Learning, ICML 2014
Country/Territory	China
City	Beijing
Period	6/21/14 → 6/26/14

ASJC Scopus subject areas

Artificial Intelligence
Computer Networks and Communications
Software

Cite this

Scaling SVM and least absolute deviations via exact data reduction. / Wang, Jie; Wonka, Peter; Ye, Jieping.
31st International Conference on Machine Learning, ICML 2014. International Machine Learning Society (IMLS), 2014. p. 1912-1927 (31st International Conference on Machine Learning, ICML 2014; Vol. 3).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wang, J, Wonka, P & Ye, J 2014, Scaling SVM and least absolute deviations via exact data reduction. in 31st International Conference on Machine Learning, ICML 2014. 31st International Conference on Machine Learning, ICML 2014, vol. 3, International Machine Learning Society (IMLS), pp. 1912-1927, 31st International Conference on Machine Learning, ICML 2014, Beijing, China, 6/21/14.

@inproceedings{a18c9dd1081e40789ecafba61b4ccf58,

title = "Scaling SVM and least absolute deviations via exact data reduction",

abstract = "The support vector machine (SVM) is a widely used method for classification. Although many efforts have been devoted to develop efficient solvers, it remains challenging to apply SVM to large-scale problems. A nice property of SVM is that the non-support vectors have no effect on the resulting classifier. Motivated by this observation, we present fast and efficient screening rules to discard non-support vectors by analyzing the dual problem of SVM via variational inequalities (DVI). As a result, the number of data instances to be entered into the optimization can be substantially reduced. Some appealing features of our screening method are: (1) DVI is safe in the sense that the vectors discarded by DVI are guaranteed to be non-support vectors; (2) the data set needs to be scanned only once to run the screening, and its computational cost is negligible compared to that of solving the SVM problem; (3) DVI is independent of the solvers and can be integrated with any existing efficient solver. We also show that the DVI technique can be extended to detect non-support vectors in the least absolute deviations regression (LAD). To the best of our knowledge, there are currently no screening methods for LAD. We have evaluated DVI on both synthetic and real data sets. Experiments indicate that DVI significantly outperforms the existing state-of-the-art screening rules for SVM, and it is very effective in discarding non-support vectors for LAD. The speedup gained by DVI rules can be up to two orders of magnitude.",

author = "Jie Wang and Peter Wonka and Jieping Ye",

note = "Publisher Copyright: Copyright {\textcopyright} (2014) by the International Machine Learning Society (IMLS) All rights reserved.; 31st International Conference on Machine Learning, ICML 2014 ; Conference date: 21-06-2014 Through 26-06-2014",

year = "2014",

language = "English (US)",

series = "31st International Conference on Machine Learning, ICML 2014",

publisher = "International Machine Learning Society (IMLS)",

pages = "1912--1927",

booktitle = "31st International Conference on Machine Learning, ICML 2014",

}

TY - GEN

T1 - Scaling SVM and least absolute deviations via exact data reduction

AU - Wang, Jie

AU - Wonka, Peter

AU - Ye, Jieping

PY - 2014

Y1 - 2014

N2 - The support vector machine (SVM) is a widely used method for classification. Although many efforts have been devoted to develop efficient solvers, it remains challenging to apply SVM to large-scale problems. A nice property of SVM is that the non-support vectors have no effect on the resulting classifier. Motivated by this observation, we present fast and efficient screening rules to discard non-support vectors by analyzing the dual problem of SVM via variational inequalities (DVI). As a result, the number of data instances to be entered into the optimization can be substantially reduced. Some appealing features of our screening method are: (1) DVI is safe in the sense that the vectors discarded by DVI are guaranteed to be non-support vectors; (2) the data set needs to be scanned only once to run the screening, and its computational cost is negligible compared to that of solving the SVM problem; (3) DVI is independent of the solvers and can be integrated with any existing efficient solver. We also show that the DVI technique can be extended to detect non-support vectors in the least absolute deviations regression (LAD). To the best of our knowledge, there are currently no screening methods for LAD. We have evaluated DVI on both synthetic and real data sets. Experiments indicate that DVI significantly outperforms the existing state-of-the-art screening rules for SVM, and it is very effective in discarding non-support vectors for LAD. The speedup gained by DVI rules can be up to two orders of magnitude.

AB - The support vector machine (SVM) is a widely used method for classification. Although many efforts have been devoted to develop efficient solvers, it remains challenging to apply SVM to large-scale problems. A nice property of SVM is that the non-support vectors have no effect on the resulting classifier. Motivated by this observation, we present fast and efficient screening rules to discard non-support vectors by analyzing the dual problem of SVM via variational inequalities (DVI). As a result, the number of data instances to be entered into the optimization can be substantially reduced. Some appealing features of our screening method are: (1) DVI is safe in the sense that the vectors discarded by DVI are guaranteed to be non-support vectors; (2) the data set needs to be scanned only once to run the screening, and its computational cost is negligible compared to that of solving the SVM problem; (3) DVI is independent of the solvers and can be integrated with any existing efficient solver. We also show that the DVI technique can be extended to detect non-support vectors in the least absolute deviations regression (LAD). To the best of our knowledge, there are currently no screening methods for LAD. We have evaluated DVI on both synthetic and real data sets. Experiments indicate that DVI significantly outperforms the existing state-of-the-art screening rules for SVM, and it is very effective in discarding non-support vectors for LAD. The speedup gained by DVI rules can be up to two orders of magnitude.

UR - http://www.scopus.com/inward/record.url?scp=84919949965&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84919949965&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84919949965

T3 - 31st International Conference on Machine Learning, ICML 2014

SP - 1912

EP - 1927

BT - 31st International Conference on Machine Learning, ICML 2014

PB - International Machine Learning Society (IMLS)

T2 - 31st International Conference on Machine Learning, ICML 2014

Y2 - 21 June 2014 through 26 June 2014

ER -

Scaling SVM and least absolute deviations via exact data reduction

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this