Scaling SVM and least absolute deviations via exact data reduction

Jie Wang, Peter Wonka, Jieping Ye

Research output: Chapter in Book/Report/Conference proceedingConference contribution

28 Scopus citations

Abstract

The support vector machine (SVM) is a widely used method for classification. Although many efforts have been devoted to develop efficient solvers, it remains challenging to apply SVM to large-scale problems. A nice property of SVM is that the non-support vectors have no effect on the resulting classifier. Motivated by this observation, we present fast and efficient screening rules to discard non-support vectors by analyzing the dual problem of SVM via variational inequalities (DVI). As a result, the number of data instances to be entered into the optimization can be substantially reduced. Some appealing features of our screening method are: (1) DVI is safe in the sense that the vectors discarded by DVI are guaranteed to be non-support vectors; (2) the data set needs to be scanned only once to run the screening, and its computational cost is negligible compared to that of solving the SVM problem; (3) DVI is independent of the solvers and can be integrated with any existing efficient solver. We also show that the DVI technique can be extended to detect non-support vectors in the least absolute deviations regression (LAD). To the best of our knowledge, there are currently no screening methods for LAD. We have evaluated DVI on both synthetic and real data sets. Experiments indicate that DVI significantly outperforms the existing state-of-the-art screening rules for SVM, and it is very effective in discarding non-support vectors for LAD. The speedup gained by DVI rules can be up to two orders of magnitude.

Original languageEnglish (US)
Title of host publication31st International Conference on Machine Learning, ICML 2014
PublisherInternational Machine Learning Society (IMLS)
Pages1912-1927
Number of pages16
ISBN (Electronic)9781634393973
StatePublished - 2014
Externally publishedYes
Event31st International Conference on Machine Learning, ICML 2014 - Beijing, China
Duration: Jun 21 2014Jun 26 2014

Publication series

Name31st International Conference on Machine Learning, ICML 2014
Volume3

Other

Other31st International Conference on Machine Learning, ICML 2014
Country/TerritoryChina
CityBeijing
Period6/21/146/26/14

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Software

Fingerprint

Dive into the research topics of 'Scaling SVM and least absolute deviations via exact data reduction'. Together they form a unique fingerprint.

Cite this