9 Citations (Scopus)

Abstract

Building highly nonlinear and nonparametric models is central to several state-of-the-art machine learning systems. Kernel methods form an important class of techniques that induce a reproducing kernel Hilbert space (RKHS) for inferring non-linear models through the construction of similarity functions from data. These methods are particularly preferred in cases where the training data sizes are limited and when prior knowledge of the data similarities is available. Despite their usefulness, they are limited by the computational complexity and their inability to support end-to-end learning with a task-specific objective. On the other hand, deep neural networks have become the de facto solution for end-to-end inference in several learning paradigms. In this paper, we explore the idea of using deep architectures to perform kernel machine optimization, for both computational efficiency and end-to-end inferencing. To this end, we develop the deep kernel machine optimization framework, that creates an ensemble of dense embeddings using Nyström kernel approximations and utilizes deep learning to generate task-specific representations through the fusion of the embeddings. Intuitively, the filters of the network are trained to fuse information from an ensemble of linear subspaces in the RKHS. Furthermore, we introduce the kernel dropout regularization to enable improved training convergence. Finally, we extend this framework to the multiple kernel case, by coupling a global fusion layer with pretrained deep kernel machines for each of the constituent kernels. Using case studies with limited training data, and lack of explicit feature sources, we demonstrate the effectiveness of our framework over conventional model inferencing techniques.

Original languageEnglish (US)
JournalIEEE Transactions on Neural Networks and Learning Systems
DOIs
StateAccepted/In press - Mar 6 2018

Fingerprint

Hilbert spaces
Learning systems
Fusion reactions
Electric fuses
Computational efficiency
Computational complexity
Deep learning
Deep neural networks

Keywords

  • Computational modeling
  • Computer architecture
  • Data models
  • Deep neural networks (DNNs)
  • Kernel
  • kernel methods
  • Machine learning
  • multiple kernel learning (MKL)
  • Nyström approximation.
  • Optimization
  • Task analysis

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Computer Networks and Communications
  • Artificial Intelligence

Cite this

Optimizing Kernel Machines Using Deep Learning. / Song, Huan; Thiagarajan, Jayaraman J.; Sattigeri, Prasanna; Spanias, Andreas.

In: IEEE Transactions on Neural Networks and Learning Systems, 06.03.2018.

Research output: Contribution to journalArticle

@article{a7e3981611e64e4395a531de95469c7e,
title = "Optimizing Kernel Machines Using Deep Learning",
abstract = "Building highly nonlinear and nonparametric models is central to several state-of-the-art machine learning systems. Kernel methods form an important class of techniques that induce a reproducing kernel Hilbert space (RKHS) for inferring non-linear models through the construction of similarity functions from data. These methods are particularly preferred in cases where the training data sizes are limited and when prior knowledge of the data similarities is available. Despite their usefulness, they are limited by the computational complexity and their inability to support end-to-end learning with a task-specific objective. On the other hand, deep neural networks have become the de facto solution for end-to-end inference in several learning paradigms. In this paper, we explore the idea of using deep architectures to perform kernel machine optimization, for both computational efficiency and end-to-end inferencing. To this end, we develop the deep kernel machine optimization framework, that creates an ensemble of dense embeddings using Nystr{\"o}m kernel approximations and utilizes deep learning to generate task-specific representations through the fusion of the embeddings. Intuitively, the filters of the network are trained to fuse information from an ensemble of linear subspaces in the RKHS. Furthermore, we introduce the kernel dropout regularization to enable improved training convergence. Finally, we extend this framework to the multiple kernel case, by coupling a global fusion layer with pretrained deep kernel machines for each of the constituent kernels. Using case studies with limited training data, and lack of explicit feature sources, we demonstrate the effectiveness of our framework over conventional model inferencing techniques.",
keywords = "Computational modeling, Computer architecture, Data models, Deep neural networks (DNNs), Kernel, kernel methods, Machine learning, multiple kernel learning (MKL), Nyström approximation., Optimization, Task analysis",
author = "Huan Song and Thiagarajan, {Jayaraman J.} and Prasanna Sattigeri and Andreas Spanias",
year = "2018",
month = "3",
day = "6",
doi = "10.1109/TNNLS.2018.2804895",
language = "English (US)",
journal = "IEEE Transactions on Neural Networks and Learning Systems",
issn = "2162-237X",
publisher = "IEEE Computational Intelligence Society",

}

TY - JOUR

T1 - Optimizing Kernel Machines Using Deep Learning

AU - Song, Huan

AU - Thiagarajan, Jayaraman J.

AU - Sattigeri, Prasanna

AU - Spanias, Andreas

PY - 2018/3/6

Y1 - 2018/3/6

N2 - Building highly nonlinear and nonparametric models is central to several state-of-the-art machine learning systems. Kernel methods form an important class of techniques that induce a reproducing kernel Hilbert space (RKHS) for inferring non-linear models through the construction of similarity functions from data. These methods are particularly preferred in cases where the training data sizes are limited and when prior knowledge of the data similarities is available. Despite their usefulness, they are limited by the computational complexity and their inability to support end-to-end learning with a task-specific objective. On the other hand, deep neural networks have become the de facto solution for end-to-end inference in several learning paradigms. In this paper, we explore the idea of using deep architectures to perform kernel machine optimization, for both computational efficiency and end-to-end inferencing. To this end, we develop the deep kernel machine optimization framework, that creates an ensemble of dense embeddings using Nyström kernel approximations and utilizes deep learning to generate task-specific representations through the fusion of the embeddings. Intuitively, the filters of the network are trained to fuse information from an ensemble of linear subspaces in the RKHS. Furthermore, we introduce the kernel dropout regularization to enable improved training convergence. Finally, we extend this framework to the multiple kernel case, by coupling a global fusion layer with pretrained deep kernel machines for each of the constituent kernels. Using case studies with limited training data, and lack of explicit feature sources, we demonstrate the effectiveness of our framework over conventional model inferencing techniques.

AB - Building highly nonlinear and nonparametric models is central to several state-of-the-art machine learning systems. Kernel methods form an important class of techniques that induce a reproducing kernel Hilbert space (RKHS) for inferring non-linear models through the construction of similarity functions from data. These methods are particularly preferred in cases where the training data sizes are limited and when prior knowledge of the data similarities is available. Despite their usefulness, they are limited by the computational complexity and their inability to support end-to-end learning with a task-specific objective. On the other hand, deep neural networks have become the de facto solution for end-to-end inference in several learning paradigms. In this paper, we explore the idea of using deep architectures to perform kernel machine optimization, for both computational efficiency and end-to-end inferencing. To this end, we develop the deep kernel machine optimization framework, that creates an ensemble of dense embeddings using Nyström kernel approximations and utilizes deep learning to generate task-specific representations through the fusion of the embeddings. Intuitively, the filters of the network are trained to fuse information from an ensemble of linear subspaces in the RKHS. Furthermore, we introduce the kernel dropout regularization to enable improved training convergence. Finally, we extend this framework to the multiple kernel case, by coupling a global fusion layer with pretrained deep kernel machines for each of the constituent kernels. Using case studies with limited training data, and lack of explicit feature sources, we demonstrate the effectiveness of our framework over conventional model inferencing techniques.

KW - Computational modeling

KW - Computer architecture

KW - Data models

KW - Deep neural networks (DNNs)

KW - Kernel

KW - kernel methods

KW - Machine learning

KW - multiple kernel learning (MKL)

KW - Nyström approximation.

KW - Optimization

KW - Task analysis

UR - http://www.scopus.com/inward/record.url?scp=85043391928&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85043391928&partnerID=8YFLogxK

U2 - 10.1109/TNNLS.2018.2804895

DO - 10.1109/TNNLS.2018.2804895

M3 - Article

JO - IEEE Transactions on Neural Networks and Learning Systems

JF - IEEE Transactions on Neural Networks and Learning Systems

SN - 2162-237X

ER -