Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features

Shen Pan, Gidon Shavit, Marta Penas-Centeno, Dong Hui Xu, Linda Shapiro, Richard Ladner, Eve Riskin, Wim Hol, Deirdre Meldrum

Research output: Contribution to journalArticle

25 Citations (Scopus)

Abstract

Protein crystallography laboratories are performing an increasing number of experiments to obtain crystals of good diffraction quality. Better automation has enabled researchers to prepare and run more experiments in a shorter time. However, the problem of identifying which experiments are successful remains difficult. In fact, most of this work is still performed manually by humans. Automating this task is therefore an important goal. As part of a project to develop a new and automated high-throughput capillary-based protein crystallography instrument, a new image-classification subsystem has been developed to greatly reduce the number of images that require human viewing. This system must have low rates of false negatives (missed crystals), possibly at the cost of raising the number of false positives. The image-classification system employs a support vector machine (SVM) learning algorithm to classify the blocks making up each image. A new algorithm to find the area within the image that contains the drop is employed. The SVM uses numerical features, based on texture and the Gabor wavelet decomposition, that are calculated for each block. If a block within an image is classified as containing a crystal, then the entire image is classified as containing a crystal. In a study of 375 images, 87 of which contained crystals, a false-negative rate of less than 4% with a false-positive rate of about 40% was consistently achieved.

Original languageEnglish (US)
Pages (from-to)271-279
Number of pages9
JournalActa Crystallographica Section D: Biological Crystallography
Volume62
Issue number3
DOIs
StatePublished - Mar 2006
Externally publishedYes

Fingerprint

Crystallography
Crystallization
Support vector machines
textures
Textures
crystallization
proteins
Crystals
Automation
Image classification
image classification
Proteins
crystals
Research Personnel
crystallography
Wavelet decomposition
Experiments
machine learning
Learning algorithms
Learning systems

ASJC Scopus subject areas

  • Clinical Biochemistry
  • Biochemistry, Genetics and Molecular Biology(all)
  • Biochemistry
  • Biophysics
  • Condensed Matter Physics
  • Structural Biology

Cite this

Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features. / Pan, Shen; Shavit, Gidon; Penas-Centeno, Marta; Xu, Dong Hui; Shapiro, Linda; Ladner, Richard; Riskin, Eve; Hol, Wim; Meldrum, Deirdre.

In: Acta Crystallographica Section D: Biological Crystallography, Vol. 62, No. 3, 03.2006, p. 271-279.

Research output: Contribution to journalArticle

Pan, Shen ; Shavit, Gidon ; Penas-Centeno, Marta ; Xu, Dong Hui ; Shapiro, Linda ; Ladner, Richard ; Riskin, Eve ; Hol, Wim ; Meldrum, Deirdre. / Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features. In: Acta Crystallographica Section D: Biological Crystallography. 2006 ; Vol. 62, No. 3. pp. 271-279.
@article{d117a5ae2d6740b496216d8ab51b8081,
title = "Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features",
abstract = "Protein crystallography laboratories are performing an increasing number of experiments to obtain crystals of good diffraction quality. Better automation has enabled researchers to prepare and run more experiments in a shorter time. However, the problem of identifying which experiments are successful remains difficult. In fact, most of this work is still performed manually by humans. Automating this task is therefore an important goal. As part of a project to develop a new and automated high-throughput capillary-based protein crystallography instrument, a new image-classification subsystem has been developed to greatly reduce the number of images that require human viewing. This system must have low rates of false negatives (missed crystals), possibly at the cost of raising the number of false positives. The image-classification system employs a support vector machine (SVM) learning algorithm to classify the blocks making up each image. A new algorithm to find the area within the image that contains the drop is employed. The SVM uses numerical features, based on texture and the Gabor wavelet decomposition, that are calculated for each block. If a block within an image is classified as containing a crystal, then the entire image is classified as containing a crystal. In a study of 375 images, 87 of which contained crystals, a false-negative rate of less than 4{\%} with a false-positive rate of about 40{\%} was consistently achieved.",
author = "Shen Pan and Gidon Shavit and Marta Penas-Centeno and Xu, {Dong Hui} and Linda Shapiro and Richard Ladner and Eve Riskin and Wim Hol and Deirdre Meldrum",
year = "2006",
month = "3",
doi = "10.1107/S0907444905041648",
language = "English (US)",
volume = "62",
pages = "271--279",
journal = "Acta Crystallographica Section D: Structural Biology",
issn = "0907-4449",
publisher = "John Wiley and Sons Inc.",
number = "3",

}

TY - JOUR

T1 - Automated classification of protein crystallization images using support vector machines with scale-invariant texture and Gabor features

AU - Pan, Shen

AU - Shavit, Gidon

AU - Penas-Centeno, Marta

AU - Xu, Dong Hui

AU - Shapiro, Linda

AU - Ladner, Richard

AU - Riskin, Eve

AU - Hol, Wim

AU - Meldrum, Deirdre

PY - 2006/3

Y1 - 2006/3

N2 - Protein crystallography laboratories are performing an increasing number of experiments to obtain crystals of good diffraction quality. Better automation has enabled researchers to prepare and run more experiments in a shorter time. However, the problem of identifying which experiments are successful remains difficult. In fact, most of this work is still performed manually by humans. Automating this task is therefore an important goal. As part of a project to develop a new and automated high-throughput capillary-based protein crystallography instrument, a new image-classification subsystem has been developed to greatly reduce the number of images that require human viewing. This system must have low rates of false negatives (missed crystals), possibly at the cost of raising the number of false positives. The image-classification system employs a support vector machine (SVM) learning algorithm to classify the blocks making up each image. A new algorithm to find the area within the image that contains the drop is employed. The SVM uses numerical features, based on texture and the Gabor wavelet decomposition, that are calculated for each block. If a block within an image is classified as containing a crystal, then the entire image is classified as containing a crystal. In a study of 375 images, 87 of which contained crystals, a false-negative rate of less than 4% with a false-positive rate of about 40% was consistently achieved.

AB - Protein crystallography laboratories are performing an increasing number of experiments to obtain crystals of good diffraction quality. Better automation has enabled researchers to prepare and run more experiments in a shorter time. However, the problem of identifying which experiments are successful remains difficult. In fact, most of this work is still performed manually by humans. Automating this task is therefore an important goal. As part of a project to develop a new and automated high-throughput capillary-based protein crystallography instrument, a new image-classification subsystem has been developed to greatly reduce the number of images that require human viewing. This system must have low rates of false negatives (missed crystals), possibly at the cost of raising the number of false positives. The image-classification system employs a support vector machine (SVM) learning algorithm to classify the blocks making up each image. A new algorithm to find the area within the image that contains the drop is employed. The SVM uses numerical features, based on texture and the Gabor wavelet decomposition, that are calculated for each block. If a block within an image is classified as containing a crystal, then the entire image is classified as containing a crystal. In a study of 375 images, 87 of which contained crystals, a false-negative rate of less than 4% with a false-positive rate of about 40% was consistently achieved.

UR - http://www.scopus.com/inward/record.url?scp=33646562492&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33646562492&partnerID=8YFLogxK

U2 - 10.1107/S0907444905041648

DO - 10.1107/S0907444905041648

M3 - Article

VL - 62

SP - 271

EP - 279

JO - Acta Crystallographica Section D: Structural Biology

JF - Acta Crystallographica Section D: Structural Biology

SN - 0907-4449

IS - 3

ER -