Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features

Rui Hou; Bibo Shi; Lars J. Grimm; MacIej A. Mazurowski; Jeffrey R. Marks; Lorraine M. King; Carlo Maley; E. Shelley Hwang; Joseph Y. Lo

doi:10.1117/12.2293809

Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features

Rui Hou, Bibo Shi, Lars J. Grimm, MacIej A. Mazurowski, Jeffrey R. Marks, Lorraine M. King, Carlo Maley, E. Shelley Hwang, Joseph Y. Lo

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

Predicting whether ductal carcinoma in situ (DCIS) identified at core biopsy contains occult invasive disease is an import task since these "upstaged" cases will affect further treatment planning. Therefore, a prediction model that better classifies pure DCIS and upstaged DCIS can help avoid overtreatment and overdiagnosis. In this work, we propose to improve this classification performance with the aid of two other related classes: Atypical Ductal Hyperplasia (ADH) and Invasive Ductal Carcinoma (IDC). Our data set contains mammograms for 230 cases. Specifically, 66 of them are ADH cases; 99 of them are biopsy-proven DCIS cases, of whom 25 were found to contain invasive disease at the time of definitive surgery. The remaining 65 cases were diagnosed with IDC at core biopsy. Our hypothesis is that knowledge can be transferred from training with the easier and more readily available cases of benign but suspicious ADH versus IDC that is already apparent at initial biopsy. Thus, embedding both ADH and IDC cases to the classifier will improve the performance of distinguishing upstaged DCIS from pure DCIS. We extracted 113 mammographic features based on a radiologist's annotation of clusters.Our method then added both ADH and IDC cases during training, where ADH were "force labeled" or treated by the classifier as pure DCIS (negative) cases, and IDC were labeled as upstaged DCIS (positive) cases. A logistic regression classifier was built based on the designed training dataset to perform a prediction of whether biopsy-proven DCIS cases contain invasive cancer. The performance was assessed by repeated 5-fold CrossValidation and Receiver Operating Characteristic(ROC) curve analysis. While prediction performance with only training on DCIS dataset had an average AUC of 0.607(%95CI, 0.479-0.721). By adding both ADH and IDC cases for training, we improved the performance to 0.691(95%CI, 0.581-0.801).

Original language	English (US)
Title of host publication	Medical Imaging 2018
Subtitle of host publication	Computer-Aided Diagnosis
Editors	Kensaku Mori, Nicholas Petrick
Publisher	SPIE
ISBN (Electronic)	9781510616394
DOIs	https://doi.org/10.1117/12.2293809
State	Published - 2018
Event	Medical Imaging 2018: Computer-Aided Diagnosis - Houston, United States Duration: Feb 12 2018 → Feb 15 2018

Publication series

Name	Progress in Biomedical Optics and Imaging - Proceedings of SPIE
Volume	10575
ISSN (Print)	1605-7422

Other

Other	Medical Imaging 2018: Computer-Aided Diagnosis
Country/Territory	United States
City	Houston
Period	2/12/18 → 2/15/18

Keywords

ADH
Breast cancer
IDC
digital mammogram
ductal carcinoma in situ

ASJC Scopus subject areas

Electronic, Optical and Magnetic Materials
Atomic and Molecular Physics, and Optics
Biomaterials
Radiology Nuclear Medicine and imaging

Access to Document

10.1117/12.2293809

Cite this

Hou, R., Shi, B., Grimm, L. J., Mazurowski, M. A., Marks, J. R., King, L. M., Maley, C., Shelley Hwang, E., & Lo, J. Y. (2018). Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features. In K. Mori, & N. Petrick (Eds.), Medical Imaging 2018: Computer-Aided Diagnosis Article 105750R (Progress in Biomedical Optics and Imaging - Proceedings of SPIE; Vol. 10575). SPIE. https://doi.org/10.1117/12.2293809

Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features. / Hou, Rui; Shi, Bibo; Grimm, Lars J. et al.
Medical Imaging 2018: Computer-Aided Diagnosis. ed. / Kensaku Mori; Nicholas Petrick. SPIE, 2018. 105750R (Progress in Biomedical Optics and Imaging - Proceedings of SPIE; Vol. 10575).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Hou, R, Shi, B, Grimm, LJ, Mazurowski, MA, Marks, JR, King, LM, Maley, C, Shelley Hwang, E & Lo, JY 2018, Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features. in K Mori & N Petrick (eds), Medical Imaging 2018: Computer-Aided Diagnosis., 105750R, Progress in Biomedical Optics and Imaging - Proceedings of SPIE, vol. 10575, SPIE, Medical Imaging 2018: Computer-Aided Diagnosis, Houston, United States, 2/12/18. https://doi.org/10.1117/12.2293809

Hou R, Shi B, Grimm LJ, Mazurowski MA, Marks JR, King LM et al. Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features. In Mori K, Petrick N, editors, Medical Imaging 2018: Computer-Aided Diagnosis. SPIE. 2018. 105750R. (Progress in Biomedical Optics and Imaging - Proceedings of SPIE). doi: 10.1117/12.2293809

Hou, Rui ; Shi, Bibo ; Grimm, Lars J. et al. / Improving classification with forced labeling of other related classes : Application to prediction of upstaged ductal carcinoma in situ using mammographic features. Medical Imaging 2018: Computer-Aided Diagnosis. editor / Kensaku Mori ; Nicholas Petrick. SPIE, 2018. (Progress in Biomedical Optics and Imaging - Proceedings of SPIE).

@inproceedings{e81071a633f2409ca8035b577f9a39e8,

title = "Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features",

abstract = "Predicting whether ductal carcinoma in situ (DCIS) identified at core biopsy contains occult invasive disease is an import task since these {"}upstaged{"} cases will affect further treatment planning. Therefore, a prediction model that better classifies pure DCIS and upstaged DCIS can help avoid overtreatment and overdiagnosis. In this work, we propose to improve this classification performance with the aid of two other related classes: Atypical Ductal Hyperplasia (ADH) and Invasive Ductal Carcinoma (IDC). Our data set contains mammograms for 230 cases. Specifically, 66 of them are ADH cases; 99 of them are biopsy-proven DCIS cases, of whom 25 were found to contain invasive disease at the time of definitive surgery. The remaining 65 cases were diagnosed with IDC at core biopsy. Our hypothesis is that knowledge can be transferred from training with the easier and more readily available cases of benign but suspicious ADH versus IDC that is already apparent at initial biopsy. Thus, embedding both ADH and IDC cases to the classifier will improve the performance of distinguishing upstaged DCIS from pure DCIS. We extracted 113 mammographic features based on a radiologist's annotation of clusters.Our method then added both ADH and IDC cases during training, where ADH were {"}force labeled{"} or treated by the classifier as pure DCIS (negative) cases, and IDC were labeled as upstaged DCIS (positive) cases. A logistic regression classifier was built based on the designed training dataset to perform a prediction of whether biopsy-proven DCIS cases contain invasive cancer. The performance was assessed by repeated 5-fold CrossValidation and Receiver Operating Characteristic(ROC) curve analysis. While prediction performance with only training on DCIS dataset had an average AUC of 0.607(%95CI, 0.479-0.721). By adding both ADH and IDC cases for training, we improved the performance to 0.691(95%CI, 0.581-0.801).",

keywords = "ADH, Breast cancer, IDC, digital mammogram, ductal carcinoma in situ",

author = "Rui Hou and Bibo Shi and Grimm, {Lars J.} and Mazurowski, {MacIej A.} and Marks, {Jeffrey R.} and King, {Lorraine M.} and Carlo Maley and {Shelley Hwang}, E. and Lo, {Joseph Y.}",

note = "Publisher Copyright: {\textcopyright} 2018 SPIE.; Medical Imaging 2018: Computer-Aided Diagnosis ; Conference date: 12-02-2018 Through 15-02-2018",

year = "2018",

doi = "10.1117/12.2293809",

language = "English (US)",

series = "Progress in Biomedical Optics and Imaging - Proceedings of SPIE",

publisher = "SPIE",

editor = "Kensaku Mori and Nicholas Petrick",

booktitle = "Medical Imaging 2018",

}

TY - GEN

T1 - Improving classification with forced labeling of other related classes

T2 - Medical Imaging 2018: Computer-Aided Diagnosis

AU - Hou, Rui

AU - Shi, Bibo

AU - Grimm, Lars J.

AU - Mazurowski, MacIej A.

AU - Marks, Jeffrey R.

AU - King, Lorraine M.

AU - Maley, Carlo

AU - Shelley Hwang, E.

AU - Lo, Joseph Y.

PY - 2018

Y1 - 2018

N2 - Predicting whether ductal carcinoma in situ (DCIS) identified at core biopsy contains occult invasive disease is an import task since these "upstaged" cases will affect further treatment planning. Therefore, a prediction model that better classifies pure DCIS and upstaged DCIS can help avoid overtreatment and overdiagnosis. In this work, we propose to improve this classification performance with the aid of two other related classes: Atypical Ductal Hyperplasia (ADH) and Invasive Ductal Carcinoma (IDC). Our data set contains mammograms for 230 cases. Specifically, 66 of them are ADH cases; 99 of them are biopsy-proven DCIS cases, of whom 25 were found to contain invasive disease at the time of definitive surgery. The remaining 65 cases were diagnosed with IDC at core biopsy. Our hypothesis is that knowledge can be transferred from training with the easier and more readily available cases of benign but suspicious ADH versus IDC that is already apparent at initial biopsy. Thus, embedding both ADH and IDC cases to the classifier will improve the performance of distinguishing upstaged DCIS from pure DCIS. We extracted 113 mammographic features based on a radiologist's annotation of clusters.Our method then added both ADH and IDC cases during training, where ADH were "force labeled" or treated by the classifier as pure DCIS (negative) cases, and IDC were labeled as upstaged DCIS (positive) cases. A logistic regression classifier was built based on the designed training dataset to perform a prediction of whether biopsy-proven DCIS cases contain invasive cancer. The performance was assessed by repeated 5-fold CrossValidation and Receiver Operating Characteristic(ROC) curve analysis. While prediction performance with only training on DCIS dataset had an average AUC of 0.607(%95CI, 0.479-0.721). By adding both ADH and IDC cases for training, we improved the performance to 0.691(95%CI, 0.581-0.801).

AB - Predicting whether ductal carcinoma in situ (DCIS) identified at core biopsy contains occult invasive disease is an import task since these "upstaged" cases will affect further treatment planning. Therefore, a prediction model that better classifies pure DCIS and upstaged DCIS can help avoid overtreatment and overdiagnosis. In this work, we propose to improve this classification performance with the aid of two other related classes: Atypical Ductal Hyperplasia (ADH) and Invasive Ductal Carcinoma (IDC). Our data set contains mammograms for 230 cases. Specifically, 66 of them are ADH cases; 99 of them are biopsy-proven DCIS cases, of whom 25 were found to contain invasive disease at the time of definitive surgery. The remaining 65 cases were diagnosed with IDC at core biopsy. Our hypothesis is that knowledge can be transferred from training with the easier and more readily available cases of benign but suspicious ADH versus IDC that is already apparent at initial biopsy. Thus, embedding both ADH and IDC cases to the classifier will improve the performance of distinguishing upstaged DCIS from pure DCIS. We extracted 113 mammographic features based on a radiologist's annotation of clusters.Our method then added both ADH and IDC cases during training, where ADH were "force labeled" or treated by the classifier as pure DCIS (negative) cases, and IDC were labeled as upstaged DCIS (positive) cases. A logistic regression classifier was built based on the designed training dataset to perform a prediction of whether biopsy-proven DCIS cases contain invasive cancer. The performance was assessed by repeated 5-fold CrossValidation and Receiver Operating Characteristic(ROC) curve analysis. While prediction performance with only training on DCIS dataset had an average AUC of 0.607(%95CI, 0.479-0.721). By adding both ADH and IDC cases for training, we improved the performance to 0.691(95%CI, 0.581-0.801).

KW - ADH

KW - Breast cancer

KW - IDC

KW - digital mammogram

KW - ductal carcinoma in situ

UR - http://www.scopus.com/inward/record.url?scp=85046258100&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85046258100&partnerID=8YFLogxK

U2 - 10.1117/12.2293809

DO - 10.1117/12.2293809

M3 - Conference contribution

AN - SCOPUS:85046258100

T3 - Progress in Biomedical Optics and Imaging - Proceedings of SPIE

BT - Medical Imaging 2018

A2 - Mori, Kensaku

A2 - Petrick, Nicholas

PB - SPIE

Y2 - 12 February 2018 through 15 February 2018

ER -

Improving classification with forced labeling of other related classes: Application to prediction of upstaged ductal carcinoma in situ using mammographic features

Abstract

Publication series

Other

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this