TY - JOUR
T1 - Predicting potential distributions of geographic events using one-class data
T2 - Concepts and methods
AU - Guo, Q.
AU - Li, W.
AU - Liu, Y.
AU - Tong, D.
N1 - Funding Information:
We thank Profs. Michael Goodchild and Ling Bian for their discussion on the concept of GOCD. We are also grateful to Prof. Charles Elkan for his insights on the positive and unlabelled algorithms. This project is supported by the National Science Foundation (BDI-0742986).
Copyright:
Copyright 2012 Elsevier B.V., All rights reserved.
PY - 2011/10
Y1 - 2011/10
N2 - One common problem with geographic data is that, for a specific geographic event, only occurrence information is available; information about the absence of the event is not available.We refer to these specific types of geospatial data as geographic one-class data (GOCD). Predicting the potential spatial distributions that a particular geographic event may occur from GOCD is difficult because traditional binary classification methods that require availability of both positive and negative training samples cannot be used. The objective of this research is to define GOCD and propose novel approaches for modelling potential spatial distributions of geographic events using GOCD. We investigate the effectiveness of one-class support vector machine (OCSVM), maximum entropy (MAXENT) and the newly proposed positive and unlabelled learning (PUL) algorithm for solving GOCD problems using a case study: species distribution modelling from synthetic data. Our experimental results indicate that generally OCSVM, MAXENT and PUL are effective in modelling the GOCD. Each method has advantages and disadvantages, but PUL seems to be the most promising method.
AB - One common problem with geographic data is that, for a specific geographic event, only occurrence information is available; information about the absence of the event is not available.We refer to these specific types of geospatial data as geographic one-class data (GOCD). Predicting the potential spatial distributions that a particular geographic event may occur from GOCD is difficult because traditional binary classification methods that require availability of both positive and negative training samples cannot be used. The objective of this research is to define GOCD and propose novel approaches for modelling potential spatial distributions of geographic events using GOCD. We investigate the effectiveness of one-class support vector machine (OCSVM), maximum entropy (MAXENT) and the newly proposed positive and unlabelled learning (PUL) algorithm for solving GOCD problems using a case study: species distribution modelling from synthetic data. Our experimental results indicate that generally OCSVM, MAXENT and PUL are effective in modelling the GOCD. Each method has advantages and disadvantages, but PUL seems to be the most promising method.
KW - Ecological niche modelling
KW - Geographic one-class data
KW - Maximum entropy
KW - One-class support vector machine
KW - Positive and unlabelled learning
UR - http://www.scopus.com/inward/record.url?scp=84862908217&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84862908217&partnerID=8YFLogxK
U2 - 10.1080/13658816.2010.546360
DO - 10.1080/13658816.2010.546360
M3 - Article
AN - SCOPUS:84862908217
SN - 1365-8816
VL - 25
SP - 1697
EP - 1715
JO - International Journal of Geographical Information Science
JF - International Journal of Geographical Information Science
IS - 10
ER -