Geospatial data mining on the web

Discovering locations of emergency service facilities

WenWen Li, Michael Goodchild, Richard L. Church, Bin Zhou

Research output: Chapter in Book/Report/Conference proceedingConference contribution

9 Citations (Scopus)

Abstract

Identifying location-based information from the WWW, such as street addresses of emergency service facilities, has become increasingly popular. However, current Web-mining tools such as Google's crawler are designed to index webpages on the Internet instead of considering location information with a smaller granularity as an indexable object. This always leads to low recall of the search results. In order to retrieve the location-based information on the ever-expanding Internet with almost-unstructured Web data, there is a need of an effective Web-mining mechanism that is capable of extracting desired spatial data on the right webpages within the right scope. In this paper, we report our efforts towards automated location-information retrieval by developing a knowledge-based Web mining tool, CyberMiner, that adopts (1) a geospatial taxonomy to determine the starting URLs and domains for the spatial Web mining, (2) a rule-based forward and backward screening algorithm for efficient address extraction, and (3) inductive-learning-based semantic analysis to discover patterns of street addresses of interest. The retrieval of locations of all fire stations within Los Angeles County, California is used as a case study.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Pages552-563
Number of pages12
Volume7713 LNAI
DOIs
StatePublished - 2012
Event8th International Conference on Advanced Data Mining and Applications, ADMA 2012 - Nanjing, China
Duration: Dec 15 2012Dec 18 2012

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume7713 LNAI
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other8th International Conference on Advanced Data Mining and Applications, ADMA 2012
CountryChina
CityNanjing
Period12/15/1212/18/12

Fingerprint

Emergency services
Web Mining
Emergency
Data mining
Data Mining
World Wide Web
Internet
Inductive Learning
Semantic Analysis
Taxonomies
Spatial Data
Knowledge-based
Taxonomy
Granularity
Information retrieval
Information Retrieval
Screening
Websites
Fires
Retrieval

Keywords

  • Emergency service facilities
  • Inductive learning
  • Information extraction
  • Information retrieval
  • Location-based services
  • Ontology
  • Web data mining

ASJC Scopus subject areas

  • Computer Science(all)
  • Theoretical Computer Science

Cite this

Li, W., Goodchild, M., Church, R. L., & Zhou, B. (2012). Geospatial data mining on the web: Discovering locations of emergency service facilities. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7713 LNAI, pp. 552-563). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7713 LNAI). https://doi.org/10.1007/978-3-642-35527-1_46

Geospatial data mining on the web : Discovering locations of emergency service facilities. / Li, WenWen; Goodchild, Michael; Church, Richard L.; Zhou, Bin.

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7713 LNAI 2012. p. 552-563 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 7713 LNAI).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, W, Goodchild, M, Church, RL & Zhou, B 2012, Geospatial data mining on the web: Discovering locations of emergency service facilities. in Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). vol. 7713 LNAI, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7713 LNAI, pp. 552-563, 8th International Conference on Advanced Data Mining and Applications, ADMA 2012, Nanjing, China, 12/15/12. https://doi.org/10.1007/978-3-642-35527-1_46
Li W, Goodchild M, Church RL, Zhou B. Geospatial data mining on the web: Discovering locations of emergency service facilities. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7713 LNAI. 2012. p. 552-563. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-642-35527-1_46
Li, WenWen ; Goodchild, Michael ; Church, Richard L. ; Zhou, Bin. / Geospatial data mining on the web : Discovering locations of emergency service facilities. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). Vol. 7713 LNAI 2012. pp. 552-563 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{1a592bab3f484eb1ab2cd894ecdd6635,
title = "Geospatial data mining on the web: Discovering locations of emergency service facilities",
abstract = "Identifying location-based information from the WWW, such as street addresses of emergency service facilities, has become increasingly popular. However, current Web-mining tools such as Google's crawler are designed to index webpages on the Internet instead of considering location information with a smaller granularity as an indexable object. This always leads to low recall of the search results. In order to retrieve the location-based information on the ever-expanding Internet with almost-unstructured Web data, there is a need of an effective Web-mining mechanism that is capable of extracting desired spatial data on the right webpages within the right scope. In this paper, we report our efforts towards automated location-information retrieval by developing a knowledge-based Web mining tool, CyberMiner, that adopts (1) a geospatial taxonomy to determine the starting URLs and domains for the spatial Web mining, (2) a rule-based forward and backward screening algorithm for efficient address extraction, and (3) inductive-learning-based semantic analysis to discover patterns of street addresses of interest. The retrieval of locations of all fire stations within Los Angeles County, California is used as a case study.",
keywords = "Emergency service facilities, Inductive learning, Information extraction, Information retrieval, Location-based services, Ontology, Web data mining",
author = "WenWen Li and Michael Goodchild and Church, {Richard L.} and Bin Zhou",
year = "2012",
doi = "10.1007/978-3-642-35527-1_46",
language = "English (US)",
isbn = "9783642355264",
volume = "7713 LNAI",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
pages = "552--563",
booktitle = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

}

TY - GEN

T1 - Geospatial data mining on the web

T2 - Discovering locations of emergency service facilities

AU - Li, WenWen

AU - Goodchild, Michael

AU - Church, Richard L.

AU - Zhou, Bin

PY - 2012

Y1 - 2012

N2 - Identifying location-based information from the WWW, such as street addresses of emergency service facilities, has become increasingly popular. However, current Web-mining tools such as Google's crawler are designed to index webpages on the Internet instead of considering location information with a smaller granularity as an indexable object. This always leads to low recall of the search results. In order to retrieve the location-based information on the ever-expanding Internet with almost-unstructured Web data, there is a need of an effective Web-mining mechanism that is capable of extracting desired spatial data on the right webpages within the right scope. In this paper, we report our efforts towards automated location-information retrieval by developing a knowledge-based Web mining tool, CyberMiner, that adopts (1) a geospatial taxonomy to determine the starting URLs and domains for the spatial Web mining, (2) a rule-based forward and backward screening algorithm for efficient address extraction, and (3) inductive-learning-based semantic analysis to discover patterns of street addresses of interest. The retrieval of locations of all fire stations within Los Angeles County, California is used as a case study.

AB - Identifying location-based information from the WWW, such as street addresses of emergency service facilities, has become increasingly popular. However, current Web-mining tools such as Google's crawler are designed to index webpages on the Internet instead of considering location information with a smaller granularity as an indexable object. This always leads to low recall of the search results. In order to retrieve the location-based information on the ever-expanding Internet with almost-unstructured Web data, there is a need of an effective Web-mining mechanism that is capable of extracting desired spatial data on the right webpages within the right scope. In this paper, we report our efforts towards automated location-information retrieval by developing a knowledge-based Web mining tool, CyberMiner, that adopts (1) a geospatial taxonomy to determine the starting URLs and domains for the spatial Web mining, (2) a rule-based forward and backward screening algorithm for efficient address extraction, and (3) inductive-learning-based semantic analysis to discover patterns of street addresses of interest. The retrieval of locations of all fire stations within Los Angeles County, California is used as a case study.

KW - Emergency service facilities

KW - Inductive learning

KW - Information extraction

KW - Information retrieval

KW - Location-based services

KW - Ontology

KW - Web data mining

UR - http://www.scopus.com/inward/record.url?scp=84872710346&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84872710346&partnerID=8YFLogxK

U2 - 10.1007/978-3-642-35527-1_46

DO - 10.1007/978-3-642-35527-1_46

M3 - Conference contribution

SN - 9783642355264

VL - 7713 LNAI

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 552

EP - 563

BT - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

ER -