Collaborative Research: Information-based Subdata selection...

  • Stufken, John, (PI)

Project: Research project

Description

Extraordinary amounts of data are being produced in many branches of science. While this provides numerous opportunities for researchers to tackle more complicated research questions, the exploding data volume poses various formidable challenges for both methodological and applied researchers. One essential challenge from a statistical perspective is how to obtain useful information from massive data with limited computing power. The primary goal of the proposed project is to find optimal strategies of selecting subdata that keeps maximum information. The proposed procedures identify most informative data points deterministically without relying on random subsampling, and they are more efficient for both estimation and computation. These optimal strategies will facilitate scientific discoveries and
technological breakthroughs by allowing researchers to extract the maximum amount of useful information from massive data with limited computing resources, which turns the fear of overwhelming challenges caused by big data into awesome opportunities.
StatusFinished
Effective start/end date7/15/185/9/19

Funding

  • National Science Foundation (NSF): $59,994.00

Fingerprint

Big data