Extraordinary amounts of data are being produced in many branches of science. While this provides numerous opportunities for researchers to tackle more complicated research questions, the exploding data volume poses various formidable challenges for both methodological and applied researchers. One essential challenge from a statistical perspective is how to obtain useful information from massive data with limited computing power. The primary goal of the proposed project is to find optimal strategies of selecting subdata that keeps maximum information. The proposed procedures identify most informative data points deterministically without relying on random subsampling, and they are more efficient for both estimation and computation. These optimal strategies will facilitate scientific discoveries and technological breakthroughs by allowing researchers to extract the maximum amount of useful information from massive data with limited computing resources, which turns the fear of overwhelming challenges caused by big data into awesome opportunities.
|Effective start/end date||7/15/18 → 5/9/19|
- National Science Foundation (NSF): $59,994.00
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.