Model-free Subsampling Method Based on Uniform Designs

报告人：周永道（南开大学）

时间：2022年8月3日下午14:30

地点：腾讯会议：579-8970-5000 会议密码：202278

摘要：Subsampling or subdata selection is a useful approach in large-scale statistical learning. Most existing studies focus on model-based subsampling methods which significantly depend on the model assumption. In this paper, we consider the model-free subsampling strategy for generating subdata from the original full data. In order to measure the goodness of representation of a subdata with respect to the original data, we propose a criterion, generalized empirical F-discrepancy (GEFD), and study its theoretical properties in connection with the classical generalized L2-discrepancy in the theory of uniform designs. These properties allow us to develop a kind of low-GEFD data-driven subsampling method based on the existing uniform designs. By simulation examples and a real case study, we show that the proposed subsampling method enjoys the model-free property and is superior to the random sampling method. In practice, such a model-free property is more appealing than the model-based subsampling methods, where the latter may have poor performance when the model is misspecified, as demonstrated in our simulation studies.

8.3周永道-01.jpg

VIDEOS

Model-free Subsampling Method Based on Uniform Designs
14:30 - 16:30, 2022-08-03 at 腾讯会议
周永道

August 03 / Wed

9:00

9:30

10:00

10:30

11:00

11:30

12:00

12:30

13:00

13:30

14:00

14:30

15:00

15:30

16:00

16:30

17:00

17:30

18:00

18:30

19:00

19:30

20:00

20:30

21:00

周永道