科学研究
学术报告
当前位置: 77779193永利官网 > 科学研究 > 学术报告 > 正文

Optimal Distributed Subsampling for Quasi-likelihood Estimator with Massive Data 大数据分析中的分布式抽样技术

发布时间:2021-05-31 作者:77779193永利官网 浏览次数:
Speaker: 艾明要 DateTime: 2021年6月4日(周五) 下午14:00-14:50
Brief Introduction to Speaker:

艾明要,现任北京大学数学科学学院统计学教研室主任、教授、博士生导师,兼任中国现场统计研究会副秘书长,中国现场统计研究会试验设计分会理事长、高维数据统计分会副理事长、空间统计分会秘书长,是国际统计期刊Statistica SinicaJournal of Statistical Planning and InferenceStatistics and Probability LettersSTATAn electronic journal of ISI)的Associate Editor,国内数学期刊《系统科学与数学》的编委。主要从事试验设计与分析、计算机试验、高维数据分析和应用统计的教学和研究工作,在国际顶尖的统计期刊Ann. Statist.JASABiometrikaTechnometricsStatist. Sinica等发表学术论文六十余篇,主持或参与完成6项国家自然科学基金面上项目、1项重点项目和1项国家科技部973课题基于药性构成三要素的中药药性实质研究等。

 

Place: 六号楼二楼报告厅
Abstract:Nonuniform subsampling methods are effective to reduce computational burden and maintain estimation efficiency for massive data. Existing methods mostly focus on subsampling with replacement due to its high computational efficiency. If the data volume is so large that nonuniform subsampling probabilities cannot be calculated all at once, then subsampling with replacement is infeasible to implement. This paper solves this problem using Poisson subsampling. We first derive optimal Poisson subsampling probabilities in the context of quasi-likelihood estimation under the A- and L-optimality criteria. For a practically implementable algorithm with approximated optimal subsampling probabilities, we establish the consistency and asymptotic normality of the resultant estimators.