J4 ›› 2013, Vol. 12 ›› Issue (10): 1-5.

• 数学与计算机科学 •    下一篇

随机试验设计中缺失值插补方法研究

  

  1. 大理学院数学与计算机学院,云南大理 671003
  • 收稿日期:2013-03-12 修回日期:2013-06-24 出版日期:2013-10-15 发布日期:2013-10-15
  • 作者简介:李杰,助教,主要从事缺失数据、变量选择、数据降维及大数据分析研究.
  • 基金资助:

    2012年大理学院青年教师科研基金资助项目(KYQN201219)

Imputation Method Study with Missing Data in Random Experiment Design

  1. College of Mathematics and Computer, Dali University, Dali, Yunnan 671003, China
  • Received:2013-03-12 Revised:2013-06-24 Online:2013-10-15 Published:2013-10-15

摘要:

随机化区组设计中经常会碰到缺失数据,处理此类缺失数据目前有4种方法:删除缺失数据法、均值插补法、公式插补
法和Yate’s插补法。4种方法的优劣是值得研究的一个问题,拟用模拟研究的方法对此4种方法进行比较。首先随机产生一个4×5的随机区组设计,令缺失值的个数m=1,…,6;其次对每个n 遍历所有缺失值位置可能的组合,在每一个缺失值位置的组合下,分别研究4种方法线性回归的标准误差、可决系数和复可决系数。最后模拟研究的结果证实Yate’s插补方法是这4种方法中表现最好的一个,实例研究的结果也证实了模拟研究的结论。

Abstract:

In random experiment design, missing data often exist due to some reason. There are four methods to deal with the missing data:delete the missing data,mean imputation,formula imputation and Yate's imputation. It is an interesting question to compare the four methods. This article presents how to use simulation study to carry out this comparison. First, built a 4×5 random experiment design; m denotes the numbers of missing data which equals from 1 to 6; Second,find out all missing values' location combinations. For each combination, these 4 methods are executed separately, and standard error, square R and adjust square R for each method are recorded. Last, the simulation study shows Yate's inputaton method performance is better than other 3 methods. The real example also
proves simulation results.

中图分类号: