This paper presents a brief review of ten imputation methods of missing data in binary logistic regression model. The performance of these methods under different missingness scenarios has been examined based on a medical dataset. The results indicated that, in general, Expectation Maximization (EM) and k-Nearest Neighbor (KNN) Imputation methods are very appropriate for estimating the missing values in this model, whether the missing data in dependent variable only, independent variables only, or in both.
Key words: Expectation-maximization; Hot-deck imputation; K-nearest neighbor; Last observation carried forward; Predictive mean matching.
|