Item nonresponse in surveys is usually handled through some form of imputation. Regression trees and random forests provide flexible tools for obtaining a set of imputed values. In this presentation, we lay out a set of conditions on the imputation model sufficient for the consistency of imputed estimators based on regression trees and random forests. We will introduce a novel variance estimator that accounts for sampling and nonresponse. The choice of hyper-parameters will also be discussed. Finally, we will present the results from a simulation study that investigates the performance of point and variance estimators in terms of bias, efficiency and coverage rate.    
David Haziza
University of Ottawa
06/06/2023 - 12:00