Joint work with Ignacio Flores
It is generally accepted that household surveys fail to accurately portray the top tail of both income and wealth distributions. Alternative data sources, such as administrative fiscal data, have been commonly used to include the rich in the picture. However, these different data sources have historically been treated as separate pieces of information by the literature, as there has been no broad consensus to date on how to combine them consistently. In this paper, we introduce a novel non-parametric procedure to adjust a survey’s individual weights in such a way as to replicate the fundamental characteristic of administrative data – namely, the number of people declaring given levels of income to fiscal authorities. Our method preserves the consistency of individual survey-respondent profiles, as we do not modify any self-reported characteristics. Consistency at the aggregate level is achieved via a calibration method that preserves weighted totals and averages for variables other than income (age, sex, urban population, etc.). The resulting adjusted survey thus keeps all micro variables, which can be exploited by researchers to analyse dimensions of social inequality under a more representative distributive framework. Furthermore, we provide a comparative review of previous correction methods, with particular focus on rescaling methods, as well as on the theory and empirics of the corresponding biases they aim to adjust. Each step of our procedure is illustrated by empirical applications to the cases of Brazil and Chile, covering numerous years. The complete methodology will be put towards public use in the form of an open source program.