This study evaluates the usefulness of income tax return data, which is newly available in Japan, as a source of income statistics, focusing on its characteristics, advantages, and limitations. The large scale of the data and its accuracy constitute major advantages, but they exclude individuals who are not required to file tax returns as well as income exempt from taxation, which limit their ability to capture the whole distribution of income.
Approximately 23 million individuals file tax returns each year, with about 5 million additional individuals reported as dependents. Together, these groups account for only about 30 percent of the total population aged 15 and over. In particular, most wage earners and pensioners do not file returns, necessitating the use of supplementary information.
Among wage earners and pensioners who do file returns, many do so to claim deductions such as the medical expense deduction or the tax credits for mortgage loan interest. Given the nature of these institutions, filers are supposed to be in distinct economic conditions—such as facing illness or having recently purchased a home—which indicates a potential selection bias within this group.
By contrast, for self-employed individuals, the tax return data provide comprehensive information. The number of business income filers closely aligns with the self-employed population reported in statistical surveys, and the total reported income is broadly consistent with the national accounts data. Since more than half of self-employed filers submit returns almost every year, the tax return data are highly valuable as a panel dataset for studying the self-employed.
This study also evaluates the tax return data as a source about information on top-income individuals. For the super-rich—those in the top 0.1 percent of the population—coverage is almost exhaustive. However, when extending the focus to the top 1 percent, the presence of individuals who do not file returns becomes non-negligible. Moreover, due to the Separate Withholding Taxation system on interest and dividend income, the coverage would be less complete, which likely results in an underestimation of top-income individuals.
Future challenges include the integration of tax return data with other datasets on wage earners and pensioners to construct a more comprehensive database of incomes. To understand the entire distribution of income, comparison between tax return data and existing statistical surveys is necessary.
抄録全体を表示