2025 年 33 巻 p. 507-521
Data analysis is crucial for extracting valuable information from large datasets and making strategic decisions. Effective data analysis requires various types of knowledge, and a lack of user knowledge can lead to incorrect analyses or misinterpretations. Therefore, a data analysis system that provides appropriate information is of great benefit for users with sufficient expertise. However, existing studies have not focused on providing information during data analysis, and it remains unclear what information users need. To address this challenge, this study investigates the information needs that arise during spreadsheet data analysis to design data analysis tools that compensate for users' lack of knowledge. We aim to understand what information users search for, when and how they search, and what web pages they read. To this end, we conducted a laboratory study in which participants analyzed data and drafted reports on their findings. The behaviors of the participants were coded and the post-task interviews provided deeper insights into the information needs during the analysis. Our findings include: (1) Six categories of information needs arose during the spreadsheet data analysis, and each category had different levels of difficulty to satisfy. (2) Each information need category co-occurred with significantly different user behaviors, such as browsing web pages, reading spreadsheets, and writing a report. (3) Information need categories had significant effects on search behaviors. Participants faced different types of difficulties especially when searching for evidence to explain data trends and when searching for analysis methods. (4) Participants read web pages with significantly different readability for each information need category. Especially, when searching for analysis method, they read web pages containing more complex terms. (5) The overlap of words between the reports and the web pages they read showed significant differences for each information need category. Their reports were influenced by the evidence to explain data trends on the web. Based on these findings, we discuss suggestions to improve data analysis tools.