This study developed a method for evaluating stimuli using a Q-learning model. We demonstrate that stimuli for more elaborate Q-learning models can be selected using a two-step procedure that extends Fujita, Okada, and Katahira (2022b) model. In the two-step procedure, the goodness of fit of the stimuli was evaluated based on Fisher information and Markov chain Monte Carlo (MCMC) estimation. The evaluation with Fisher information narrows down the better candidates for the stimuli (first step). Furthermore, stimuli regarded as desirable in the first step are evaluated more precisely in the second step. From Fisher information-based and MCMC-based simulations, the superiority of specific stimuli in the Fisher information-based simulation aligns with that of the MCMC-based simulation. The Fisher information can precisely predict the order of the estimation precision of the stimuli, validating the two-step procedure. Moreover, there is a superior stimulus design regardless of the model for the inverse temperature parameter. Nonetheless, no such stimulus design exists for the learning rate parameters. In actual experiments, it is preferable to consider the model, research method, and purpose (e.g., parameters that researchers should focus on) and optimally select stimuli using a two-step procedure.
View full abstract