To construct a scale for arithmetic ability, three forms of test were built and administered to grades three through five. Applying the two-parameter item response model, all parameters of the model were estimated simultaneously for three test forms. Thus, it was possible to compare the difficulty and the discriminating power of all items in a common scale. Accuracy of ability estimation of each test, which was evaluated by the test information function, revealed that precision of estimation was enough for practical use. In the common scale, the average of estimated ability was -.35 for grade three, 0.0 for grade four, and 1.48 for grade five, showing that the difference between grade three and four was small, compared to that between grade four and five. Possible applications of the scale were also discussed.