Japanese Journal for Research on Testing
Online ISSN : 2433-7447
Print ISSN : 1880-9618
Volume 11, Issue 1
Displaying 1-7 of 7 articles from this issue
  • In Multiple-Choice Reading Comprehension Tests in English
    Takahiro Terao, Kazuhiro Yasunaga, Hidetoki Ishii, Hiroyuki Noguchi
    2015 Volume 11 Issue 1 Pages 1-20
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    This study aims to examine the effect of attractive distractors depending on examinees’ proficiency. In Study 1, 16 participants were required to comment why each distractors are incorrect such as to determine the structure of attractive distractors, using entrance examinations administered before in private universities in Japan. In Study 2, 366 examinees took multiple-choice reading comprehension tests in English. Multinomial logistic regression analysis and the analysis of residual deviance revealed that low proficiency group chose distractors which included negative expressions and included causality without description in the passage, while middle proficiency group chose distractors which included negative expressions and causality with some descriptions in the passage. It was also evident that in the high level, examinees chose distractors using antonyms with some descriptions in the passage. Implications to item writing were discussed.

    Download PDF (577K)
  • – on the basis of the results of a survey of experts on item construction –
    Sayaka Arai
    2015 Volume 11 Issue 1 Pages 21-34
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    Tests are used for selection and qualification and they have a big effect on individuals and society. Therefore, tests and test items used to construct tests must be appropriately developed.

    The aim of this study is to reveal the things that are most important when developing multiple-choice items by taking a survey of experts involved in item construction. There are two parts to this study. In the first study, I compared several item-writing guidelines and asked the experts what they thought about each item in the guidelines, and in the second study, I asked them what the important things for item construction were. The results showed that we do not necessarily have to follow all the guidelines; it depends on the objective of the test. The results also suggested that the important things for developing multiple-choice items were: 1) the items properly reflect the objective of the test. 2) The items properly measure the ability of examinees which item-writers intend to measure. 3) The items are instructive to examinees.

    Download PDF (705K)
  • Haruhiko Mitsunaga
    2015 Volume 11 Issue 1 Pages 61-80
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    Building an item bank which most effectively facilitates the evaluation of eligibility for clinical hospital practice in nursing colleges is clearly desirable. However, the 2PL IRT model, commonly used to standardize item parameters, requires more than 300 examinees in order to estimate stable item parameters (Toyoda, 2012). Although Mitsunaga, et al (2014) used some prior distribution to obtain stable item parameters-based estimates from smaller datasets, in practice the relevant prior information is not always available. In this paper, eight CBT test forms were administered to eight groups, each of which had fewer than 200 examinees. To obtain feasible parameter estimates using such small datasets, latent rank theory (LRT; Shojima, 2009) was applied. The results suggest that relatively accurate LRT estimates are possible without any prior distribution. This can be achieved by setting up a group of small datasets which are conducive to IRT analysis where evaluation of item characteristics and examinee ability estimates can be carried out by the comparison of item parameters.

    Download PDF (1836K)
  • Application to the Kyoukenshiki Standardized Achievement Test NRT
    Masayuki Suzuki, Tetsuya Toyota, Kazuhiro Yamaguchi, Yuan Sun
    2015 Volume 11 Issue 1 Pages 81-97
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    Most traditional tests, which only report a small number of content-based subscores, total scores, or T-scores, are almost no use for providing diagnostic information about students’ strengths and weaknesses. In recent years, cognitive diagnostic modeling, which has been developed to provide detailed information concerning the extent to which students have mastered study contents, has been attracting a great deal of attention. In this paper, we applied several cognitive diagnostic models to the Kyoukenshiki standardized achievement test NRT and investigated its utility in educational practice. The results showed that we could obtain diagnostic information about students’ knowledge states, which was not possible to attain from the content-based subscores and total score. In addition, we discussed the problems in applying cognitive diagnostic models and issues to be addressed in the future.

    Download PDF (784K)
  • Eri Banno, Tomoko Watanabe
    2015 Volume 11 Issue 1 Pages 99-109
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    This article examines the placement test for a Japanese language course that was revised in 2012 using the Rasch model. Using data from the placement test that was used before 2012 and the one that was revised in 2012, we investigated how the revision affected the test results. The participants were 487 international students who were enrolled in a Japanese program at a university in Japan. The results indicate that the revised test was more difficult than the old one, which was the intent of the revision. At the same time, the results show that taking out the rubi from each kanji did not change the difficulty level of the questions. Furthermore, the results suggest that it was necessary to increase the number of difficult items in the revised test because the test was still easy for the examinees, and also to revise the multiple-choice items that did not work well.

    Download PDF (578K)
  • A practical study using BJT Business Japanese Proficiency Test data as an example
    Wakana Onozuka, Kiyokata Kato, Yumiko Umeki, Akiko Echizenya, Shin-ich ...
    2015 Volume 11 Issue 1 Pages 111-129
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    The purpose of this study is to show that those items that did not meet the statistical criteria to be entered into the item bank can be improved in terms of item statistics by rewriting them with a focus on item quality. We (1) selected those items that did not meet statistical criterion from the past Business Japanese Proficiency Test (BJT) administration, (2) rewrote those items by enhancing their quality, and (3) found that the revision did improve the statistical characteristics of those items based on experimental data.

    Download PDF (1746K)
  • Takatoshi Ishii, Maomi Ueno
    2015 Volume 11 Issue 1 Pages 131-149
    Published: 2015
    Released on J-STAGE: February 15, 2022
    JOURNAL FREE ACCESS

    To record and score examinees’ responses, ISO/IEC 23988:2007 provides global standards on the use of IT to deliver assessments to the examinees. For high-stakes test, this standard is recommending to use uniform test forms in which each form comprises a different set of items but which still must have equivalent specifications such as equivalent amounts of test information based on item response theory (IRT). However, assembly of uniform test forms suffers NP-hard problem because it is a combinational optimization problem of selecting items from an item bank. Therefore, the test assembly has made rapid progress in recent years, through the aid of information technology advances. In this paper, we introduce some typical uniform test assembly methods and compare to each other to explain their advantages and disadvantages.

    Download PDF (892K)
feedback
Top