Japanese Journal for Research on Testing

Attractive Distractors Depend on Proficiencies of Examinees

In Multiple-Choice Reading Comprehension Tests in English

Takahiro Terao, Kazuhiro Yasunaga, Hidetoki Ishii, Hiroyuki Noguchi

2015 Volume 11 Issue 1 Pages 1-20
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_1

JOURNAL FREE ACCESS

Show abstractHide abstract

This study aims to examine the effect of attractive distractors depending on examinees’ proficiency. In Study 1, 16 participants were required to comment why each distractors are incorrect such as to determine the structure of attractive distractors, using entrance examinations administered before in private universities in Japan. In Study 2, 366 examinees took multiple-choice reading comprehension tests in English. Multinomial logistic regression analysis and the analysis of residual deviance revealed that low proficiency group chose distractors which included negative expressions and included causality without description in the passage, while middle proficiency group chose distractors which included negative expressions and causality with some descriptions in the passage. It was also evident that in the high level, examinees chose distractors using antonyms with some descriptions in the passage. Implications to item writing were discussed.

View full abstract

Download PDF (577K)
Some Important Things for Developing Multiple-Choice Items

– on the basis of the results of a survey of experts on item construction –

Sayaka Arai

2015 Volume 11 Issue 1 Pages 21-34
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_21

JOURNAL FREE ACCESS

Show abstractHide abstract

Tests are used for selection and qualification and they have a big effect on individuals and society. Therefore, tests and test items used to construct tests must be appropriately developed.

The aim of this study is to reveal the things that are most important when developing multiple-choice items by taking a survey of experts involved in item construction. There are two parts to this study. In the first study, I compared several item-writing guidelines and asked the experts what they thought about each item in the guidelines, and in the second study, I asked them what the important things for item construction were. The results showed that we do not necessarily have to follow all the guidelines; it depends on the objective of the test. The results also suggested that the important things for developing multiple-choice items were: 1) the items properly reflect the objective of the test. 2) The items properly measure the ability of examinees which item-writers intend to measure. 3) The items are instructive to examinees.

View full abstract

Download PDF (705K)

An evaluation of using Latent Rank Theory to construct an item bank for the CBT for clinical hospital practice in nursing colleges

Haruhiko Mitsunaga

2015 Volume 11 Issue 1 Pages 61-80
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_61

JOURNAL FREE ACCESS

Show abstractHide abstract

Building an item bank which most effectively facilitates the evaluation of eligibility for clinical hospital practice in nursing colleges is clearly desirable. However, the 2PL IRT model, commonly used to standardize item parameters, requires more than 300 examinees in order to estimate stable item parameters (Toyoda, 2012). Although Mitsunaga, et al (2014) used some prior distribution to obtain stable item parameters-based estimates from smaller datasets, in practice the relevant prior information is not always available. In this paper, eight CBT test forms were administered to eight groups, each of which had fewer than 200 examinees. To obtain feasible parameter estimates using such small datasets, latent rank theory (LRT; Shojima, 2009) was applied. The results suggest that relatively accurate LRT estimates are possible without any prior distribution. This can be achieved by setting up a group of small datasets which are conducive to IRT analysis where evaluation of item characteristics and examinee ability estimates can be carried out by the comparison of item parameters.

View full abstract

Download PDF (1836K)
On the Utility of Cognitive Diagnostic Models

Application to the Kyoukenshiki Standardized Achievement Test NRT

Masayuki Suzuki, Tetsuya Toyota, Kazuhiro Yamaguchi, Yuan Sun

2015 Volume 11 Issue 1 Pages 81-97
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_81

JOURNAL FREE ACCESS

Show abstractHide abstract

Most traditional tests, which only report a small number of content-based subscores, total scores, or T-scores, are almost no use for providing diagnostic information about students’ strengths and weaknesses. In recent years, cognitive diagnostic modeling, which has been developed to provide detailed information concerning the extent to which students have mastered study contents, has been attracting a great deal of attention. In this paper, we applied several cognitive diagnostic models to the Kyoukenshiki standardized achievement test NRT and investigated its utility in educational practice. The results showed that we could obtain diagnostic information about students’ knowledge states, which was not possible to attain from the content-based subscores and total score. In addition, we discussed the problems in applying cognitive diagnostic models and issues to be addressed in the future.

View full abstract

Download PDF (784K)
Examining the Revised Placement Test Using the Rasch Model

Eri Banno, Tomoko Watanabe

2015 Volume 11 Issue 1 Pages 99-109
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_99

JOURNAL FREE ACCESS

Show abstractHide abstract

This article examines the placement test for a Japanese language course that was revised in 2012 using the Rasch model. Using data from the placement test that was used before 2012 and the one that was revised in 2012, we investigated how the revision affected the test results. The participants were 487 international students who were enrolled in a Japanese program at a university in Japan. The results indicate that the revised test was more difficult than the old one, which was the intent of the revision. At the same time, the results show that taking out the rubi from each kanji did not change the difficulty level of the questions. Furthermore, the results suggest that it was necessary to increase the number of difficult items in the revised test because the test was still easy for the examinees, and also to revise the multiple-choice items that did not work well.

View full abstract

Download PDF (578K)
Revising test items in order to improve the item quality

A practical study using BJT Business Japanese Proficiency Test data as an example

Wakana Onozuka, Kiyokata Kato, Yumiko Umeki, Akiko Echizenya, Shin-ich ...

2015 Volume 11 Issue 1 Pages 111-129
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_111

JOURNAL FREE ACCESS

Show abstractHide abstract

The purpose of this study is to show that those items that did not meet the statistical criteria to be entered into the item bank can be improved in terms of item statistics by rewriting them with a focus on item quality. We (1) selected those items that did not meet statistical criterion from the past Business Japanese Proficiency Test (BJT) administration, (2) rewrote those items by enhancing their quality, and (3) found that the revision did improve the statistical characteristics of those items based on experimental data.

View full abstract

Download PDF (1746K)

A Review of Uniform Test Assembly Methods for e-Testing

Takatoshi Ishii, Maomi Ueno

2015 Volume 11 Issue 1 Pages 131-149
Published: 2015
Released on J-STAGE: February 15, 2022

DOIhttps://doi.org/10.24690/jart.11.1_131

JOURNAL FREE ACCESS

Show abstractHide abstract

To record and score examinees’ responses, ISO/IEC 23988:2007 provides global standards on the use of IT to deliver assessments to the examinees. For high-stakes test, this standard is recommending to use uniform test forms in which each form comprises a different set of items but which still must have equivalent specifications such as equivalent amounts of test information based on item response theory (IRT). However, assembly of uniform test forms suffers NP-hard problem because it is a combinational optimization problem of selecting items from an item bank. Therefore, the test assembly has made rapid progress in recent years, through the aid of information technology advances. In this paper, we introduce some typical uniform test assembly methods and compare to each other to explain their advantages and disadvantages.

View full abstract

Download PDF (892K)

Register with J-STAGE for free!