Japanese Journal for Research on Testing

A simulation study on appropriate transformations of reliability coefficient in mixed-effects meta-analysis models

Yasuo Miyazaki, Taketoshi Sugisawa

2016Volume 12Issue 1 Pages 1-17
Published: 2016
Released on J-STAGE: May 25, 2019

DOIhttps://doi.org/10.24690/jart.12.1_1

JOURNAL OPEN ACCESS

Show abstractHide abstract

Reliability generalization is a meta-analytic technique used to synthesize the score reliability for an instrument across many studies. The concept is relatively new and therefore the methodology for this technique is not established yet, especially the appropriate form of transformation of the reliability coefficient is not well known. In this paper, a simulation study was conducted in order to examine which transformation of alpha coefficient works best by generating a population of reliability coefficients within the framework of mixed-effects meta-analysis models. The results of six forms of transformation were compared in order to find a better transformation for reliability generalization. The results implied that either log or cube root transformations performed much better than other forms of transformations. From the variance stability viewpoint, the log transformation is more recommended since it is a variance stabilizing transformation while the cube transformation is not.

View full abstract

Download PDF (1251K)

Judgment Standards for Test Item Disclosure for Official Examinations in Japan

Cases on The Information Disclosure System

Masako Wakabayashi, Kazunari Sugimitsu

2016Volume 12Issue 1 Pages 19-35
Published: 2016
Released on J-STAGE: May 25, 2019

DOIhttps://doi.org/10.24690/jart.12.1_19

JOURNAL OPEN ACCESS

Show abstractHide abstract

There are diverse values to consider when deciding whether test items should be disclosed or not. However, test item disclosure considering diverse values has not been discussed comprehensively in previous studies. The aim of this study is to obtain judgment standards for test item disclosure. We surveyed multiple cases dealing with test items in the information disclosure system of Japan. Also, we investigated the details of each case comprehensively. As a result, the following viewpoints were obtained:(1) a necessity of ensuring transparency by test item disclosure, (2) a reuse of test items for future examinations, (3) acceptability of test preparation using former test items,(4) acceptability of burden increase for developing new test items, and (5) an information management of test items. This study suggested that the points mentioned above can be applicable to judgement standards for test item disclosure for Japanese official examinations.

View full abstract

Download PDF (854K)
The verification of validity in TIMSS 2011 mathematics data in Japan using multidimensional item response theory

Yutaro Sakamoto

2016Volume 12Issue 1 Pages 37-53
Published: 2016
Released on J-STAGE: May 25, 2019

DOIhttps://doi.org/10.24690/jart.12.1_37

JOURNAL OPEN ACCESS

Show abstractHide abstract

While it is said that the evidence based discussion about education is needed, the quality assurance of test is also important. They say that we need to reaffirm the significance of measuring constructs correctly and the previous studies are not enough. The present study examined the verification of validity in TIMSS 2011 mathematics data in Japan using multidimensional IRT. In addition, the present study tried to investigate what the subscales “knowing” “reasoning” “applying” measure using bifactor model in terms of item information . As a result, there are 23 items which group factors have more impact on than general factor, so the present study proved characteristics which unidimensional IRT can not express. In other words, the present study can express characteristics about constructs which this test try to measure using multidimensional IRT and the application possibility was suggested.

View full abstract

Download PDF (1087K)

A Review of Item Response Models for Performance Assessment

Masaki Uto, Maomi Ueno

2016Volume 12Issue 1 Pages 55-75
Published: 2016
Released on J-STAGE: May 25, 2019

DOIhttps://doi.org/10.24690/jart.12.1_55

JOURNAL OPEN ACCESS

Show abstractHide abstract

Performance assessment has been attracted much attention in various assessment fields, such as entrance exam, employee evaluation and educational assessment. Performance assessment enables to assess examinees’ practical and higher order skills, which are difficult to be assessed by traditional paper tests. In typical performance assessment, examinee’s performances for multiple tasks a re evaluated by multiple raters. However, it has been pointed out that reliability of such performance assessment strongly depends on characteristics of raters and tasks. As a method to improve the reliability, item response models which incorporate rater and task characteristic parameters has been proposed. Earlier studies reported that the models could improve the reliability of performance assessment because they can estimate ability of examinees considering characteristics of raters and tasks. When applying them to actual performance assessments, the selection of an optimal model for the assessment situation is important. Therefore, this paper reviews previous item response models that incorporate rater and task characteristic parameters and explains those characteristics. Furthermore, the paper proposes an approach to select an optimal model for assessment situations. Moreover, the paper demonstrates the effectiveness of the models through a real data application.

View full abstract

Download PDF (1474K)

Register with J-STAGE for free!