Japanese Journal for Research on Testing
Online ISSN : 2433-7447
Print ISSN : 1880-9618
Volume 12, Issue 1
Japanese Journal for Research on Testing
Displaying 1-4 of 4 articles from this issue
  • Yasuo Miyazaki, Taketoshi Sugisawa
    2016 Volume 12 Issue 1 Pages 1-17
    Published: 2016
    Released on J-STAGE: May 25, 2019
    JOURNAL OPEN ACCESS

    Reliability generalization is a meta-analytic technique used to synthesize the score reliability for an instrument across many studies. The concept is relatively new and therefore the methodology for this technique is not established yet, especially the appropriate form of transformation of the reliability coefficient is not well known. In this paper, a simulation study was conducted in order to examine which transformation of alpha coefficient works best by generating a population of reliability coefficients within the framework of mixed-effects meta-analysis models. The results of six forms of transformation were compared in order to find a better transformation for reliability generalization. The results implied that either log or cube root transformations performed much better than other forms of transformations. From the variance stability viewpoint, the log transformation is more recommended since it is a variance stabilizing transformation while the cube transformation is not.

    Download PDF (1251K)
  • Cases on The Information Disclosure System
    Masako Wakabayashi, Kazunari Sugimitsu
    2016 Volume 12 Issue 1 Pages 19-35
    Published: 2016
    Released on J-STAGE: May 25, 2019
    JOURNAL OPEN ACCESS

    There are diverse values to consider when deciding whether test items should be disclosed or not. However, test item disclosure considering diverse values has not been discussed comprehensively in previous studies. The aim of this study is to obtain judgment standards for test item disclosure. We surveyed multiple cases dealing with test items in the information disclosure system of Japan. Also, we investigated the details of each case comprehensively. As a result, the following viewpoints were obtained:(1) a necessity of ensuring transparency by test item disclosure, (2) a reuse of test items for future examinations, (3) acceptability of test preparation using former test items,(4) acceptability of burden increase for developing new test items, and (5) an information management of test items. This study suggested that the points mentioned above can be applicable to judgement standards for test item disclosure for Japanese official examinations.

    Download PDF (854K)
  • Yutaro Sakamoto
    2016 Volume 12 Issue 1 Pages 37-53
    Published: 2016
    Released on J-STAGE: May 25, 2019
    JOURNAL OPEN ACCESS

    While it is said that the evidence based discussion about education is needed, the quality assurance of test is also important. They say that we need to reaffirm the significance of measuring constructs correctly and the previous studies are not enough. The present study examined the verification of validity in TIMSS 2011 mathematics data in Japan using multidimensional IRT. In addition, the present study tried to investigate what the subscales “knowing” “reasoning” “applying” measure using bifactor model in terms of item information . As a result, there are 23 items which group factors have more impact on than general factor, so the present study proved characteristics which unidimensional IRT can not express. In other words, the present study can express characteristics about constructs which this test try to measure using multidimensional IRT and the application possibility was suggested.

    Download PDF (1087K)
  • Masaki Uto, Maomi Ueno
    2016 Volume 12 Issue 1 Pages 55-75
    Published: 2016
    Released on J-STAGE: May 25, 2019
    JOURNAL OPEN ACCESS

    Performance assessment has been attracted much attention in various assessment fields, such as entrance exam, employee evaluation and educational assessment. Performance assessment enables to assess examinees’ practical and higher order skills, which are difficult to be assessed by traditional paper tests. In typical performance assessment, examinee’s performances for multiple tasks a re evaluated by multiple raters. However, it has been pointed out that reliability of such performance assessment strongly depends on characteristics of raters and tasks. As a method to improve the reliability, item response models which incorporate rater and task characteristic parameters has been proposed. Earlier studies reported that the models could improve the reliability of performance assessment because they can estimate ability of examinees considering characteristics of raters and tasks. When applying them to actual performance assessments, the selection of an optimal model for the assessment situation is important. Therefore, this paper reviews previous item response models that incorporate rater and task characteristic parameters and explains those characteristics. Furthermore, the paper proposes an approach to select an optimal model for assessment situations. Moreover, the paper demonstrates the effectiveness of the models through a real data application.

    Download PDF (1474K)
feedback
Top