JLTA Journal
Online ISSN : 2189-9746
Print ISSN : 2189-5341
ISSN-L : 2189-5341
Volume 21
Displaying 1-10 of 10 articles from this issue
  • 2018 Volume 21 Pages 0-
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    Download PDF (1331K)
  • John M. NORRIS
    2018 Volume 21 Pages 3-20
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    Constructed-response tasks have captured the attention of testers and educators for some time (e.g., Cureton, 1951), because they present goal-oriented, contextualized challenges that prompt examinees to deploy cognitive skills and domain-related knowledge in authentic performances. Such performances present a distinct advantage when teaching, learning, and assessment focus on what learners can do rather than merely emphasizing what they know (Wiggins, 1998). Over the past several decades, communicative performance tasks have come to play a crucial role in language assessments on a variety of levels, from classroom-based tests, to professional certifications, to large-scale language proficiency exams (Norris, 2009, 2016). However, the use of such tasks for assessment purposes remains contentious, and numerous language testing alternatives are available at potentially lower cost and degree of effort. In order to facilitate decisions about when and why to adopt task-based designs for language assessment, I first outline the relationship between assessment designs and their intended uses and consequences. I then introduce two high-stakes examples of language assessment circumstances (job certification and admissions testing) that suggest a need for task-based designs, and I review the corresponding fit of several assessments currently in use for these purposes. In relation to these purposes, I also suggest some of the positive consequences of task-based designs for language learners, teachers, and society, and I point to the dangers of using assessments that do not incorporate communicative tasks or do so inappropriately. I conclude by highlighting other circumstances that call for task-based designs, and I suggest how advances in technology may help to address associated challenges.
    Download PDF (974K)
  • Rika MANDOKORO
    2018 Volume 21 Pages 23-41
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    The present study explored how EFL readers’ text processing and comprehension, and the relationship between the two, were affected by two reading purposes: reading to respond to multiple-choice reading comprehension test and to complete an essay in response to a source text. In the experiment, a total of 23 Japanese university students read two expository texts: one for responding to a multiple-choice test and another for completing an essay test. They spoke their thoughts aloud while reading. After their reading and completing tests, they performed a written recall task. The results showed that text processing and comprehension did not differ between the two reading purposes; however, the relationship between text processing and comprehension changed according to the reading purposes. The findings suggest that reading purposes for responding to different test types do not affect text processing and comprehension as a whole, but they may influence the reason that EFL readers engage in a reading process while reading a text. Based on these findings, this study suggests pedagogical implications regarding how to encourage EFL readers’ flexible text processing and comprehension, and what teachers should consider when conducting and selecting reading tasks.
    Download PDF (1073K)
  • Allan NICHOLAS
    2018 Volume 21 Pages 42-64
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    Speech act assessments have typically not focused on interaction, with instruments such as discourse completion tasks not capturing lengthy sequences of talk. Further, assessments typically focus on current independent performance. Dynamic assessment (DA) offers an alternative, assessing both current learner performance and future potential, while also promoting development. By applying DA through dynamically-administered strategic interaction scenarios (D-SIS)―an open role-play type task―multiple turns of talk can be captured, allowing assessment of interactional competence. This study profiles a single learner to investigate the effectiveness of DA in assessing and developing interactional competence in relation to the requesting speech act. The 20-year old, intermediate level participant was a second-year student at a Japanese university, enrolled in an intercultural communication program. Researcher and participant engaged in an eight-week developmental-experimental DA of requesting-in-interaction. Analysis here focuses on the DA phases of the study, in which the researcher and participant collaborated on D-SIS tasks. When the participant struggled, mediation was provided in order for the task to proceed. Mediation was analyzed, assessing the efficiency with which the participant both identified the object (problem) of mediation, and resolved the issue. A coding scheme was used to quantify learner development, in conjunction with a qualitative analysis of transcript data. Results showed evidence of learner development in regards to request turn directness, pre-requesting and pre-closing aspects of conversation, requiring less mediation, and less explicit mediation types over time. The findings provide support for the use of DA in assessing and developing interactional competence in EFL contexts.
    Download PDF (1072K)
  • Hideki IIMURA
    2018 Volume 21 Pages 65-81
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    The multiple-choice test format is widely used in major English proficiency tests, such as EIKEN, TOEIC®, and TOEFL®. To develop effective multiple-choice test items, plausible distractors (i.e., incorrect options) are required. The study aimed to investigate the plausibility of distractors in a multiple-choice listening test and the relationship between distractor plausibility and listening ability. Using data from 46 listening test items of the TOEIC® test administered to 199 Japanese university students, this study examined the characteristics of distractors that could attract test takers. The frequencies (i.e., number of test takers who choose each distractor) were used as dependent variables. The following five distractor characteristics were used as predictor variables: (a) overlap, (b) synonym, (c) derivative, (d) negative, and (e) specific determiner. Results of multiple regression analyses showed that overlap (i.e., a distractor including the same words or phrases used in the text) would be the most influential factor to make distractors plausible. The results also indicated that distractor plausibility would vary according to the listening ability levels of test takers.
    Download PDF (945K)
  • Akihiro MIKAMI
    2018 Volume 21 Pages 82-101
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    The aims of this study are to evaluate the content validity of a reflection tool for EFL teachers’ professional development in Japan, called Self-Evaluation Checklist for EFL Teachers (SECEFLT), and to provide validity evidence for interpreting and using SECEFLT scores through Kane’s (2006) argument-based approach. SECEFLT was originally developed by Mikami (2015) to promote EFL teachers’ reflection on their professional competencies. It was revised by Mikami (2018) through the validation process of construct validity, using both exploratory and confirmatory factor analyses. To gather further validity evidence related to content aspect for the revised SECEFLT, a survey was conducted with a panel of experts including six English language teachers (all English language education majors) at teacher education departments in national universities in Japan. The experts were asked to evaluate the extent to which each item in the revised SECEFLT was relevant to the content domain it aimed to measure, as well as the overall extent of relevance of the revised SECEFLT to the content domain it aimed to measure. The results showed that each individual item in the scale was appropriate in content validity and the whole scale was also appropriate judging from individual item evaluations. It was confirmed that experts judged the revised SECEFLT as content-valid when asked directly whether it was appropriate overall. Based on the study results, interpretive arguments are discussed using Kane’s (2006) framework for indicators of theoretical constructs.
    Download PDF (1372K)
  • Akiyo HIRAI
    2018 Volume 21 Pages 102-123
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    The present study aims to clarify the effects of study abroad (SA) duration and predeparture proficiency on the second language (L2) progress of Japanese students of English. As a first step toward this goal, studies on SA of one month or less (short-term), of more than one month to less than six months (middle-term), and of six months or more (long-term) were reviewed extensively. Next, 31 studies, all of which reported SA students’ pre- and post-test scores, were selected, and effect sizes of the students’ L2 gains were generated to allow for further comparisons among the three lengths of SA and among three proficiency levels based on their pre-test scores that were carried out by means of a meta-analysis method. The results showed that the magnitude of the effect of long-term SA was more than twice as great as that of middle-term SA and more than four times as great as that of short-term SA. The second factor analyzed in this study, students’ predeparture proficiency, did not seem to be an influential predictor of L2 gains. However, further analysis revealed that there was an interaction between the two factors, and low proficiency students tended to attend shorter-term SA programs.
    Download PDF (1366K)
  • Izumi WATANABE-KIM
    2018 Volume 21 Pages 124-140
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    With the anticipated 2020 reform of national testing of university applicants, test validation will play an even greater role in the Japanese educational context. English language assessment, in particular, has received more attention due to the inclusion of an oral communication test. In such times of change, this paper aims to: 1) provide a brief historical overview of how the concept, issues, and practice of test validation have evolved in the last century, and 2) offer suggestions as to how Japanese universities might proceed in selecting from an array of commercially available English language tests for admission purposes. Beginning with Messick’s (1989) definition of validation, this paper will touch upon key concepts of both old and new validation models, including criterion-model, content-model, construct-model, and the argument-based validation model. Research findings call for a careful, systematic, and ethical approach towards fulfilling the demands of the upcoming changes that should take into consideration fairness to test takers, as well as potential washback of test use.
    Download PDF (1345K)
  • Keisuke KUBOTA
    2018 Volume 21 Pages 141-159
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    Peer assessment is currently popular as an alternative evaluation tool in the context of EFL classrooms. It is also recognized as a useful method for active and collaborative learning. There is, however, the issue of rater bias (Farrokhi, Esfandiari, & Schaefer, 2012; Matsuno, 2009), which can cause negative consequences in instances of peer assessment. One way to avoid such unfairness is to design a better measurement scale that would work equally well for inexperienced evaluators such as the participants of peer assessments and for experienced raters. This study thus aims to explore the potential of employing an assessment instrument that can help to mitigate rater bias in peer assessment. To do this, the author adopted an empirically derived, binary-choice, boundary-defined (EBB) rating scale. The investigation used the Many-Facet Rasch Measurement (MFRM) to analyze 45 sets of essay writing data scored by 5 Japanese raters (3 experienced and 2 inexperienced). The evaluation tool was employed to investigate rater severity or leniency and rater bias patterns in comparison with two types of rating scales. The results revealed that despite the utilization of the rating scales, the same patterns of rater bias occurred as were found in previous studies (Farrokhi et al., 2012; Matsuno, 2009). Moreover, the present study confirmed that the language ability of inexperienced raters could influence their rating tendency. In other words, an inexperienced rater may behave like an experienced evaluator if the person's language proficiency levels are relatively high. On the basis of the findings of the present study, this paper discusses the implications of the use of empirically developed rating scales in EFL classroom writing assessments.
    Download PDF (1882K)
  • 2018 Volume 21 Pages 161-180
    Published: 2018
    Released on J-STAGE: December 24, 2018
    JOURNAL OPEN ACCESS
    The Rationale for the Establishment of the Japan Language Testing Association; Constitution of the Japan Language Testing Association; Guidelines for Contributors to the JLTA Journal; Rules and Regulations of the JLTA Best Paper Award; Editors and Referees for the JLTA Journal Vol. 21 (2018)
    Download PDF (1631K)
feedback
Top