Mathematical Linguistics
Online ISSN : 2433-0302
Print ISSN : 0453-4611
Volume 33, Issue 3
Special Issue 2021 om the "Recent Quantitative Vocabulary Studies"
Displaying 1-9 of 9 articles from this issue
Special Issue 2021 on the "Recent Quantitative Vocabulary Studies"
  • Makoto Yamazaki
    2021 Volume 33 Issue 3 Pages 113
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    Download PDF (100K)
  • Applying Menzerath–Altmann's Law to Translated Texts in Japanese
    Haruko Sanada
    Article type: Invited Paper (A) to the Special Issue 2021
    2021 Volume 33 Issue 3 Pages 114-129
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    In order to understand balancing the lengths of linguistic elements contained in a sentence, we conducted a survey using the Menzerath–Altmann's law (MAL), which is often used in quantitative linguistics research in Europe and Canada. Using texts from two editions of the Japanese translation of a French book, Le Petit Prince, as our study material, we construct a dataset to create a function with the length of the sentence measured by the number of clauses as x, the length of the clauses that constitute the sentence measured by the number of morphemes as y, and the number of characters as y'. The regression analyses were performed on the linguistic length dataset, employing the MAL equation. As Altmann stated, the association between the sentence length and clause length is expressed by a decreasing function. The MAL holds even in translated texts. The regression curves of the two editions tended to be similar to each other. However, some data deviated from the MAL decreasing function, specifically, for cases with additions, longwinded explanations, parenthetic expressions, or words expressing thoughts without quotations. It can be regarded that these cases are intended to have an expressive effect by means of breaking the balance of length among linguistic elements, which is typical in Japanese expressions.
    Download PDF (919K)
  • A Welfare Linguistic Analysis of Data from National Surveys
    Aimi Kuya
    Article type: Invited Paper (A) to the Special Issue 2021
    2021 Volume 33 Issue 3 Pages 130-145
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    The present research, conducted from a viewpoint of welfare linguistics, discusses the manner in which loanwords from English should be adopted for public communication in Japan. It explores a new approach to interpreting data from national surveys on loanwords from English using logistic regression analysis. The analysis reveals that there is often a discrepancy between the general public and regional government workers in the level of preference for the adoption of homophonic translation (a phonetic equivalent of the English word in katakana). It also reveals that the general public’s level of comprehension, or literacy, of a certain homophonic translation cannot be identified with their level of preference for the adoption of that word. These findings suggest that better public communication could result if those who publish announcements, e.g. regional government workers, take these issues into consideration.
    Download PDF (1091K)
  • Mizuho Imada
    Article type: Paper (A) to the Special Issue 2021
    2021 Volume 33 Issue 3 Pages 146-161
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    We examined the reliability of 24 different lexical diversity indices using children's compositions. It is desirable that the lexical diversity index is uncorrelated with the text length, but in the case of children's writing, uncorrelatedness does not ensure the reliability of the index because children's language ability is correlated with both the text length and the lexical diversity. Therefore, we measured indices cumulatively in the same composition, and evaluate the smallness of its variation by LMM. In addition, we attempted to modify three types of indices by using LMM to determine its parameter. It showed that the Simpson indices, the iterative measurement methods, and the LMM-corrected indices gave relatively good results. We also examined the relationship of these indices to the author's grade level and the teacher's evaluation of their writing, and it was confirmed that lexical diversity increased as the grade level increased and that the teacher's evaluation did not place much importance on lexical diversity.
    Download PDF (1638K)
  • An Emphasis on the Sampling Methods for Quantitative Lexicology
    Michimasa Kanno, Sonomi Kikuchi
    Article type: Paper (A) to the Special Issue 2021
    2021 Volume 33 Issue 3 Pages 162-177
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    There has been much research on the proportions of parts of speech in the field of quantitative lexicology. However, the issue of how much variation can occur when applying different calculation methods has not been considered, even though the proportions of parts of speech depend on them. Using waka poetry data, this study empirically examined the differences in proportion for each calculation method in terms of three issues: word unit, variant texts, and statistical sampling. The examination showed that the proportions determined by two different word units had considerable differences, while the proportions calculated using two variant texts of the same waka collection did not change substantially. Moreover, compared to the proportions in the complete survey, the proportions in the sample survey identified statistical errors, but the differences did not change the overall conclusion. Furthermore, when the proportions for the entire waka collection were estimated from the sample proportions, it was revealed that the sampling pieces of waka using cluster sampling yielded unexpectedly better results in terms of the statistical precision than the sampling words using simple random sampling.
    Download PDF (2062K)
  • Their Representative Senses and Typical Usages
    Sachi Kato, Masayuki Asahara
    Article type: Invited Paper (B) to the Special Issue 2021
    2021 Volume 33 Issue 3 Pages 178-193
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    We performed large-scale impression rating experiments to investigate the evaluation of the senses of polysemous words in example sentences. For the impression rating, we used 5,125 example sentences for 530 words in the IPA Lexicon of Basic Japanese Verbs, Adjectives, and Nouns (IPAL). In addition, we annotated the sense tags of Word List by Semantic Principles and their representative senses given by Yamazaki and Kashino (2017). A contrastive survey between the representative senses and impression ratings was performed. Further, linear regression analysis was performed to estimate the degree of representativeness and extract their typical usages. We found that differences in contextual words, including case particles, affect the recognition of word senses.
    Download PDF (870K)
  • Naoki Nakamata, Yukiko Koguchi, Madoka Konishi, Hajime Tateishi, Hitos ...
    Article type: Resource to the Special Issue 2021
    2021 Volume 33 Issue 3 Pages 194-204
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    This article describes how to create and use the “Topic-Vocabulary Table for Japanese Language Education” which was elaborated based on a conversation corpus. To create this table, several workers manually checked and divided the “Nagoya University Conversation Corpus” into subcorpora for each topic. Then, log-likelihood ratios were calculated for each sub-corpus. Finally, an Excel table was created with 97 topics arranged horizontally and 3,324 words arranged vertically. This table can be used for Japanese language education in two directions: from topic to word, and from word to topic. In the former, users can learn function words and language behaviors in addition to frequently used words in the topic. In the latter, users can notice that synonyms are frequently used in different topics for each, as well as a bias in the topics in which functional words are used.
    Download PDF (783K)
General Topics
  • Naoki Nakamata
    Article type: Tutorial
    2021 Volume 33 Issue 3 Pages 205-213
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    Though percentages are widely used to express ratios of numerical data, occasionally, they are inappropriately used. This article summarizes the significance of percentages and important points to bear in mind when using them to express ratios. First, when using percentages in a breakdown of data, it is important to bear in mind that percentages compress information on the numbers of instances when the size of the dataset is greater than 100. Thus it is inappropriate to use percentages with very small datasets or to give more decimal places than precision warrants. When using ratios to compare data for different populations, it is important to be aware of what the ratios are for. Also, a difference in values expressed as a percentage should not be expressed by a percentage itself, but by percentage points. Finally, this article introduces units of ratios other than percentages, including per million words (PMW), which is widely used in corpus linguistics.
    Download PDF (863K)
  • International Quantitative Linguistics Conference held online from September 9 to 11, 2021
    Haruko Sanada
    Article type: Conference Report
    2021 Volume 33 Issue 3 Pages 214-218
    Published: December 20, 2021
    Released on J-STAGE: December 20, 2022
    JOURNAL OPEN ACCESS
    International Conference 2021 (QUALICO 2021) was held online from September 9 to 11 2021. The conference was supported by The International Quantitative Linguistics Association (IQLA), National Institute for Japanese Language and Linguistics (NINJAL), and Center for Corpus Development, NINJAL. It was originally planned to be held in September 2020 and then was postponed for a year because of COVID-19. It was the first conference hosted in Asia and also the first one held online for QUALICO. Two keynote lecturers were invited, and 39 papers as talks and 20 papers for a poster session were presented. Of 117 participants, 71 were presenters, and 44 were students. Participants joined from 26 countries and regions, i.e., EU counties, Russia, Canada, U.S.A., the Republic of South Africa, China, Taiwan, and Japan. Many papers focused on classical or fundamental topics such as linguistics laws, e.g., Zipf" law, Menzerath-Altmann" law, Synergetic Linguistics, or valency theory. Other papers also focused on the applied topics like the authorship attribution, comparative language studies, or studies using corpora. IQLA Council Business Meeting was also held, and new board members were selected. The next conference is planned to be organized in Europe in 2023 or 2024. A call for papers for the next conference will be announced on the IQLA website.
    Download PDF (844K)
feedback
Top