ヒューマンインタフェース学会論文誌
Online ISSN : 2186-8271
Print ISSN : 1344-7262
ISSN-L : 1344-7262
一般論文
対訳ペアにおける単語間の概念差の抽出
西村 一球村上 陽平Pituxcoosuvarn Mondheera
著者情報
ジャーナル フリー HTML

2025 年 27 巻 2 号 p. 125-132

詳細
Translated Abstract

Each word in one language and its translation in another do not necessarily represent the same concept due to asymmetry in meanings and cultural contexts, especially for polysemous words. In recent years, as the accuracy of machine translation has improved, multilingual communication is being supported. However, this conceptual difference can lead to misunderstandings in multilingual communication. Therefore, we proposed the conceptual differences extraction in translation pairs method to quantify the concepts represented by words using conceptual dictionaries. Specifically, we used WordNet and Multilingual-WordNet, which are multilingual versions of WordNet, in our method. The concept of each word in Japanese, Chinese, and Indonesian is quantified based on the Synset, which is the smallest unit of concept in WordNet. This makes it possible to extract the concept differences among words with overlapping concepts in these languages. Consequently, our method finds 27,005 (Japanese-Chinese), 60,581 (Japanese-Indonesian), and 14,175 (Chinese-Indonesian) word pairs to be conceptually different out of 104,626 (Japanese-Chinese), 173,233 (Japanese-Indonesian), and 42,468 (Chinese-Indonesian) word pairs in WordNet.

References
  • [1]   Pituxcoosuvarn,  M.,  Lin,  D. and  Ishida,  T.: A method for automated detection of cultural difference based on image similarity, Collaboration Technologies and Social Computing: 25th International Conference, CRIWG+ CollabTech 2019, Kyoto, Japan, September 4–6, 2019, Proceedings 25, Springer, pp. 129–143 (2019).
  • [2]   山下 直美, 石田 亨ほか:翻訳機を用いた対話における参照方法に関する分析,情報処理学会論文誌,Vol. 48, No. 2, pp. 939–948 (2007).
  • [3]   Mikolov,  T.,  Chen,  K.,  Corrado,  G. and  Dean,  J.: Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781 (2013).
  • [4]   Chen,  X.,  Liu,  Z. and  Sun,  M.: A unified model for word sense representation and disambiguation, Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP), pp. 1025–1035 (2014).
  • [5]   Patwardhan,  S. and  Pedersen,  T.: Using WordNet-based context vectors to estimate the semantic relatedness of concepts, Proceedings of the Workshop on Making Sense of Sense: Bringing Psycholinguistics and Computational Linguistics Together (2006).
  • [6]   Yoshino,  T.,  Miyabe,  M. and  Suwa,  T.: A proposed cultural difference detection method using data from Japanese and Chinese Wikipedia, Proceeding of 2015 International Conference on Culture and Computing (Culture Computing), IEEE, pp. 159–166 (2015).
  • [7]   諏訪 智大, 宮部 真衣,   吉野 孝ほか:日本語版・中国語版Wikipediaを用いた文化差検出手法の提案,情報処理学会論文誌, Vol. 55, No. 1, pp. 257–266 (2014).
  • [8]   Pfeil,  U.,  Zaphiris,  P. and  Ang , C. S.: Cultural differences in collaborative authoring of Wikipedia, Journal of Computer-Mediated Communication, Vol. 12, No. 1, pp. 88–113 (2006).
  • [9]   Cho,  H.,  Ishida,  T.,  Yamashita,  N.,  Inaba,  R.,  Mori,  Y. and  Koda,  T.: Culturally-situated pictogram retrieval, International Collaboration, Springer, pp. 221–235 (2007).
  • [10]   Koda,  T.: Cross-cultural comparison of interpretation of avatars’ facial expressions, Proceedings of the IEEE/IPSJ Symposium on Applications and the Internet (SAINT-06) (2006).
  • [11]   Fellbaum,  C. and  Vossen,  P.: Challenges for a multilingual wordnet, Language Resources and Evaluation, Vol. 46, pp. 313–326 (2012).
  • [12]   Fellbaum,  C.: WordNet, The encyclopedia of applied linguistics (2012).
  • [13]   Bond,  F. and  Foster,  R.: Linking and extending an open multilingual wordnet, Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1352–1362 (2013).
 
© ヒューマンインタフェース学会
feedback
Top