Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 4Xin2-06
Conference information

J-NER:Benchmark Dataset Considering Extended Named Entity in Named Entity Recognition for Large Language Models
*Yusuke SHIBUYAHiroto SHIBUYA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

It is an important aspect of understanding a language model to ascertain whether the model is able to recognize the structure and connections of sentences. Named entity such as place names and person names are one of the main components of language, and research on the recognition of proper expressions in language models is an important theme in understanding language models. Although named entity recognition is also important in large language models, compared to general language models, there is still room for research in areas such as the development of data sets for named entity recognition. Therefore, in this study, we create a new benchmark dataset "J-NER", which includes named entities of training data of large language models and extended named entity. Using this dataset, we evaluate large language models with Gemini Pro, GPT-3.5, and ELYZA, and find that there is variation in accuracy and F1 score. This suggests that J-NER is effective in measuring the named entity recognition ability of large language models; it is expected that we can obtain deep insights into the named entity recognition ability of large language models through using J-NER.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top