Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
38th (2024)
Session ID : 4A1-GS-6-05
Conference information

Search Query Expansion Method for Patent Documents Combining Large Language Models and Thesaurus
*Kaede MORIHirofumi NONAKAAsahi HENTONASeiya KAWANOKoichiro YOSHINOKoji MARUSAKIShotaro KATAOKA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Patent retrieval refers to the process of searching within patent databases for information on technologies, inventions, inventors, and applicants. Particularly, since recognized patent infringement in court could result in substantial damages or licensing fees, conducting thorough prior art searches is crucial. However, patent documents are composed of unique vocabularies and the number of documents is vast, making the research process costly. While there are several methods aimed at conducting exhaustive searches by expanding search queries, they generally struggle to address complex vocabularies present in only a small number of patents.Therefore, this study proposes query expansion combining thesauruses and large language models (LLMs). It focuses on the output tendencies of LLMs and the independence and co-occurrence rates of new words generated by existing thesauruses and LLMs, conducting a foundational analysis of the method. As a result, new words generated by large language models had low co-occurrence with existing thesauruses. The success in generating new vocabularies through large language models suggests the potential for comprehensive patent searches that can accommodate the unique vocabularies and complex expressions of patent documents.

Content from these authors
© 2024 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top