Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
33rd (2019)
Session ID : 4Rin1-23
Conference information

Product Name Extraction from Product Entries on Electronic Commerce Pages
*Peinan ZHANG
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

We propose a task to identify a product name from an EC page title. On EC pages, sellers need to design their posts to increase the visibility of their products in search results. One of the common techniques is including extra information to the title of their product page. However, adding many keywords can result in such a complicated page title that it is hard for buyers to distinguish a product name from the title. Therefore, extracting product names is important, yet has some challenges especially when titles are in Japanese. (1) Most titles do not have standard grammatical structures. (2) Diverse characters, such as Kanjis, Kanas, alphanumerics, and symbols often appear in a single title. These make models hardly handle the boundaries of words and lead to incorrect learning. In this work, we create a corpus and evaluate several conventional approaches for basic analysis. The results show that this task is still challenging; an existing approach for named entity recognition, which performs very well at some open datasets, can only achieve 23.0 of the F1 score with our dataset.

Content from these authors
© 2019 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top