2019 Volume 29 Issue 3 Pages 238-246
The chemical substance names described have various descriptions and the description of the name depends on the author. Such variation causes hindering information sharing of chemical knowledge. Auto-extraction of chemical substance names is useful for information sharing. In order to find a method for extracting the names of chemical substances in Japanese documents, we created a corpus of patent documents tagged with chemical substance names. We studied cutting out words from sentences and recognized chemical substance names by concatenating cut-out words using the part of speech information. We also studied selecting chemical substance names from concatenated cut-out words and made a selection comparison between chemical substance names and functional group names that are similar to chemical substance names.