Abstract
The purpose of this paper is to propose a describing method of syntactic rules in order to analyze sentences on film scripts and manage the stored rules. Because fllm scripts include spoken language, which has different characteristics from written language, we may meet ellipses of various elements and inversion in the sentences. To analyze these sentences correctly by using a parser, some unique rules corresponding to these characteristics are required.In order to solve such problems, we develop a new method of parser to deal with a no-limitation description of syntactic rules with regular expressions.With regular expressions, this method allows us to describe the expression in a syntax rule such as “(auxiliary-verb I ending-particle) .” Because this expression is adapted regardless of whether ellipsis occurs, the method can deal with the ellipses without any difficulties. Because the expression can also be adapted to some combinations of many part-of-speeches, the number of rules should dramatically decrease.Under the basis of this idea, we developed a system with approximately 3, 000 syntactic rules, which are preliminary developed and checked by human, based on about 40, 000 sentences from 21 film scripts. These rules were in detail described to analyze various fashions of speech. In order to verify the effectiveness of the syntactic rules, we experimented a new set of other filmscripts. As a result, we obtained an analysis system with a high accuracy.And we found adopting regular expressions helped us reduce the cost to maintain syntactic rules because only one rule could express about 10 lists of part-of-speeches analyzed.