2016 Volume 15 Issue 2 Pages 83-90
This paper aims at proposing a new system to estimate the emotional value content for short sentences. The proposed system utilizes the co-occurrence strength between adjectives and nouns based on similarity measurements and semantic relationships to explore the possibility of finding the semantic association between adjectives and an input sentence. At first, keywords extracted from the input sentence are used to query adjectives from Google N-gram corpus using keywords-based templates. The dataset for the step of association measurement is continually collected using templates created from each keyword. Co-occurrence frequencies of the adjectives and keywords are obtained; however, to improve the efficiency of this task, patterns showing the semantic relationships between them are also considered. The semantic similarity scores computed by several modified computational measurements and the pattern frequencies are used for training not only to classify adjectives into two classes- association and non-association, but also to get the association scores. For each keyword, the lists of adjectives and keyword are then sorted in the decreasing order by their association scores. Finally, a rank aggression method - Borda's method which is used to generate an acceptable ranking for a given set of rankings is considered and the top na adjectives (in this paper na is 5) are chosen according to the estimated values. The main contribution of this method is to design an effective method for the adjective selection task of the input sentence of the impression estimation system. We evaluated our approach using two tasks: the first one is the quality of the association measurement and the second one is the efficiency of the proposed method. The evaluation for association classification on 4,500 pairs of words shows that the average accuracy is 87.0 %. And for the performance of the proposed method, we carried out subjective experiments and obtained fairly good results.