2021 Volume 28 Issue 2 Pages 350-379
In recent years the amount of information on the Internet has increased exponentially.Consequently, automatic article summarisation technology will be indispensable.In this study, we propose a data augmentation method for an automatic summarisation system.The proposed method removes the least important sentence in an article.We used a topic model to determine the importance of sentences in articles. The Luhn and LexRank methods were used as comparative methods for determining the importance of sentences in articles. Additionally, we used Easy Data Augmentation (EDA) techniques as the comparison method for this study. EDA is a data augmentation method applied to document classification.A comparative experiment was performed using input datasets with 28,000, 57,000, and 287,226 articles.The Luhn and LexRank methods always produced the worst results, while EDA sometimes performed worse than the baseline method without a data augmentation. The proposed method performed the best in all cases.