As an international language, English has become more and more important for nonnative speakers. Therefore, writers ought to consider the needs of non-native speakers, i.e. write English in a way that can be understood quite well by non-native audience. In this paper, we investigate the position of six discourse markers within the texts whose target audience was intermediate non-native speakers of English. The six discourse markers are: because and since, which represent “reason” relation; if and when, which represent “condition” relation; although and while, which represent “concession” / “contrast” relation. First, we created a corpus (200, 000 words) containing the texts (domain: natural and pure science) whose target audience was intermediate non-native speakers. We selected 1072 examples of the six discourse markers from the corpus, and annotated them. Second, a machine learning program C4.5 was applied to induce the classification models of the position of the discourse markers. And then we used Support Vector Machine (SVM) to verify the experiment results of C4.5. To our knowledge, this study is the first one on exploring the position of discourse markers within the texts whose target audience was intermediate non-native speakers. The experiment results can be applied to text generation and homepage creation for intermediate non-native speakers of English.
We propose an application-independent Sinhala character input method called Sri Shell with a principled key assignment based on phonetic transcription of Sinhala characters. A good character input method should fulfill two criteria, efficiency and user-friendliness. We have introduced several quantification methods to quantify the efficiency and user-friendliness of Sinhala character input methods. Experimental results prove the efficiency and user-friendliness of our proposed method.