抄録
Speech recognition software is widely adopted due to its efficiency and convenience, yet its accuracy remains inconsistent across speaker groups. While prior studies show strong performance with native English speakers, limited research has examined its effectiveness with indigenized English accents. This study investigates the intelligibility of English spoken by Indonesian learners using two automatic speech recognition (ASR) platforms: Google Voice Typing and Macintosh Dictation. A total of 27 Indonesian EFL speakers read 27 English words embedded in carrier sentences, including both monophthongs and diphthongs. The same word list was also recorded by native American English speakers for comparison. Each utterance was transcribed by both ASR platforms. Recognition performance was evaluated using recognition accuracy, as the study employed an Isolated Word Recognition (IWR) approach. Recognition accuracy between ASR platforms was analyzed using the Wilcoxon Signed Rank test, while gender differences were examined using the Mann–Whitney U test. The results showed that Google Voice Typing consistently outperformed Macintosh Dictation in transcribing Indonesian EFL speech for both overall vowels and each vowel type. Both systems recognized diphthongs more accurately than monophthongs. Native English speech was transcribed more accurately than Indonesian speech on both platforms. Additionally, both male and female participants achieved significantly higher recognition accuracy with Google, although no significant gender-based differences were observed within each ASR system. These findings suggest that Indonesian-accented English presents intelligibility challenges for current ASR technologies, highlighting the need for more inclusive speech recognition systems that can support greater linguistic diversity.