2020 Volume 27 Issue 2 Pages 383-409
Time is an important concept in human-cognition, fundamental to a wide range of reasoning tasks in the clinical domain. Results of the Clinical TempEval 2016 challenge, a set of shared tasks that evaluate temporal information extraction systems in the clinical domain, indicate that current state-of-the-art systems do well in solving event and time expression identification but perform poorly in temporal relation extraction. This study aims to identify and analyze the reason(s) for this uneven performance. It adapts a general domain tree-based bidirectional long short-term memory recurrent neural network model for semantic relation extraction to the task of temporal relation extraction in the clinical domain, and tests the system in a binary and multi-class classification setting by experimenting with general and in-domain word embeddings. Its results outperform the best Clinical TempEval 2016 system and the current state-of-the-art model. However, there is still a significant gap between the system and human performance. Consequently, this study delivers a deep analysis of the results, identifying a high incidence of nouns as events and class overlapping as posing major challenges in this task.