Abstract
The main challenge in hierarchical multi-label document classification is the means by which hierarchically organized labels are leveraged. In this paper, we propose to exploit dependencies among multiple labels to be output, which has not been considered in previous studies. To accomplish this, we first formalize the task as a structured prediction problem and propose (1) a global model that jointly outputs multiple labels and (2) a decoding algorithm that finds an exact solution with dynamic programming. We then introduce features that capture inter-label dependencies. Experiments show that these features improve performance while reducing the model size.