Lightweight Neural Data Sequence Modeling by Scale Causal Blocks

Hiroaki AKUTSU; Ko ARAI

doi:10.1587/transinf.2024EDP7074

Abstract

Autoregressive probability estimation of data sequences is a fundamental task in deep neural networks and has been widely used in applications such as data compression and generation. Since it is a sequential iterative process due to causality, there is a problem that its process is slow. One way to achieve high throughput is multiplexing on a GPU. To maximize the throughput of inference processing within the limited resources of the GPU, it is necessary to avoid the increase in computational complexity associated with deeper layers and to reduce the required memory consumption at higher multiplexing. In this paper, we propose Scale Causal Blocks (SCBs), which are basic components of deep neural networks that aim to significantly reduce the computational and memory cost compared to conventional techniques. Evaluation results show that the proposed method is one order of magnitude faster than a conventional computationally optimized Transformer-based method while maintaining comparable accuracy, and also shows better learning convergence.

Content from these authors

Favorites & Alerts

Corresponding author

Register with J-STAGE for free!