Proceedings of the Annual Conference of JSAI
Online ISSN : 2758-7347
35th (2021)
Session ID : 3J4-GS-6c-01
Conference information

Weight and Activation Ternarization in BERT
*Soichiro KAKUKyosuke NISHIDASen YOSHIDA
Author information
CONFERENCE PROCEEDINGS FREE ACCESS

Details
Abstract

Quantization techniques that approximate float values with a small number of bits have been attracting attention to reduce the model size and speed of pre-trained language models such as BERT. On the other hand, quantization of activation (input to each layer) is mostly done with 8 bits, and it is empirically known that approximation with less than 8 bits is difficult to maintain accuracy. In this study, we consider outliers in the intermediate representation of BERT to be a problem, and propose a ternarization method that can deal with outliers in the activation of each layer of the pre-trained BERT. Experimental results show that the ternarized model of weight and activation outperformed the previous method in language modeling and downstream tasks.

Content from these authors
© 2021 The Japanese Society for Artificial Intelligence
Previous article Next article
feedback
Top