Host: The Japanese Society for Artificial Intelligence
Name : The 36th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 36
Location : [in Japanese]
Date : June 14, 2022 - June 17, 2022
Extracting information from images of cards such as driver’s licenses or credit cards is a computer vision task with widespread needs. In many cases, images taken with a smartphone are taken from an arbitrary position and angle specified by people. To recognize the card’s text with OCR, it is necessary first to localize the card within the image, transform it to a rectangle, and then rotate it to the correct orientation. Deep learning-based methods are able to perform these localization and rotation tasks with high accuracy. However, handling the two tasks with two separate models results in increased processing times. In this work, we propose a solution to this problem which uses a single object detection model to perform both the localization and rotation tasks, thereby allowing cards to be processed quickly without sacrificing accuracy.