2021 Volume E104.A Issue 1 Pages 213-225
At the FSE conference of ToSC 2018, Kranz et al. presented their results on shortest linear programs for the linear layers of several well known block ciphers in literature. Shortest linear programs are essentially the minimum number of 2-input xor gates required to completely describe a linear system of equations. In the above paper the authors showed that the commonly used metrics like d-xor/s-xor count that are used to judge the “lightweightedness” do not represent the minimum number of xor gates required to describe a given MDS matrix. In fact they used heuristic based algorithms of Boyar-Peralta and Paar to find implementations of MDS matrices with even fewer xor gates than was previously known. They proved that the AES mixcolumn matrix can be implemented with as little as 97 xor gates. In this paper we show that the values reported in the above paper are not optimal. By suitably including random bits in the instances of the above algorithms we can achieve implementations of almost all matrices with lesser number of gates than were reported in the above paper. As a result we report an implementation of the AES mixcolumn matrix that uses only 95 xor gates. In FSE conference of ToSC 2019, Li et al. had tweaked the Boyar-Peralta algorithm to get low depth implementations of many matrices. We show that by introducing randomness in the tweaked algorithm, it is again possible to get low depth implementations with lesser number of gates than the above paper. As a result, we report a depth implementation of the AES mixcolumn matrix that uses only 103 xor gates, which is 2 gates less than the previous implementation. In the second part of the paper, we observe that most standard cell libraries contain both 2 and 3-input xor gates, with the silicon area of the 3-input xor gate being smaller than the sum of the areas of two 2-input xor gates. Hence when linear circuits are synthesized by logic compilers (with specific instructions to optimize for area), most of them would return a solution circuit containing both 2 and 3-input xor gates. Thus from a practical point of view, reducing circuit size in presence of these gates is no longer equivalent to solving the shortest linear program. In this paper we show that by adopting a graph based heuristic it is possible to convert a circuit constructed with 2-input xor gates to another functionally equivalent circuit that utilizes both 2 and 3-input xor gates and occupies less hardware area. As a result we obtain more lightweight implementations of all the matrices listed in the ToSC paper.