2023 年 31 巻 p. 332-343
Logical Natural Language Generation, i.e., generating textual descriptions that can be logically entailed by a table, has been a challenge due to the low fidelity of the generation. Previous works have addressed this problem by annotating interim logical programs to control the generation contents and semantics, and presented the task of table-aware logical form to text (Logic2text) generation. However, although table instances are abundant in the real world, logical forms paired with textual descriptions require costly human labor, which limits the parallel data size. To mitigate this, we propose topic-conditioned data augmentation (TopicDA), which employs controlled sequence-to-sequence generation as auxiliary tasks to augment logical forms and textual descriptions directly from tables. We further introduce logical form generation (LG), a dual task of Logic2text that requires generating a valid logical form based on a text description of a table. We hence propose a semi-supervised learning approach to jointly train a Logic2text and an LG model with both labeled and augmented data. Experimental results on both Logic2text and LG demonstrate that our approach can effectively utilize the augmented data and outperform supervised baselines by a substantial margin.