Abstract
Deterministic oracle, used in standard software testing, is not available for statistical machine learning computer programs. Alternatively, metamorphic testing, pseudo oracle in view of data diversity, is adopted. Statistical machine learning, however, works on a dataset consisting of many data. Their distributions in the dataset introduce the diversity to testing. This paper proposes a new metamorphic testing method, in which generating follow-up test cases takes into account the dataset diversity. The method was applied to two well-known machine learning of support vector machines and neural networks.