Abstract
This paper proposes a method to estimate malicious domain names from a large scale DNS query response dataset. The key idea of the work is to leverage the use of DNS graph that is a bipartite graph consisting of domain names and corresponding IP addresses. We apply a concept of Probabilistic Threat Propagation (PTP) on the graph with a set of predefined benign and malicious node to a DNS graph obtained from DNS queries at a backbone link. The performance of our proposed method (EPTP) outperformed that of an original PTP method (9% improved) and that of a traditional method using N-gram (40% improved) in an ROC analysis. We finally estimated 2,170 of new malicious domain names with EPTP.