This paper discusses a mathematical analysis for learning weights in a similarity function. Although there are many works on theoretical analyses of case-based reasoning systems [Aha 91, Albert 91, Janke 93, Langley 93], none has yet theoretically analyzed methods of producing a proper similarity function in accordance with a tendency of cases which many people have already proposed and empirically analyzed [Aha 89, Callan 91, Cardie 93, Stanfill 86]. In this paper, as the first step, we provide a PAC learning framework for weights with two kinds of distance information; one is qualitative distance information and the other is relative distance information. Qualitative distance information represents if case A is similar to case B or not and relative distance information represents if case A is more similar to case B than to case C. We give a mathematical analysis for learning weights from these information. In this setting, we show that we can efficiently learn a weight which has an error rate less than ε with a probability more than 1-δ such that the size of distance information is polynomially bounded in the dimension , n , and the inverses of ε and δ, and the running time is polynomially bounded in the size of distance information.
抄録全体を表示