Weighted k-Nearest-Neighbor Techniques for High Throughput Screening Data

The k-nearest neighbors (knn) is a simple but effective method of classification. In this paper we present an extended version of this technique for chemical compounds used in High Throughput Screening, where the distances of the nearest neighbors can be taken into account. Our algorithm uses kernel weight functions as guidance for the process of defining activity in screening data. Proposed kernel weight function aims to combine properties of graphical structure and molecule descriptors of screening compounds. We apply the modified knn method on several experimental data from biological screens. The experimental results confirm the effectiveness of the proposed method.