A Software Framework for Predicting Oil-Palm Yield from Climate Data

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.