A Risk Assessment Tool for the Contamination of Aflatoxins on Dried Figs based on Machine Learning Algorithms

Aflatoxins are highly poisonous and carcinogenic compounds produced by species of the genus Aspergillus spp. that can infect a variety of agricultural foods, including dried figs. Biological and environmental factors, such as population, pathogenicity and aflatoxinogenic capacity of the strains, topography, soil and climate parameters of the fig orchards are believed to have a strong effect on aflatoxin levels. Existing methods for aflatoxin detection and measurement, such as high-performance liquid chromatography (HPLC), and enzyme-linked immunosorbent assay (ELISA), can provide accurate results, but the procedures are usually time-consuming, sample-destructive and expensive. Predicting aflatoxin levels prior to crop harvest is useful for minimizing the health and financial impact of a contaminated crop. Consequently, there is interest in developing a tool that predicts aflatoxin levels based on topography and soil analysis data of fig orchards. This paper describes the development of a risk assessment tool for the contamination of aflatoxin on dried figs, based on the location and altitude of the fig orchards, the population of the fungus Aspergillus spp. in the soil, and soil parameters such as pH, saturation percentage (SP), electrical conductivity (EC), organic matter, particle size analysis (sand, silt, clay), concentration of the exchangeable cations (Ca, Mg, K, Na), extractable P and trace of elements (B, Fe, Mn, Zn and Cu), by employing machine learning methods. In particular, our proposed method integrates three machine learning techniques i.e., dimensionality reduction on the original dataset (Principal Component Analysis), metric learning (Mahalanobis Metric for Clustering) and K-nearest Neighbors learning algorithm (KNN), into an enhanced model, with mean performance equal to 85% by terms of the Pearson Correlation Coefficient (PCC) between observed and predicted values.