Escherichia coli (E. coli) is a common bacterium that lives in the intestines of animals and humans, and it is often used to identify fecal contamination within the environment. E. coli can also easily develop resistance to antibiotics, making it an ideal organism for testing antimicrobial resistance—especially in certain agricultural environments where fecal material is used as manure or wastewater is reused.
Traditional laboratory methods for analyzing antimicrobial resistance are often time-consuming and labor-intensive, making them impractical for large-scale monitoring. As a result, researchers are exploring faster approaches using whole-genome sequencing (WGS) and predictive modeling.
Marco Christopher Lopez and Dr. Pierangeli Vital of the University of the Philippines-Diliman College of Science’s Natural Sciences Research Institute (UPD-CS NSRI), along with Dr. Joseph Ryan Lansangan of the UPD School of Statistics, tested various artificial intelligence (AI) prediction models to determine the antimicrobial resistance of E. coli using genetic data and laboratory test results from the National Center for Biotechnology Information (NCBI) database.
“We selected the models based on their strengths in handling biological and imbalanced data,” Vital explained. “These models were chosen to compare performance across different learning strategies and to identify which is most suitable for predicting antibiotic resistance.”
The AI models used were Random Forest (RF), which is well-suited for high-dimensional data; Support Vector Machine (SVM), which excels in classification tasks, particularly when dealing with complex decision boundaries; and two ensemble methods—Adaptive Boosting (AB) and Extreme Gradient Boosting (XGB)—which enhance accuracy by focusing on hard-to-classify samples.
These AI prediction models most accurately predicted resistance to streptomycin and tetracycline, showing high accuracy and reliably distinguishing resistant strains from susceptible ones. On the other hand, ciprofloxacin was the most challenging to predict due to the limited number of resistant samples in the data (only 4%), which led to difficulty in identifying resistance and poor sensitivity. Among the models, AB and XGB consistently delivered good results, even when tested on imbalanced antimicrobial resistance data.
“We think that this strategy has great potential for real-time monitoring of antimicrobial resistance, particularly in agriculture.” Vital said, emphasizing the potential use of AI prediction models in the sector. “As DNA sequencing becomes faster and cheaper, prediction models such as ours can pick up resistant bacteria early—before they lead to outbreaks. This can facilitate better decision-making in food safety, agriculture, and public health programs.”
The researchers recommend including more diverse sample types and data sources—such as metagenomic data, which is DNA from all microbes in a sample—to better understand and predict how bacteria develop resistance.
Vital also highlighted the value of collaboration between fields—like how microbiologists and statisticians worked together in this study. “More so, the integration of (micro)biological concepts to statistics and predictive modelling to have an impactful result/outcome to the community, in this instance, agricultural food safety.” she said.
The study, titled “Prediction models for antimicrobial resistance of Escherichia coli in an agricultural setting around Metro Manila, Philippines,” was published in the Malaysian Journal of Microbiology, an open-access, peer-reviewed journal that serves as a platform for scientific communication among researchers and academics working with microbes and microbial products. It was also funded by NSRI and the Department of Science and Technology’s Grant to Outstanding Achievements in Science and Technology through the National Academy of Science and Technology. (Story courtesy of Eunice Jean Patron/UPD-CS Science Communications)
Top image shows a disk diffusion assay plate (Photo by Dr. Pierangeli Vital)

