Optimized machine learning-collaborative filtering model for mastitis prediction in dairy cows
-
Graphical Abstract
-
Abstract
Mastitis is a major disease affecting dairy cow health and milk production. This study established an integrated machine learning (ML) model combining herd- and individual-level data to achieve efficient and balanced prediction of clinical mastitis. Data were collected from 5284 lactating Holstein cows on two farms in southern and northern China. Five feature processing methods—recursive feature elimination (RFE), contrastive learning (CL), slopes and intercept, milk-conductivity ratio, and differences—were evaluated with four ML algorithms: Support vector machine (SVM), random forest (RF), XGBoost, and backpropagation neural network (BPNN). Among them, the XGBoost model with the milk-conductivity ratio feature achieved the best performance, with a sensitivity of 0.81 and specificity of 0.75. To further address the imbalance between sensitivity and specificity, collaborative filtering (CF) was introduced into the XGBoost model to incorporate both herd and individual cow information. The resulting XGBoost–CF model improved sensitivity to 0.83 and specificity to 0.87, enhancing the model’s ability to identify both healthy and diseased cows. This integrated ML–CF framework provides an effective strategy for early mastitis prediction, offering practical support for intelligent dairy herd management and precision livestock farming.
-
-