Abstract #687

# 687
Using routinely recorded herd data to predict and benchmark herd and cow health status.
Kristen L. Parker Gaddis*1, John B. Cole2, John S. Clay3, Christian Maltecca4, 1Department of Animal Sciences, University of Florida, Gainesville, FL, 2Animal Genomics and Improvement Laboratory, ARS, USDA, Beltsville, MD, 3Dairy Records Management Systems, Raleigh, NC, 4Department of Animal Science, North Carolina State University, Raleigh, NC.

Genetic improvement of dairy cattle health using producer-recorded data is feasible. Estimates of heritability are low, indicating that genetic progress will be slow. Improvement of health traits may also be possible with the incorporation of environmental and managerial aspects into herd health programs. The objective of this study was to use the more than 1,100 herd characteristics that are regularly recorded on farm test days to benchmark herd and cow health status. Herd characteristics were combined with producer-recorded health event data. Parametric and non-parametric models were used to predict and benchmark health status. Models implemented included stepwise logistic regression, support vector machines, and random forests. At both the herd- and individual-level, random forest models attained the highest accuracy for predicting health status in all health event categories when evaluated by 10-fold cross validation. Accuracy of prediction (SD) ranged from 0.59 (0.04) to 0.61 (0.04) in logistic regression models, 0.55 (0.02) to 0.61 (0.04) in support vector machine models, and 0.61 (0.04) to 0.63 (0.04) with random forest models at the herd level. Accuracy of prediction (SD) at the cow level ranged from 0.69 (0.002) to 0.77 (0.01) for support vector machine models and 0.87 (0.06) to 0.93 (0.001) with random forest models. Results of these analyses indicate that machine-learning algorithms, specifically random forest, can be used to accurately identify herds and cows likely to experience a health event of interest. It was concluded that accurate prediction and benchmarking of health status using routinely collected herd data is feasible. Nonparametric models were better able to handle the large, complex data compared with traditional models. Further development and incorporation of predictive models into herd management programs will help to continue improvement of dairy herd health.

Key Words: health, machine learning, prediction