Abstract #545

# 545
Imputation using whole-genome sequence data in Brown Swiss and Original Braunvieh.
Christine F. Baes*1,2, Beat Bapst2, Franz R. Seefried2, Heidi Signer-Hasler1, Christine Flury1, Dorian Garrick3, Christian Stricker4, Birgit Gredler2, 1Bern University of Applied Sciences, Zollikofen, Bern, Switzerland, 2Qualitas AG, Zug, Zug, Switzerland, 3Iowa State University, Ames, IA, 4agn Genetics, Davos, Grison, Switzerland.

The distinct half-sib population structure common in dairy cattle populations permits imputation of array-based genotypes up to sequence level using sequence information of key ancestors. This approach provides a cost-effective alternative to sequencing all animals. The process works well in single breeds (e.g., Holstein), however it is less accurate when validation and reference animals are not of the same breed or population (e.g., Brown Swiss or Original Braunvieh). The objective of this study was therefore to investigate imputation strategies for composing reference and validation sets using Brown Swiss (BS) and Original Braunvieh (OB) animals. Whole genome sequence information (WGS) of 70 BS, 17 BS x OB and 8 OB animals was available for analysis. WGS was masked to mimic medium (50K; 54,609 SNP), and high-density (HD; 777,962 SNP) SNP arrays and then imputed back up to sequence level. Imputation was conducted using fimpute, which employs an overlapping sliding window approach to exploit haplotype similarities between validation and reference animals. Principal component analysis was used to determine which animals constituted reference and validation sets; scenarios involving various breed compositions were compared. The accuracy of imputation from 50K and HD to WGS was evaluated for each scenario by calculating the number of correctly imputed genotypes and concordance between true and imputed genotypes. Furthermore, imputation accuracy when the reference population was solely comprised of the breed to be imputed was compared with results obtained when the reference population was comprised of multiple breeds. Validation of OB genotypes resulted in 64.6–69.6% (50K to WGS) and 79.5–83.8% (HD to WGS) correctly imputed genotypes. BS validation resulted in 78.0–78.9% (50K to WGS) and 85.5–86.1% (HD to WGS) correctly imputed genotypes. Validation of intermediary animals using both BS and OB as reference resulted in the highest percentage of correctly imputed genotypes (79.1–79.6% for 50K to WGS and 86.5–86.7% for HD to WGS). Results show that breed composition of reference and validation sets has a considerable effect on imputation accuracy, and that imputation quality is improved if reference animals from similar populations are included.

Key Words: imputation, whole-genome sequence, accuracy