ORCID
0009-0002-7599-5068
Department
Biological Sciences
Year of Study
1st
Full-time or Part-time Study
Full-time
Level
Postgraduate
Presentation Type
Oral Presentation
Supervisor
Dr Deirdre Purfield
Supervisor
Dr Noirin McHugh
Supervisor
Dr Bruno Andrade
Abstract
Background:
Log R Ratio (LRR) is a genotype intensity measurement returned alongside routine genotype results and used to detect structural variation in the genome known as copy number variation (CNVs). Although LRR standard deviation (SD) is commonly applied as a quality control measure in CNV studies, limited research has explored its effect on genotype quality and the potential impact on CNV calling.
Methods:
A total of 720,152 genotypes were available on 716,234 cattle. Among these were 1,044 cattle that had duplicate genotype samples where one sample was considered gold standard (LRR SD < 0.3, Call Rate ≥ 0.95). Genotype concordance was calculated for all duplicates. PennCNV was used to call CNVs and concordance among CNV calls per duplicate animal was determined using the Jaccard index.
Results:
Across all 720,152 cattle samples, the mean (median) LRR SD was 0.21 (0.17), ranging from 0.07 to 1.98. A total of 14.62% of samples had an LRR SD > 0.3, the threshold routinely applied for CNV detection using SNP array data. Genotype concordance among the duplicates decreased as LRR SD increased, primarily due to misclassification of homozygous calls. Concordance among CNVs was highest for duplicate animals with a call rate > 0.9 and LRR SD < 0.3, but only 19.1% of these animals had a Jaccard index > 0.1.
Conclusion:
LRR SD is a useful additional quality control metric for genotype data. The low level of CNV concordance between duplicates highlights the need for caution when interpreting CNV results from medium density SNP panels.
Start Date
16-6-2025 1:30 PM
End Date
16-6-2025 1:45 PM
Recommended Citation
Dunne, Adam, "Using LRR standard deviation as a genotype quality control measure and its downstream effect on CNV calling." (2025). ORBioM (Open Research BioSciences Meeting). 3.
https://sword.cit.ie/orbiom/2025/oral2/3
Using LRR standard deviation as a genotype quality control measure and its downstream effect on CNV calling.
Background:
Log R Ratio (LRR) is a genotype intensity measurement returned alongside routine genotype results and used to detect structural variation in the genome known as copy number variation (CNVs). Although LRR standard deviation (SD) is commonly applied as a quality control measure in CNV studies, limited research has explored its effect on genotype quality and the potential impact on CNV calling.
Methods:
A total of 720,152 genotypes were available on 716,234 cattle. Among these were 1,044 cattle that had duplicate genotype samples where one sample was considered gold standard (LRR SD < 0.3, Call Rate ≥ 0.95). Genotype concordance was calculated for all duplicates. PennCNV was used to call CNVs and concordance among CNV calls per duplicate animal was determined using the Jaccard index.
Results:
Across all 720,152 cattle samples, the mean (median) LRR SD was 0.21 (0.17), ranging from 0.07 to 1.98. A total of 14.62% of samples had an LRR SD > 0.3, the threshold routinely applied for CNV detection using SNP array data. Genotype concordance among the duplicates decreased as LRR SD increased, primarily due to misclassification of homozygous calls. Concordance among CNVs was highest for duplicate animals with a call rate > 0.9 and LRR SD < 0.3, but only 19.1% of these animals had a Jaccard index > 0.1.
Conclusion:
LRR SD is a useful additional quality control metric for genotype data. The low level of CNV concordance between duplicates highlights the need for caution when interpreting CNV results from medium density SNP panels.