Supplementary Material for: Defining Blood Group Gene Reference Alleles by Long-Read Sequencing: Proof of Concept in the ACKR1 Gene Encoding the Duffy Antigens
datasetposted on 11.12.2019, 10:43 by Fichou Y., Berlivet I., Richard G., Tournamille C., Castilho L., Férec C.
Background: In the novel era of blood group genomics, (re-)defining reference gene/allele sequences of blood group genes has become an important goal to achieve, both for diagnostic and research purposes. As novel potent sequencing technologies are available, we thought to investigate the variability encountered in the three most common alleles of ACKR1, the gene encoding the clinically relevant Duffy antigens, at the haplotype level by a long-read sequencing approach. Materials and Methods: After long-range PCR amplification spanning the whole ACKR1 gene locus (∼2.5 kilobases), amplicons generated from 81 samples with known genotypes were sequenced in a single read by using the Pacific Biosciences (PacBio) single molecule, real-time (SMRT) sequencing technology. Results: High-quality sequencing reads were obtained for the 162 alleles (accuracy >0.999). Twenty-two nucleotide variations reported in databases were identified, defining 19 haplotypes: four, eight, and seven haplotypes in 46 ACKR1*01, 63 ACKR1*02, and 53 ACKR1*02N.01 alleles, respectively. Discussion: Overall, we have defined a subset of reference alleles by third-generation (long-read) sequencing. This technology, which provides a “longitudinal” overview of the loci of interest (several thousand base pairs) and is complementary to the second-generation (short-read) next-generation sequencing technology, is of critical interest for resolving novel, rare, and null alleles.