Supplementary Material for: Comparison of SureSelect and Nextera Exome Capture Performance in Single-Cell Sequencing
datasetposted on 22.01.2019 by Huss W.J., Hu Q., Glenn S.T., Gangavarapu K.J., Wang J., Luce J.D., Quinn P.K., Brese E.A., Zhan F., Conroy J.M., Paragh G., Foster B.A., Morrison C.D., Liu S., Wei L.
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
Background: Advances in single-cell sequencing provide unprecedented opportunities for clinical examination of circulating tumor cells, cancer stem cells, and other rare cells responsible for disease progression and drug resistance. On the genomic level, single-cell whole exome sequencing (scWES) started to gain popularity with its unique potentials in characterizing mutational landscapes at a single-cell level. Currently, there is little known about the performance of different exome capture kits in scWES. Nextera rapid capture (NXT; Illumina, Inc.) has been the only exome capture kit recommended for scWES by Fluidigm C1, a widely accessed system in single-cell preparation. Results: In this study, we compared the performance of NXT following Fluidigm’s protocol with Agilent SureSelectXT Target Enrichment System (AGL), another exome capture kit widely used for bulk sequencing. We created DNA libraries of 192 single cells isolated from spheres grown from a melanoma specimen using Fluidigm C1. Twelve high-yield cells were selected to perform dual-exome capture and sequencing using AGL and NXT in parallel. After mapping and coverage analysis, AGL outperformed NXT in coverage uniformity, mapping rates of reads, exome capture rates, and low PCR duplicate rates. For germline variant calling, AGL achieved better performance in overlap with known variants in dbSNP and transition-transversion ratios. Using calls from high coverage bulk sequencing from blood DNA as the golden standard, AGL-based scWES demonstrated high positive predictive values, and medium to high sensitivity. Lastly, we evaluated somatic mutation calling by comparing single-cell data with the matched blood sequence as control. On average, 300 mutations were identified in each cell. In 10 of 12 cells, higher numbers of mutations were identified using AGL than NXT, probably caused by coverage depth. When mutations are adequately covered in both AGL and NXT data, the two methods showed very high concordance (93–100% per cell). Conclusions: Our results suggest that AGL can also be used for scWES when there is sufficient DNA, and it yields better data quality than the current Fluidigm’s protocol using NXT.