Benchmarking freely available HLA typing algorithms across varying genes, coverages and typing resolutions
Identifying the specific human leukocyte antigen (HLA) allele combination of an individual is crucial in organ donation, risk assessment of autoimmune and infectious diseases and cancer immunotherapy. However, due to the high genetic polymorphism in this region, HLA typing requires specialized methods. We investigated the performance of five next-generation sequencing (NGS) based HLA typing tools with a non-restricted license namely HLA*LA, Optitype, HISAT-genotype, Kourami and STC-Seq. This evaluation was done for the five HLA loci, HLA-A, -B, -C, -DRB1 and -DQB1 using whole-exome sequencing (WES) samples from 829 individuals. The robustness of the tools to lower depth of coverage (DOC) was evaluated by subsampling and HLA typing 230 WES samples at DOC ranging from 1X to 100X. The HLA typing accuracy was measured across four typing resolutions. Among these, we present two clinically-relevant typing resolutions (P group and pseudo-sequence), which specifically focus on the peptide binding region. On average, across the five HLA loci examined, HLA*LA was found to have the highest typing accuracy. For the individual loci, HLA-A, -B and -C, Optitype’s typing accuracy was the highest and HLA*LA had the highest typing accuracy for HLA-DRB1 and -DQB1. The tools’ robustness to lower DOC data varied widely and further depended on the specific HLA locus. For all Class I loci, Optitype had a typing accuracy above 95% (according to the modification of the amino acids in the functionally relevant portion of the HLA molecule) at 50X, but increasing the DOC beyond even 100X could still improve the typing accuracy of HISAT-genotype, Kourami, and STC-seq across all five HLA loci as well as HLA*LA’s typing accuracy for HLA-DQB1. HLA typing is also used in studies of ancient DNA (aDNA), which is often based on sequencing data with lower quality and DOC. Interestingly, we found that Optitype’s typing accuracy is not notably impaired by short read length or by DNA damage, which is typical of aDNA, as long as the DOC is sufficiently high.
By: Nikolas Hallberg Thuesen, Michael Schantz Klausen, Shyam Gopalakrishnan, Thomas Trolle and Gabriel Renaud
Scroll down to download the full publication