|Common variants in the HLA-DRB1-HLA-DQA1 HLA class II region are associated with susceptibility to visceral leishmaniasis.|
|Jump to Full Text|
|PMID: 23291585 Owner: NLM Status: MEDLINE|
|To identify susceptibility loci for visceral leishmaniasis, we undertook genome-wide association studies in two populations: 989 cases and 1,089 controls from India and 357 cases in 308 Brazilian families (1,970 individuals). The HLA-DRB1-HLA-DQA1 locus was the only region to show strong evidence of association in both populations. Replication at this region was undertaken in a second Indian population comprising 941 cases and 990 controls, and combined analysis across the three cohorts for rs9271858 at this locus showed P(combined) = 2.76 × 10(-17) and odds ratio (OR) = 1.41, 95% confidence interval (CI) = 1.30-1.52. A conditional analysis provided evidence for multiple associations within the HLA-DRB1-HLA-DQA1 region, and a model in which risk differed between three groups of haplotypes better explained the signal and was significant in the Indian discovery and replication cohorts. In conclusion, the HLA-DRB1-HLA-DQA1 HLA class II region contributes to visceral leishmaniasis susceptibility in India and Brazil, suggesting shared genetic risk factors for visceral leishmaniasis that cross the epidemiological divides of geography and parasite species.|
|; ; Michaela Fakiola; Amy Strange; Heather J Cordell; E Nancy Miller; Matti Pirinen; Zhan Su; Anshuman Mishra; Sanjana Mehrotra; Gloria R Monteiro; Gavin Band; Céline Bellenguez; Serge Dronov; Sarah Edkins; Colin Freeman; Eleni Giannoulatou; Emma Gray; Sarah E Hunt; Henio G Lacerda; Cordelia Langford; Richard Pearson; Núbia N Pontes; Madhukar Rai; Shri P Singh; Linda Smith; Olivia Sousa; Damjan Vukcevic; Elvira Bramon; Matthew A Brown; Juan P Casas; Aiden Corvin; Audrey Duncanson; Janusz Jankowski; Hugh S Markus; Christopher G Mathew; Colin N A Palmer; Robert Plomin; Anna Rautanen; Stephen J Sawcer; Richard C Trembath; Ananth C Viswanathan; Nicholas W Wood; Mary E Wilson; Panos Deloukas; Leena Peltonen; Frank Christiansen; Campbell Witt; Selma M B Jeronimo; Shyam Sundar; Chris C A Spencer; Jenefer M Blackwell; Peter Donnelly|
Related Documents :
|23025505 - Cbl mutations do not frequently occur in paediatric acute myeloid leukaemia.
23585875 - Adiponectin-11377cg gene polymorphism and type 2 diabetes mellitus in the chinese popul...
20532935 - A novel tnfrsf1 gene mutation in a turkish family: a report of three cases.
10796875 - Analysis and classification of 304 mutant alleles in patients with type 1 and type 3 ga...
19302345 - Polygyny can increase rather than decrease genetic diversity contributed by males relat...
10232405 - Identification of pten mutations in five families with bannayan-zonana syndrome.
|Type: Journal Article; Research Support, N.I.H., Extramural; Research Support, Non-U.S. Gov't Date: 2013-01-06|
|Title: Nature genetics Volume: 45 ISSN: 1546-1718 ISO Abbreviation: Nat. Genet. Publication Date: 2013 Feb|
|Created Date: 2013-01-29 Completed Date: 2013-03-26 Revised Date: 2014-11-11|
Medline Journal Info:
|Nlm Unique ID: 9216904 Medline TA: Nat Genet Country: United States|
|Languages: eng Pagination: 208-13 Citation Subset: IM|
|APA/MLA Format Download EndNote Download BibTex|
Electrophoresis, Agar Gel
Genetic Predisposition to Disease / genetics*
Genome-Wide Association Study
HLA-DQ alpha-Chains / genetics*
HLA-DRB1 Chains / genetics*
Haplotypes / genetics
Leishmaniasis, Visceral / genetics*
Oligonucleotide Array Sequence Analysis
Polymorphism, Single Nucleotide / genetics
|074196//Wellcome Trust; 074196/Z/04/Z//Wellcome Trust; 075491/Z/04/B//Wellcome Trust; 083948//Wellcome Trust; 084694//Wellcome Trust; 085475//Wellcome Trust; 085475/B/08/Z//Wellcome Trust; 085475/Z/08/Z//Wellcome Trust; 087436//Wellcome Trust; 087436/Z/10/Z//Wellcome Trust; 089698//Wellcome Trust; 090532//Wellcome Trust; 090532/Z/09/Z//Wellcome Trust; 095552//Wellcome Trust; 097364/Z/11/Z//Wellcome Trust; P50 AI-30639/AI/NIAID NIH HHS; P50 AI030639/AI/NIAID NIH HHS; P50 AI074321/AI/NIAID NIH HHS; P50AI074321/AI/NIAID NIH HHS; R01 AI048822/AI/NIAID NIH HHS; R01 AI048822/AI/NIAID NIH HHS; R01 AI076233/AI/NIAID NIH HHS; R01 AI076233/AI/NIAID NIH HHS|
|0/HLA-DQ alpha-Chains; 0/HLA-DQA1 antigen; 0/HLA-DRB1 Chains|
Journal ID (nlm-journal-id): 9216904
Journal ID (pubmed-jr-id): 2419
Journal ID (nlm-ta): Nat Genet
Journal ID (iso-abbrev): Nat. Genet.
nihms-submitted publication date: Day: 10 Month: 5 Year: 2013
Electronic publication date: Day: 06 Month: 1 Year: 2013
Print publication date: Month: 2 Year: 2013
pmc-release publication date: Day: 01 Month: 8 Year: 2013
Volume: 45 Issue: 2
First Page: 208 Last Page: 213
PubMed Id: 23291585
|Common variants in the HLA-DRB1-HLA-DQA1 Class II region are associated with susceptibility to visceral leishmaniasis|
|The LeishGEN Consortium & The Wellcome Trust Case Control Consortium 2|
|Heather J. Cordell3|
|E. Nancy Miller1χ|
|Gloria R. Monteiro5|
|Sarah E. Hunt6|
|Henio G. Lacerda7|
|Núbia N. Pontes5|
|Matthew A. Brown10|
|Juan P. Casas11|
|Hugh S. Markus15|
|Christopher G. Mathew16|
|Colin N.A. Palmer17|
|Stephen J. Sawcer19|
|Richard C. Trembath20|
|Ananth C. Viswanathan21|
|Nicholas W. Wood22|
|Mary E. Wilson23|
|Selma M.B. Jeronimo5|
|Chris C.A. Spencer2|
|Jenefer M. Blackwell124†|
1Cambridge Institute for Medical Research, University of Cambridge School of Clinical Medicine, Addenbrooke’s Hospital, Cambridge CB2 0XY, UK
2Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN, UK
3Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne NE1 3BZ, UK
4Institute of Medical Sciences, Banaras Hindu University, Varanasi – 221 005, India
5Department of Biochemistry, Center for Biosciences, Universidade Federal do Rio Grande do Norte, Natal, RN 59078-970, Brazil
6Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, UK
7Department of Infectious Diseases, Universidade Federal do Rio Grande do Norte, Natal, RN 59078-970, Brazil
8Department of Clinical Immunology, Royal Perth Hospital, Wellington Street, Perth, Western Australia 6000, Australia
9NIHR Biomedical Research Centre for Mental Health at the South London and Maudsley NHS Foundation Trust and Institute of Psychiatry Kings College London, London SE5 8AF, UK
10University of Queensland Diamantina Institute, Princess Alexandra Hospital, University of Queensland, Brisbane, Australia
11Department of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
12Neuropsychiatric Genetics Research Group, Institute of Molecular Medicine, Trinity College Dublin, Dublin 2, Eire
13Molecular and Physiological Sciences, The Wellcome Trust, London NW1 2BE, UK
14Department of Oncology, University of Oxford, Oxford OX3 7DG, UK
15Stroke and Dementia Research group, St George’s University of London, London, SW17 ORE, UK
16Department of Medical and Molecular Genetics, King’s College London School of Medicine, Guy’s Hospital, London SE1 9RT, UK
17Medical Research Institute, University of Dundee, Ninewells Hospital and Medical School, Dundee DD1 9SY, UK
18Social, Genetic and Developmental Psychiatry Centre, King’s College London Institute of Psychiatry, Denmark Hill, London SE5 8AF, UK
19Department of Clinical Neurosciences, University of Cambridge, Addenbrooke’s Hospital, Cambridge CB2 0QQ, UK
20Oxford Centre for Diabetes, Endocrinology and Metabolism (ICDEM), Churchill Hospital, Oxford OX3 7LJ, UK
21National Institutes of Health Research (NIHR) Biomedical Centre for Ophthalmology at Moorfields Eye Hospital NHS Foundation Trust, City Road, London EC1V 2PD and UCL Institute of Ophthalmology, 11-43 Bath Street, London EC1V 9EL, UK
22Department of Molecular Neuroscience, Institute of Neurology, Queen Square, London WC1N 3BG, UK
23Department of Internal Medicine, University of Iowa and the VA Medical Center, Iowa City, IA, USA 52242, USA
24Telethon Institute for Child Health Research, Centre for Child Health Research, The University of Western Australia, Subiaco, Western Australia 6008, Australia
25Department of Statistics, University of Oxford, Oxford OX1 3TG, UK
|Correspondence: Correspondence should be addressed to P.D. (firstname.lastname@example.org) or J.M.B. (email@example.com)
1A full list of authors and affiliations appears at the end of the paper.
2Full membership of both consortia is provided in the Supplementary Materials.
*These authors contributed equally to this work
†These authors jointly supervised this work.
AUTHOR CONTRIBUTIONS M.F., E.N.M., A.M., S.M., G.R.M., H.G.L., N.N.P., M.R., S.P.S., O.S., M.E.W., S.M.B.J., S.S. and J.M.B. oversaw cohort collections for LeishGEN. The WTCCC2 DNA, genotyping, data quality control and informatics group (S.D., S.E., E. Gray., S.E.H. and C.L.) executed GWAS sample handling, genotyping and quality control. The WTCCC2 Management Committee (P. Donnelly, J.M.B., E.B., M.A.B., J.P.C., A.C., P. Deloukas, A.D., J.J., H.S.M., C.G.M., C.N.A.P., R.P., A.R., S.J.S., R.C.T., A.C.V., L.P. and N.W.W.) monitored the execution of the GWAS. A.S., M.F., H.J.C., M.P., Z.S., G.B., C.B., C.F., E. Giannoulatou, R.P., D.V. and C.C.A.S. performed statistical analyses. L.S., F.C. and C.W. oversaw HLA typing and interpretation. A.S., M.F., H.J.C., C.C.A.S., J.M.B. and P. Donnelly contributed to writing the manuscript. All authors reviewed the final manuscript.
Leishmania are protozoan parasites that live in macrophages. They are transmitted by sand flies and cause severe and debilitating cutaneous, as well as fatal visceral disease in subtropical/tropical regions of Old and New Worlds. Leishmaniasis is classified by WHO as one of the neglected tropical diseases. It affects 12 million people and there are an estimated 1.5 million new cases annually1. Of these, 500,000 are cases of potentially fatal visceral leishmaniasis caused by the Leishmania donovani complex, 90% of which occurs in three foci in India/Bangladesh/Nepal, Sudan, and Brazil. Skin-tests and lymphocyte responses indicate that only 1 in 5-10 infected individuals develop clinical disease2-4. The importance of host genetic factors is indicated by familial clustering5 and high sibling risk ratios6. However, human genetic studies undertaken to date (reviewed7-9) provide inconsistent results. As part of the Wellcome Trust Case Control Consortium 2 (WTCCC2) study of 15 complex disorders and traits, we report the first genome-wide association study (GWAS) of visceral leishmaniasis across major foci of disease caused by L. donovani in India and L. infantum chagasi in Brazil.
Subjects for the India discovery GWAS (Supplementary Table 1, online methods) were recruited from Bihar state in northeast India. Patients and controls were matched for self-reported age, sex, religion, caste and geographic region of recruitment. The Brazilian family-based sample was collected as part of the Belém Family Study and from study sites near Natal6,10,11 (Supplementary Table 1, online methods).
All individuals were genotyped at the Wellcome Trust Sanger Institute on the custom Illumina Human660W-Quad chip (online methods). After performing stringent quality control (QC) procedures12 (online methods) the Indian discovery dataset comprised 2078 individuals (989 cases and 1089 controls) genotyped at 526,731 SNPs and the Brazilian discovery dataset comprised of 1970 individuals (357 cases; 308 families) genotyped at 553,323 SNPs.
As ancestry differences and close relationships between case and control individuals can confound association studies, we used a variance components method (sometimes referred to as a mixed model), which models the pair-wise relatedness between individuals to account for population structure in the data13. This approach accounts for relatedness on different scales, from close relatives to distant ancestral structure, and it can thus be applied both to the case-control samples in India and the family data from Brazil. Similar models have been recently used in several other association studies14-18. An advantage of our implementation of the mixed model is that we are able to estimate the effect sizes on the log-odds scale. Using this method, the genomic inflation factor (λ) was 1.03 for the India analysis and 1.02 for the Brazil analysis, showing that it successfully handled any population structure or relatedness in the data. The mixed model approach was used for all analyses presented here, for both discovery and replication data and in further dissection of association signals.
Previous studies attempting to identify the genetic components of visceral leishmaniasis susceptibility have typically been small, underpowered, family studies or candidate gene studies19-23. We present the results from our discovery GWAS and replication data (see below) at these previously identified loci in Table 1. None of the loci previously found to be associated with visceral leishmaniasis consistently show an association with visceral leishmaniasis in our three datasets.
Because different pathogen species are responsible for disease in India and Brazil, we first examined each GWAS separately. Our primary focus will be on regions showing association in both GWAS studies. This would be appropriate if similar host genetic factors affected susceptibility to the two different parasite species. Observing the same association in different populations, with different study design (population-based and family-based association, respectively) also adds to the robustness of potential findings.
We combined the India and Brazil GWAS through a fixed effects meta-analysis (Fig. 1c). Two regions had a Pcombined <10−6, one of which was the HLA-DRB1-HLA-DQA1 class II region (Supplementary Table 4). Details of the regions showing association (P<10−5) in only one of the two studies are shown in (Fig 1, and Supplementary Tables 2, 3 and 4). Apart from the MHC, none of the association regions with P <10−5 in the separate India or Brazil analysis showed significant association in the combined analysis.
A second Indian cohort was used in a replication analysis of the findings from the two discovery GWAS. These samples were genotyped on the Immunochip 24, a purpose-designed Illumina chip, some content of which was chosen by WTCCC2 for deep replication of its GWAS findings. After QC (online methods), the Indian replication dataset comprised 1931 individuals (941 visceral leishmaniasis cases and 990 controls). Outside of the MHC, none of the regions that were associated at P<10−5 in either the separate Indian or Brazilian GWAS or the combined discovery meta-analysis replicated at P<0.1 in the Indian replication data (Supplementary Tables 2, 3 and 4). However, only four of the regions associated in the Brazil GWAS were available on the Immunochip.
The MHC was the only region which replicated in all three cohorts (rs9271858, Pcombined=2.76×10−17 and OR(95%CI)=1.41(1.30-1.52)). Interestingly while there is shared signal at the MHC in both discovery GWAS, which clearly replicates, there are also differences between the India and Brazil results (Figure 2, Supplementary Table 4).
In the India discovery GWAS, the two SNPs most associated with visceral leishmaniasis in the MHC were in strong LD (rs9271842 and rs9271858, r2=1 in Indian controls). Only one of these SNPs was present in the replication data (rs9271858), therefore for convenience we focus below on this SNP, and refer to it as the top SNP. This SNP is also significant in the Brazil data (PBrazil=2.04×10−4), but conversely the top MHC signal in the Brazil data (rs9268878) was not significant in the India data (PIndia=0.71, PReplication=0.58 at rs4428528, r2=1). The top MHC SNP signal in Brazil is 163.6 Kb upstream of the India signal, and is in almost complete linkage equilibrium with the top India SNP (r2= 0.06 in the Brazil founders). Further replication in a Brazilian dataset is required to assess this additional MHC signal: it could be a population or parasite species-specific association, or alternatively a false-positive finding.
In the Indian discovery cohort, when conditioning on the top SNP from the Indian discovery GWAS (rs9271858), there was still residual association signal at the MHC; two highly correlated SNPs both had P<10−3 (rs9271252 and rs9271255, r2=1 in Indian controls). For convenience, because it is typed in the discovery and replication samples, we focus in what follows on one of these, rs9271255, with a P value of 8.79×10−4 in the conditional analysis. This SNP is not highly correlated with rs9271858 (r2=0.21 in Indian controls). When conditioning on both rs9271858 and rs9271255, no association signal remains with P <10−3. In the India replication data, when conditioning on rs9271858, rs9271255 is still significant (P=5.72×10−3). This conditional analysis was repeated in the Brazil discovery data; when conditioning on rs9271858, rs9271255 was significant with a P value of 1.91×10−5.
We turn now to a more detailed examination of this replicated MHC signal (at rs9271858 and rs9271255). We focus on the Indian data, for which the Indian replication cohort is from a comparable population and can therefore be used to replicate cohort specific association signals. To further explore the Indian MHC signal, we estimated phase for the SNP genotypes for all individuals in the Indian discovery data. We subsequently fitted a model to the phased haplotypes with different risk parameters for the four haplotypes defined by the phased genotypes at the two aforementioned SNPs, rs9271858 and rs9271255. After additional exploration of the model (Supplementary Results), it was apparent that combining two of the four haplotypes, to give a model with three risk classes, fitted the data as well as the model with four risk classes (Supplementary Figure 1), and also gave a better fit than the two-SNP model (Supplementary Results).
At rs9271858 and rs9271255 respectively, the haplotype with highest risk is defined by the alleles GA and the haplotype with intermediate risk by AA, with the haplotypes AG and GG being protective relative to the other two haplotypes, and having statistically indistinguishable risk. This model with three risk classes had P=1.65×10−11 in the Indian discovery data, and where the protective haplotypes are set as the baseline with OR=1, the intermediate haplotype OR(95%CI)=1.28(1.09-1.50) and the highest risk haplotype OR(95%CI)=1.72(1.48-2.00). This analysis replicated in the Immunochip data: the model with three risk classes was significant at P= 4.78×10−6, where the intermediate haplotype OR(95%CI)=1.19(1.00-1.40) and the highest risk haplotype OR(95%CI)=1.50(1.27-1.76).
Our results establish that common polymorphisms in the HLA-DRB1-HLA-DQA1 region of the MHC are genetic risk factors for visceral leishmaniasis, and importantly appear to cross the epidemiological divides of geography and parasite species. For GWAS associations, the top SNPs are not necessarily the causal variants, but are expected to be correlated with the functional variants. There are SNPs in the MHC which are in moderate LD with rs9271858 (r2 ~ 0.2) and are located up to 500kb away (e.g. rs408359, r2=0.19 calculated in the Indian controls). This extensive linkage disequilibrium is common in the MHC, and makes it particularly difficult to assess possible functional candidates within the region of association25.
One possibility is that the association signal could be driven by functional variation in the adjacent highly polymorphic classical HLA class II genes, DRB1 and DQB1, each of which is a natural immunological candidate. The classical alleles at these loci were genotyped in a subset of individuals in both of the discovery cohorts, and phased (statistically) onto the haplotypes estimated from all the SNP data (see Supplementary Results). In this phased data set, chromosomes carrying the protective haplotype at rs9271858 and rs9271255 always carried one of the three classical DRB1 alleles 15, 16 and 01, and vice versa (Figure 3, Supplementary Table 5). We note that the DRB*01 and *16 alleles are very rare in the Indian data (Supplementary Table 5), so we have limited power to assess their possible role. This correlation was also perfect in the 142 phased Brazil discovery sample chromosomes (Online Methods) where all of the chromosomes carrying the protective haplotype defined by rs9271858 and rs9271255 also carried one of DRB1 *15, *16 or *01 (Supplementary Table 5, Supplementary Figure 2). This lends support to the possibility that the association signal could be driven by classical HLA alleles, and the biological importance of endogenous processing and presentation of antigen from infected macrophages and dendritic cells to CD4 T cells that drive the immune response in visceral leishmaniasis26,27. Nevertheless, additional research will be needed to establish whether or not these DRB1 alleles are themselves functionally involved in protection against visceral leishmaniasis, or merely correlated with an as-yet uncharacterised functional variant. Further details of the HLA analysis are presented in the Supplementary.
Previous studies of HLA associations in human leishmaniasis have provided confusing and often contradictory results generally compromised by small sample size and lack of power (reviewed9). One small study did find protection associated with DRB1*15/*16 for visceral leishmaniasis caused by Old World L. infantum28, the same parasite species that causes visceral leishmaniasis in Brazil29. Protection has also been associated with DR2 (=DRB1*15/*16) in small studies of cutaneous and mucocutaneous forms of leishmaniasis caused by L. mexicana30 and L. braziliensis31,32, respectively.
There have been relatively few successful GWAS studies of infectious disease susceptibility33-36. Our GWAS has identified HLA class II region polymorphisms as significant genetic risk factors for visceral leishmaniasis, with the risk associated with the high-risk haplotype being high by GWAS standards: OR(95%CI) estimated from replication data of 1.50(1.27-1.76). Importantly, genetic variation in the MHC has not been found to be associated with all infectious disease. Our observations focus attention on determining the precise nature of HLA class II variation in determining visceral leishmaniasis susceptibility and offer the potential to contribute to development of novel strategies for disease control.
Supplementary Table 1 provides characteristics of samples. In India, discovery and replication sets comprised cases and controls from villages located within a radius of ~100km from the city of Muzaffarpur covering the districts of Muzaffarpur, Vaishali, Samastipur, Saran, Sheohar, East Champaran and Sitamarhi in Bihar State in northeast India. Further epidemiological and demographic details relating to the study site are described elsewhere37. An annual incidence rate of 2.49 clinical visceral leishmaniasis cases/1,000 persons has been reported in the region37. L. donovani sensu strictu (zymodeme MON-2) was confirmed as the causative agent of visceral leishmaniasis in the study region, in accordance with other reports on clinical isolates from kala-azar patients in the state of Bihar38-41. The controls had no history of visceral leishmaniasis or a family history of visceral leishmaniasis among first-, second-, or third-degree relatives. Informed written consent in Hindi was obtained from all participating individuals and from parents of children under 18 years old. Approval for the study was provided by the Ethical Committee of the Institute of Medical Sciences, Banaras Hindu University, Varanasi, India.
From northeast Brazil, 64 families were collected during 1991-1994 as part of the Belém Family Study6,42. These families were ascertained from medical records of the Fundacão National de Saude in the States of Para, Maranhão, Piaui based on data from the 1983-85 and 1993-94 epidemics. A further 244 families were collected from study sites near Natal where transmission of the parasite is focal and transient, allowing identification of neighbourhoods with ongoing or recent transmission. These families were ascertained from medical records of the Fundacão National de Saude in Rio Grande do Norte. All families were of equivalent socioeconomic status. The causative agent of visceral leishmaniasis was confirmed as L. infantum chagasi in a subset (~10% over 4 States) of visceral leishmaniasis patients6. Further clinical, epidemiological, family structure and demographic information relating to the families are described elsewhere6,10,21,42,43.
Ethical approval for the Belem Family Study was obtained originally from the local ethics committee at the Instituto Evandro Chagas, Belém, Para, Brazil6,42. Approval for continued use of the Belem Family Study samples, and for collection and use of the samples from Natal, has been granted from the local Institutional Review Board at the Universidade Federal do Rio Grande do Norte (CEP-UFRN 94-2004), nationally from the Comissão Nacional de Ética em Pesquisa (CONEP: 11019), and from the Ministerios Cencia e Tecnologia for approval to ship samples out of Brazil (portaria 617; 28 September 2005). Informed written consent for sample collection was obtained from adults, and from parents of children <18 years old. The Brazilian populations studied are long-term (>200 years) admixtures of Caucasian, Negroid and Native Indian ethnic backgrounds, as confirmed in recent analysis of the Natal families11.
Individuals from both India and Brazil were classified as affected when they had been diagnosed with clinical visceral leishmaniasis that responded positively to specific anti-leishmanial treatment. Data on sub-clinical disease or asymptomatic infections were not included. Diagnosis of clinical visceral leishmaniasis was made on the basis of clinical, parasitological (splenic aspirates in India; bone marrow aspirates in Brazil) and serological criteria as described elsewhere6,10,21,42,44-46.
For India non-invasive buccal swabs were collected and genomic DNA prepared in the Cambridge laboratory. For Brazil, blood was collected by venepuncture from all available family members. Genomic DNA was prepared directly from blood in Natal10, or from Epstein-Barr virus-transformed B cells cultured from fresh or cryopreserved peripheral blood mononuclear cells in Cambridge21. Genomic DNA for all cases was shipped to the Sanger Institute, Cambridge. Quality was validated using the Sequenom iPLEX assay designed to genotype 4 gender SNPs and 26 SNPs present on the Illumina Beadchips. DNA concentrations were quantified using a PicoGreen assay (Invitrogen) and an aliquot assayed by agarose gel electrophoresis. A DNA sample was considered to pass quality control if the DNA concentration was greater than or equal to 50 ng/μl, the DNA was not degraded, the gender assignment from the iPLEX assay matched that provided in the patient data manifest and genotypes were obtained for at least two thirds of the SNPs on the iPLEX.
Indian discovery cases and controls and the Brazil families were genotyped at the Sanger Institute on the Illumina Infinium platform using the Human660-Quad, a custom chip designed by WTCCC2 and comprising Human550 supplemented with 60,000 additional probes that were intended to allow the genotyping of common CNVs from the Structural Variation Consortium47. Replication genotyping for India was carried out at the Sanger Institute using the Illumina Immunochip, a custom chip designed by WTCCC2 and the Immunochip Consortium and comprising 196,524 SNPs. For both chips, bead intensity data was processed and normalized for each sample in BeadStudio; data for successfully genotyped samples was extracted and genotypes called using Illuminus.
SNPs. SNPs were excluded if the Fisher information for the allele frequency was not close to unity (information <0.98) or if the minor allele frequency (MAF) was very low (defined as <0.01%), or if the missingness >0.02, or for extreme departures from Hardy-Weinberg equilibrium (HWE P-value <1×10−20). After applying the above filters in the Indian discovery 526,731 autosomal SNPs remained for further analysis. For the Brazilian families, the above filters left 553,323 autosomal SNPs. Of the two discovery cohorts, 521,134 overlapping SNPs passed QC in both cohorts. Cluster plots of the hit SNPs described in this study are shown in Supplementary Figure 3.
For quality control of samples typed on the Indian and Brazilian Illumina Human660 quad data, a Bayesian clustering method was used to infer and exclude outlying individuals on the basis of call rate, heterozygosity and signal intensity12,48. To remove signal intensity outliers observed for raw intensity data, the difference between the A channel intensity and the B channel intensity was averaged over all SNPs on autosomes for each sample. A similar approach was used taking intensity measures from the A channel on the non-pseudo autosomal X chromosomes to identify outliers and infer gender for all Indian and Brazilian samples. Samples were removed if their inferred gender was discordant with the recorded gender after cross-checking with original database entries, or if less than 90% of the SNPs typed by Sequenom on entry to sample handing (see above) agreed with the genome-wide data. Duplicated individuals were also excluded. For the Indian discovery sample a total of 121/2199 individuals were removed following these quality control checks. For the Indian replication sample 111/2042 individuals were removed. For the Brazilian family study, additional quality control checks were carried out in order to verify (and if necessary adjust) the expected pedigree relationships between the individuals. Genome-wide identity by descent (IBD) sharing probabilities for all pairs of individuals were estimated using a subset of 11,177 autosomal markers using the software PLINK49. These estimates were used to assist in pedigree checking and resulted in 49 additional post quality control individuals being removed due to unresolved pedigree relationship discrepancies. In total, for the Brazilian family study, 189/2159 individuals were removed.
We carried out association analyses using a novel13 variance components method (a linear mixed model, similar to Kang et al.17). The linear mixed model explicitly accounts for correlations in individuals’ phenotypes due to their relatedness through specification of the phenotypic variance/covariance matrix in terms of parameters representing underlying genetic and environmental components of variance (estimated during the model fitting procedure) and pairwise relatedness measures (kinship coefficients), estimated prior to model fitting on the basis of the genome-wide genotype data. In this analysis we used a standard linear mixed model
Although use of this linear mixed model was originally proposed for pedigrees with known relationships50-53, recently this approach has gained popularity for use with samples of unknown or uncertain relationship13,17,18,54, including apparently unrelated samples who may nevertheless display distant levels of common ancestry. For this purpose, the pairwise kinship coefficients modelling either close or distant relatedness are estimated (prior to fitting the linear mixed model) on the basis of genome-wide genotype data, rather than being fixed at known theoretical values.
Computational considerations have led to the development of several faster approximations for constructing tests of the fixed effects of interest in the above linear mixed model15-17,53,55,56. These approximate tests have been implemented in various software packages including EMMAX, TASSEL, FaST-LMM and GenABEL. For the analyses presented here, we used our own implementation of the linear mixed model developed for a previous study13 that has the advantage, in common with the recently-developed GEMMA package57 of fitting the full (rather than an approximate) model, which in principal can lead to a small increase in power, depending on the underlying true model. Our implementation also allows a transformation between the parameters of the linear model shown above and the logistic model (which is the usual model for case/control data), allowing the convenient estimation of effects on the log odds scale and the generation of resulting odds ratio estimates. We confirmed that the Indian discovery results did not change significantly after the inclusion of the first 10 PCs in the linear mixed model (data not shown).
Combined P-values across Indian (discovery and replication) and Brazilian datasets were calculated in R using an inverse variance fixed effects meta-analysis.
In addition, for the Brazilian family study we verified our results using two complementary alternative methods implemented in the software packages FBAT58 and ROADTRIPS59. Results were comparable across the different methods (data not shown).
The HLA DRB1 and DQB1 typing of 100 selected Indian discovery individuals was performed using a sequencing-based method60. To select these individuals, SNPs in the class II region DRA-QB2 (32,514,320 to 32,840,188 NCBI Build 36) were first pruned in PLINK49 to generate a set of 78 SNPs in approximate linkage equilibrium (r2<0.8) and with MAF>0.1. LD blocks across these 78 SNPs were then visualised using Haploview61 and the top 59 haplotype tagging SNPs were selected and used to generate haplotypes for all 1866 unrelated individuals in fastPHASE62. The subset of 100 individuals selected for sequence-based HLA typing represented the 135 most common 59-SNP haplotypes in the region. Automated sequencing was carried out on ABI Prism 3730 or 3730xl Genetic Analysers (Applied Biosystems; Foster City, CA, USA) and HLA-DRB1 analysis was carried out using ASSIGNV188.8.131.52 (Conexio Genomics; Applecross, WA, Australia). Where possible, individuals were assigned DRB1 or DQB1 specificities to the 4-digit level63, i.e. to a specific HLA protein encoded on each chromosome (for example, DRB1*1501, where *1501 is a specific functional DRB1 protein). Where this was not possible, individuals were assigned to a 2-digit level allele group, e.g. HLA-DRB1*15 indicates an allele group for which all specific functional DRB1 proteins share a common ancestry.
HLA-DRB1 and DQB1 typing was further performed in a subset of 71 unrelated individuals from the Brazilian families using the same sequence-based method as described above for the Indian samples.
SNPs in the MHC region and HLA-DRB1 and -DQB1 alleles were phased using IMPUTE264 and PHASE65. The Indian discovery and Brazilian replication cohorts were treated separately throughout the phasing pipeline.
For the Indian discovery data, IMPUTE264 was used to phase all the SNPs on chromosome 6 for 1866 individuals, using HapMap3 populations as the haploid reference panel. For the Brazilian data 5,667 SNPs in the MHC (20,733,613- 41,984,313bp) were phased. The phasing of the Brazilian data was performed in two stages, firstly with an unrelated set of 498 parents and secondly an unrelated set of 268 offspring, totalling 766 individuals. The trio relationships were then used to check the accuracy of both SNP and HLA allele phasing.
Treating the IMPUTE2 SNP phasing as fixed, PHASE65 was then used to phase the HLA alleles typed at HLA-DRB1 and HLA-DQB1 onto HLA region SNP haplotypes, selecting only haplotypes phased at probability >0.9. PHASE65 was used to phase the multi-allelic HLA-DRB1 and HLA-DQB1 alleles onto the known SNP haplotypes, specifying known phase at all of the SNPs and the PHASE options –MR and -d1. The HLA alleles were given positions: - DRB1 was set at 32,597,662bp, between rs28756238 and rs41546317 and HLA-DQB1 was set at 32,735,635bp between rs28724231 and rs1063355. Both HLA-DRB1 and HLA-DQB1 alleles were phased in the Indian data, and HLA-DRB1 was phased in the Brazilian data.
After QC of the PHASE data, in the Indian discovery set there was a total of 3,732 SNP haplotypes, 154 with HLA-DRB1 alleles and 153 with HLA-DQB1 alleles. Of these, 112 were phased to 4-digit level for both DRB1 and DQB1. In the Brazilian dataset there was a total of 1,481 SNP haplotypes, 142 of which were phased with 2 digit HLA-DRB1 alleles.
Haplotype associations were investigated by applying two methods; a conditional analysis (see Results) and a Bayesian approach. Both methods reached the same result, each fitting a three-haplotype model of risk.
GENECLUSTER, which adopts a Bayesian approach66 was used to look for primary and secondary association signals at known and putative SNPs. This method analyses the genealogy of the case-control sample to find evidence for causal mutation(s). It uses the genealogy of a reference haplotype panel (in this case the HapMap3 GIH haplotypes) to approximate the genealogy of the case-control sample at positions across the region by clustering the case-control haplotypes under the leaves of the reference genealogy. Placing a disease mutation on a branch of the reference genealogy segregates the haplotypes at the leaves into two different risk groups: those case-control haplotypes that fall under the disease mutation (and therefore carries the mutation) and those that do not. For a given mutation in the tree, GENECLUSTER carries out a Bayesian test of association. The model can also be extended to two disease mutations in the genealogy. For this data, the genealogical tree was estimated on chromosome 6 at 32,678,817 bp. The log10(Bayes Factor) for the 2-mutation (three haplotype) model was 7.36, compared to log10(Bayes Factor) 5.42 for a single mutation (two haplotype) model.
Following the linear mixed model conditional analysis, we phased the SNP data for all individuals in the India discovery data and found that the two SNPs associated in the conditional analysis (rs9271858 and rs9271255) tag the haplotypes from the GENECLUSTER analysis. The first mutation perfectly correlated with the protective haplotype and the second mutation separated the risk halotype into two, which are well tagged by the two SNP intermediate and risk haplotypes (r2 protective=1, intermediate=0.49, risk=0.60) (overall correlation 0.86).
FN7COMPETING FINANCIAL INTERESTS The authors declare no competing financial interests.
The principal funding for this study was provided by the Wellcome Trust, as part of the Wellcome Trust Case Control Consortium 2 project (085475/B/08/Z and 085475/Z/08/Z). We thank S. Bertrand, J. Bryant, S.L. Clark, J.S. Conquer, T. Dibling, J.C. Eldred, S. Gamble, C. Hind, M.L. Perez, C.R. Stribling, S. Taylor and A. Wilk of the Wellcome Trust Sanger Institute’s Sample and Genotyping Facilities for technical assistance. We thank D. Davison for making available his program ‘Shellfish’ for calculating principal components in large genetic datasets. CCAS is supported by a Wellcome Trust Fellowship (097364/Z/11/Z). HJC is supported by a Wellcome Senior Fellowship in Basic Biomedical Science (087436/Z/10/Z). PD is supported in part by a Royal Society Wolfson Merit Award and work was supported in part by WTCHG core grants 090532/Z/09/Z and 075491/Z/04/B. Collection of samples and epidemiological data, sample preparation, and sequence-based HLA typing was supported by grants from The Wellcome Trust (074196/Z/04/Z and 085475/Z/08/Z to JMB, SS, SMBJ and MEW) and the National Institutes of Health [Tropical Medicine Research Centre award P50AI074321 to SS in India; Tropical Medicine Research Centre award P50 AI-30639 to Dr. Edgar M. Carvalho and SMBJ in Brazil; R01 AI076233 (MEW, JMB); R01 AI048822 (MEW, SMBJ, JMB)]. Special thanks to all subjects who contributed samples, and clinicians and field staff in India and Brazil who helped with recruitment of study subjects.
|1.||Alvar J,et al. Leishmaniasis Worldwide and Global Estimates of Its IncidencePLoS OneYear: 20127e3567122693548|
|2.||Bucheton B,et al. The interplay between environmental and host factors during an outbreak of visceral leishmaniasis in eastern SudanMicrobes InfectYear: 200241449145712475635|
|3.||Sacks DL,Lal SL,Shrivastava SN,Blackwell JM,Neva FA. An analysis of T cell responsiveness in Indian Kala-azarJ. ImmunolYear: 19871389089133100620|
|4.||Ho M,Siongok TK,Lyerly WH,Smith DH. Prevalence and disease spectrum in a new focus of visceral leishmaniasis in KenyaTrans R. Soc. Trop. Med. HygYear: 1982767417466984547|
|5.||Cabello PH,Lima AM,Azevedo ES,Kriger H. Familial aggregation of Leishmnaia chagasi infection in northeastern BrazilAm. J. Trop. Med. HygYear: 1995523643657741179|
|6.||Peacock CS,et al. Genetic epidemiology of visceral leishmaniasis in northeastern BrazilGenet EpidemiolYear: 20012038339611255246|
|7.||Burgner D,Jamieson SE,Blackwell JM. Genetic susceptibility to infectious diseases: big is beautiful, but will bigger be even better?Lancet Infect DisYear: 2006665366317008174|
|8.||Blackwell JM,et al. Genetics and visceral leishmaniasis: of mice and manParasite ImmunolYear: 20093125426619388946|
|9.||Blackwell JM,Jamieson SE,Burgner D. HLA and infectious diseasesClin. Microbiol. RevYear: 20092237038519366919|
|10.||Jeronimo SM,et al. Genetic predisposition to self-curing infection with the protozoan Leishmania chagasi: a genomewide scanJ Infect DisYear: 20071961261917955446|
|11.||Ettinger NA,et al. Genetic admixture in Brazilians exposed to infection with Leishmania chagasiAnn Hum GenetYear: 2009733041319397557|
|12.||Bellenguez C,Strange A,Freeman C,Donnelly P,Spencer CC. A robust clustering algorithm for identifying problematic samples in genome-wide association studiesBioinformaticsYear: 2011|
|13.||Sawcer S,et al. Genetic risk and a primary role for cell-mediated immune mechanisms in multiple sclerosisNatureYear: 2011476214921833088|
|14.||Astle W,Balding DJ. Population Structure and Cryptic Relatedness in Genetic Association StudiesStatistical ScienceYear: 200924451471|
|15.||Lippert C,et al. FaST linear mixed models for genome-wide association studiesNat MethodsYear: 20118833521892150|
|16.||Zhang Z,et al. Mixed linear model approach adapted for genome-wide association studiesNat GenetYear: 2010423556020208535|
|17.||Kang HM,et al. Variance component model to account for sample structure in genome-wide association studiesNat GenetYear: 2010423485420208533|
|18.||Amin N,van Duijn CM,Aulchenko YS. A genomic background based method for association analysis in related individualsPLoS ONEYear: 20072e127418060068|
|19.||Mehrotra S,et al. Genetic and functional evaluation of the role of CXCR1 and CXCR2 in susceptibility to visceral leishmaniasis in north-east IndiaBMC Med GenetYear: 20111216222171941|
|20.||Bucheton B,et al. Identification of a novel G245R polymorphism in the IL-2 receptor beta membrane proximal domain associated with human visceral leishmaniasisGenes ImmunYear: 20078798317108990|
|21.||Jamieson SE,et al. Genome-wide scan for visceral leishmaniasis susceptibility genes in BrazilGenes ImmunYear: 20078849017122780|
|22.||Fakiola M,et al. Genetic and functional evidence implicating DLL1 as the gene that influences susceptibility to visceral leishmaniasis at chromosome 6q27J Infect DisYear: 20112044677721742847|
|23.||Mohamed HS,et al. SLC11A1 (formerly NRAMP1) and susceptibility to visceral leishmaniasis in The SudanEur J Hum GenetYear: 200412667414523377|
|24.||Cortes A,Brown MA. Promise and pitfalls of the ImmunochipArthritis Res TherYear: 20111310121345260|
|25.||de Bakker PI,et al. A high-resolution HLA and SNP haplotype map for disease association studies in the extended human MHCNat GenetYear: 20063811667216998491|
|26.||Sundar S,Reed SG,Sharma S,Mehrotra A,Murray HW. Circulating T helper 1 (Th1) cell- and Th2 cell-associated cytokines in Indian patients with visceral leishmaniasisAm J Trop Med HygYear: 19975652259180602|
|27.||Carvalho EM,et al. Immunologic markers of clinical evolution in children recently infected with Leishmania donovani chagasiJ Infect DisYear: 1992165535401347057|
|28.||Meddeb-Garnaoui A,et al. Association analysis of HLA-class II and class III gene polymorphisms in the susceptibility to mediterranean visceral leishmaniasisHum ImmunolYear: 20016250951711334675|
|29.||Kuhls K,et al. Comparative microsatellite typing of new world leishmania infantum reveals low heterogeneity among populations and its recent old world originPLoS Negl Trop DisYear: 20115e115521666787|
|30.||Olivo-Diaz A,et al. Role of HLA class II alleles in susceptibility to and protection from localized cutaneous leishmaniasisHum ImmunolYear: 2004652556115041165|
|31.||Cabrera M,et al. Polymorphism in TNF genes associated with mucocutaneous leishmaniasisJ. Exp. MedYear: 1995182125912647595196|
|32.||Petzl-Erler ML,Belich MP,Queiroz-Telles F. Association of mucosal leishmaniasis with HLAHuman ImmunologyYear: 1991322542601783572|
|33.||Jallow M,et al. Genome-wide and fine-resolution association analysis of malaria in West AfricaNat GenetYear: 20094165766519465909|
|34.||Thye T,et al. Genome-wide association analyses identifies a susceptibility locus for tuberculosis on chromosome 18q11.2Nat GenetYear: 20104273974120694014|
|35.||Khor CC,et al. Genome-wide association study identifies susceptibility loci for dengue shock syndrome at MICB and PLCE1Nat GenetYear: 20114311394122001756|
|36.||Zhang FR,et al. Genomewide association study of leprosyN Engl J MedYear: 20093612609261820018961|
|37.||Singh SP,Reddy DC,Mishra RN,Sundar S. Knowledge, attitude, and practices related to Kala-azar in a rural area of Bihar state, IndiaAm J Trop Med HygYear: 200675505816968930|
|38.||Manna M,Majumder HK,Sundar S,Bhaduri AN. The molecular characterization of clinical isolates from Indian Kala-azar patients by MLEE and RAPD-PCRMed Sci MonitYear: 200511BR220715990683|
|39.||Chatterjee M,Manna M,Bhaduri AN,Sarkar D. Recent kala-azar cases in India: isozyme profiles of Leishmania parasitesIndian J Med ResYear: 1995102165728543361|
|40.||Thakur CP,Dedet JP,Narain S,Pratlong F. Leishmania species, drug unresponsiveness and visceral leishmaniasis in Bihar, IndiaTrans R Soc Trop Med HygYear: 200195187911355558|
|41.||Sundar S,et al. Resistance to treatment in Kala-azar: speciation of isolates from northeast IndiaAm J Trop Med HygYear: 200165193611561703|
|42.||Blackwell JM,et al. Immunogenetics of leishmanial and mycobacterial infections: the Belem Family StudyPhilos Trans R Soc Lond B Biol SciYear: 19973521331459355125|
|43.||Jeronimo SM,et al. An emerging peri-urban pattern of infection with Leishmania chagasi, the protozoan causing visceral leishmaniasis in northeast BrazilScand J Infect DisYear: 200436443915307565|
|44.||Khalil EA,Zijlstra EE,Kager PA,El Hassan AM. Epidemiology and clinical manifestations of Leishmania donovani infection in two villages in an endemic area in eastern SudanTrop Med Int HealthYear: 20027354411851953|
|45.||Fakiola M,et al. Classification and regression tree and spatial analyses reveal geographic heterogeneity in genome wide linkage study of Indian visceral leishmaniasisPLoS OneYear: 20105e1580721209823|
|46.||Zijlstra EE,el-Hassan AM,Ismael A,Ghalib HW. Endemic kala-azar in eastern Sudan: a longitudinal study on the incidence of clinical and subclinical infection and post-kala-azar dermal leishmaniasisAm J Trop Med HygYear: 199451826367810819|
|47.||Conrad DF,et al. Origins and functional impact of copy number variation in the human genomeNatureYear: 20104647041219812545|
|48.||Barrett JC,et al. Genome-wide association study of ulcerative colitis identifies three new susceptibility loci, including the HNF4A regionNat GenetYear: 2009411330419915572|
|49.||Purcell S,et al. PLINK: a tool set for whole-genome association and population-based linkage analysesAm J Hum GenetYear: 2007815597517701901|
|50.||Fisher RA. The correlation between relatives on the supposition of Mendelian inheritanceTrans R Soc EdinYear: 191852399433|
|51.||Henderson CR. Estimation of variance and covariance componentsBiometricsYear: 19539226252|
|52.||Boerwinkle E,Chakraborty R,Sing CF. The use of measured genotype information in the analysis of quantitative phenotypes in man. IModels and analytical methodsYear: 198650181194|
|53.||Chen WM,Abecasis GR. Family-based association tests for genomewide association scansAm J Hum GenetYear: 2007819132617924335|
|54.||Yu J,et al. A unified mixed-model method for association mapping that accounts for multiple levels of relatednessNat GenetYear: 200638203816380716|
|55.||Aulchenko YS,de Koning DJ,Haley C. Genomewide rapid association using mixed model and regression: a fast and simple method for genomewide pedigree-based quantitative trait loci association analysisGeneticsYear: 20071775778517660554|
|56.||Svishcheva GR,Axenovich TI,Belonogova NM,van Duijn CM,Aulchenko YS. Rapid variance components-based method for whole-genome association analysisNat GenetYear: 20124411667022983301|
|57.||Zhou X,Stephens M. Genome-wide efficient mixed-model analysis for association studiesNat GenetYear: 201244821422706312|
|58.||Horvath S,Wei E,Xu X,Palmer LJ,Baur M. Family-based association test method: age of onset traits and covariatesGenet EpidemiolYear: 200121Suppl 1S403811793708|
|59.||Thornton T,McPeek MS. ROADTRIPS: case-control association testing with partially or completely unknown population and pedigree structureAm J Hum GenetYear: 2010861728420137780|
|60.||Sayer D,et al. HLA-DRB1 DNA sequencing based typing: an approach suitable for high throughput typing including unrelated bone marrow registry donorsTissue AntigensYear: 200157465411169258|
|61.||Barrett JC,Fry B,Maller J,Daly MJ. Haploview: analysis and visualization of LD and haplotype mapsBioinformaticsYear: 200521263515297300|
|62.||Scheet P,Stephens M. A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phaseAm J Hum GenetYear: 2006786294416532393|
|63.||Marsh SG,et al. Nomenclature for factors of the HLA system, 2010Tissue AntigensYear: 20107529145520356336|
|64.||Howie BN,Donnelly P,Marchini J. A flexible and accurate genotype imputation method for the next generation of genome-wide association studiesPLoS GenetYear: 20095e100052919543373|
|65.||Stephens M,Smith NJ,Donnelly P. A new statistical method for haplotype reconstruction from population dataAm J Hum GenetYear: 2001689788911254454|
|66.||Su Z,Cardin N,Donnelly P,Marchini J. A Bayesian method for detecting and characterizing allelic heterogeneity and boosting signals in genome-wide association studiesStatistical ScienceYear: 200924430450|
Previous Document: A Rapidly Recurring Cutaneous Xanthogranuloma-Like Histiocytic Tumor: A Diagnostic Challenge.
Next Document: Genomic analysis of smooth tubercle bacilli provides insights into ancestry and pathoadaptation of M...