Bird genes give new insights into the origins of lipid antigen presentation
Which came first: the chicken or the egg? This famous question has given the chicken much prominence as a metaphor for difficult questions about evolution. Now it seems that chickens have landed again in the middle of an interesting evolutionary question by becoming the first nonmammalian species shown to possess genes for CD1, a family of immunologically important molecules known to be present in most, if not all, mammals. Mammalian CD1 proteins are believed to perform a significant role in adaptive immunity by functioning as antigen-presenting molecules for T cell responses to a unique class of self and foreign antigens (1). However, CD1 has diverged considerably from other known antigen-presenting molecules, which are encoded by the class I and II genes of the major histocompatibility complex (MHC), raising intriguing questions about the evolutionary relationship of these molecules. In this issue of PNAS, two groups report the identification of CD1 genes in the domestic chicken (Gallus domesticus) and its immediate wild ancestor, the Red Jungle Fowl (Gallus gallus), thus providing the first examples of CD1 genes in a non-mammalian species (2, 3). By extending the origin of CD1 back at least 300 million years to the time of the last common ancestor of birds and mammals, these findings support the view of CD1 as an ancient lineage of antigen-presenting molecules that, like the MHC class I and II families, was part of the early foundations of the adaptive immune system.
The CD1 system was first brought to light by cloning of the five human CD1 genes by Calabi and Milstein (4), whose pioneering studies identified clear homology and structural similarity of CD1 molecules to MHC class I and II. However, the relatively low levels of sequence homology between CD1 and the MHC-encoded molecules suggested a very distant evolutionary relationship and a possible divergence from the peptide binding and presenting function of MHC class I and II. Other important early observations included the demonstration that CD1 proteins, like all MHC class I molecules, are present on the plasma membrane of cells in a 1:1 noncovalent association with β2-microglobulin and that the human CD1 genes map to a locus on chromosome 1 and not to the MHC on chromosome 6 (5, 6). CD1 molecules also were shown to be most prominently expressed on cells that are directly involved in antigen presentation such as dendritic and B cells, giving them a restricted range of tissue and cell type expression that is reminiscent of MHC class II and distinctly different from the ubiquitously expressed MHC class I molecules. Most surprisingly, functional studies revealed that the antigens presented by CD1 are not peptides but are instead a variety of lipids such as fatty acids, phospholipids, glycolipids, and lipopeptides. Thus, three of the five human CD1 proteins, designated CD1a, CD1b, and CD1c, present various lipids from pathogenic mycobacteria, giving rise to highly specific and long-lived T cell responses (1). Another CD1 molecule, known as CD1d, presents self and foreign lipids to an unusual and unconventional T cell population known as natural killer T cells, which may represent a relatively primitive arm of the T cell system that bridges innate and adaptive immunity (7).
The unique property of CD1 molecules to bind and present lipid rather than peptide antigens is now well understood at the molecular level through a series of x-ray crystallographic studies (8). These studies reveal that all CD1 proteins have a partially concealed hydrophobic ligand-binding groove embedded in their membrane distal domains. In general, the hydrophobic alkyl chains of CD1-presented lipid antigens are buried within the hydrophobic groove, whereas the more charged or polar moieties of the ligand protrude out of the groove's opening and contact antigen receptors on specific T cells (Fig. 1A). The mechanism by which a variety of lipids are loaded into the ligand-binding grooves of CD1 molecules is complex, only partially understood, and to a great extent dependent on the ability of CD1 proteins to traffic to intracellular compartments, especially various sites within the endocytic pathway (9). This ability to survey the endocytic system of antigen-presenting cells is another feature that CD1 shares with MHC class II and another difference from MHC class I molecules, which are mainly loaded with antigenic peptides in the endoplasmic reticulum.
CD1 structure and evolution. (A) Schematic views of the three separate families of antigen-presenting molecules and their mechanisms of antigen binding. MHC class I and CD1 molecules have three extracellular domains of similar size (α1, α2, and α3), and both bind β2-microglobulin. MHC class II has a similar overall domain arrangement but is a heterodimer of two transmembrane polypeptide chains (α and β). Both MHC class I and II have relatively shallow peptide-binding grooves at their membrane distal ends. For MHC class I, both ends of the bound peptides are usually contained within the groove, limiting the peptide length to 8–10 aa. In contrast, MHC class II can bind much longer peptides that protrude from the open ends of its groove. The CD1 groove is larger, deeper, and much more hydrophobic. It binds the hydrophobic alkyl tails of lipids, anchoring the ligand so that its more hydrophilic portions are exposed at the groove entrance. In all three cases, ligand binding to these molecules generates stable complexes that are recognized by the antigen receptors on specific T lymphocytes. (B) Simplified phylogenetic tree illustrating evolution of MHC and CD1. A “primordial MHC” locus that lacks MHC class I and II and CD1 genes has been described in the protochordate amphioxus, and a similar locus is proposed to exist in primitive jawless fish (i.e., lamprey, hagfish). The MHC-encoded peptide antigen-presenting systems most likely emerged with the acquisition of adaptive immunity, thought to have occurred ≈500 million years ago with the introduction of the recombinase activating genes into the vertebrate genome, leading to the ability to create large numbers of T and B cell antigen receptors through somatic recombination. The presence of both MHC class I and II genes in the cartilaginous fish (sharks) and their apparent absence in the jawless fish is consistent with this scheme. The new reports of CD1 in birds now allow the conclusion that this antigen-presenting system dates back at least to the emergence of the last common ancestor of birds and mammals (≈310 million years ago, illustrated as the red portion of the tree). The question marks in the red arrow indicate the remaining uncertainty about whether CD1 may have originated even earlier, possibly at about the same time as or even before MHC class I and II.
The two Gallus genes described in this issue of PNAS encode proteins that by many criteria appear to be descendants of the same ancestral lineage as mammalian CD1 proteins (2, 3). The level of homology between the proteins encoded by the avian genes, designated chCD1–1 and chCD1–2, and mammalian CD1 proteins is relatively low (23–25% identity over the extracellular domains). However, dendrograms generated with multiple CD1 and MHC class I or II sequences clearly point to a closer relationship between the putative avian CD1 sequences and mammalian CD1 sequences than with other antigen-presenting molecules. In fact, these analyses consistently show chCD1–1 and chCD1–2 to be more closely related to mammalian CD1 than to chicken MHC class I or II. The cytoplasmic tails of chCD1–1 and chCD1–2 contain potential endosomal targeting motifs, and mRNA levels showed a restricted pattern of gene expression that was strongest in lymphoid tissues (bursa and spleen). Miller et al. (2) show that different species of birds most likely have different numbers of CD1 genes (ranging from two to four), suggesting a more limited version of the pronounced variation in size and complexity of the CD1 families observed between different mammalian species (2, 10). Like mammalian CD1 genes and in marked distinction to classical MHC class I and II genes, the two avian genes are relatively nonpolymorphic when compared between different individual chickens.
Although the evidence in favor of classification of the Gallus genes as members or close relatives of the CD1 lineage is compelling, it is not known yet whether the proteins encoded by these genes actually perform the lipid antigen presentation that has been attributed to mammalian CD1. Molecular modeling by Miller et al. (2) shows that the predicted ligand-binding groove in the chCD1–2 protein is strongly hydrophobic, consistent with a lipid-binding function. Salomonsen et al. (3) show that chCD1–2 can be expressed as a cell surface protein after transfection and that the expressed protein is recognized by a monoclonal antibody that was reported 15 years ago to recognize a β2-microglobulin-associated protein specific to avian B lymphocytes. These features are all consistent with the identification of the avian genes as CD1 (11, 12) and suggest that chCD1–2 performs some function in the avian immune system that could involve binding of hydrophobic ligands and cell interactions between T and B lymphocytes.
The discovery of CD1 in birds also provides some fascinating insights into the possible features of the primordial MHC, which is thought to have arisen with the very earliest vertebrates or protochordates (13). Based on experimental data and the publicly available draft sequence of the G. gallus genome, the chCD1–1 and chCD1–2 genes map to the MHC locus, unlike what has so far been found for mammalian CD1 genes. This striking observation suggests that CD1 or its immediate ancestor was present in an ancient version of the MHC locus that arose at least 300 million years ago at the time of the last common ancestor of the avian and mammalian phyla. The separation of CD1 and MHC loci in mammals is presumed to have occurred as a result of duplication and relocation of the ancestral MHC locus to create one or more paralogous loci on different chromosomes. Indeed, evidence exists for at least three such MHC paralogous loci in the human genome, one of which is the CD1 locus on chromosome 1 (13, 14). In this evolutionary scheme, subsequent shaping of the genome after these duplication events led to deletion of the CD1 genes from the modern MHC locus in mammals and deletion of the MHC class I and II genes or their precursors from the CD1 locus, thus creating two distinct loci with related genes. In contrast, the classical MHC and CD1 genes in birds have always remained together, and any duplications of MHC class I or II or CD1 genes in paralogous MHC loci were subsequently deleted and lost.
So which came first: the MHC class I and II or the CD1 genes? MHC class I and II genes are found in cartilaginous fish (sharks) and probably are present in all vertebrates with jaws. They appear to be absent in the primitive jawless fish (lampreys and hagfish), suggesting a fairly precise position in the phylogenetic tree at which the adaptive immune system as it is currently defined came into being (13) (Fig. 1B). So far, there is no evidence for CD1 proteins in animals whose ancestors emerged before those of birds, but perhaps this evidence will yet be found with continued mining of more genomes and expressed sequence tag libraries. If CD1 molecules can be traced back as far as the ancestors of primitive fish or earlier, then this finding would indicate that these molecules arose around the same time or possibly even before the direct ancestors of the present-day MHC-encoded peptide-presenting molecules. If so, could this finding mean that lipid recognition was actually established as a form of adaptive immune recognition before the potentially more powerful peptide recognition systems came along? Or, alternatively, was CD1 a contemporaneous or later addition to adaptive immunity that became fixed in the genomes of many vertebrates because it provides a useful complementary approach to foreign antigen recognition? Even with the new data on avian CD1, this topic remains another “chicken and egg” question that still cannot be answered definitively.






