gpcrS.org [G Protein-Coupled Receptors Sequence-Structure-Signaling Database]

GRoSS Publication Data

Structure-Based Sequence Alignment of the Transmembrane Domains of all Human GPCRs: Phylogenetic, Structural and Functional Implications
Cvicek et al. PLoS Computational Biology (2016)

Manuscript Figures

Fig. 1 TM domains of the available crystal structures. Top: Two views of the 24 inactive crystal structures from classes A, B, C, and F (aligned to beta2) show the general GPCR fold of the transmembrane (TM) bundle. Class A in green, class B in blue (CRF1, GLR), class C in orange (MGLU1, MGLU5), class F in magenta (SMO). Bottom: Same views for only the 19 inactive class A structures showing the highly conserved class A TM fold. A detailed view of the conserved hydrogen bonding networks is shown in S1 Fig.

Fig. 2 Conserved inter-helical contacts. Top left: Diagram of 40 conserved inter-helical contacts (CHICOs) present in at least 23 out of 24 studied class A structures. The contacts common to all classes are shown in purple, and contacts present only in class A in orange. Top right: List of these contacts in Ballesteros-Weinstein numbering scheme. Bottom: Extracellular view of the same contacts in the beta2 crystal structure. The contacts in the inner and outer half of the membrane are shows on the left and right respectively.

Fig. 3 Testing the robustness of the alignment of the Vomeronasal receptors with the other groups. The table shows similarity between TMs averaged over all pairs of sequences formed from the two groups (red denotes high similarity, blue low similarity). For most TMs the optimal choices agree with the optimal alignment to Aalpha (full table in S5 Fig); all combinations are shown only for TM5. The same table but using the GPCRtm substitution matrix [74] instead of BLOSUM62 is shown in S7 Fig. GPCRtm was developed in particular for GPCR proteins, but in this case both matrices result in the same alignment.

Fig. 4 Testing the robustness of the alignment of the Taste2 receptors with the other groups. The table shows similarity between TMs averaged over all pairs of sequences formed from the two groups (red denotes high similarity, blue low similarity). For most TMs the optimal choices agree with the optimal alignment to Aalpha (full table in S6 Fig) only TM6 shows a second possible alignment at offset +4. The same table but using the GPCRtm substitution matrix instead of BLOSUM62 is in S8 Fig. Again, both matrices result in the same alignment.

Figs. 5 and 6 Sequence alignments of TMs 1 through 7 for the 25 crystal structures. The sequences are taken from the selected PDB files. The TM helix residues are colored in the Zappos scheme, which captures the chemical nature of each residue (e.g. helix breakers, proline and glycine, are shown in purple). The loop residues are shown in grey. The BW n.50 residue (numbering displayed below the sequences) is the most conserved within the class A. The consensus sequence is most similar to class A, because most sequences are from this class. The largest differences are for the last 5 sequences, which belong to the classes B, C, and F. The figure was prepared using Jalview.

Fig. 7 The phylogenetic tree based only on TM similarity using the GRoSS alignment (loops were ignored). Color coding denotes the GPCR class. Proteins with known crystal structure are emphasized with a dot. The full resolution version of this figure is in S4 Fig.

Fig. 8 Native activation \hot-spot" residues (NACHOs), which are contacts that change upon receptor activation. The width of the green lines is proportional to the number of contacts common to all six structures (RHO, beta2AR, M2, and their active structures). Blue shows the contacts present only in inactive structures, and not in inactive structures; while red shows the opposite. The upper diagrams show contacts in the extracellular half of the membrane. We see that there is no systematic change common to the class A receptors in the conformation of the extracellular half of the TMs. This is not obvious, because there are conformational changes accompanying ligand binding. All the systematic changes, which enable G protein binding, occur in the intracellular half of the TMs. The list only contains 15 different residues in 15 different contacts. Thus many of the residues switch partners upon activation.

Fig. 9 Magnitude of the rigid body moves of the helices necessary to map one structure to another. All TMs 1-7 from all available structure pairs were compared and each symbol denotes which TM is the data point from. The coordinate system is defined in the text. The maximal observed deviation is approximately proportional to the sequence dissimilarity of the two compared TMs, and it follows the same trend within class A (blue symbols) and across the GPCR superfamily (green symbols). The red symbols, which correspond to the active-inactive structure pairs, show rigid body moves caused by receptor activation. S10 Fig has an analogous plot of residual RMSD vs. similarity for each helix after the best rigid body transformation. RMSD shows a similar trend as the plots in this figure.

Manuscript Tables

Table 1
table1.pdf
Number of GPCR sequences by class. The total number of candidate human GPCR sequences that were considered are listed. The full list of Uniprot ACs is in S2 Table.

Table 2
table2.pdf
Selection of the alignment between class A and classes B, C, and F. This table shows the selection process for assigning BW .50 residues to non class A proteins. Shifting BW .50 residue on each helix renumbers the relative BW numbers, effectively changing the labels of contacts observed in these proteins. Subsequently, the number of common contacts each structure shares with the class A structures changes for different BW residue assignments. The second rightmost column shows the cumulative number of contact occurrences among the 24 class A structures (including active conformations). The BW assignment with the highest number of contacts is selected (except for MGLU5, see text). The selected alignment is in bold.

Table 3
table3.pdf
Examples of natural variants and mutations that are associated with functional change or disease and which coincide with the NACHO residues.

Table 4
table4.pdf
Summary of SNPs annotated on Uniprot. The complete list is in S3 Table.

Supplementary Information

S1 Table
S1Table.pdf
List of studied GPCR crystal structures. When multiple structures are available, then the one with the highest resolution or the one with least deformed TM helices is used.

S2 Table
S2Table.csv
GRoSS sequence alignment for all 817 human GPCRs. S1 File has this alignment in fasta format. Since there are no gaps in the TM domains, the alignment of each protein is uniquely determined by the BW .50 residues for each TM 1 through 7. We list also the expected range of the helical TM regions, which is estimated as the average TM region in the known crystal structures from the same class. In the discussion of the bitter taste receptors (TAS2Rs), we identified two possible alignments of TM6, but only the first one is presented in the following table. The second choice is to decrease the start, end, and BW50 residue of TM6 by 4.

S3 Table
S3Table.csv
GPCR natural variants annotated by Uniprot mapped to BW numbering and indicating their proximity to the NACHO and CHICO residues. The mutations are ordered according to the following score: \distance to the closest NACHO + distance to CHICO - multiplicity of the closest NACHO - multiplicity of CHICO + Blosum62 of the mutation".

S4 Table
S4Table.pdf
Conservation of CHICO and NACHO residues among orthologs. For orthologs of several proteins we computed average amino acids conservation over TM, and over CHICO/NACHO residues. The data shows that CHICO and NACHO positions are more conserved than other TM residues in all GPCR classes. Residues present on both lists are even more conserved. Two measures of conservation provided by Jalview are used: Consensus is the percentage of orthologs sharing the human amino acid; and Conservation is a qualitative measure counting the number of conserved chemical properties. For P2Y12, we used a curated list of 77 orthologs from [100]. For other proteins, we collected predicted orthologs from the MetaPhOrs database (release 201405 [101]), aligned them with Clustal Omega, and then removed sequences with gaps in the TM regions.

S1 Fig. Detailed view of conserved motifs in class A GPCRs. The conserved residues in 24 different structures (including active) have very similar positions, which shows that the class A GPCR fold is highly conserved. The full TM bundle is shown in Fig. 1.

S2 Fig. Sequence similarity (%) of the TM bundles between crystal structures for the final sequence alignment. Two residues are similar if their BLOSUM62 entry is positive.

S3 Fig. Backbone (atoms N, Calpha, C, O) RMSD of the TM bundles for the final sequence alignment. For a given pair of structures, there may exist a different sequence alignment, which results in a lover RMSD than the listed one.

S4 Fig.
S4Fig.pdf
High-resolution phylogenetic tree (Fig. 7) based on TM similarity only. The pdf file is searchable for the UNIPROT accession numbers. Loops were ignored. Color coding denotes the GPCR class. Proteins with known crystal structure are emphasized with a dot.

S5 Fig. Testing the robustness of the alignment of the Vomeronasal receptors with the other groups. This is an extended version of Fig. 3, same caption.

S6 Fig. Testing the robustness of the alignment of the Taste2 receptors with the other groups. This is an extended version of Fig. 4, same caption.

S7 Fig. Testing the robustness of the alignment of the Vomeronasal receptors with the GPCRtm substitution matrix. Same caption as in Fig. 3.

S8 Fig. Testing the robustness of the alignment of the Taste2 receptors with the GPCRtm substitution matrix. Same caption as in Fig. 4.

S9 Fig. Diagram of interhelical contacts present in classes B, C, and F. The width of the line connecting two TMs is proportional to the number of contacts present in all structures from the given class. The list in red font shows the contacts not present in any available structure from other classes.

S10 Fig. RMSD of helices after best rigid body move. Same caption as Fig. 9.

S1 Text
S1Text.pdf
Comparison of the GRoSS alignment to the HMM-HMM alignment [77] and to the GPCRDB alignment [24, 78].

S1 File
gross-alignment.fasta
The GRoSS alignment in fasta format and annotation of the TM regions and BW residues in Jalview format for all Human GPCRs. The first 29 sequences are the actual sequences from the PDB files of used crystal structures; the rest of the sequences are from Uniprot. N-terminal, loops and C-terminal are not aligned. For interactive work it is useful to also highlight the TM regions and BW residues using the Jalview annotation gross-alignment.gff file.

gpcr{S}.org

G Protein-Coupled Receptors {Sequence-Structure-Signaling} Knowledgebase

Release 1.0

GRoSS Publication Data

Structure-Based Sequence Alignment of the Transmembrane Domains of all Human GPCRs: Phylogenetic, Structural and Functional Implications
Cvicek et al. PLoS Computational Biology (2016)

Manuscript Figures

Manuscript Tables

Supplementary Information

gpcr{S}.org

G Protein-Coupled Receptors {Sequence-Structure-Signaling} Knowledgebase

Release 1.0

GRoSS Publication Data

Structure-Based Sequence Alignment of the Transmembrane Domains of all Human GPCRs: Phylogenetic, Structural and Functional Implications Cvicek et al. PLoS Computational Biology (2016)

Manuscript Figures

Manuscript Tables

Supplementary Information

Structure-Based Sequence Alignment of the Transmembrane Domains of all Human GPCRs: Phylogenetic, Structural and Functional Implications
Cvicek et al. PLoS Computational Biology (2016)