Distinct Splice Variants and Pathway Enrichment in the Cell-Line Models of Aggressive Human Breast Cancer Subtypes
JOURNAL OF PROTEOME RESEARCH
2014; 13 (1): 212-227
This study was conducted as a part of the Chromosome-Centric Human Proteome Project (C-HPP) of the Human Proteome Organization. The United States team of C-HPP is focused on characterizing the protein-coding genes in chromosome 17. Despite its small size, chromosome 17 is rich in protein-coding genes; it contains many cancer-associated genes, including BRCA1, ERBB2, (Her2/neu), and TP53. The goal of this study was to examine the splice variants expressed in three ERBB2 expressed breast cancer cell-line models of hormone-receptor-negative breast cancers by integrating RNA-Seq and proteomic mass spectrometry data. The cell lines represent distinct phenotypic variations subtype: SKBR3 (ERBB2+ (overexpression)/ER-/PR-; adenocarcinoma), SUM190 (ERBB2+ (overexpression)/ER-/PR-; inflammatory breast cancer), and SUM149 (ERBB2 (low expression) ER-/PR-; inflammatory breast cancer). We identified more than one splice variant for 1167 genes expressed in at least one of the three cancer cell lines. We found multiple variants of genes that are in the signaling pathways downstream of ERBB2 along with variants specific to one cancer cell line compared with the other two cancer cell lines and with normal mammary cells. The overall transcript profiles based on read counts indicated more similarities between SKBR3 and SUM190. The top-ranking Gene Ontology and BioCarta pathways for the cell-line specific variants pointed to distinct key mechanisms including: amino sugar metabolism, caspase activity, and endocytosis in SKBR3; different aspects of metabolism, especially of lipids in SUM190; cell-to-cell adhesion, integrin, and ERK1/ERK2 signaling; and translational control in SUM149. The analyses indicated an enrichment in the electron transport chain processes in the ERBB2 overexpressed cell line models and an association of nucleotide binding, RNA splicing, and translation processes with the IBC models, SUM190 and SUM149. Detailed experimental studies on the distinct variants identified from each of these three breast cancer cell line models that may open opportunities for drug target discovery and help unveil their specific roles in cancer progression and metastasis.
View details for DOI 10.1021/pr400773v
View details for Web of Science ID 000329472700022
View details for PubMedID 24111759
- Specific Plasma Autoantibody Reactivity in Myelodysplastic Syndromes SCIENTIFIC REPORTS 2013; 3
Whole-exome sequencing identifies tetratricopeptide repeat domain 7A (TTC7A) mutations for combined immunodeficiency with intestinal atresias.
journal of allergy and clinical immunology
2013; 132 (3): 656-664 e17
Combined immunodeficiency with multiple intestinal atresias (CID-MIA) is a rare hereditary disease characterized by intestinal obstructions and profound immune defects.We sought to determine the underlying genetic causes of CID-MIA by analyzing the exomic sequences of 5 patients and their healthy direct relatives from 5 unrelated families.We performed whole-exome sequencing on 5 patients with CID-MIA and 10 healthy direct family members belonging to 5 unrelated families with CID-MIA. We also performed targeted Sanger sequencing for the candidate gene tetratricopeptide repeat domain 7A (TTC7A) on 3 additional patients with CID-MIA.Through analysis and comparison of the exomic sequence of the subjects from these 5 families, we identified biallelic damaging mutations in the TTC7A gene, for a total of 7 distinct mutations. Targeted TTC7A gene sequencing in 3 additional unrelated patients with CID-MIA revealed biallelic deleterious mutations in 2 of them, as well as an aberrant splice product in the third patient. Staining of normal thymus showed that the TTC7A protein is expressed in thymic epithelial cells, as well as in thymocytes. Moreover, severe lymphoid depletion was observed in the thymus and peripheral lymphoid tissues from 2 patients with CID-MIA.We identified deleterious mutations of the TTC7A gene in 8 unrelated patients with CID-MIA and demonstrated that the TTC7A protein is expressed in the thymus. Our results strongly suggest that TTC7A gene defects cause CID-MIA.
View details for DOI 10.1016/j.jaci.2013.06.013
View details for PubMedID 23830146
Genome Wide Proteomics of ERBB2 and EGFR and Other Oncogenic Pathways in Inflammatory Breast Cancer
JOURNAL OF PROTEOME RESEARCH
2013; 12 (6): 2805-2817
In this study we selected three breast cancer cell lines (SKBR3, SUM149 and SUM190) with different oncogene expression levels involved in ERBB2 and EGFR signaling pathways as a model system for the evaluation of selective integration of subsets of transcriptomic and proteomic data. We assessed the oncogene status with reads per kilobase per million mapped reads (RPKM) values for ERBB2 (14.4, 400, and 300 for SUM149, SUM190, and SKBR3, respectively) and for EGFR (60.1, not detected, and 1.4 for the same 3 cell lines). We then used RNA-Seq data to identify those oncogenes with significant transcript levels in these cell lines (total 31) and interrogated the corresponding proteomics data sets for proteins with significant interaction values with these oncogenes. The number of observed interactors for each oncogene showed a significant range, e.g., 4.2% (JAK1) to 27.3% (MYC). The percentage is measured as a fraction of the total protein interactions in a given data set vs total interactors for that oncogene in STRING (Search Tool for the Retrieval of Interacting Genes/Proteins, version 9.0) and I2D (Interologous Interaction Database, version 1.95). This approach allowed us to focus on 4 main oncogenes, ERBB2, EGFR, MYC, and GRB2, for pathway analysis. We used bioinformatics sites GeneGo, PathwayCommons and NCI receptor signaling networks to identify pathways that contained the four main oncogenes and had good coverage in the transcriptomic and proteomic data sets as well as a significant number of oncogene interactors. The four pathways identified were ERBB signaling, EGFR1 signaling, integrin outside-in signaling, and validated targets of C-MYC transcriptional activation. The greater dynamic range of the RNA-Seq values allowed the use of transcript ratios to correlate observed protein values with the relative levels of the ERBB2 and EGFR transcripts in each of the four pathways. This provided us with potential proteomic signatures for the SUM149 and 190 cell lines, growth factor receptor-bound protein 7 (GRB7), Crk-like protein (CRKL) and Catenin delta-1 (CTNND1) for ERBB signaling; caveolin 1 (CAV1), plectin (PLEC) for EGFR signaling; filamin A (FLNA) and actinin alpha1 (ACTN1) (associated with high levels of EGFR transcript) for integrin signalings; branched chain amino-acid transaminase 1 (BCAT1), carbamoyl-phosphate synthetase (CAD), nucleolin (NCL) (high levels of EGFR transcript); transferrin receptor (TFRC), metadherin (MTDH) (high levels of ERBB2 transcript) for MYC signaling; S100-A2 protein (S100A2), caveolin 1 (CAV1), Serpin B5 (SERPINB5), stratifin (SFN), PYD and CARD domain containing (PYCARD), and EPH receptor A2 (EPHA2) for PI3K signaling, p53 subpathway. Future studies of inflammatory breast cancer (IBC), from which the cell lines were derived, will be used to explore the significance of these observations.
View details for DOI 10.1021/pr4001527
View details for Web of Science ID 000320298600040
Identification of Potential Glycan Cancer Markers with Sialic Acid Attached to Sialic Acid and Up-regulated Fucosylated Galactose Structures in Epidermal Growth Factor Receptor Secreted from A431 Cell Line.
Molecular & cellular proteomics
2013; 12 (5): 1239-1249
We have used powerful HPLC-mass spectrometric approaches to characterize the secreted form of epidermal growth factor receptor (sEGFR). We demonstrated that the amino acid sequence lacked the cytoplasmic domain and was consistent with the primary sequence reported for EGFR purified from a human plasma pool. One of the sEGFR forms, attributed to the alternative RNA splicing, was also confirmed by transcriptional analysis (RNA sequencing). Two unusual types of glycan structures were observed in sEGFR as compared with membrane-bound EGFR from the A431 cell line. The unusual glycan structures were di-sialylated glycans (sialic acid attached to sialic acid) at Asn-151 and N-acetylhexosamine attached to a branched fucosylated galactose with N-acetylglucosamine moieties (HexNAc-(Fuc)Gal-GlcNAc) at Asn-420. These unusual glycans at specific sites were either present at a much lower level or were not observable in membrane-bound EGFR present in the A431 cell lysate. The observation of these di-sialylated glycan structures was consistent with the observed expression of the corresponding ?-N-acetylneuraminide ?-2,8-sialyltransferase 2 (ST8SiA2) and ?-N-acetylneuraminide ?-2,8-sialyltransferase 4 (ST8SiA4), by quantitative real time RT-PCR. The connectivity present at the branched fucosylated galactose was also confirmed by methylation of the glycans followed by analysis with sequential fragmentation in mass spectrometry. We hypothesize that the presence of such glycan structures could promote secretion via anionic or steric repulsion mechanisms and thus facilitate the observation of these glycan forms in the secreted fractions. We plan to use this model system to facilitate the search for novel glycan structures present at specific sites in sEGFR as well as other secreted oncoproteins such as Erbb2 as markers of disease progression in blood samples from cancer patients.
View details for DOI 10.1074/mcp.M112.024554
View details for PubMedID 23371026
Preparation of recombinant protein spotted arrays for proteome-wide identification of kinase targets.
Current protocols in protein science / editorial board, John E. Coligan ... [et al.]
2013; Chapter 27: Unit 27 4-?
Protein microarrays allow unique approaches for interrogating global protein interaction networks. Protein arrays can be divided into two categories: antibody arrays and functional protein arrays. Antibody arrays consist of various antibodies and are appropriate for profiling protein abundance and modifications. Functional full-length protein arrays employ full-length proteins with various post-translational modifications. A key advantage of the latter is rapid parallel processing of large number of proteins for studying highly controlled biochemical activities, protein-protein interactions, protein-nucleic acid interactions, and protein-small molecule interactions. This unit presents a protocol for constructing functional yeast protein microarrays for global kinase substrate identification. This approach enables the rapid determination of protein interaction networks in yeast on a proteome-wide level. The same methodology can be readily applied to higher eukaryotic systems with careful consideration of overexpression strategy.
View details for DOI 10.1002/0471140864.ps2704s72
View details for PubMedID 23546622
A Chromosome-centric Human Proteome Project (C-HPP) to Characterize the Sets of Proteins Encoded in Chromosome 17
JOURNAL OF PROTEOME RESEARCH
2013; 12 (1): 45-57
We report progress assembling the parts list for chromosome 17 and illustrate the various processes that we have developed to integrate available data from diverse genomic and proteomic knowledge bases. As primary resources, we have used GPMDB, neXtProt, PeptideAtlas, Human Protein Atlas (HPA), and GeneCards. All sites share the common resource of Ensembl for the genome modeling information. We have defined the chromosome 17 parts list with the following information: 1169 protein-coding genes, the numbers of proteins confidently identified by various experimental approaches as documented in GPMDB, neXtProt, PeptideAtlas, and HPA, examples of typical data sets obtained by RNASeq and proteomic studies of epithelial derived tumor cell lines (disease proteome) and a normal proteome (peripheral mononuclear cells), reported evidence of post-translational modifications, and examples of alternative splice variants (ASVs). We have constructed a list of the 59 "missing" proteins as well as 201 proteins that have inconclusive mass spectrometric (MS) identifications. In this report we have defined a process to establish a baseline for the incorporation of new evidence on protein identification and characterization as well as related information from transcriptome analyses. This initial list of "missing" proteins that will guide the selection of appropriate samples for discovery studies as well as antibody reagents. Also we have illustrated the significant diversity of protein variants (including post-translational modifications, PTMs) using regions on chromosome 17 that contain important oncogenes. We emphasize the need for mandated deposition of proteomics data in public databases, the further development of improved PTM, ASV, and single nucleotide variant (SNV) databases, and the construction of Web sites that can integrate and regularly update such information. In addition, we describe the distribution of both clustered and scattered sets of protein families on the chromosome. Since chromosome 17 is rich in cancer-associated genes, we have focused the clustering of cancer-associated genes in such genomic regions and have used the ERBB2 amplicon as an example of the value of a proteogenomic approach in which one integrates transcriptomic with proteomic information and captures evidence of coexpression through coordinated regulation.
View details for DOI 10.1021/pr300985j
View details for Web of Science ID 000313156300007
Personal Omics Profiling Reveals Dynamic Molecular and Medical Phenotypes
2012; 148 (6): 1293-1307
Personalized medicine is expected to benefit from combining genomic information with regular monitoring of physiological states by multiple high-throughput methods. Here, we present an integrative personal omics profile (iPOP), an analysis that combines genomic, transcriptomic, proteomic, metabolomic, and autoantibody profiles from a single individual over a 14 month period. Our iPOP analysis revealed various medical risks, including type 2 diabetes. It also uncovered extensive, dynamic changes in diverse molecular components and biological pathways across healthy and diseased conditions. Extremely high-coverage genomic and transcriptomic data, which provide the basis of our iPOP, revealed extensive heteroallelic changes during healthy and diseased states and an unexpected RNA editing mechanism. This study demonstrates that longitudinal iPOP can be used to interpret healthy and diseased states by connecting genomic information with additional dynamic omics activity.
View details for DOI 10.1016/j.cell.2012.02.009
View details for Web of Science ID 000301889500023
View details for PubMedID 22424236
Global identification of protein kinase substrates by protein microarray analysis
2009; 4 (12): 1820-1827
Herein, we describe a protocol for the global identification of in vitro substrates targeted by protein kinases using protein microarray technology. Large numbers of fusion proteins tagged at their carboxy-termini are purified in 96-well format and spotted in duplicate onto amino-silane-coated slides in a spatially addressable manner. These arrays are incubated in the presence of purified kinase and radiolabeled ATP, and then washed, dried and analyzed by autoradiography. The extent of phosphorylation of each spot is quantified and normalized, and proteins that are reproducibly phosphorylated in the presence of the active kinase relative to control slides are scored as positive substrates. This approach enables the rapid determination of kinase-substrate relationship on a proteome-wide scale, and although developed using yeast, has since been adapted to higher eukaryotic systems. Expression, purification and printing of the yeast proteome require about 3 weeks. Afterwards, each kinase assay takes approximately 3 h to perform.
View details for DOI 10.1038/nprot.2009.194
View details for Web of Science ID 000274226100011
View details for PubMedID 20010933
- Establishment and regulation of chromatin domains: Mechanistic insights from studies of hemoglobin synthesis PROGRESS IN NUCLEIC ACID RESEARCH AND MOLECULAR BIOLOGY, VOL 81 2006; 81: 435-471
Chromatin domain activation via GATA-1 utilization of a small subset of dispersed GATA motifs within a broad chromosomal region
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2005; 102 (47): 17065-17070
Cis elements that mediate transcription factor binding are abundant within genomes, but the rules governing occupancy of such motifs in chromatin are not understood. The transcription factor GATA-1 that regulates red blood cell development binds with high affinity to GATA motifs, and initial studies suggest that these motifs are often unavailable for occupancy in chromatin. Whereas GATA-2 regulates the differentiation of all blood cell lineages via GATA motif binding, the specificity of GATA-2 chromatin occupancy has not been studied. We found that conditionally active GATA-1 (ER-GATA-1) and GATA-2 occupy only a small subset of the conserved GATA motifs within the murine beta-globin locus. Kinetic analyses in GATA-1-null cells indicated that ER-GATA-1 preferentially occupied GATA motifs at the locus control region (LCR), in which chromatin accessibility is largely GATA-1-independent. Subsequently, ER-GATA-1 increased promoter accessibility and occupied the betamajor promoter. ER-GATA-1 increased erythroid Krüppel-like factor and SWI/SNF chromatin remodeling complex occupancy at restricted LCR sites. These studies revealed three phases of beta-globin locus activation: GATA-1-independent establishment of specific chromatin structure features, GATA-1-dependent LCR complex assembly, and GATA-1-dependent promoter complex assembly. The differential utilization of dispersed GATA motifs therefore establishes spatial/temporal regulation and underlies the multistep activation mechanism.
View details for DOI 10.1073/pnas.0506164102
View details for Web of Science ID 000233463200030
View details for PubMedID 16286657
Measurement of protein-DNA interactions in vivo by chromatin immunoprecipitation.
Methods in molecular biology (Clifton, N.J.)
2004; 284: 129-146
Elucidating mechanisms controlling nuclear processes requires an understanding of the nucleoprotein structure of genes at endogenous chromosomal loci. Traditional approaches to measuring protein-DNA interactions in vitro have often failed to provide insights into physiological mechanisms. Given that most transcription factors interact with simple DNA sequence motifs, which are abundantly distributed throughout a genome, it is essential to pinpoint the small subset of sites bound by factors in vivo. Signaling mechanisms induce the assembly and modulation of complex patterns of histone acetylation, methylation, phosphorylation, and ubiquitination, which are crucial determinants of chromatin accessibility. These seemingly complex issues can be directly addressed by a powerful methodology termed the chromatin immunoprecipitation (ChIP) assay. ChIP analysis involves covalently trapping endogenous proteins at chromatin sites, thereby yielding snapshots of protein-DNA interactions and histone modifications within living cells. The chromatin is sonicated to generate small fragments, and an immunoprecipitation is conducted with an antibody against the desired factor or histone modification. Crosslinks are reversed, and polymerase chain reaction (PCR) is used to assess whether DNA sequences are recovered immune-specifically. Chromatin-domain scanning coupled with quantitative analysis is a powerful means of dissecting mechanisms by which signaling pathways target genes within a complex genome.
View details for PubMedID 15173613
Highly restricted localization of RNA polymerase II within a locus control region of a tissue-specific chromatin domain
MOLECULAR AND CELLULAR BIOLOGY
2003; 23 (18): 6484-6493
RNA polymerase II (Pol II) can associate with regulatory elements far from promoters. For the murine beta-globin locus, Pol II binds the beta-globin locus control region (LCR) far upstream of the beta-globin promoters, independent of recruitment to and activation of the betamajor promoter. We describe here an analysis of where Pol II resides within the LCR, how it is recruited to the LCR, and the functional consequences of recruitment. High-resolution analysis of the distribution of Pol II revealed that Pol II binding within the LCR is restricted to the hypersensitive sites. Blocking elongation eliminated the synthesis of genic and extragenic transcripts and eliminated Pol II from the betamajor open reading frame. However, the elongation blockade did not redistribute Pol II at the hypersensitive sites, suggesting that Pol II is recruited to these sites. The distribution of Pol II did not strictly correlate with the distributions of histone acetylation and methylation. As Pol II associates with histone-modifying enzymes, Pol II tracking might be critical for establishing and maintaining broad histone modification patterns. However, blocking elongation did not disrupt the histone modification pattern of the beta-globin locus, indicating that Pol II tracking is not required to maintain the pattern.
View details for DOI 10.1128/MCB.23.18.6484-6493.2003
View details for Web of Science ID 000185103900013
View details for PubMedID 12944475
Dynamic regulation of histone H3 methylated at lysine 79 within a tissue-specific chromatin domain
JOURNAL OF BIOLOGICAL CHEMISTRY
2003; 278 (20): 18346-18352
Post-translational modifications of individual lysine residues of core histones can exert unique functional consequences. For example, methylation of histone H3 at lysine 79 (H3-meK79) has been implicated recently in gene silencing in Saccharomyces cerevisiae. However, the distribution and function of H3-meK79 in mammalian chromatin are not known. We found that H3-meK79 has a variable distribution within the murine beta-globin locus in adult erythroid cells, being preferentially enriched at the active betamajor gene. By contrast, acetylated H3 and H4 and H3 methylated at lysine 4 were enriched both at betamajor and at the upstream locus control region. H3-meK79 was also enriched at the active cad gene, whereas the transcriptionally inactive loci necdin and MyoD1 contained very little H3-meK79. As the pattern of H3-meK79 at the beta-globin locus differed between adult and embryonic erythroid cells, establishment and/or maintenance of H3-meK79 was developmentally dynamic. Genetic complementation analysis in null cells lacking the erythroid and megakaryocyte-specific transcription factor p45/NF-E2 showed that p45/NF-E2 preferentially establishes H3-meK79 at the betamajor promoter. These results support a model in which H3-meK79 is strongly enriched in mammalian chromatin at active genes but not uniformly throughout active chromatin domains. As H3-meK79 is highly regulated at the beta-globin locus, we propose that the murine ortholog of Disruptor of Telomeric Silencing-1-like (mDOT1L) methyltransferase, which synthesizes H3-meK79, regulates beta-globin transcription.
View details for DOI 10.1074/jbc.M300890200
View details for Web of Science ID 000182838300099
View details for PubMedID 12604594
Histone deacetylase-dependent establishment and maintenance of broad low-level histone acetylation within a tissue-specific chromatin domain
2002; 41 (51): 15152-15160
The murine beta-globin locus in adult erythroid cells is characterized by a broad pattern of erythroid-specific histone acetylation. The embryonic beta-globin genes Ey and betaH1 are located in a approximately 30 kb central subdomain characterized by low-level histone acetylation, while the fetal/adult genes betamajor and betaminor and the upstream locus control region reside in hyperacetylated chromatin. Histone deacetylase (HDAC) inhibitors induce H4 acetylation at the Ey promoter [Forsberg, E. C., Downs, K. M., Christensen, H. M., Im, H., Nuzzi, P. A., and Bresnick, E. H. (2000) Proc. Natl. Acad. Sci. U.S.A. 97, 14494-14499], indicating that HDACs maintain low-level H4 acetylation at this site. Since little is known about the establishment of broad histone modification patterns, we asked whether this mechanism applies only to the promoter or to the entire subdomain. We show that the HDAC inhibitor trichostatin A induces H4 hyperacetylation at multiple sites within the subdomain in erythroid cells. The hematopoietic factors p45/NF-E2, GATA-1, and erythroid kruppel-like factor (EKLF), which function through cis elements of the beta-globin locus, were not required for induction of H4 hyperacetylation. Analysis of chromatin structure within the subdomain revealed low accessibility to restriction endonucleases and nearly complete CpG dinucleotide methylation. Induction of H4 hyperacetylation did not restore hallmark features of transcriptionally active chromatin. We propose that an HDAC-dependent surveillance mechanism counteracts constitutive histone acetyltransferase (HAT) access, thereby maintaining low-level H4 acetylation throughout the subdomain.
View details for DOI 10.1021/bi026786q
View details for Web of Science ID 000180015100007
View details for PubMedID 12484752
Developmentally dynamic histone acetylation pattern of a tissue-specific chromatin domain
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
2000; 97 (26): 14494-14499
We have defined the histone acetylation pattern of the endogenous murine beta-globin domain, which contains the erythroidspecific beta-globin genes. The beta-globin locus control region (LCR) and transcriptionally active promoters were enriched in acetylated histones in fetal liver relative to fetal brain, whereas the inactive promoters were hypoacetylated. In contrast, the LCR and both active and inactive promoters were hyperacetylated in yolk sac. Hypersensitive site two of the LCR was also hyperacetylated in murine embryonic stem cells, whereas beta-globin promoters were hypoacetylated. Thus, the acetylation pattern varied at different developmental stages. Histone deacetylase inhibition selectively increased acetylation at a hypoacetylated promoter in fetal liver, suggesting that active deacetylation contributes to silencing of promoters. We propose that dynamic histone acetylation and deacetylation play an important role in the developmental control of beta-globin gene expression.
View details for Web of Science ID 000165993700092
View details for PubMedID 11121052