RNA & Protein Profiling, & Protein-DNA Interaction Mapping
المؤلف:
Peter J. Kennelly, Kathleen M. Botham, Owen P. McGuinness, Victor W. Rodwell, P. Anthony Weil
المصدر:
Harpers Illustrated Biochemistry
الجزء والصفحة:
32nd edition.p454-455
2025-10-30
64
The “-omic” revolution of the last decade has culminated in the determination of the complete nucleotide sequence of many thousands of genomes, including those of budding and fission yeasts, numerous bacteria, the fruit fly, the worm Caenorhabditis elegans, plants, the mouse, rat, chicken, monkey, and, most notably, humans. Additional genomes are being sequenced at an accelerating pace. The availability of all of this DNA sequence information, coupled with engineering advances, has led to the development of several revolutionary methodologies, most of which are based on second- and third-generation sequencing platforms. In the case of auto mated DNA sequencing, mRNAs are converted to cDNAs using retroviral RNA-dependent DNA polymerases to reverse transcribe mRNAs to DNAs. The resulting cDNAs are amplified and directly sequenced; this method is termed RNA-Seq. These methods allow for the quantitative description of the entire transcriptome. Recent reports in the literature have used RNA-Seq to describe the transcriptome of single cells, and when coupled with high-sensitivity ribosome profiling (see later) and mass spectrometry–based proteomics (see later), confidently define gene expression profiles at the mRNA and protein levels.
Recent methodologic advances(PRO-Seq,Precision Run On sequencing, and NET-Seq, native elongating transcript sequencing) allow for sequencing of RNA within elongating RNA polymerase-DNA-RNA ternary complexes, thereby allowing nucleotide-level descriptions, genome-wide, of active transcription in living cells. A parallel method termed ribosome profiling allows investigators to use high-throughput DNA sequencing to determine both the identity and number of cellular mRNAs in the process of being actively translated—thereby defining the cellular proteome. Transcriptome information allows one to quantitatively predict the collection of proteins that might be expressed in a particular cell, tissue, or organ in normal and disease states based on the mRNAs present in those cells, while ribosome profiling allows for the quantitative measurement of the actual cellular proteome, particularly when coupled with newer high sensitivity mass spectrometry to analyze protein content and PTM status of cellular proteomes (see later).
Complementing the very high-throughput, genome-wide expression profiling methods described earlier is the development of methods to map the location, or occupancy of specific proteins bound to discrete DNA sequences within living cells. This method, illustrated in Figure 1, is termed chroma tin immunoprecipitation (ChIP). Proteins are cross-linked in situ in cells or tissues, the cellular chromatin is isolated, sheared, and specific cross-linked protein-DNA complexes purified using antibodies that recognize a particular protein, or protein isoform. DNA bound to this protein is recovered and amplified using PCR and analyzed using gel electrophoresis imaging, microarray hybridization (ChIP-chip), or direct sequencing. There are two versions of the DNA sequencing assay readout. In the first, the immunopurified DNA is directly subjected to NGS/high-throughput DNA sequencing (ChIP-Seq); in the second version, the immunopurified, cross-linked protein-DNA complex is treated with exonucleases to remove cross-linked DNA sequences that are not in intimate contact with the protein of interest; this is termed ChIP-Exo. Collectively ChIP-chip and ChIP-Seq methods allow investigators to identify the locations of a single protein genome-wide throughout all the chromosomes. ChIP-Exo has the added advantage of allowing investigators to map in vivo protein occupancy at single nucleotide-level resolution. Finally, methods for high-sensitivity, high-throughput mass spectrometry of metabolites (metabolomics), various small molecules (lipids, lipidomics; carbohydrates, glycomics, etc.), and com plex protein samples (proteomics) have been developed. Newer mass spectrometry methods allow scientists to identify thousands of proteins in complex samples extracted from very small numbers of cells ( <1 g). Such analyses can now be used to quantify the relative amounts of proteins in two samples, as well as the level of certain PTMs, such as phosphorylation, acetylation etc.; and with the use of specific antibodies, define specific protein-protein interactions. This critical information tells investigators which of the many mRNAs detected in transcriptome mapping studies are being translated into protein, generally the ultimate dictator of cellular/tissue/organismal phenotypes.

Fig1. Outline of the chromatin immunoprecipitation (ChIP) technique.This method allows for the precise localization of a particular protein (or modified protein if an appropriate antibody is available; for example, n = anti-phosphorylated or acetylated histones, transcription factors, etc.) on a particular sequence element in living cells. Depending on the method used to analyze the immunopurified DNA, quantitative or semiquantitative information, at near nucleotide level resolution, can be obtained. Protein-DNA occupancy can be scored genome-wide in two ways. First, by ChIP-chip, a method that uses a hybridization readout. In ChIP-chip total genomic DNA is labeled with one particular fluorophore and the immunopurified DNA is labeled with a spectrally distinct fluorophore. These differentially labeled DNAs are mixed and hybridized to microarray “chips” (microscope slides) that contain specific DNA fragments, or more commonly now, synthetic oligonucleotide 50 to 70 nucleotides long. These gene-specific oligonucleotides are deposited and covalently attached at predetermined, known X,Y position/ coordinates on the slide. The labeled DNAs are hybridized, the slides washed and hybridization to each gene-specific oligonucleotide probe is scored using differential laser scanning and sensitive photodetection at micron resolution. The hybridization signal intensities are quantified, and the ratio of IP DNA/genomic DNA signals is used to score occupancy levels. The second method, termed ChIP-Seq, directly sequences immunopurified DNAs using NGS sequencing methods. Two variants of ChIP-Seq are shown: “standard” ChIP-Seq and ChIP-Exo. These two approaches differ in their ability to resolve and map the locations of the bound protein on genomic DNA. Standard ChIP-Seq resolution is 50 to 100 nt resolution, while ChIP-Exo has near single nt level resolution. Both approaches rely on efficient bioinformatic algorithms to deal with the very large datasets that are generated. ChIP-chip and ChIP-Seq techniques provide a (semi-) quantitative measure of in vivo protein occupancy. Though not schematized here, similar methods termed RIP (RNA immunoprecipitation) or CLIP (cross-linking protein-RNA and immunoprecipitation), which differ primarily in the method of protein-RNA cross-linking, can score the in vivo binding of specific proteins to specific RNA species (typically mRNAs, though any RNA species can be analyzed by these techniques).
New genetic means for identifying protein–protein inter actions and protein function have also been devised. Systematic genome-wide gene expression knockdown using siRNAs, synthetic lethal genetic interaction screens, or most recently CRISPR-Cas9 knockdown have been used to assess the contribution of individual genes to a variety of processes in model systems (yeast, worms, and flies) and mammalian cells (human and mouse). Specific network mappings of protein–protein interactions, on a genome-wide basis, have been identified using high-throughput variants of the two-hybrid interaction test (Figure 2). This simple yet powerful method can be performed in bacteria, yeast, or metazoan cells, and allows for the detection of specific protein–protein interactions in living cells. Reconstruction experiments indicate that protein–protein interactions with binding affinities of Kd ~10−6 mol/L or tighter can readily be detected with this method. Together, these technologies provide powerful tools with which to dis sect the intricacies of human biology.

Fig2. Overview of two hybrid system for identifying and characterizing protein–protein interactions. Shown are the basic components and operation of the two hybrid systems, originally devised by Fields and Song (Nature 340:245-246 [1989]) to function in the baker’s yeast system. (1) A reporter gene, either a selectable marker (ie, a gene conferring prototrophic growth on selective media, or producing an enzyme for which a colony colorimetric assay exists, such as β-galactosidase) that is expressed only when a transcription factor binds upstream to a cis-linked enhancer (dark red bar). (2) A “bait” fusion protein (DBD-X) produced from a chimeric gene expressing a modular DNA-binding domain (DBD; often derived from the yeast Gal4 protein or the bacterial Lex A protein, both high-affinity, high-specificity DNA-binding proteins) fused in-frame to a protein of interest, here X. In two hybrid experiments, one is testing whether any protein can interact with protein X. Prey protein X may be fused in its entirety or often alternatively just a portion of protein X is expressed in-frame with the DBD. (3) A “prey” protein (Y-AD), which represents a fusion of a specific protein fused in-frame to a transcriptional activation domain (AD; often derived from either the herpes simplex virus VP16 activator protein or the yeast GAL4 protein). This system serves as a useful test of protein–protein interactions between proteins X and Y because in the absence of a functional transactivator binding to the indicated enhancer, no transcription of the reporter gene occurs (ie, see Figure 38–16). Thus, one observes transcription only if protein X–protein Y interaction occurs, thereby bringing a functional AD to the cis-linked transcription unit, in this case activating transcription of the reporter gene. In this scenario, protein DBD-X alone fails to activate reporter transcription because the X-domain fused to the DBD does not contain an AD. Similarly, protein Y-AD alone fails to activate reporter gene transcription because it lacks a DBD to target the Y-AD protein to the enhancer-promoter-reporter gene. Only when both proteins are expressed in a single cell and bind the enhancer and, via DBD-X–Y-AD protein–protein interactions, regenerate a functional transactivator binary “protein,” does reporter gene transcription result in activation and mRNA synthesis (green line from AD to reporter gene).
الاكثر قراءة في مواضيع عامة في الاحياء الجزيئي
اخر الاخبار
اخبار العتبة العباسية المقدسة