Doctor of Philosophy, Ben Gurion University Of The Negev (2011)
Andrew Fire, Postdoctoral Faculty Sponsor
In certain organisms, numbers of crossover events for any single chromosome are limited ("crossover interference") so that double crossover events are obtained at much lower frequencies than would be expected from the simple product of independent single-crossover events. We present a number of observations during which we examined interference over a large region of Caenorhabditis elegans chromosome V. Examining this region for multiple crossover events in heteroallelic configurations with limited dimorphism, we observed high levels of crossover interference in oocytes with only partial interference in spermatocytes.
View details for DOI 10.1534/g3.113.008672
View details for Web of Science ID 000332597000013
View details for PubMedID 24240780
The secondary structure of RNAs can be represented by graphs at various resolutions. While it was shown that RNA secondary structures can be represented by coarse grain tree-graphs and meaningful topological indices can be used to distinguish between various structures, small RNAs are needed to be represented by full graphs. No meaningful topological index has yet been suggested for the analysis of such type of RNA graphs. Recalling that the second eigenvalue of the Laplacian matrix can be used to track topological changes in the case of coarse grain tree-graphs, it is plausible to assume that a topological index such as the Wiener index that represents all Laplacian eigenvalues may provide a similar guide for full graphs. However, by its original definition, the Wiener index was defined for acyclic graphs. Nevertheless, similarly to cyclic chemical graphs, small RNA graphs can be analyzed using elementary cuts, which enables the calculation of topological indices for small RNAs in an intuitive way. We show how to calculate a structural descriptor that is suitable for cyclic graphs, the Szeged index, for small RNA graphs by elementary cuts. We discuss potential uses of such a procedure that considers all eigenvalues of the associated Laplacian matrices to quantify the topology of small RNA graphs.
View details for DOI 10.1016/j.compbiolchem.2012.10.004
View details for Web of Science ID 000313772100004
View details for PubMedID 23147564
RNA mutational analysis at the secondary-structure level can be useful to a wide-range of biological applications. It can be used to predict an optimal site for performing a nucleotide mutation at the single molecular level, as well as to analyze basic phenomena at the systems level. For the former, as more sequence modification experiments are performed that include site-directed mutagenesis to find and explore functional motifs in RNAs, a pre-processing step that helps guide in planning the experiment becomes vital. For the latter, mutations are generally accepted as a central mechanism by which evolution occurs, and mutational analysis relating to structure should gain a better understanding of system functionality and evolution. In the past several years, the program RNAmute that is structure based and relies on RNA secondary-structure prediction has been developed for assisting in RNA mutational analysis. It has been extended from single-point mutations to treat multiple-point mutations efficiently by initially calculating all suboptimal solutions, after which only the mutations that stabilize the suboptimal solutions and destabilize the optimal one are considered as candidates for being deleterious. The RNAmute web server for mutational analysis is available at http://www.cs.bgu.ac.il/~xrnamute/XRNAmute.
View details for DOI 10.1093/nar/gkr207
View details for Web of Science ID 000292325300016
View details for PubMedID 21478166
Nucleosome DNA bendability pattern extracted from large nucleosome DNA database of C. elegans is used for construction of full length (116 dinucleotide positions) nucleosome DNA bendability matrix. The matrix can be used for sequence-directed mapping of the nucleosomes on the sequences. Several alternative positions for a given nucleosome are typically predicted, separated by multiples of nucleosome DNA period. The corresponding computer program is successfully tested on best known experimental examples of accurately positioned nucleosomes. The uncertainty of the computational mapping is +/-1 base. The procedure is placed on publicly accessible server and can be applied to any DNA sequence of interest.
View details for Web of Science ID 000278516000009
View details for PubMedID 20476799
The DNA in eukaryotic cells is packed into the chromatin that is composed of nucleosomes. Positioning of the nucleosome core particles on the sequence is a problem of great interest because of the role nucleosomes play in different cellular processes including gene regulation. Using the sequence structure of 10.4 base DNA repeat presented in our previous works and nucleosome core DNA sequences database, we have derived the complete nucleosome DNA bendability matrix of Caenorhabditis elegans. We have developed a web server named FineStr that allows users to upload genomic sequences in FASTA format and to perform a single-base-resolution nucleosome mapping on them.FineStr server is freely available for use on the web at http:/www.cs.bgu.ac.il/ approximately nucleom. The site contains a help file with explanation regarding the exact firstname.lastname@example.org.
View details for DOI 10.1093/bioinformatics/btq030
View details for Web of Science ID 000275243500021
View details for PubMedID 20106816
Heat shock proteins (HSPs) provide a useful system for studying developmental patterns in the digenetic Leishmania parasites, since their expression is induced in the mammalian life form. Translation regulation plays a key role in control of protein coding genes in trypanosomatids, and is directed exclusively by elements in the 3' untranslated region (UTR). Using sequential deletions of the Leishmania Hsp83 3' UTR (888 nucleotides [nt]), we mapped a region of 150 nt that was required, but not sufficient for preferential translation of a reporter gene at mammalian-like temperatures, suggesting that changes in RNA structure could be involved. An advanced bioinformatics package for prediction of RNA folding (UNAfold) marked the regulatory region on a highly probable structural arm that includes a polypyrimidine tract (PPT). Mutagenesis of this PPT abrogated completely preferential translation of the fused reporter gene. Furthermore, temperature elevation caused the regulatory region to melt more extensively than the same region that lacked the PPT. We propose that at elevated temperatures the regulatory element in the 3' UTR is more accessible to mediators that promote its interaction with the basal translation components at the 5' end during mRNA circularization. Translation initiation of Hsp83 at all temperatures appears to proceed via scanning of the 5' UTR, since a hairpin structure abolishes expression of a fused reporter gene.
View details for DOI 10.1261/rna.1874710
View details for Web of Science ID 000273868900013
View details for PubMedID 20040590
An original signal extraction procedure is applied to database of 146 base nucleosome core DNA sequences from C. elegans (S. M. Johnson et al. Genome Research 16, 1505-1516, 2006). The positional preferences of various dinucleotides within the 10.4 base nucleosome DNA repeat are calculated, resulting in derivation of the nucleosome DNA bendability matrix of 16x10 elements. A simplified one-line presentation of the matrix ("consensus" repeat) is ...A(TTTCCGGAAA)T.... All 6 chromosomes of C. elegans conform to the bendability pattern. The strongest affinity to their respective positions is displayed by dinucleotides AT and CG, separated within the repeat by 5 bases. The derived pattern makes a basis for sequence-directed mapping of nucleosome positions in the genome of C. elegans. As the first complete matrix of bendability available the pattern may serve for iterative calculations of the species-specific matrices of bendability applicable to other genomic sequences.
View details for Web of Science ID 000262917000001
View details for PubMedID 19108579
Three-way junctions in folded RNAs have been investigated both experimentally and computationally. The interest in their analysis stems from the fact that they have significantly been found to possess a functional role. In recent work, three-way junctions have been categorized into families depending on the relative lengths of the segments linking the three helices. Here, based on ideas originating from computational geometry, an algorithm is proposed for detecting three-way junctions in data sets of genes that are related to a metabolic pathway of interest. In its current implementation, the algorithm relies on a moving window that performs energy minimization folding predictions, and is demonstrated on a set of genes that are involved in purine metabolism in plants. The pattern matching algorithm can be extended to other organisms and other metabolic cycles of interest in which three-way junctions have been or will be discovered to play an important role. In the test case presented here with, the computational prediction of a three-way junction in Arabidopsis that was speculated to have an interesting functional role is verified experimentally.
View details for PubMedID 18928199
The discovery of natural RNA sensors that respond to a change in the environment by a conformational switch can be utilized for various biotechnological and nanobiotechnological advances. One class of RNA sensors is the riboswitch: an RNA genetic control element that is capable of sensing small molecules, responding to a deviation in ligand concentration with a structural change. Riboswitches are modularly built from smaller components. Computational methods can potentially be utilized in assembling these building block components and offering improvements in the biochemical design process. We describe a computational procedure to design RNA switches from building blocks with favorable properties. To achieve maximal throughput for genetic control purposes, future designer RNA switches can be assembled based on a computerized preprocessing buildup of the constituent domains, namely the aptamer and the expression platform in the case of a synthetic riboswitch. Conformational switching is enabled by the RNA versatility to possess two highly stable states that are energetically close to each other but topologically distinct, separated by an energy barrier between them. Initially, computer simulations can produce a list of short sequences that switch between two conformers when trigerred by point mutations or temperature. The short sequences should possess an additional desirable property; when these selected small RNA switch segments are attached to various aptamers, the ligand binding mechanism should replace the aforementioned event triggers, which will no longer be effective for crossing the energy barrier. In the assembled RNA sequence, energy minimization folding predictions should then show no difference between the folded structure of the entire sequence relative to the folded structure of each of its constituents. Moreover, energy minimization methods applied on the entire sequence could aid at this preprocessing stage by exhibiting high mutational robustness to capture the stability of the formed hairpin in the expression platform. The above computer-assisted assembly procedure together with application specific considerations may further be tailored for therapeutic gene regulation. Index Terms-Design of RNA switches, energy minimization methods, RNA folding predictions.
View details for DOI 10.1109/TNB.2007.891894
View details for Web of Science ID 000244944600002
View details for PubMedID 17393844
Evolution of the triplet code is reconstructed on the basis of consensus temporal order of appearance of amino acids. Several important predictions are confirmed by computational sequence analyses. The earliest amino acids, alanine and glycine, have been encoded by GCC and GGC codons, as today. They were succeeded, respectively, by A- and G-series of amino acids, encoded by pyrimidine-central and purine-central codons. The length of the earliest proteins is estimated to be 6-7 residues. The earliest mRNAs were short G+C-rich molecules. These short sequences could have formed hairpins. This is confirmed by analysis of modern prokaryotic mRNA sequences. Predominant size of detected ancient hairpins also corresponds to 6-7 amino acids, as above. Vestiges of last common ancestor can be found in extant proteins in form of entirely conserved short sequences of size six to nine residues present in all or almost all sequenced prokaryotic proteomes (omnipresent motifs). The functions of the topmost conserved octamers are not involved in the basic elementary syntheses. This suggests an initial abiotic supply of amino acids, bases and sugars.
View details for DOI 10.1007/s11084-006-9042-5
View details for Web of Science ID 000243623600019
View details for PubMedID 17120122
From recent developments of the early evolution theory it follows that the earliest mRNAs were short ( approximately 20 nt) (G+C)-rich polynucleotides. These short sequences could form hairpins, which would be of high evolutionary advantage because of stability and uniqueness of their conformations. Due to mutations accumulated during billions of years of evolution, the speculated earliest hairpins would largely lose the initial complementarities. Some of the original complementary base-to-base contacts, however, may have survived. Computational analysis of modern prokaryotic mRNA sequences reveals excess population of the expected short range complementarities. The derived earliest mRNA hairpin size fully corresponds to the predicted size of ancient coding duplexes. The repertoire of the surviving hairpins traced in modern mRNA confirms duplex structure of the earliest mRNA, suggested by the early molecular evolution theory.
View details for Web of Science ID 000241066100007
View details for PubMedID 16928139