THESIS
2020
xxiv, 133 pages : illustrations (some color) ; 30 cm
Abstract
The accuracy of DNA replication, gene transcription, and protein translation is critical
to maintain the integrity of the genetic code. Mistakes that occur during transcription can cause
genomic instability and human diseases. Here, we employed Circular Sequencing (Cir-seq) to
investigate the mutational profiles and landscape of the yeast Saccharomyces Cerevisiae
W303-1A-Rad5
R535 strain that carries a variant of Rad5 gene that may impairs transcription-coupled
DNA repair or promote transcription-replication conflicts. Three batches of Cir-seq
are prepared with error rates of 1.99 × 10
-6, 4.66 × 10
-6 and 5.27× 10
-6. Dominant G to U
mutations are found in two batches with a high coverage number. G to U errors are clustered
when the focal base of G is flanked by GU and UC. A list o...[
Read more ]
The accuracy of DNA replication, gene transcription, and protein translation is critical
to maintain the integrity of the genetic code. Mistakes that occur during transcription can cause
genomic instability and human diseases. Here, we employed Circular Sequencing (Cir-seq) to
investigate the mutational profiles and landscape of the yeast Saccharomyces Cerevisiae
W303-1A-Rad5
R535 strain that carries a variant of Rad5 gene that may impairs transcription-coupled
DNA repair or promote transcription-replication conflicts. Three batches of Cir-seq
are prepared with error rates of 1.99 × 10
-6, 4.66 × 10
-6 and 5.27× 10
-6. Dominant G to U
mutations are found in two batches with a high coverage number. G to U errors are clustered
when the focal base of G is flanked by GU and UC. A list of hotspots is obtained for an error
rate range of 10
-3 to 10
-6 with all types of mutations, most of which are related to the ribosomal
gene, which may impact proteostasis.
Background error model-coupled precision nuclear run-on circular-sequencing (EmPC-seq)
is our newly developed method for accurate sequencing of nascent RNA produced by RNA
polymerase I. We present the initial development of EmPC-seq and identification of error
hotspots and the features of the RNA polymerase I transcribed region by introducing a
background baseline. Mismatch rates are greatly reduced from 1.43±0.11×10
-3 (PRO-seq) to
3.79±0.81×10
-5 (EmPC-seq), implying that EmPC-seq can successfully remove noise
generated mainly during reverse transcription. One feature of RNA polymerase I is that errors
tend to cluster at the 3’ end. EmPC-seq can be further extended to study different transcriptional
error spectra of pre-mRNA (RNA polymerase II transcribed), siRNAs, and regulatory RNA
between normal and disease cells.
Following the development of EmPC-seq, we were inspired to develop nucleotide
resolution sequencing to obtain the influenza A polymerase (FluA pol) distribution and
mutation spectrum by utilizing the PRO-seq and NET-seq principle. This method is temporarily
named Pre-terminated RNA Elongation with Circular Influenza-A-transcript Sequencing Effort,
PRECISE. We demonstrated that the incorporation of ß-D-2’-deoxy-2’-α-fluoro guanosine
triphosphate (2’FdGTP) can stall influenza A polymerase upon elongation, terminating the
transcript with a 2’FdGTP end, which resists RNase R degradation. FluA pol also demonstrated
the ability to protect RNA (at least 14 nt) in the active site from various kinds of nuclease in
permeabilized extracts.
Post a Comment