Integrative methods for reference-independent genome assembly and error detection /
Saved in:
Author / Creator: | Bun, Christopher Chean, author. |
---|---|
Imprint: | 2016. Ann Arbor : ProQuest Dissertations & Theses, 2016 |
Description: | 1 electronic resource (140 pages) |
Language: | English |
Format: | E-Resource Dissertations |
Local Note: | School code: 0330 |
URL for this record: | http://pi.lib.uchicago.edu/1001/cat/bib/11674629 |
MARC
LEADER | 00000ntm a22000003i 4500 | ||
---|---|---|---|
001 | 11674629 | ||
005 | 20170317131747.5 | ||
006 | m o d | ||
007 | cr un|---||||| | ||
008 | 170317s2016 miu|||||om |||||||eng d | ||
003 | ICU | ||
020 | |a 9781369438536 | ||
035 | |a (MiAaPQD)AAI10239438 | ||
040 | |a MiAaPQD |b eng |c MiAaPQD |e rda | ||
100 | 1 | |a Bun, Christopher Chean, |e author. | |
245 | 1 | 0 | |a Integrative methods for reference-independent genome assembly and error detection / |c Bun, Christopher Chean. |
260 | |c 2016. | ||
264 | 1 | |a Ann Arbor : |b ProQuest Dissertations & Theses, |c 2016 | |
300 | |a 1 electronic resource (140 pages) | ||
336 | |a text |b txt |2 rdacontent | ||
337 | |a computer |b c |2 rdamedia | ||
338 | |a online resource |b cr |2 rdacarrier | ||
500 | |a Advisors: Rick Stevens Committee members: James Davis; Ian Foster; Robert Grossman; Fangfang Xia. | ||
502 | |b Ph.D. |c University of Chicago; Physical Sciences Division; Department of Computer Science |d 2016. | ||
510 | 4 | |a Dissertation Abstracts International, |c Volume: 78-06(E), Section: B. | |
520 | |a High-throughput genetic sequencing technologies have driven the proliferation of new genomic data. From the advent of long-read Sanger sequencing to the now low-cost, short-read generation and upcoming era of single-molecule techniques, methods to address the complex genome assembly problem have evolved alongside and are introduced at an expeditious pace. These algorithms attempt to produce an accurate representation of a target genome from datasets filled with errors and ambiguities. Many of the challenges introduced, unfortunately, must be addressed through an algorithm's ad-hoc criteria and heuristics, and as a result, can output assembly hypotheses that contain significant errors. Without an inexpensive or computational approach to assess the quality of a given assembly hypothesis, researchers must make due with draft-level genome projects for downstream analysis. Solving three fundamental challenges will alleviate this issue: (i) automation and incorporation of algorithms from the dynamic landscape of genome assembly tools, (ii) developing optimal assembly algorithms best suited for various types, or mixtures, of sequencing data, and (iii) developing an approach to assess de novo genome assembly quality independence of a reference genome. | ||
520 | |a We provide several contributions towards this effort: We first introduce AssemblyRAST, a general compute orchestration framework and accompanying domain-specific language that facilitates rapid workflow design for rapid genome assembly, analysis, and method discovery. Next, we demonstrate the improvement of genome assemblies through novel integrative algorithm techniques. Finally, we devise a method for reference-independent assembly evaluation and error identification through supervised learning, along with several applications to further improve existing techniques. | ||
546 | |a English | ||
590 | |a School code: 0330 | ||
690 | |a Computer science. | ||
690 | |a Bioinformatics. | ||
710 | 2 | |a University of Chicago. |e degree granting institution. | |
720 | 1 | |a Rick Stevens |e degree supervisor. | |
856 | 4 | 0 | |u http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:10239438 |y ProQuest |
035 | |a AAI10239438 | ||
929 | |a eresource | ||
999 | f | f | |i 278c8849-a63f-5a13-9748-20cfb3dc8760 |s e3c22603-2ab4-5448-9ae9-a5a6d0e0f2bf |
928 | |t Library of Congress classification |l Online |c UC-FullText |u http://gateway.proquest.com/openurl?url_ver=Z39.88-2004&rft_val_fmt=info:ofi/fmt:kev:mtx:dissertation&res_dat=xri:pqm&rft_dat=xri:pqdiss:10239438 |z ProQuest |g ebooks |i 11097584 |