Fall 2014 – Bioinformatics and Genomic Applications
Class time: Tuesday & Thursday 1:30-3:30 pm; Class room: TLS 171B (Bamford Room).
Instructor: Yaowu Yuan (yaowu.yuan at uconn.edu); Office: BioPharm 300A.
TA: Cera Fisher (cera.fisher at uconn.edu); Office: BioPharm 318.
Office hours: by appointment.
Course goal: This new course is designed to–– (1) help students develop the basic skills for practical computing in biology (with a particular focus on genomic analysis) and start solving real-world problems immediately; (2) empower students to continue training themselves with more advanced computer skills when needed in their future research; and (3) familiarize students with the process of experimental design, data collection, analysis, interpretation, and presentation (i.e., the process from project design to publication), using empirical examples.
Course content: The course has three major parts–– (1) The Linux environment and ~40 most useful commands, shell scripts; (2) Perl programming for text manipulation (with a particular focus on empirical genomic data); and (3) R for simple statistics and graphics.
A Few Useful References:
Unix and Perl Primer for Biologists (Version 3.1.1). Keith Bradnam & Ian Korf (2012): http://korflab.ucdavis.edu/unix_and_Perl/.
Practical Computing for Biologists. Steven H. D. Haddock & Casey Dunn (2011).
Beginning Perl for Bioinformatics. James Tisdall (2001). Available from UConn library E-books.
R for Beginners. Emmanuel Paradis (2005).
R in Action: Data Analysis and Graphics with R. Robert I. Kabacoff (2011).
Week 1-1 (Aug. 26): Linux Basics I (pwd, cd, ls, mkdir, rmdir, mv, rm, touch, less, man, wc, cat, ssh, which, exit, logout)
Week 1-2 (Aug. 28): Linux Basics II (history, cp, scp, echo, chmod, grep, tr, the nano editor, shell scripts)
Week 2-1 (Sep. 02): Linux Basics III (head, tail, the pipe | , sed, cut, sort, uniq, awk, gzip, gunzip, zip, unzip, tar, curl, wget)
Week 2-2 (Sep. 04): Linux Basics IV (running batch jobs on a Linux cluster and installing new programs under a user’s account)
Week 3-1 (Sep. 09): Perl Introductions (the first perl program; common features of all programming languages)
Week 3-2 (Sep. 11): Scalar Variables and Basic String Manipulation
Week 4-1 (Sep. 16): File I/O, Arrays, Loops (I)
Week 4-2 (Sep. 18): File I/O, Arrays, Loops (II)
Week 5-1 (Sep. 23): Hashes
Week 5-2 (Sep. 25): Regular Expressions
Week 6-1 (Sep. 30): Introduction to Research Workflow
Week 6-2 (Oct. 02): Genome Statistics
Week 7-1 (Oct. 07): Subroutine, Module, and Variable Scope
Week 7-2 (Oct. 09): Guest Lecture by Dr. Jill Wegrzyn: Introduction to Genomics and Common Bioinformatic Approaches
Week 8-1 (Oct. 14): Genome filtering, Interaction with the command line and other programs
Week 8-2 (Oct. 16): Directory operations, parsing MUMmer output
Week 9-1 (Oct. 21): Short read mapping
Week 9-2 (Oct. 23): SNP calling
Week 10-1 (Oct. 28): Genome comparisons with MUMMer
Week 10-2 (Oct. 30): R introductions
Week 11-1 (Nov. 04): SNP density and sliding windows
Week 11-2 (Nov. 06): R graphics I
Week 12-1 (Nov. 11): R graphics II
Week 12-2 (Nov. 13): Clustered SNP filtering and Running R Scripts in Perl
Week 13-1 (Nov. 18): R applications in phylogenetics
Week 13-2 (Nov. 20): Combining Linux, Perl, and R for Research I