BIOT E-100 Syllabus Fall 2015 v1.1
BIOT E-100 Syllabus Fall 2015 v1.1
BIOT E-100 Syllabus Fall 2015 v1.1
Fall 2015
Course Staff
Virtual office hours (internet enabled) will be scheduled after the first class meeting.
Class Meetings
Location
Prerequisites
Description
This course will explore how computer science and mathematics, supported by
information technology, have combined with modern laboratory technologies to solve
previously intractable problems in the life sciences. In the past several decades,
bioinformatics has expanded from dealing with the matching of sequence data, to a
wide range of techniques for understanding the mechanisms of life's molecular
machinery. Since this course is an "Introduction to …", we will survey the breadth and
power of these techniques, pausing occasionally to peer "under the hood" to appreciate
the mathematical and computational approaches that enable them.
Areas of bioinformatics that will be discussed include:
• sequence alignment
• DNA sequencing and assembly
• probability and the significance of results
• gene prediction
• multiple sequence alignment
• phylogenetics
• functional genomics
• sequence, gene, and protein databases
• web-based bioinformatics tools
• impact to society and ethical considerations
Though the computational methods that address these areas will be discussed,
students will not be asked to develop or program bioinformatics algorithms. Students
will solve bioinformatics problems with written exercises, web-based queries, and small
Python programs which automate the use of web-based bioinformatics tools and
databases and display their output. Students will learn basic usage of the Python
programming language and the BioPython program library. Schedule permitting, useful
functions with the statistical language R will be shown. Basic concepts of probability will
be introduced to help understand the significance of results. Course readings will
include selections from the textbook and scientific papers which will be distributed
during the semester.
Expectations
Students are not expected to know the Python language prior to this course. They
should have some experience with a modern computer language (see Prerequisites).
The Python language is very well suited to bioinformatics, has great expressive power,
yet a very gentle learning curve. See Jump Start below.
There will be five or six assignments, generally due bi-weekly, two exams, and a final
project. Graduate students will also do a Journal Club presentation. See Credit Level
and Grading below.
Section meetings are scheduled on the same day as lectures to allow students who
may have to travel a great distance to attend. We recognize that this can make for a
very long evening, so we will take a break between section and lecture for students to
grab a snack, coffee, or whatever.
Calendar
(NB: This may be modified as needed, particularly once the balance of undergrads to
grads is known and we schedule the Journal Club presentations.)
(Wednesdays)
Sep 2 Lecture 01 Intro & Foundations
Sep 9 Lecture 02 Sequence Alignment
Sep 16 Lecture 03 DNA sequencing and assembly
Sep 23 Lecture 04 FASTA and BLAST
Sep 30 Lecture 05 Multiple Alignments
Oct 7 Exam 1
Oct 14 Lecture 06 Phylogenetics
Oct 21 Lecture 07 Genome Analysis
Oct 28 Lecture 08 Protein Structures & HMMs
Nov 4 Lecture 09 HMMs & Structure cont'd
Nov 11 Lecture 10 Structure & Function
Nov 18 Exam 2
Nov 25 No Class, Thanksgiving Break
Dec 2 Lecture 12 Gene Expression
Dec 9 Lecture 13 Pathways & Systems / Final Projects
Dec 16 Final Projects
Credit Level
This course is offered for both undergraduate and graduate credit. Graduate students
will participate in a "journal club", each reading and presenting a relevant scientific
paper to the class sometime during the semester. Undergraduates are not required to
present a paper, but may do so for extra credit. Assignments may include items for
graduates that are optional for undergrads. Graduate students are expected to show
greater breadth and depth of understanding and analysis in their final project.
Grading
Undergraduate
30% Assignments
20% Exam 1
20% Exam 2
10% Class participation (in lecture, section, and online forum)
20% Final project
Graduate
30% Assignments
20% Exam 1
20% Exam 2
10% Journal club
20% Final project
Accessibility
Academic Integrity
You are responsible for understanding Harvard Extension School policies on academic
integrity (www.extension.harvard.edu/resources-policies/student-conduct/academic-
integrity) and how to use sources responsibly. Not knowing the rules, misunderstanding
the rules, running out of time, submitting "the wrong draft", or being overwhelmed with
multiple demands are not acceptable excuses. There are no excuses for failure to
uphold academic integrity. To support your learning about academic citation rules,
please visit the Harvard Extension School Tips to Avoid Plagiarism
(www.extension.harvard.edu/resources-policies/resources/tips-avoid-plagiarism), where
you'll find links to the Harvard Guide to Using Sources and two, free, online 15-minute
tutorials to test your knowledge of academic citation policy. The tutorials are anonymous
open-learning tools.
Textbooks
I have chosen one required book and several recommended books for this course.
Depending on your individual need, you may find one or more of the recommended
books particularly helpful. If you're unsure, feel free to discuss books with me before
buying.
A note about buying books: If you are buying books from the Harvard Coop, first get a
Coop membership (costs only $1 a year), and you'll receive an immediate 10% discount
on textbooks and everything else you purchase there (except for the cafe).
Required:
"Understanding Bioinformatics"
Author: Marketa Zvelebil, Jeremy Baum
ISBN: 9780815340249
Publisher: Garland Science / Taylor & Francis Group
This book is very comprehensive and exceptionally well produced. The course lectures
will generally follow the plan of this text. Note that the book is over 700 pages and
covers enough material to fill a two semester course. So we will select only a portion of
the chapters as assigned readings.
Recommended:
Unless you are an experienced programmer, you should review both MYBDwP and
PCfB, then pick the one that feels best to you.
BDS is designed for a more advanced user, already comfortable with software
development tools, including a scripting language, the Unix command line, regular
expressions, etc., who wants to develop and apply skills particularly suited to
bioinformatics. It was just published in July, and we won’t be using it much, if at all, in
the course. I include it here for those of you who want to push further toward becoming
professional bioinformaticians.
Jump Start
Chapter 1 of our text book "The Nucleic Acid World" is available as a free PDF
download at
http://www.garlandscience.com/res/pdf/9780815340249_ch01.pdf
I strongly suggest you review it to make sure you're reasonably familiar with the subject
matter. I'll be glad to discuss it a bit the first night in case there are questions, but this is
really part of the prerequisites for the course.
If you want to get a little practice with Python programming, you can try these online
interactive lessons:
Python Code Academy
http://www.codecademy.com/tracks/python
%%%%