The goal of this page is to provide a roadmap for UCLA students at the graduate or post-doctoral level, who are interested in obtaining more background in Bioinformatics, in order to take advantage of the appropriate resources for their current quantitative background. All areas of Computational Biology including Bioinformatics, Statistical Genetics, Systems Biology and others are growing very rapidly and being transformed by technology developments in collecting high throughput genomic data. UCLA is committed to encouraging a broad group of graduate students to obtain computational training, in order to utilize these techniques in their research.
Any broad classification scheme is bound to be over simplified and inexact. The goal here is to provide a starting point for finding the appropriate resources. These resources include both courses at UCLA as well as online resources.
The course offerings at UCLA can be roughly grouped into three categories based on the required pre-requisites and background required for the course. The three categories of background are:
1. “Interest” – For these courses, students just need to be interested in learning about computational biology and no background is assumed by the course. These courses are a great place to start to determine whether or not computational biology is for you.
2. “Basic” – These courses assume some basic knowledge in biology as well as some basic background in statistics, math and computation. Typically a student who completed an undergraduate biology major is qualified to complete (or enroll in) these courses, all of which focus on the application of computational methods to biological problems. For these courses, the ideal student will have some basic ability to program, in order to be able to convert data into appropriate formats, so as to apply different methods.
3. “Advanced” – These courses assume strong programming skills (at least 1 year of programming) and a solid knowledge of statistics. These courses focus on the development of computational methods. Students with backgrounds in computer science, mathematics or statistics have the necessary preparation for these courses.
There are plenty of courses available at UCLA for each of these levels of background as well as courses and resourcesin which students can obtain the background to take these courses.
Which courses are appropriate for me?
The answer to this question depends on your goals. Possible goals are:
1. “I have no background in computational biology but want to see if it is useful for me” – For this goal, the best type of courses are the “Interest” courses. They give a good overview of the area without requiring the serious time commitment involved in obtaining the background necessary for the “Basic” and “Advanced” courses.
2. “I need to learn some computational biology for my research, but don’t know how to program” – For this goal, the best type of courses are the “Basic” courses. Another option, but one which requires a greater time commitment is to take the necessary courses (or utilize other resources to learn to program) in order to take the “Advanced” courses.
3. “I want to have computational biology a major aspect of my research and I know how to program” – For this goal, the best courses are the “Advanced” courses since they focus on methodology development.
4. “I want to learn to program, but don’t have time to take courses” – For this goal, see the options below in obtaining the background. There are online courses and programming languages like Python and Perl which require less of a time commitment to learn than the full programming sequence required for the “Advanced” courses.
How do I get the background for these courses?
Obtaining Background: Learning to Program
For individuals who have a biological sciences background, the biggest hurdle involved in computational biology courses is learning to program. Learning to program requires a significant time commitment, but also provides tremendous benefits.
Choosing a programming language
An “advanced” computational biology programming background will consist of knowledge of R (a statistical programming language), a scripting language such as Perl or Python, and a compiled language such as C, C++ or Java. A “basic” background will consist of knowledge of R and some knowledge of a scripting language. R is easiest to learn and offers the greatest return, for the investment of time spent learning. R can be used to analyze a many types of genomic data and is widely used in the community. Scripting languages are important because they serve as the glue between different methods when it comes to analyzing a dataset. They are useful for manipulating data including extracting relevant information from the output of analysis methods and converting that output to the input format for other methods. They are more difficult to learn than R, but relatively simple for programming languages. Compiled languages are the most difficult to learn and are usually only useful for researchers involved in developing computational methods.
Learning R
The easiest way to learn R is through online resources including tutorials and video lectures. A good starting point is the official R tutorial available at http://cran.r-project.org/doc/manuals/R-intro.pdf. There are many good books and online resources for learning R.
Learning a scripting language
There are many online resources for learning scripting languages. A good starting point for Python is (http://wiki.python.org/moin/BeginnersGuide/NonProgrammers). A good starting point for learning Perl is (http://perl-tutorial.org/). Another strategy is to buy a book on Python or Perl (several are for sale at the UCLA bookstore) and go through the book on your own.
Learning a compiled language
UCLA offers several courses in programming which focus on learning compiled languages. These include Program In Computing (PIC) 10A, 10B, and 10C which teach the C++ programming language and also PIC 20A,20B which teaches Java (enrollment requires prior completion of PIC 10A). There are also several courses online that students can take. These include many options from iTunesU such as Introduction to Computer Science from Harvard University (http://itunes.apple.com/us/course/intro-to-computer-science/id529181544) and Programming Methodology from Stanford (http://itunes.apple.com/us/itunes-u/programming-methodology/id384232896).
Obtaining Background: Math and Statistics
There are UCLA several courses available to give students a background in statistics. For students with some quantitative background, a mathematical probability and statistics sequence such as Statistics 100A and 100B or Biostatistics 110A and 110B is strongly recommended. An online version of a comparable course is available from Harvard (http://itunes.apple.com/us/course/statistics-110-probability/id502492375). For students with less background, UCLA regular course options include Biostatistics 100A and 100B. An online version of a comparable course is available from HEC Paris (http://itunes.apple.com/us/course/statistics/id542118413). Even more basic courses are Statistics 10 (or 11-14) which provide a very basic introduction to statistics. An online comparable course is available from HACC (http://itunes.apple.com/us/course/introduction-to-statistics/id495049542). In terms of math background, the Math 30 series is recommended, and in particular Math 33A Linear Algebra is recommended. A linear algebra course is available from MIT (http://itunes.apple.com/us/itunes-u/linear-algebra/id354869137).
Obtaining Background: Advanced Computational Skills
The following are several advanced computational courses that are recommended for students with interest in computational biology. These include Computer Science180 or Math 182 Algorithms.
Other resources at UCLA for obtaining computational biology training include:
1. The UCLA Bioinformatics Ph.D. Program (http://www.bioinformatics.ucla.edu/first-year-curriculum) is an integrated doctoral training for students interested in working at the interface of computer science, biology, and mathematics to address the fundamental challenges of contemporary genomic-scale research. The interdisciplinary Ph.D. program consists of an integrated one-year core curriculum, research rotations, over 50 elective courses, and faculty mentors spanning biology, mathematics, engineering, and medicine.
2. The UCLA Bioinformatics Undergraduate Minor (http://www.bioinformatics.ucla.edu/minor) is an Undergraduate Minor that any undergraduate student at UCLA can incorporate into their degree program and provides training in computational biology.
3. The UCLA Genome Analysis Training Program (http://www.genetics.ucla.edu/GATG/) is a is funded by a NIH grant and supports UCLA pre-doctoral students whose goal is to conduct research in genomics. The program is designed to insure that students obtain an adequate biological, computational and statistical foundation to succeed in this important new, interdisciplinary field. Each year, the Genomic Analysis Training Program provides its trainees with stipends and funding for academic fees. The grant also provides support for travel and research expenses.
4. The UCLA Systems and Integrative Biology Training Program (http://dragon.nuc.ucla.edu/sibtp/) seeks to support pre-doctoral students who seek balanced and rigorous training in mathematics and biology/medicine with a special emphasis on research in mathematical modeling in biology or medicine. The early training includes formal course work, research rotations with participating faculty, and attendance and participation in a seminar series.
UCLA Courses:
“Interest” courses at UCLA:
Computational and Systems Biology 184 – Introduction to Computational and Systems Biology
Human Genetics C144/C244 – Genomic Technology
Human Genetics 210 – Topics in Genomics
“Basic” courses at UCLA:
Human Genetics M207B – Applied Genetics Modeling
Human Genetics 236A – Advanced Human Genetics A: Molecular Aspects
Human Genetics 236B – Advanced Human Genetics B: Statistical Aspects
Human Genetics CM122/CM222 – Mouse Molecular Genetics
Ecology and Evolution 135 – Population Genetics
Molecular Cellular and Developmental Biology 172 – Genomics and Bioinformatics
Physiological Sciences 125 – Molecular Systems Biology
Chemistry 100 – Genomics & Computational Biology
Biomath 204 – Biomedical Data Analysis
“Advanced” courses at UCLA:
Human Genetics M207A – Theoretical Genetic Modeling
Human Genetics CM124/CM224 – Computational Genetics
Computer Science 121/221 – Introduction to Bioinformatics
Computer Science 122/222 – Algorithms in Bioinformatics and Systems Biology
Computational and Systems Biology 186/286 – Computational Systems Biology: Modeling and Simulation of Biological Systems
Statistics M254 – Statistical Methods in Computational Biology
Biomath 211 – Mathematical and Statistical Phylogenetics