About This Training
R for Biologists is a workshop created by the former National Center for Genome Analysis Support (NCGAS). It helps biologists get acquainted with R, which will, in turn, help them with their analysis. Supercomputing for Everyone Series (SC4E): Basic UNIX/LINUX Skills is recommended before taking this course.
The NCGAS was funded by the National Science Foundation under Grant Nos. DBI-1062432 2011 , ABI-1458641 2015 , and ABI-1759906 2018 to Indiana University.
Length: 2 weeks
Certificate: Yes
Audience: IU students, faculty, staff, and non-IU collaborators with IU guest account
Goal
The goal of the workshop is to help biologists get acquainted with R, which will, in turn, help them with their analysis. The workshop includes three sessions designed to span three weeks. There are no pre-requirements for the workshop in terms of skills, but some familiarity with Unix is helpful. The workshop is available on Rstudio, provided through a preconfigured virtual machine hosted on Jetstream. You can also do the activities on your home computer if you install R yourself.
Contact
Research Technologies at cesg@iu.edu
Additional Information
To enroll in the course:
- Click “Enroll”
- Login to Expand if you have not already done so. (You will either use your IU credentials to login, or take steps to create a guest login if you do not have IU credentials.)
- Click “Enroll in Course”
- Select “Go to Your Dashboard”
- Locate the proper course under “In Progress” then select “Begin Course"
Course Modules
Chapter 1 - Using and Manipulating Basic R Data Types
- How to Get into R
- R is a Language
- Getting a Bit More Complicated: Vectors
- Vectors of Vectors: Matrices
- Data Frames
- What does the following mean?
- What About Really Complex Data Types
- Get Help
Chapter 2 - R Lab 1: DNA Words
- Install your Packages
- Reading sequence data into R
- QUIZ: Length of a DNA sequence
- QUIZ: Base composition of a DNA sequence
- QUIZ: GC Content of DNA
- QUIZ: DNA words
- Over-represented and under-represented DNA words
- QUIZ: Over-represented and under-represented DNA words
Chapter 3 - Graphing and Making Maps with Your Data
- Graphing Basics
- Mapping Libraries in R
- Mapping Points
- Mapping with Objects
- Using Real Data
- Using Google Satelite Maps
Chapter 4 - R Lab 2: Ordination in R
- Introduction to PCA
- Eigenvalues and Eigenvectors
- A Simple PCA
- QUIZ: A Simple PCA
- QUIZ: Compute the Principal Components
- Compute the Principal Components
- QUIZ: Plotting PCA
- QUIZ: Interpreting the Results
- Graphical Parameters with ggbiplot
- Adding a New Sample
- QUIZ: Project a New Sample onto the Original PCA
- Project a New Sample onto the Original PCA
- A Note on Functions
- Principal Coordinate Analysis
- QUIZ: Distance Calculations
- QUIZ: Computing the Components
- Graphing the PCoA
- QUIZ: Graphing the PCoA
- More on Vegan
- Wrap Up and Back to the Biology
Chapter 5 - Writing Custom Scripts
- Saving and Loading a Script in R
- Best Practices in R - What is a Function?
- Write a function in R
- Blast Summary Function
- Blast Summary Function - Loops
- Blast Summary Function - Some Graphing Functions
- Blast Summary Function - Write to a File
- Blast Summary Function - Final Cleanup
Chapter 6 - R Lab 3: Building a Sliding Window Analysis of GC Content
- Dengue Virus Genome Sequence
- QUIZ: Subsetting Vectors
- Pseudocode
- QUIZ: gcByRange Checkpoint I
- Fill in the code I
- QUIZ: gcByRange Checkpoint II
- Fill in the code II
- Add a Plot I
- QUIZ: gcByRange Checkpoint III
- Add a Plot II
- Just for fun: Overlapping windows
Appendix: R Command Script
- R Commands Script (Chapter 1 & 2)
- R Commands Script (Chapter 3 & 4)
- R Commands Script (Chapter 5 & 6)