Introduction to Data Science with R
Installing software
Before we go any further, you should have working installations of the following software:
Later in the course we will use git and GitHub for version control and collaborative work.
- Install git here
- Register for an account at GitHub
- Install GitHub CLI
You may also want to install a interface to git such as GitHub desktop
How to organize your files - RStudio projects
- A big issue in doing science using computers is how to organize your files.
- We will start by using RStudio projects. A project will help you keep track of files in one place.
- What is a project? - A single report (source files, data, figures, etc) - A book or collection of reports (source files, figures, data, etc.) - A website/blog/course notes (…)
Create a project
- Find the project menu in RStudio (upper right corner)
- Select New Project, New Directory, New Project
- Select a suitable name!
Naming projects
- A project should be contain everything you need for a certain task (like writing a book, project report etc.)
- The name of the project should reflect this. Files inside the project can have more general names (like figure-1.R, report.qmd).
- “The hardest thing in Data Science is naming things” (is not the exact quote, but close enough)
Basic R in a script
Start up a new R script (File>New File>R Script)
A R script is a text file with a specific extension
.R
A R script can be written as a program for R to evaluate.
We will use the R script to talk about basic R
Objects and assignments
Data types and vectors
Data frames and lists
Functions and packages
Combine code and text in quarto files
Markdown
Markdown is the basic syntax for editing text in quarto (
qmd
) files.The syntax let’s you format the text in a plain text editor.
- Headings - Links & Images - Lists - Footnotes - Tables - Equations - Page Breaks - Callout Blocks
Code chunks
- `echo`
- `warning`
- `message`