R Programming Notes
Welcome to your comprehensive guide to R. This modern, structured handbook is designed to take you from writing your first line of code to executing advanced data engineering and statistical workflows.
Introduction to the R Ecosystemβ
R is more than just a programming language; it is a highly specialized environment designed for data analysis, statistical computing, and stunning graphics.
Why Use R?β
- Built for Data: Unlike general-purpose languages, R's core data types are tailor-made for statistical modeling.
- The Tidyverse Ecosystem: A powerful, cohesive collection of packages designed explicitly for data science.
- Academic & Production Standard: The gold standard for reproducible research, statistical benchmarking, and reporting.
Core Ecosystem Pillarsβ
tidyverseβ The definitive toolkit for data manipulation and visualization (dplyr,ggplot2,tidyr,readr).data.tableβ High-performance, memory-efficient data manipulation for massive datasets.shinyβ The go-to framework for building interactive web applications directly in R.tidymodels/caretβ Modern, unified frameworks for machine learning pipelines.
Installation & Environment Setupβ
- Install R: Download the base engine from the Comprehensive R Archive Network (CRAN).
- Install RStudio: Download the industry-standard IDE from Posit.
Essential Console Commandsβ
Get familiar with your workspace using these quick foundational commands:
# Check your active R version
version
# Find out where R is looking for files (Working Directory)
getwd()
# Change your working directory to a specific path
setwd("~/my_project")
Managing Packagesβ
Packages expand R's capabilities. You only need to install a package once, but you must load it every time you start a new R session.
# Download from CRAN
install.packages("tidyverse")
# Load into your current session
library(tidyverse)
Variables & Basic Data Typesβ
In R, we use the arrow operator (<-) for assignment. While = works, <- is the idiomatic standard that explicitly signals a directional value assignment.
# Assigning values
age <- 42
user_name <- "Alice"
is_active <- TRUE
# Checking data types
class(age) # Returns: "numeric" (stored as a double by default)
class(user_name) # Returns: "character"
class(is_active) # Returns: "logical"
The 5 Core Data Typesβ
- Numeric: Decimals/doubles (
42.5) and explicit integers (42L). - Character: Text strings (
"Hello R"). - Logical: Boolean flags (
TRUEorFALSE). - Factor: Categorical data with pre-defined levels (e.g.,
factor(c("Low", "Medium", "High"))). - Dates/POSIXct: Standard date formats and calendar times with timezones.
Operators & Expressionsβ
# 1. Arithmetic Operators
5 + 3 # Addition
10 / 2 # Division
3 ^ 2 # Exponentiation (3 squared)
11 %% 3 # Modulo (Remainder of division = 2)
# 2. Comparison Operators
5 > 3 # TRUE
age == 42 # TRUE
# 3. Logical Operators
TRUE & FALSE # Element-wise AND -> FALSE
TRUE | FALSE # Element-wise OR -> TRUE
!TRUE # NOT -> FALSE
Control Flowβ
Control flow lets your code make decisions based on data conditions.
Conditional Logic (If-Else)β
score <- 85
if (score >= 90) {
print("Grade: A")
} else if (score >= 75) {
print("Grade: B")
} else {
print("Grade: C")
}
Pattern Matching (Switch)β
Use switch as a cleaner alternative to multi-layered if-else blocks when matching specific strings or values.
day <- "Monday"
message <- switch(day,
"Monday" = "Back to the grind!",
"Friday" = "Weekend is almost here!",
"Mid-week blues..." # Default fallback value
)
print(message)