Introduction
Why R?
R is the language of statistics and data science. It was designed from the ground up for data analysis, visualization, and statistical computing. Here is what makes it stand out:
- Built for data -- Vectors, data frames, and matrices are first-class citizens. Data manipulation is natural and expressive.
- Unmatched statistics -- From t-tests to Bayesian inference, R has every statistical method built in or available through packages.
- Visualization -- R's plotting capabilities, especially ggplot2, produce publication-quality graphics with minimal code.
- CRAN -- The Comprehensive R Archive Network hosts over 20,000 packages covering every domain of data analysis.
- Interactive analysis -- R excels at exploratory data analysis with its REPL and notebook-style workflows.
The Story
R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in 1993. The name "R" is a play on the creators' first names and a nod to S, the language that inspired it.
S was developed at Bell Labs by John Chambers in the 1970s as a language for statistical computing. R reimplemented S as free, open-source software and quickly surpassed it. The first stable release, R 1.0.0, came in February 2000.
Today, R is maintained by the R Core Team and has one of the most active open-source communities in data science.
Who Uses R
R is the standard tool in many fields:
- Academia -- the dominant language for statistical research and publications.
- Pharmaceutical industry -- used for clinical trial analysis and FDA submissions.
- Finance -- risk modeling, time series analysis, and quantitative trading.
- Tech companies -- Google, Facebook, Microsoft, and Twitter use R for data analysis.
The tidyverse ecosystem, created by Hadley Wickham, has made R accessible to a much broader audience with packages like dplyr, ggplot2, and tidyr.
What You Will Learn
This course contains 16 lessons organized into 7 chapters:
- Foundations -- Printing output, variables, types, and arithmetic.
- Vectors -- Creating vectors, vectorized operations, indexing, and filtering.
- Control Flow -- Conditionals and loops.
- Functions -- Defining functions, default arguments, closures, and higher-order functions.
- Data Structures -- Lists, matrices, and data frames.
- Data Manipulation -- Apply functions and data frame operations.
- Strings -- String manipulation with paste, gsub, and sprintf.
Each lesson explains a concept, demonstrates it with code examples, and gives you an exercise to practice.
Let's get started.