Introduction

Why R?

R is the language of statistics and data science. It was designed from the ground up for data analysis, visualization, and statistical computing. Here is what makes it stand out:

  • Built for data -- Vectors, data frames, and matrices are first-class citizens. Data manipulation is natural and expressive.
  • Unmatched statistics -- From t-tests to Bayesian inference, R has every statistical method built in or available through packages.
  • Visualization -- R's plotting capabilities, especially ggplot2, produce publication-quality graphics with minimal code.
  • CRAN -- The Comprehensive R Archive Network hosts over 20,000 packages covering every domain of data analysis.
  • Interactive analysis -- R excels at exploratory data analysis with its REPL and notebook-style workflows.

The Story

R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in 1993. The name "R" is a play on the creators' first names and a nod to S, the language that inspired it.

S was developed at Bell Labs by John Chambers in the 1970s as a language for statistical computing. R reimplemented S as free, open-source software and quickly surpassed it. The first stable release, R 1.0.0, came in February 2000.

Today, R is maintained by the R Core Team and has one of the most active open-source communities in data science.

Who Uses R

R is the standard tool in many fields:

  • Academia -- the dominant language for statistical research and publications.
  • Pharmaceutical industry -- used for clinical trial analysis and FDA submissions.
  • Finance -- risk modeling, time series analysis, and quantitative trading.
  • Tech companies -- Google, Facebook, Microsoft, and Twitter use R for data analysis.

The tidyverse ecosystem, created by Hadley Wickham, has made R accessible to a much broader audience with packages like dplyr, ggplot2, and tidyr.

What You Will Learn

This course contains 16 lessons organized into 7 chapters:

  1. Foundations -- Printing output, variables, types, and arithmetic.
  2. Vectors -- Creating vectors, vectorized operations, indexing, and filtering.
  3. Control Flow -- Conditionals and loops.
  4. Functions -- Defining functions, default arguments, closures, and higher-order functions.
  5. Data Structures -- Lists, matrices, and data frames.
  6. Data Manipulation -- Apply functions and data frame operations.
  7. Strings -- String manipulation with paste, gsub, and sprintf.

Each lesson explains a concept, demonstrates it with code examples, and gives you an exercise to practice.

Let's get started.

Next →