Introduction

Why Genomics?

Every living cell contains a complete copy of the organism's genome — roughly 3 billion DNA base pairs in humans. That sequence is the source code of life: it encodes every protein, regulates when and where each gene turns on, and determines susceptibility to disease.

For most of history, biologists had to study genes one at a time. Today, sequencing machines can read an entire genome in hours. The bottleneck is no longer generating the data — it is understanding it.

Most of the genome does not code for proteins. The vast regulatory landscape — promoters, enhancers, silencers, splice signals — controls which genes are active in which cell. Single-letter changes in these regions can cause cancer, developmental disorders, or heart disease. But the functional rules encoded in DNA are far too complex to write by hand.

AlphaGenome, released by Google DeepMind in 2025, is an AI model that reads up to one million base pairs of DNA and simultaneously predicts thousands of molecular properties: gene expression levels across tissues, RNA splice patterns, chromatin accessibility, transcription factor binding, histone marks, and more. It achieves state-of-the-art accuracy on 25 of 26 variant effect benchmarks.

How This Course Works

This course teaches you the molecular biology and bioinformatics concepts behind AlphaGenome by implementing them yourself in Python. No biology background required — just Python.

Each lesson builds on the previous one:

First you will understand what DNA is and how it encodes information.
Then you will learn how genes are read: transcription, translation, and splicing.
Then you will explore gene regulation: how the genome controls which genes turn on.
Finally you will see how all of these concepts map directly onto AlphaGenome's architecture.

By the final lesson, you will have implemented — in miniature — the core operation that AlphaGenome performs: variant effect prediction.

What You Will Learn

This course contains 15 lessons organized into 4 chapters:

The DNA Alphabet -- Sequences, base complement, GC content, and codons.
Reading the Genome -- Open reading frames, transcription, translation, and splice sites.
Gene Regulation -- Motif finding, CpG islands, single nucleotide variants.
Genomics Meets AI -- One-hot encoding, k-mer features, regulatory scoring, and variant effect prediction.

Let's get started.