Lesson 4 of 15
Codons
The Genetic Code
DNA is read in groups of three bases called codons. Each codon specifies one amino acid (or a start/stop signal). This is the genetic code.
There are 4³ = 64 possible codons but only 20 amino acids, so the code is redundant — multiple codons can encode the same amino acid.
Two special codons:
- ATG — the start codon. It marks where a gene begins and codes for Methionine (M).
- TAA, TAG, TGA — stop codons. They signal the end of the protein.
def split_into_codons(seq):
return [seq[i:i+3] for i in range(0, len(seq) - 2, 3)]
def is_start_codon(codon):
return codon == "ATG"
def is_stop_codon(codon):
return codon in ("TAA", "TAG", "TGA")
seq = "ATGCGATAA"
codons = split_into_codons(seq)
print(codons) # ['ATG', 'CGA', 'TAA']
print(is_start_codon(codons[0])) # True
print(is_stop_codon(codons[-1])) # True
AlphaGenome predicts transcription start sites — exactly where in the genome a gene begins to be read. Understanding codons is the first step to understanding those predictions.
Your Task
Implement split_into_codons(seq), is_start_codon(codon), and is_stop_codon(codon).
Python runtime loading...
Loading...
Click "Run" to execute your code.