Lesson 1 of 15
Tokenizing Numbers and Operators
What is a Token?
Before we can interpret code, we need to break raw text into meaningful chunks called tokens. This process is called lexical analysis (or "lexing").
A token is a small object describing a piece of syntax:
{ type: "NUMBER", value: "42" }
{ type: "PLUS", value: "+" }
Token Types for Arithmetic
For simple arithmetic like 3 + 42, we need:
| Type | Matches |
|---|---|
NUMBER | 0, 7, 42, 100 |
PLUS | + |
MINUS | - |
EOF | end of input |
A Simple Tokenizer
We can write a function that scans one character at a time:
function tokenize(input) {
const tokens = [];
let i = 0;
while (i < input.length) {
if (input[i] >= "0" && input[i] <= "9") {
let num = "";
while (i < input.length && input[i] >= "0" && input[i] <= "9") {
num += input[i];
i++;
}
tokens.push({ type: "NUMBER", value: num });
} else if (input[i] === "+") {
tokens.push({ type: "PLUS", value: "+" });
i++;
}
// ... handle other characters
}
tokens.push({ type: "EOF", value: "" });
return tokens;
}
Your Task
Write a tokenize(input) function that handles:
- Multi-digit numbers (
NUMBER) +(PLUS) and-(MINUS)- Whitespace (skip it)
- Appends an
EOFtoken at the end
Return an array of { type, value } objects. Format each token as "TYPE:VALUE" and print them separated by spaces.
Node.js loading...
Loading...
Click "Run" to execute your code.