Lesson 1 of 15

Tokenizing Numbers and Operators

What is a Token?

Before we can interpret code, we need to break raw text into meaningful chunks called tokens. This process is called lexical analysis (or "lexing").

A token is a small object describing a piece of syntax:

{ type: "NUMBER", value: "42" }
{ type: "PLUS", value: "+" }

Token Types for Arithmetic

For simple arithmetic like 3 + 42, we need:

TypeMatches
NUMBER0, 7, 42, 100
PLUS+
MINUS-
EOFend of input

A Simple Tokenizer

We can write a function that scans one character at a time:

function tokenize(input) {
    const tokens = [];
    let i = 0;
    while (i < input.length) {
        if (input[i] >= "0" && input[i] <= "9") {
            let num = "";
            while (i < input.length && input[i] >= "0" && input[i] <= "9") {
                num += input[i];
                i++;
            }
            tokens.push({ type: "NUMBER", value: num });
        } else if (input[i] === "+") {
            tokens.push({ type: "PLUS", value: "+" });
            i++;
        }
        // ... handle other characters
    }
    tokens.push({ type: "EOF", value: "" });
    return tokens;
}

Your Task

Write a tokenize(input) function that handles:

  • Multi-digit numbers (NUMBER)
  • + (PLUS) and - (MINUS)
  • Whitespace (skip it)
  • Appends an EOF token at the end

Return an array of { type, value } objects. Format each token as "TYPE:VALUE" and print them separated by spaces.

Node.js loading...
Loading...
Click "Run" to execute your code.