042025· Full-stack · Infra

CodeCraft

CodeCraft: an Educative + CodeCrafters hybrid with Docker-sandboxed, multi-language code grading.

Next.js 16TypeScriptPrismaDockerMonacoCaddy

855

grading test cases

across 23 challenges · 304 stages

sandboxed languages

C++ · Go · Python · TypeScript · Rust

~50.6k

lines of TypeScript

36 Next.js API route handlers

123/123

tests passing

CI green on main

The problem

Learning systems engineering by reading is a dead end — you only internalize Redis, Git or a DNS server by building one. CodeCraft combines structured multi-lesson courses with hands-on 'build your own X' challenges that have staged progression, automated grading and instant feedback. The hard requirement underneath: run arbitrary learner-submitted code on the server, in five languages, without letting it touch the network, the host, or another submission.

The approach

The platform is a Next.js 16 App Router monolith (React 19, TypeScript 5, Tailwind v4) with Prisma over SQLite for persistence, in-browser editing via Monaco, and GitHub OAuth through NextAuth. Content is filesystem-defined: 23 build-your-own challenges each carry a definition.json (304 stages, 855 grading test cases total) and 7 structured courses carry a course.json (59 modules, 617 lessons). Grading executes every submission in a throwaway Docker container; an Anthropic-SDK-backed 'What's wrong with my code?' panel sits next to the editor.

Key decisions

The load-bearing tradeoff was isolation over convenience for code execution: each run spins a fresh container with --rm, --network=none, a memory cap, --cpus=1, --pids-limit=64, a read-only filesystem and a 10s timeout, across four purpose-built images. That is slower and more operationally complex than an in-process interpreter, but it's the only honest way to run untrusted code from the internet. Persistence stayed on SQLite via Prisma rather than a networked DB — a deliberate single-node simplification, with the Docker socket mounted so the app itself launches the sandbox containers.

What broke

Two honest notes. The README prose had drifted from ground truth — it claimed '19 challenges' while the filesystem holds 23 definition.json files, so every number here comes from parsing the files on disk, not the README. And CI was red before it was green: earlier runs failed on the day of the fix before the suite was brought to passing.

Outcome

The codebase is substantial and real: ~50,591 lines of TS/TSX, 36 Next.js API route handlers, 123 commits, with a Prisma schema modelling the full learner platform (progress, submissions, streaks, certificates). Its own test suite runs and passes — 123 tests across 9 files via vitest — kept honest by GitHub Actions CI (typecheck, build, tests), latest run on main green. Production packaging exists as a docker-compose stack: the Next.js app, a one-shot sandbox-image builder, a Caddy auto-TLS reverse proxy with HSTS, and a watchtower watchdog.

Submission & grading flow

Browser

Monaco editor

in-browser IDE

AI help panel

Anthropic SDK

Server — docker-compose

Caddy

auto-TLS · HSTS

Next.js 16

App Router · 36 API routes

Isolation

Sandbox runner

fresh container / run

Prisma → SQLite

progress · submissions

Ephemeral container

--network=none · --read-only · 10s

Verdict

pass/fail vs 855 cases

Untrusted code runs in a throwaway, network-isolated container per submission, across gcc · go · python · node images. Verified counts from the filesystem, not the README.