Andrej Karpathy Recreates GPT From Scratch with a Small Python Code

Share

Andrej Karpathy, a former researcher at OpenAI and the founder of AI-native education company Eureka Labs, has launched a new experimental project that distils the inner workings of a generative pre-trained transformer into a single, minimal Python file.

The project, called microGPT, shows how a GPT-style language model can be trained and used for inference using only 243 lines of pure, dependency-free Python code—without PyTorch, TensorFlow, NumPy, or any external machine learning frameworks.

“This is the full algorithmic content of what is needed. Everything else is just for efficiency. I cannot simplify this any further,” said Karpathy in a post on X.

He shared the project on GitHub and included an HTML file containing the code for a single web page.

Karpathy explained that the full LLM architecture and loss function are stripped down to the most basic mathematical operations—such as addition, multiplication, exponentiation, logarithms, and exponentials—and a tiny scalar-valued autograd engine (“micrograd”) is used to compute gradients, with Adam handling optimisation.

The code that Karpathy shared also includes a simple character-level tokeniser, positional and token embeddings, multi-head self-attention with residual connections, RMS (root mean square) normalisation in place of layer normalisation, and an autoregressive sampling loop that generates text token by token after training.

Users online reacted with widespread fascination and admiration, given that the project compresses an entire working GPT into just a few hundred lines of plain Python code.

Anand Iyer, Venture Partner at Lightspeed Ventures, wrote on X, “You can read it in one sitting and actually understand how LLMs work instead of treating them as black boxes.”

“When someone (Karpathy) who led Tesla’s Autopilot and helped found OpenAI says this is as simple as it gets, it means the field is maturing from research mystery to engineering clarity,” added Iyer, calling it the K&R of language models—a reference to the classic book The C Programming Language by Brian Kernighan and Dennis Ritchie, which defined the canonical, minimal expression of C language.

Karpathy is widely known not just for his high-profile roles as a founding member of OpenAI and as Tesla’s Director of AI and Autopilot Vision, but also for his educational and technical deep dives into the fundamentals of neural networks and language models.

He authored and taught Stanford’s influential CS231n deep learning course, helped popularise the idea of “Software 2.0”, “Vibe Coding’and created hands-on projects and tutorials, including building neural network components and GPT-like models from scratch.

ALSO READ: OpenAI Begins Testing Ads on Free and Go Plans in the US

Staff Writer

The AI & Data Insider team works with a staff of in-house writers and industry experts.

Join Our Core Community

6 Enterprise Tests to Expose Hidden AI Compliance Risks Across Borders

Forward-Looking Technical Debt: The Hidden Cost of AI Hesitation

Why AI Governance Is Becoming a Board-Level Issue for Multinationals

Intent: The Missing Data Layer in Generative AI

The Death of the Generalist: 5 Specialised Copilots Rewriting the Enterprise Stack

Data as the New Diagnostic: How Ahead Health is Turning Algorithms Into Preventive Care

Why Data Leaders Are Wary of a Synthetic Future

Is Your Enterprise Data Stack Ready for Agentic AI? 10 Signs to Check

2025’s Top 16 Acquisitions in AI & Data

Geopatriation for Cloud Sovereignty: Why 75% Are Moving Home by 2030

After Cursor, NVIDIA Roll Outs OpenAI’s Codex to 30,000 Engineers

Meta Commits $10 Bn to 1 GW Data Centre in Indiana