By the end of this chapter, you should understand:
- The difference between Python and CPython.
- What an interpreter is.
- Why multiple Python implementations exist.
- The architecture and tradeoffs of CPython.
- What PyPy, Jython, IronPython, and MicroPython are.
- Why different implementations optimize for different goals.
- Why language specifications and implementations are separate concepts.
Many developers use the terms:
- Python
- CPython
as if they mean the same thing.
They do not.
Python is a language.
CPython is an implementation of that language.
This distinction is extremely important because many different programs can implement the same language.
Understanding this separation removes a great deal of mystery about how Python works.
Think about human languages.
English is a language.
Many people can speak English.
Different people may have:
- Different accents
- Different voices
- Different speaking speeds
But they are all speaking English.
Similarly:
Python Language
↓
┌──────────────┬─────────────┬─────────────┬─────────────┐
│ │ │ │
CPython PyPy Jython MicroPython
All follow Python's rules.
But they implement them differently.
Python is a specification.
It defines:
- Syntax
- Semantics
- Keywords
- Language rules
- Standard behavior
For example:
x = 10
if x > 5:
print("Hello")The language specifies what this code means.
It does not specify exactly how it should be executed.
That is the responsibility of an implementation.
An implementation is software that understands Python code and executes it.
Responsibilities include:
- Parsing source code.
- Compiling bytecode.
- Managing memory.
- Creating objects.
- Executing instructions.
The implementation determines:
How Python runs.
CPython is the reference implementation of Python.
It is:
- Written primarily in C.
- Maintained by the Python Software Foundation.
- The most widely used implementation.
When most people install Python:
python3
they are actually installing:
CPython
Because it is implemented in:
C
Hence:
C + Python
↓
CPython
Python Source Code
↓
Tokenizer
↓
Parser
↓
AST
↓
Compiler
↓
Bytecode
↓
Python Virtual Machine
↓
Operating System
↓
Hardware
We will study these stages in later chapters.
C provides:
- High performance
- Direct memory management
- Operating system access
- Portability
Advantages:
- Mature ecosystem
- Stable
- Extensible
Tradeoff:
- More complex internals
CPython handles:
Converts:
x = 5into internal structures.
Produces bytecode.
Creates and destroys objects.
Reclaims unused memory.
Everything in Python is an object.
CPython implements this model.
Runs bytecode instructions.
Why would alternative implementations exist?
Different users have different priorities:
- Speed
- Memory usage
- Embedded systems
- Java integration
- .NET integration
No single implementation optimizes for everything.
PyPy focuses on:
Performance
Unlike CPython, PyPy includes:
JIT Compiler
(JIT = Just-In-Time Compiler)
Architecture:
Python Code
↓
Bytecode
↓
JIT Compiler
↓
Machine Code
Advantages:
- Often faster than CPython.
Tradeoffs:
- Higher memory usage.
- Some compatibility issues.
Suppose a loop runs millions of times.
CPython repeatedly interprets instructions.
PyPy detects:
This code is executed frequently.
and converts it into machine code.
This reduces overhead.
Jython runs on:
JVM (Java Virtual Machine)
Architecture:
Python
↓
Java Bytecode
↓
JVM
Advantages:
- Access to Java libraries.
Tradeoffs:
- Lagging behind newer Python versions.
IronPython targets:
.NET CLR
(Common Language Runtime)
Advantages:
- Integration with C# and .NET.
Architecture:
Python
↓
CLR
MicroPython is designed for:
- Embedded systems
- Microcontrollers
- IoT devices
Examples:
- ESP32
- Raspberry Pi Pico
Goals:
- Small memory footprint
- Low power consumption
Tradeoff:
Not every CPython feature is available.
Stackless Python focuses on:
- Massive concurrency
- Lightweight task scheduling
Useful for:
- Games
- Simulations
Runs inside:
GraalVM
Goals:
- Interoperability
- Performance
| Implementation | Goal |
|---|---|
| CPython | General purpose |
| PyPy | Performance |
| Jython | Java ecosystem |
| IronPython | .NET ecosystem |
| MicroPython | Embedded systems |
| Stackless Python | Concurrency |
| GraalPython | Polyglot runtime |
CPython offers:
- Stability
- Compatibility
- Huge ecosystem
- Excellent documentation
- Broad library support
Most third-party libraries target CPython first.
Examples:
- NumPy
- Pandas
- PyTorch
often depend heavily on CPython internals.
Consider:
print("Hello")The language defines:
What should happen.
The implementation decides:
How it happens.
This separation is fundamental.
One reason CPython is popular:
It allows C code to integrate directly.
Examples:
- NumPy
- Pandas
- SciPy
- PyTorch
Python code often delegates heavy computations to C.
This explains why Python can remain productive while still supporting high-performance libraries.
Later, we will study:
- Reference counting
- Garbage collection
- Object model
- GIL
- Bytecode
Most of these concepts belong specifically to:
CPython
not Python itself.
This distinction prevents many misconceptions.
Python and CPython are the same.
Reality:
Python is the language.
CPython is one implementation.
Python is interpreted, period.
Reality:
Different implementations use:
- Interpretation
- JIT compilation
- Virtual machines
All implementations behave identically internally.
Reality:
They share semantics but differ in execution.
Python is written in Python.
Reality:
CPython is primarily written in C.
Most developers use:
CPython
Data Science:
CPython + NumPy + Pandas
Embedded Devices:
MicroPython
Java Integration:
Jython
Performance-sensitive workloads:
PyPy
Programming Language
↓
Language Specification
↓
Implementation
↓
CPython
↓
Parser
Compiler
VM
Memory
Objects
Understanding implementations prepares us to study how Python actually executes code.
- What is Python?
- What is CPython?
- Why are they different?
- Why does PyPy exist?
- What is MicroPython used for?
- Why separate language and implementation?
- Why isn't there only one Python implementation?
- Why is CPython written in C?
- Why can PyPy outperform CPython?
Explain:
- Python
- CPython
- PyPy
using spoken languages as an analogy.
Complete:
Python Language
↓
?
↓
Execution
Why is the middle layer necessary?
Suppose NumPy uses CPython internals heavily.
Would it automatically work on every Python implementation?
Why or why not?
Research:
- PyPy
- MicroPython
Identify one advantage and one tradeoff for each.
Explain why Python and CPython are not synonyms.
Draw:
Python Language
↓
CPython
PyPy
MicroPython
and explain the relationship.
In this chapter we learned:
- Python is a language specification.
- CPython is the reference implementation.
- Multiple implementations exist because different goals require different tradeoffs.
- PyPy focuses on speed.
- MicroPython targets embedded devices.
- CPython is written primarily in C.
- Many future chapters focus specifically on CPython internals.
So far we know:
Python Code
↓
CPython
↓
Execution
But what exactly happens inside CPython?
In the next chapter, we will trace the complete journey:
source.py
↓
Tokenizer
↓
Parser
↓
AST
↓
Compiler
↓
Bytecode
↓
Python Virtual Machine
↓
Output
For the first time, we will answer:
What really happens when you run a Python file?