Skip to content

[experiment] python#178

Draft
AlexandreYang wants to merge 4 commits intomainfrom
alex/python
Draft

[experiment] python#178
AlexandreYang wants to merge 4 commits intomainfrom
alex/python

Conversation

@AlexandreYang
Copy link
Copy Markdown
Member

What does this PR do?

Motivation

Testing

Checklist

  • Tests added/updated
  • Documentation updated (if applicable)

AlexandreYang and others added 2 commits April 11, 2026 21:44
Adds a `python` builtin command that executes Python 3.4 source code
using the gpython pure-Go interpreter — no CPython installation required.

Usage: python [-c CODE] [-h] [SCRIPT | -] [ARG ...]

Security sandbox (enforced in builtins/internal/pyruntime/):
- os.system, os.popen, all exec/spawn/fork/write/delete functions removed
- open() replaced with read-only AllowedPaths-aware version; write/append
  modes raise PermissionError
- tempfile and glob modules neutered (functions removed)
- sys.exit() exit code propagated via closure variable before VM wraps error
- Source and file reads bounded at 1 MiB
- Context cancellation respected (goroutine + select on ctx.Done())

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@AlexandreYang
Copy link
Copy Markdown
Member Author

Implementation notes: python builtin

This PR adds a python builtin command that executes Python 3.4 source code using gpython, a pure-Go Python interpreter. No CPython installation is required.


Architecture

The implementation is split across two packages:

builtins/python/python.go                  ← builtin registration, flag parsing, source dispatch
builtins/internal/pyruntime/pyruntime.go   ← gpython wrapper + security sandbox

The split is deliberate. builtins/internal/ is exempt from the builtinAllowedSymbols static-analysis check, which means the pyruntime package can freely use gpython's third-party API and the required blank import (_ "github.com/go-python/gpython/stdlib"). The outer builtins/python/python.go only sees the clean pyruntime.Run(ctx, opts) interface and is fully subject to the allowlist checker.


Usage

python [-c CODE] [-h] [SCRIPT | -] [ARG ...]
Mode Description
python -c "print(1+1)" Execute inline code string
python script.py Execute a script file (via AllowedPaths sandbox)
echo "print(1)" | python - Read source from stdin

Security sandbox

Every gpython context is hardened before any user code runs:

1. os module sanitisation

The os module is pre-loaded via py.Import(pyCtx, "os") and then stripped of all dangerous functions before user code can touch it. Deleted names include:

  • Process exec/spawn: system, popen, execl, execle, execlp, execlpe, execv, execve, execvp, execvpe, fork, forkpty, spawnlspawnvpe, startfile
  • Filesystem writes: remove, unlink, mkdir, makedirs, rmdir, removedirs, rename, renames, replace, link, symlink, truncate, write
  • Permission/ownership: chmod, chown, chroot
  • Environment mutation: putenv, unsetenv
  • Signal delivery: kill, killpg, _exit

Because the module is pre-loaded and cached in gpython's module store, any later import os in user code returns the already-sanitised copy.

2. Sandboxed open()

builtins.open is replaced with a Go-backed function that:

  • Rejects any mode containing w, a, x, or + with PermissionError
  • Routes read-only opens through callCtx.OpenFile, which enforces the AllowedPaths sandbox
  • Returns a goFile object with read(), readline(), readlines(), close(), __enter__/__exit__ — enough for with open(...) as f: patterns
  • Caps all reads at 1 MiB to prevent memory exhaustion

3. Blocked modules

tempfile and glob are neutered at context creation: their registered module implementations run Python code that raises ImportError, so import tempfile / import glob fails. (In practice, gpython's stdlib doesn't expose truly dangerous APIs for these, but neutering them is defence-in-depth.)

4. Stream redirection

sys.stdout, sys.stderr, and sys.stdin are replaced with custom Go-backed Python types (goWriter, goReader) that wrap callCtx.Stdout, callCtx.Stderr, and callCtx.Stdin. This ensures all Python print() output is captured by the shell executor's output limiter.

5. sys.exit() exit code propagation

gpython's built-in sys_exit has a bug: it returns the SystemExit exception as a Python value rather than raising it, so it never propagates to the Go caller. The fix is a custom override that stores the exit code in a closure variable (*exitCodePtr) before returning any error to stop the VM — this way the code is recovered even after gpython wraps the Go error into a SystemError exception:

var sysExitCode *int
// sys.exit() closure sets *sysExitCode before returning any error
sysMod.Globals["exit"] = py.MustNewMethod("exit", func(...) (py.Object, error) {
    c := code
    *exitCodePtr = &c
    return nil, fmt.Errorf("sys.exit(%d)", code)
}, ...)

// After py.RunCode returns:
if sysExitCode != nil {
    return *sysExitCode   // recovered before VM wraps the error
}

6. Context cancellation

Run() executes gpython in a goroutine and selects on ctx.Done(). If the shell's execution timeout fires, the goroutine is abandoned. Because gpython is pure-Go and holds no OS resources (no child processes, no file descriptors beyond what the sandbox controls), abandoning the goroutine is safe.


Memory limits

Resource Limit Enforcement
Script / stdin source 1 MiB io.LimitReader in python.go
open().read() 1 MiB io.LimitReader in goFile.read()
open().readlines() 1 MiB io.LimitReader before bytes.SplitAfter
stdin .read() 1 MiB io.LimitReader in goReader.read()
Print output executor limit (1 MiB) enforced upstream by shell executor

Static analysis

The implementation registers in both analysis allowlists:

  • builtinPerCommandSymbols["python"] — lists the handful of stdlib symbols used directly in builtins/python/python.go (io.LimitReader, io.ReadAll, io.Reader, os.O_RDONLY, context.Context)
  • internalPerPackageSymbols["pyruntime"] — lists all stdlib symbols used in pyruntime (bufio, bytes, context, errors, fmt, io, os, strings)
  • internalAllowedSymbols — updated with the new symbols as the global ceiling
  • ExemptImport in both internalCheckConfig and internalPerPackageCheckConfig — exempts github.com/go-python/gpython/ from symbol-by-symbol checking (listing every py.* symbol would be impractical and offer no security benefit)

Test coverage

Test file Coverage
builtins/tests/python/python_test.go 30+ unit tests: all flags, sys.exit propagation, file I/O, every sandbox restriction, error paths, context cancellation
builtins/tests/python/python_fuzz_test.go Two fuzz targets: arbitrary source strings via -c, arbitrary file content via script file
tests/scenarios/cmd/python/ 14 YAML scenario tests across basic/, sandbox/, errors/, stdin/

Known limitations (gpython vs CPython)

  • Python 3.4 syntax only — no f-strings, walrus operator (:=), match/case, *-unpacking in calls
  • Very limited stdlib: math, sys, os (read-only), time, string, binascii — no json, re, io, pathlib, hashlib, collections, itertools, etc.
  • str methods are limited — e.g. .upper(), .join(), .split() are not implemented in gpython
  • No subprocess, socket, threading, multiprocessing (not in gpython stdlib)

AlexandreYang and others added 2 commits April 11, 2026 22:26
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…builtins, and keywords

Adds 61 new scenario tests across 10 categories to improve coverage of
the python builtin's gpython (3.4) interpreter:

- keywords: pass, del, assert, global, nonlocal, in/not-in, is/is-not, break, continue
- comprehensions: list, filtered list, dict, set, generator expression, nested
- generators: basic yield, generator.send(), yield from, StopIteration
- lambdas: basic, sorted key, map
- builtins: len, range, enumerate, zip, map, filter, sorted, all/any, min/max,
  sum, chr/ord, bin/hex/oct, isinstance, type constructors, repr,
  print kwargs, getattr/setattr/hasattr, abs/divmod/pow
- exceptions: try/finally, try/except/finally, bare raise, raise from,
  multiple except handlers
- operators: bitwise, augmented assignment, chained comparisons, ternary, boolean short-circuit
- data_structures: tuple unpacking, extended unpacking, set operations,
  string format (%), string methods
- functions: default args, *args, **kwargs
- os_module: os.getcwd(), os.environ

Tests account for gpython v0.2.0 limitations: no str.format(), no str.lower/upper,
no len(bytes), no frozenset(), no classmethod/staticmethod, no closures (free
variable capture without nonlocal), no integer dict keys, no enumerate(start=).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant