Skip to content

Add challenge 76: Bilinear Image Scaling (Medium)#199

Open
claude[bot] wants to merge 1 commit intomainfrom
add-challenge-76-bilinear-image-scaling
Open

Add challenge 76: Bilinear Image Scaling (Medium)#199
claude[bot] wants to merge 1 commit intomainfrom
add-challenge-76-bilinear-image-scaling

Conversation

@claude
Copy link
Contributor

@claude claude bot commented Mar 1, 2026

Summary

  • Adds challenge 76: Bilinear Image Scaling (Medium difficulty)
  • Given an input image of dimensions H × W and target dimensions H_out × W_out, produce the scaled output using bilinear interpolation with the align-corners convention
  • Validated locally with a correct CUDA solution passing all functional and example tests on NVIDIA Tesla T4

Why this challenge is interesting

  • Each output pixel reads from 4 non-aligned input pixels — writes are coalesced but reads are not, making memory access patterns non-trivial
  • Requires floating-point coordinate computation (inverse mapping: src = i * (in_size - 1) / (out_size - 1)) with correct boundary handling
  • Tests understanding of 2D thread/block organization for image-shaped workloads
  • Not element-wise: every output depends on a weighted blend of 4 neighbours
  • Real-world operation used in every GPU vision pipeline (OpenCV CUDA, cuDNN resize, etc.)

Checklist

challenge.html

  • Starts with <p> (problem description) — never <h1>
  • Has <h2> sections for: Implementation Requirements, Example(s), Constraints
  • First example matches generate_example_test() values (2×2 → 3×3 with [[1,3],[7,9]])
  • Examples use LaTeX \begin{bmatrix} for matrices
  • Constraints includes Performance is measured with H = 4,096, W = 4,096, H_out = 8,192, W_out = 8,192

challenge.py

  • class Challenge inherits ChallengeBase
  • __init__ calls super().__init__() with name, atol, rtol, num_gpus, access_tier
  • reference_impl has assertions on shape, dtype, and device
  • All 6 methods present
  • generate_functional_test returns 10 cases covering edge cases (1×1, 2×2, 3×3 zeros, 4×12 width-only), powers-of-2 (16², 64², 256²), non-powers-of-2 (30×40, 100×150), realistic (1024²)
  • generate_performance_test: 4096×4096 → 8192×8192 (320 MB total, well within 16 GB × 5)

Starter files

  • All 6 files present: .cu, .pytorch.py, .triton.py, .jax.py, .cute.py, .mojo
  • Exactly 1 parameter description comment per file, no other comments
  • CUDA/Mojo use "device pointers" (medium challenge, no parenthetical)
  • Python frameworks use "tensors on the GPU"; JAX also has # return output tensor directly
  • Starters compile/run but do NOT produce correct output

General

  • Directory follows 76_bilinear_image_scaling convention
  • Linting passes: pre-commit run --all-files
  • Validated with run_challenge.py --action run — status: success

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants