Skip to content

feat(bokeh): implement dumbbell-basic#9565

Merged
MarkusNeusinger merged 6 commits into
mainfrom
implementation/dumbbell-basic/bokeh
Jun 30, 2026
Merged

feat(bokeh): implement dumbbell-basic#9565
MarkusNeusinger merged 6 commits into
mainfrom
implementation/dumbbell-basic/bokeh

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

Implementation: dumbbell-basic - python/bokeh

Implements the python/bokeh version of dumbbell-basic.

File: plots/dumbbell-basic/implementations/python/bokeh.py

Parent Issue: #945


🤖 impl-generate workflow

github-actions Bot added 2 commits June 30, 2026 22:49
Regen from quality 88. Addressed:
- Canvas size: fixed 4800×2700 → canonical 3200×1800 with CDP viewport override
- Title format: added python language token, removed "Employee Satisfaction" prefix
- Save method: replaced export_png with Selenium + CDP Emulation.setDeviceMetricsOverride
- Typography: updated to canonical 50pt/42pt/34pt sizes for 3200×1800 canvas
- Y-axis spine: removed (p.yaxis.axis_line_color = None) for cleaner look
- Segment color-coding: segments now green for improvement, red (#AE3030) for regression
- Comment hygiene: "Okabe-Ito" → "Imprint palette"
- Added sys.path self-shadowing fix (bokeh.py naming conflict)
@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

AI Review - Attempt 1/3

Image Description

Light render (plot-light.png): Warm off-white background (#FAF8F1, correct). Title "dumbbell-basic · python · bokeh · anyplot.ai" in dark ink, clearly legible, filling ~75% of canvas width (expected for this title length). Eight department categories on the y-axis; x-axis shows "Satisfaction Score" 50–90. "Before" dots (brand green #009E73) and "After" dots (lavender #C475FD) at size=28 are prominent and well-separated. Connecting segments are green for improvements (7 departments) and matte red for Finance (regression of -6 pts) — a clear semantic color choice. Legend at top_left with warm elevated background, readable labels. Subtle vertical grid at 10% alpha. All text is dark-colored and clearly readable against the warm off-white surface.

Dark render (plot-dark.png): Warm near-black background (#1A1A17, correct). Title, axis labels, tick labels, and legend text are all light-colored (#F0EFE8 / #B8B7B0) — clearly readable, no dark-on-dark failures observed. Data colors are identical to the light render: brand green Before dots, lavender After dots, matte red Finance segment — only chrome (backgrounds, text, legend box) flips between themes. Brand green #009E73 remains visually distinct and readable on the dark surface. Both renders pass legibility check.

Score: 88/100

Category Score Max
Visual Quality 28 30
Design Excellence 12 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 8 10
Total 88 100

Visual Quality (28/30)

  • VQ-01: Text Legibility (7/8) — All font sizes explicitly set per bokeh.md (50pt title, 42pt axis, 34pt ticks/legend). Well-proportioned in both themes. Minor: 'Department' at 42pt is slightly large for a short single-word label; proportional sizing check notes short labels should not dominate.
  • VQ-02: No Overlap (6/6) — No overlap. Legend at top_left is close to the R&D Before dot but does not obscure it. Category labels well-spaced.
  • VQ-03: Element Visibility (6/6) — Dots at size=28 prominent for 8 sparse data points. Connecting segments at line_width=6 clearly visible. White dot outlines add definition.
  • VQ-04: Color Accessibility (2/2) — Imprint palette (green + lavender) with semantic red for regression. White outlines improve contrast. CVD-safe.
  • VQ-05: Layout & Canvas (3/4) — Good proportions and canvas utilization. Minor: legend at top_left is inside the data area and slightly crowds the Research & Development row at the top.
  • VQ-06: Axis Labels & Title (2/2) — 'Satisfaction Score' descriptive; unitless scores are appropriate here. 'Department' correct. Title format correct.
  • VQ-07: Palette Compliance (2/2) — Before dots #009E73 (Imprint pos 1), After dots #C475FD (Imprint pos 2). Regression segment uses #AE3030 (semantic matte red — appropriate for loss/regression). Backgrounds #FAF8F1 light / #1A1A17 dark correct. Chrome adapts correctly in both renders.

Design Excellence (12/20)

  • DE-01: Aesthetic Sophistication (5/8) — Above defaults. Color-coded connecting segments by improvement/regression direction is a genuine design choice adding semantic value. Dot outlines and minimal chrome contribute. Not yet at publication-quality level.
  • DE-02: Visual Refinement (4/6) — Good refinement: y-axis spine removed, only x-axis kept, subtle vertical grid at 10% alpha, generous explicit margins, no Y ticks. Clearly above minimal-customization.
  • DE-03: Data Storytelling (3/6) — Sorting by delta (ascending) with Finance regression at bottom creates implicit hierarchy, and the red segment highlights the lone regression. However, the story is implicit — no delta labels on connecting segments, no subtitle, viewer must read carefully to identify Finance as the outlier.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct dumbbell chart: two dots per category connected by a line, horizontal orientation, categories on y-axis.
  • SC-02: Required Features (4/4) — Two distinct dot colors, connecting line, sorted by difference, 8 categories in optimal range.
  • SC-03: Data Mapping (3/3) — Categories on y-axis, satisfaction scores on x-axis. All data visible.
  • SC-04: Title & Legend (3/3) — Title 'dumbbell-basic · python · bokeh · anyplot.ai' correct. Legend labels match data context.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Shows both improvement (7 departments) and regression (Finance, -6 pts). Sorted by delta reveals the full feature range of a dumbbell chart.
  • DQ-02: Realistic Context (5/5) — Employee satisfaction before/after policy changes — neutral, real-world HR scenario.
  • DQ-03: Appropriate Scale (4/4) — Scores 55–88, improvements up to 23 pts, one regression (-6). Plausible and internally consistent.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Linear: self-path fix → imports → constants → data → sort → plot → save.
  • CQ-02: Reproducibility (2/2) — Hardcoded deterministic data.
  • CQ-03: Clean Imports (2/2) — All imports used.
  • CQ-04: Code Elegance (2/2) — Clean, no over-engineering. Self-shadowing workaround is required for bokeh.py.
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.png and plot-{THEME}.html. Correct for interactive library.

Library Mastery (8/10)

  • LM-01: Idiomatic Usage (5/5) — ColumnDataSource throughout, HoverTool, output_file/save, theme-adaptive env-var pattern, segment() for connecting lines — all idiomatic Bokeh.
  • LM-02: Distinctive Features (3/5) — HoverTool and HTML interactive output are genuinely Bokeh-distinctive. Solid but fairly standard; could leverage shared hover across Before/After renderers or a custom hover template showing the delta value.

Score Caps Applied

  • None

Strengths

  • Correct dumbbell chart with horizontal orientation and categories on y-axis, fully matching the spec
  • Color-coded connecting segments (brand green for improvement, matte red for Finance regression) adds semantic meaning
  • Sorted by delta ascending (largest improvement at top) creates natural visual storytelling
  • Correct Imprint palette: Before dots #009E73, After dots #C475FD, regression segment #AE3030 (semantic)
  • Theme adaptation excellent — all chrome elements correctly adapt in both light and dark renders
  • Perfect spec, data, and code quality

Weaknesses

  • DE-03 (3/6): Finance regression is highlighted by color but the story is implicit — add delta labels on connecting segments (e.g., +16, +17, −6) or a subtitle to make the narrative explicit
  • LM-02 (3/5): Could leverage more distinctive Bokeh features — shared hover across Before/After renderers, or a custom hover template showing the computed delta
  • VQ-05 (3/4): Legend at top_left is inside the data area and slightly crowds the Research & Development row; consider bottom_left placement or anchoring the legend outside the plot area
  • DE-01 (5/8): Design is above defaults but not yet publication-quality; more refined typography treatment or visual polish would push it higher
  • VQ-01 (7/8): 'Department' y-axis label is a short word at 42pt — slightly disproportionate; consider 36pt for better proportional balance

Issues Found

  1. DE-03 LOW: Story told implicitly through sort order and color, but viewer must work to find it
    • Fix: Add delta labels on connecting segments (+N pts) OR a subtitle like "All departments improved except Finance" — delta labels are the more compact and data-forward choice
  2. LM-02 MODERATE: HoverTool is used but not maximally leveraged
    • Fix: Share hover across Before and After renderers simultaneously; add delta to the tooltip so hovering shows "Department: Engineering / Before: 62 / After: 78 / Change: +16"
  3. VQ-05 MINOR: Legend placement crowds R&D row at top
    • Fix: Move legend to bottom_left or position it outside the plot area to the right with add_layout(legend, 'right')

AI Feedback for Next Attempt

Priority 1 — Data storytelling: Add delta labels on the connecting segments (formatted as +N or −N) to make the magnitude of each change immediately readable without requiring axis estimation. This single change upgrades DE-03 from 3 to 5+.

Priority 2 — Legend placement: Move legend to bottom_left or anchor it outside the data area (add_layout with 'right' or 'below') to avoid crowding the top data row.

Priority 3 — LM-02 lift: Upgrade the HoverTool to show delta (computed as end−start) in the tooltip, and configure it as a shared hover across both Before and After renderers so hovering over either dot highlights the full dumbbell for that department.

Verdict: REJECTED

@github-actions github-actions Bot added quality:88 Quality score: 88/100 ai-rejected Quality not OK, triggers update labels Jun 30, 2026
@github-actions github-actions Bot added ai-attempt-1 First repair attempt and removed ai-rejected Quality not OK, triggers update labels Jun 30, 2026
Attempt 1/3 - fixes based on AI review
@github-actions

Copy link
Copy Markdown
Contributor Author

🔧 Repair Attempt 1/4

Applied fixes based on AI review feedback.

Status: Repair completed, re-triggering review...


🤖 impl-repair

@claude

claude Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

AI Review - Attempt 2/3

Image Description

Light render (plot-light.png): Horizontal dumbbell chart on warm off-white (#FAF8F1) background. Eight department categories on the Y-axis sorted ascending by delta change (Finance −6 at bottom, R&D +23 at top). Brand green (#009E73) dots mark "Before" satisfaction scores; lavender (#C475FD) dots mark "After" scores. Connecting segments are green for improvements and red (#AE3030) for the Finance regression. Delta labels (+23, +17, +16, +16, +15, +13, +11, −6) are positioned above each segment midpoint in INK_SOFT. Title and axis labels are dark ink (#1A1A17) — clearly readable. Tick labels in muted dark (#4A4A44) — readable. Legend box at bottom-left with elevated background — overlaps the Finance row (bottom data row), partially obscuring the Finance dumbbell dots and the '−6' label. All other text legible against the off-white surface.

Dark render (plot-dark.png): Same layout on near-black (#1A1A17) background. All chrome elements flip correctly: title and axis labels use #F0EFE8, tick labels use #B8B7B0 — no dark-on-dark failures detected. Data colors are identical to the light render: green (#009E73) and lavender (#C475FD) dots, red (#AE3030) Finance segment. Delta labels visible in muted light gray (#B8B7B0). Legend at bottom-left uses elevated dark background (#242420) with light text — readable. Same bottom-left legend overlap with Finance row as light render. Brand green #009E73 reads clearly against the near-black surface.

Both paragraphs required — both renders examined. Theme-adaptive chrome is correct in both.

Score: 88/100

Category Score Max
Visual Quality 25 30
Design Excellence 14 20
Spec Compliance 15 15
Data Quality 15 15
Code Quality 10 10
Library Mastery 9 10
Total 88 100

Visual Quality (25/30)

  • VQ-01: Text Legibility (7/8) — Font sizes explicitly set (50pt title, 42pt x-axis, 36pt y-axis, 34pt ticks). All text readable in both themes. Minor: y-axis 36pt vs x-axis 42pt is a slight imbalance.
  • VQ-02: No Overlap (3/6) — places the legend directly over the Finance row (sorted to bottom after ascending-delta sort), obscuring Finance dumbbell dots and '−6' delta label in both renders.
  • VQ-03: Element Visibility (6/6) — Dots at size=28 are prominent for 8 data points. Segments at line_width=6 with alpha=0.55 are clear. Delta labels at 26pt are readable.
  • VQ-04: Color Accessibility (2/2) — Imprint palette positions 1 & 2 with redundant x-position encoding; semantic red for regression is appropriate.
  • VQ-05: Layout & Canvas (3/4) — Canvas gate passed (3200×1800). min_border_* correctly enlarged. Legend placement overlapping bottom row is the one layout flaw.
  • VQ-06: Axis Labels & Title (2/2) — Correct title format. Axis labels 'Satisfaction Score' and 'Department' are descriptive.
  • VQ-07: Palette Compliance (2/2) — First series = #009E73 ✓; second = #C475FD ✓; red segment is semantic exception for regression ✓; backgrounds #FAF8F1 / #1A1A17 ✓; data colors identical across renders ✓.

Design Excellence (14/20)

  • DE-01: Aesthetic Sophistication (5/8) — Color-coded segments for direction of change, white dot edges for visual separation, delta labels as embedded annotations. Intentional semantic red for Finance regression.
  • DE-02: Visual Refinement (4/6) — Y-axis spine removed, outline removed, vertical-only grid at 10% alpha, legend with elevated background and muted border.
  • DE-03: Data Storytelling (5/6) — Ascending delta sort places R&D (+23) at top as the hero story. Red Finance segment creates immediate visual contrast. Delta labels make policy-change impact explicit.

Spec Compliance (15/15)

  • SC-01: Plot Type (5/5) — Correct horizontal dumbbell: categories on y-axis, values on x-axis, two dots per category connected by a line.
  • SC-02: Required Features (4/4) — Horizontal orientation ✓, distinct colors for start/end dots ✓, connecting line ✓, sorted by difference ✓.
  • SC-03: Data Mapping (3/3) — Categories on y-axis ✓, scores on x-axis ✓, x_range=(45,95) covers all data ✓.
  • SC-04: Title & Legend (3/3) — Title 'dumbbell-basic · python · bokeh · anyplot.ai' matches required format ✓. Legend differentiates 'Before / After policy changes' ✓.

Data Quality (15/15)

  • DQ-01: Feature Coverage (6/6) — Shows direction, magnitude, before/after comparison. Includes one regression (Finance −6) demonstrating both positive and negative delta handling.
  • DQ-02: Realistic Context (5/5) — Employee satisfaction scores before/after policy changes is plausible, neutral, and relatable. Scores 55–88 are realistic.
  • DQ-03: Appropriate Scale (4/4) — 8 categories (within 5–20 spec range), realistic 0–100 score scale, good axis padding.

Code Quality (10/10)

  • CQ-01: KISS Structure (3/3) — Straight procedural script, no functions or classes.
  • CQ-02: Reproducibility (2/2) — All data hardcoded; fully deterministic.
  • CQ-03: Clean Imports (2/2) — All imports are used; no dead imports.
  • CQ-04: Code Elegance (2/2) — HoverTool is real bokeh interactivity for the HTML artifact (not fake UI). Selenium screenshot pattern is correct.
  • CQ-05: Output & API (1/1) — Saves plot-{THEME}.png and plot-{THEME}.html as required. Current bokeh 3.x API.

Library Mastery (9/10)

  • LM-01: Idiomatic Usage (5/5) — ColumnDataSource for all data, segment() for lines, scatter() for dots, LabelSet for delta annotations, categorical y-axis via y_range=categories, correct min_border_* for label clearance.
  • LM-02: Distinctive Features (4/5) — HoverTool with multi-field tooltips ✓, LabelSet for positioned text ✓, interactive HTML artifact ✓, ColumnDataSource with rich field dictionaries ✓.

Score Caps Applied

  • None — no caps triggered.

Strengths

  • Imprint palette used correctly: #009E73 (Before), #C475FD (After), #AE3030 (regression segment as semantic exception)
  • Delta labels embedded in the chart make policy-change impact immediately readable without calculation
  • Ascending delta sort creates clear visual hierarchy with R&D (+23) as the hero
  • Color-coded segments (green/red) add redundant semantic encoding for direction
  • Full theme-adaptive chrome: both renders correct, no dark-on-dark failures
  • White dot edges (line_color=PAGE_BG) give clean visual separation on both surfaces
  • HoverTool with Phase/Score/Δ fields makes the interactive HTML genuinely useful
  • Includes Finance regression (−6) demonstrating the chart handles both positive and negative changes

Weaknesses

  • overlaps the Finance row (sorted to bottom by ascending delta), obscuring Finance dumbbell dots (before=78, after=72) and '−6' delta label. Fix: change to — there is clear empty space above the R&D row's start dot around x=65.
  • Y-axis label 'Department' at 36pt vs x-axis 'Satisfaction Score' at 42pt creates mild axis label size imbalance. Fix: align to match (min_border_left=280 already reserves room).

Issues Found

  1. VQ-02 OVERLAP: Legend at overlaps Finance row data
    • Fix:
  2. VQ-01 MINOR: Y-axis label at 36pt vs x-axis 42pt — slight imbalance
    • Fix:

AI Feedback for Next Attempt

Two targeted fixes: (1) Move from to — the top-left quadrant of the plot is empty (R&D's Before dot is at x≈65, well right of x=45 axis edge, leaving room for the legend). (2) Raise from to to match the x-axis. Everything else — palette, data, storytelling, code structure, bokeh patterns, HTML export — is strong and should be preserved.

Verdict: APPROVED

@github-actions github-actions Bot added the ai-approved Quality OK, ready for merge label Jun 30, 2026
@MarkusNeusinger MarkusNeusinger merged commit 69940be into main Jun 30, 2026
3 checks passed
@MarkusNeusinger MarkusNeusinger deleted the implementation/dumbbell-basic/bokeh branch June 30, 2026 23:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai-approved Quality OK, ready for merge ai-attempt-1 First repair attempt quality:88 Quality score: 88/100

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant