Bug: answer_position.split(',') breaks evaluation for sheet names containing commas

### Description

In `evaluation/evaluation.py`, the `compare_workbooks()` function splits the `answer_position` string on commas to handle multiple sheet/range pairs:

```python
# Line 169
sheet_cell_ranges = answer_position.split(',')
```

This assumes commas only appear as delimiters between separate sheet/range entries. However, Excel sheet names can legally contain commas (e.g. `b2b, sez, de`), and when they do, the `answer_position` wraps them in single quotes per Excel convention:

```
'b2b, sez, de'!A5:V10
```

The naive `.split(',')` does not respect the quoting and tears the string apart incorrectly.

### Steps to reproduce

1. Run evaluation on task `130-9`, which has:
   - `answer_position`: `'b2b, sez, de'!A5:V10`
   - `answer_sheet`: `b2b, sez, de`
2. Provide a correct output file (verified by cell-by-cell comparison against the golden file — all values, formats, and structure match perfectly).
3. Evaluation returns `False` (score 0).

### Root cause

`"'b2b, sez, de'!A5:V10".split(',')` produces:

```python
["'b2b", " sez", " de'!A5:V10"]
```

The loop then processes the third fragment `" de'!A5:V10"`, splits on `!`, strips quotes, and looks for a sheet named `de` — which does not exist. Result: `False`.

The first two fragments (`'b2b` and ` sez`) have no `!`, so they fall through to the else branch on line 177 and get treated as cell ranges on the first sheet, which is also incorrect.

### Suggested fix

Replace line 169 with a regex that respects single-quoted sheet names:

```python
import re
sheet_cell_ranges = re.findall(r"(?:'[^']*'![^,]+|[^,]+)", answer_position)
```

This matches either a single-quoted name followed by `!` and a range, or a plain comma-free token — keeping `'b2b, sez, de'!A5:V10` intact as a single entry.

### Impact

All tasks whose sheet names contain commas will silently score 0 even when the model output is fully correct. This affects the reliability of benchmark results without any visible error or warning.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: answer_position.split(',') breaks evaluation for sheet names containing commas #33

Description

Steps to reproduce

Root cause

Suggested fix

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bug: answer_position.split(',') breaks evaluation for sheet names containing commas #33

Description

Description

Steps to reproduce

Root cause

Suggested fix

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions