Skip to content

Comments

fix: Council Fix Pack - March 2026#1841

Open
m26dvd wants to merge 16 commits intorobbrad:masterfrom
m26dvd:master
Open

fix: Council Fix Pack - March 2026#1841
m26dvd wants to merge 16 commits intorobbrad:masterfrom
m26dvd:master

Conversation

@m26dvd
Copy link
Contributor

@m26dvd m26dvd commented Feb 2, 2026

Fixes
fix: #1836 - London Borough Redbridge
fix: #1831 - Harborough District Council
fix: #1844 - Harborough District Council
fix: #1846 - Powys Council
fix: #1845 - Mid Suffolk District Council
fix: #1851 - Bromley Borough Council
fix: #1853 - Wakefield City Council
fix: #1848 - Redcar and Cleveland Council
fix: #1855 - Barking & Dagenham
fix: #1858 - Cumberland Council
fix: #1861 - North East Derbyshire District Council

New Councils
fix: #1504 - Hammersmith & Fulham

Summary by CodeRabbit

  • Bug Fixes

    • Improved Redbridge collection detection.
    • Made Harborough address lookup more reliable via stabilized session flow.
    • Hardened Powys date parsing to skip malformed entries.
    • Trimmed stray text in Mid Suffolk date parsing to avoid parse errors.
    • Improved Bromley driver initialization for better site compatibility.
  • New Features

    • Added support for Hammersmith & Fulham council bin retrieval.
  • Tests

    • Added test input for Hammersmith & Fulham.
  • Documentation

    • Updated councils docs: added/removed entries and revised guidance.

fix: robbrad#1836 - fix: London Borough Redbridge
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 2, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refactors a Redbridge Selenium wait, changes Harborough HTTP flow to GET→POST with a session, adds a Hammersmith & Fulham council scraper and test input, updates Powys and Mid Suffolk parsing, tweaks Bromley driver UA usage, and updates Councils.md documentation.

Changes

Cohort / File(s) Summary
Redbridge tweak
uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py
Removed intermediate assignment when calling wait.until(...); simplified Selenium wait invocation.
Harborough request flow
uk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.py
Replaced direct POST with a requests.Session GET→POST sequence to seed cookies, switched to form-encoded POST, removed prior explicit headers, kept 502 handling.
New council + test input
uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py, uk_bin_collection/tests/input.json
Added CouncilClass for Hammersmith & Fulham that GETs postcode results, extracts weekday names and maps to next collection dates; added corresponding test entry.
Powys parsing update
uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py
Changed section selectors to card divs, changed date format parsing to include weekday ("%A %d %B %Y"), normalized date strings (split on " ("), and added try/except around garden waste parsing.
Mid Suffolk minor parse trim
uk_bin_collection/uk_bin_collection/councils/MidSuffolkDistrictCouncil.py
Trim date string after " - " before parsing to avoid parsing errors when trailing descriptors exist.
Bromley UA tweak
uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py
Passes an explicit user_agent string to WebDriver creation instead of None.
Docs update
wiki/Councils.md
Added/removed multiple council entries and updated usage notes (includes Hammersmith & Fulham, Eden District, Camden, Wirral, South Norfolk adjustments).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant LBHF as "LBHF Website"
    participant Parser
    participant DateUtil as "Date Utils"

    Client->>LBHF: GET /recycling-and-rubbish/bin-collections?postcode=W12%200BQ
    LBHF-->>Client: HTML (nearest results / links)
    Client->>Parser: provide HTML
    Parser->>Parser: extract day/type items
    Parser->>DateUtil: map weekday name -> next collection date
    DateUtil-->>Parser: date
    Parser-->>Client: {"bins":[{"type":"...","collectionDate":"DD/MM/YYYY"}, ...]}
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • dp247

Poem

🐰 I hopped through HTML and found a tiny clue,

Seeded cookies, chased a POST, and parsed each weekday true.
A new borough joined the warren, bins aligned in rows,
I nibbled whitespace, mapped the dates, then sorted how it goes.
Thump-thump for tidy code and one neat little proof.

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'fix: Council Fix Pack - March 2026' is vague and uses non-descriptive terminology. It does not clearly identify which councils are affected or what specific issues are being resolved. Consider using a more descriptive title that specifies the main changes, such as 'fix: Update parsers for Redbridge, Harborough, Powys, Mid Suffolk and add Hammersmith & Fulham'.
Out of Scope Changes check ❓ Inconclusive Changes to BromleyBoroughCouncil.py (user agent modification) appear unrelated to any linked issues and may represent out-of-scope changes not covered by #1836-#1846 or #1504. Clarify the purpose of BromleyBoroughCouncil.py modifications or move to a separate PR if unrelated to the stated objectives.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed All linked issues are addressed: #1836 (Redbridge) via wait.until fix, #1831 & #1844 (Harborough) via session handling and error management, #1846 (Powys) via HTML selector updates, #1845 (Mid Suffolk) via date string normalization, and #1504 (Hammersmith & Fulham) via new council implementation.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py (2)

1-1: ⚠️ Potential issue | 🟡 Minor

Incorrect comment: file is for Redbridge, not Bromley.

The comment references "Bromley Council" but this file handles London Borough of Redbridge.

📝 Proposed fix
-# This script pulls (in one hit) the data from Bromley Council Bins Data
+# This script pulls (in one hit) the data from London Borough of Redbridge Bins Data

114-116: ⚠️ Potential issue | 🟠 Major

Silent error handling masks format changes.

When date parsing fails, this code sets formatted_date = "Invalid Date Format" but then does nothing with it—the data isn't appended. This silently swallows errors and returns empty results, making it hard to detect when the council site format changes.

Per project conventions, prefer explicit failures so format changes are detected immediately rather than returning empty or partial data.

🛡️ Proposed fix: Fail explicitly or log warning
                         except ValueError as e:
                             # Handle the case where the date format is invalid
-                            formatted_date = "Invalid Date Format"
+                            raise ValueError(
+                                f"Failed to parse date '{date_string}' for {collection_type}: {e}"
+                            )

Alternatively, if you want to continue processing other collections but still surface the issue:

                         except ValueError as e:
-                            # Handle the case where the date format is invalid
-                            formatted_date = "Invalid Date Format"
+                            # Log warning but continue processing
+                            print(f"Warning: Failed to parse date '{date_string}' for {collection_type}: {e}")
+                            return  # Skip this collection entry

Based on learnings: "In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors."

🤖 Fix all issues with AI agents
In `@uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py`:
- Around line 118-123: The code calls container.find("h3").get_text().strip()
which will raise AttributeError if an h3 is missing; update the loop over
container_fluid to guard the lookup: retrieve the h3 element with
container.find("h3"), check for None before calling get_text()/strip(), skip or
log the container when missing, and only call
extract_collection_data(collection, collection_type) when collection_type is
safely set; reference the variables container_fluid, container, collection_type,
and the function extract_collection_data when making the change.
- Around line 97-103: The parsing fails because date_string =
f"{collection_date} {collection_month}" lacks a year but datetime.strptime is
called with "%d %B %Y"; update the logic in LondonBoroughRedbridge (around the
creation of date_string/formatted_date) to append an inferred year: take the
current year, build a candidate date string like "{collection_date}
{collection_month} {year}", parse with "%d %B %Y", and if the resulting date is
before today (meaning the parsed month/day already passed) increment the year
and re-parse; then format to date_format and assign to formatted_date. Reference
symbols: collection_date, collection_month, date_string, formatted_date,
date_format, and datetime.strptime.
🧹 Nitpick comments (1)
uk_bin_collection/uk_bin_collection/councils/LondonBoroughRedbridge.py (1)

2-5: Shadowed import and unused module.

Line 2 imports the datetime module, but line 5 imports the datetime class from it, shadowing the module. Additionally, line 7 imports requests which is never used in this file.

🧹 Proposed cleanup
-# This script pulls (in one hit) the data from Bromley Council Bins Data
-import datetime
-import re
+# This script pulls (in one hit) the data from London Borough of Redbridge Bins Data
 import time
 from datetime import datetime

-import requests
 from bs4 import BeautifulSoup

@codecov
Copy link

codecov bot commented Feb 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 86.67%. Comparing base (f6cbad7) to head (36573c3).

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #1841   +/-   ##
=======================================
  Coverage   86.67%   86.67%           
=======================================
  Files           9        9           
  Lines        1141     1141           
=======================================
  Hits          989      989           
  Misses        152      152           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Fix all issues with AI agents
In `@uk_bin_collection/tests/input.json`:
- Around line 1411-1418: The entry for "LondonBoroughHammersmithandFulham" is
missing skip_get_url and may trigger a prefetch failure; add "skip_get_url":
true to that object so the parser's parse_data step runs directly and avoids the
automatic GET to the base url (this will also cause the auto-generated wiki
entry to include -s). Update the JSON object for
LondonBoroughHammersmithandFulham to include the skip_get_url key alongside
postcode, url, wiki_command_url_override, wiki_name, wiki_note, and LAD24CD.

In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py`:
- Around line 39-45: The comparison days_of_week.get(collection_day) == 0
incorrectly assumes Monday is "today"; change the logic to compare
days_of_week.get(collection_day) against the current weekday index
(datetime.now().weekday()) so that collection_day is set to today only when the
mapped index equals today's index, otherwise call
get_next_day_of_week(collection_day) and format with date_format; update
references to collection_day, days_of_week, get_next_day_of_week, and
date_format accordingly.
- Around line 20-25: Ensure the postcode is validated for presence and type
before calling .strip() and before invoking check_uprn: check that
kwargs.get("postcode") yields a non-empty string (e.g., not None) and raise or
return an explicit error if missing, then assign to user_postcode and call
user_postcode = user_postcode.strip().replace(" ", "") and only after that call
check_uprn(user_postcode) (or call check_uprn after validation but before
stripping if it expects raw input); update the logic around the user_postcode
variable and the check_uprn(...) call in LondonBoroughHammersmithandFulham.py to
avoid AttributeError when postcode is absent.

In `@wiki/Councils.md`:
- Line 112: The TOC anchor for the "Eden District (Westmorland and Furness)"
entry is invalid per MD051; update the fragment to match GitHub-style heading
IDs (lowercase, spaces → hyphens, strip parentheses/punctuation, replace & with
"and"), e.g. change the link target for "[Eden District (Westmorland and
Furness)](#...)" to "#eden-district-westmorland-and-furness" (and apply the same
normalization to the other problematic entries referenced at 173-175) or adjust
your TOC generator to produce GitHub-compatible anchors.
🧹 Nitpick comments (2)
uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py (2)

34-37: Raise explicit errors when expected DOM nodes are missing.
Right now a markup change will raise an AttributeError with little context; explicit failures make format changes easier to detect and debug.

🛠️ Suggested change
-        results = soup.find("div", {"class": "nearest-search-results"})
-        ol = results.find("ol")
-        bin_collections = ol.find_all("a")
+        results = soup.find("div", {"class": "nearest-search-results"})
+        if results is None:
+            raise ValueError("Unexpected LBHF markup: .nearest-search-results not found")
+        ol = results.find("ol")
+        if ol is None:
+            raise ValueError("Unexpected LBHF markup: results list not found")
+        bin_collections = ol.find_all("a")
+        if not bin_collections:
+            raise ValueError("Unexpected LBHF markup: no collection links found")
Based on learnings “In uk_bin_collection/**/*.py, when parsing council bin collection data, prefer explicit failures (raise exceptions on unexpected formats) over silent defaults or swallowed errors.”

6-6: Remove star import and use explicit imports to clarify dependencies and satisfy Ruff checks.

Star imports hide dependencies and trigger F403/F405 warnings in Ruff. Replace with explicit imports for only the symbols used.

🛠️ Suggested change
-from uk_bin_collection.uk_bin_collection.common import *
+from uk_bin_collection.uk_bin_collection.common import (
+    check_uprn,
+    date_format,
+    days_of_week,
+    get_next_day_of_week,
+)

Comment on lines +1411 to +1418
"LondonBoroughHammersmithandFulham": {
"postcode": "W12 0BQ",
"url": "https://www.lbhf.gov.uk/",
"wiki_command_url_override": "https://www.lbhf.gov.uk/",
"wiki_name": "Hammersmith & Fulham",
"wiki_note": "Pass only the property postcode",
"LAD24CD": "E09000013"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add skip_get_url: true to avoid a prefetch that can fail before parsing.
The parser performs its own request; without skip_get_url, a failed prefetch to the base URL can block the flow or add unnecessary latency. Consider adding skip_get_url: true so parse_data runs directly (the auto-generated wiki entry will then include -s).

🛠️ Suggested change
 "LondonBoroughHammersmithandFulham": {
     "postcode": "W12 0BQ",
+    "skip_get_url": true,
     "url": "https://www.lbhf.gov.uk/",
     "wiki_command_url_override": "https://www.lbhf.gov.uk/",
     "wiki_name": "Hammersmith & Fulham",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"LondonBoroughHammersmithandFulham": {
"postcode": "W12 0BQ",
"url": "https://www.lbhf.gov.uk/",
"wiki_command_url_override": "https://www.lbhf.gov.uk/",
"wiki_name": "Hammersmith & Fulham",
"wiki_note": "Pass only the property postcode",
"LAD24CD": "E09000013"
},
"LondonBoroughHammersmithandFulham": {
"postcode": "W12 0BQ",
"skip_get_url": true,
"url": "https://www.lbhf.gov.uk/",
"wiki_command_url_override": "https://www.lbhf.gov.uk/",
"wiki_name": "Hammersmith & Fulham",
"wiki_note": "Pass only the property postcode",
"LAD24CD": "E09000013"
},
🤖 Prompt for AI Agents
In `@uk_bin_collection/tests/input.json` around lines 1411 - 1418, The entry for
"LondonBoroughHammersmithandFulham" is missing skip_get_url and may trigger a
prefetch failure; add "skip_get_url": true to that object so the parser's
parse_data step runs directly and avoids the automatic GET to the base url (this
will also cause the auto-generated wiki entry to include -s). Update the JSON
object for LondonBoroughHammersmithandFulham to include the skip_get_url key
alongside postcode, url, wiki_command_url_override, wiki_name, wiki_note, and
LAD24CD.

Comment on lines 20 to 25
user_postcode = kwargs.get("postcode")
check_uprn(user_postcode)
bindata = {"bins": []}

user_postcode = user_postcode.strip().replace(" ", "")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate postcode explicitly before calling .strip().
check_uprn doesn’t raise and prints a UPRN-specific message; a missing postcode currently results in an AttributeError later.

🛠️ Suggested change
-        user_postcode = kwargs.get("postcode")
-        check_uprn(user_postcode)
+        user_postcode = kwargs.get("postcode")
+        if not user_postcode:
+            raise ValueError("Postcode is required")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_postcode = kwargs.get("postcode")
check_uprn(user_postcode)
bindata = {"bins": []}
user_postcode = user_postcode.strip().replace(" ", "")
user_postcode = kwargs.get("postcode")
if not user_postcode:
raise ValueError("Postcode is required")
bindata = {"bins": []}
user_postcode = user_postcode.strip().replace(" ", "")
🧰 Tools
🪛 Ruff (0.14.14)

[error] 21-21: check_uprn may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py`
around lines 20 - 25, Ensure the postcode is validated for presence and type
before calling .strip() and before invoking check_uprn: check that
kwargs.get("postcode") yields a non-empty string (e.g., not None) and raise or
return an explicit error if missing, then assign to user_postcode and call
user_postcode = user_postcode.strip().replace(" ", "") and only after that call
check_uprn(user_postcode) (or call check_uprn after validation but before
stripping if it expects raw input); update the logic around the user_postcode
variable and the check_uprn(...) call in LondonBoroughHammersmithandFulham.py to
avoid AttributeError when postcode is absent.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@uk_bin_collection/uk_bin_collection/councils/HarboroughDistrictCouncil.py`:
- Around line 26-37: The POST call incorrectly uses undefined names `URI` and
`headers` and bypasses the initialized session; change the POST to use the
session (session.post) and point it at the correct endpoint (`URI2`) and either
supply the existing headers object or remove the header param if not needed
(e.g., session.post(URI2, headers=headers, data=params, verify=False)); after
the POST call, call response.raise_for_status() to validate success. Update
references to `URI` -> `URI2` and ensure `headers` is defined in the scope of
the function before use (or omit it) so `HarboroughDistrictCouncil`'s POST uses
the session cookies established by the initial GET.

In `@wiki/Councils.md`:
- Around line 4287-4293: Update the Wirral example command and parameter list to
include the missing house-number flag (-n): in the example invocation `python
collect_data.py WirralCouncil ...` add `-n "HOUSE_NUMBER"` and update the
parameter bullet list to include `-n` - house number; also adjust the Note line
to explicitly mention both postcode and house number must be passed. This
ensures the example matches the explanatory text and the script's expected
flags.

Comment on lines +4287 to +4293
python collect_data.py WirralCouncil https://www.wirral.gov.uk -p "XXXX XXX" -w http://HOST:PORT/
```
Additional parameters:
- `-u` - UPRN
- `-p` - postcode
- `-w` - remote Selenium web driver URL (required for Home Assistant)

Note: In the `uprn` field, enter your street name and suburb separated by a comma (e.g., 'Vernon Avenue,Seacombe').
Note: Pass your postcode and house number.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the missing house-number parameter in the Wirral command.

The note says to pass a house number, but the command and parameter list omit -n.

🛠️ Suggested fix
-python collect_data.py WirralCouncil https://www.wirral.gov.uk -p "XXXX XXX" -w http://HOST:PORT/
+python collect_data.py WirralCouncil https://www.wirral.gov.uk -p "XXXX XXX" -n XX -w http://HOST:PORT/
-Additional parameters:
-- `-p` - postcode
-- `-w` - remote Selenium web driver URL (required for Home Assistant)
+Additional parameters:
+- `-p` - postcode
+- `-n` - house number
+- `-w` - remote Selenium web driver URL (required for Home Assistant)
🤖 Prompt for AI Agents
In `@wiki/Councils.md` around lines 4287 - 4293, Update the Wirral example command
and parameter list to include the missing house-number flag (-n): in the example
invocation `python collect_data.py WirralCouncil ...` add `-n "HOUSE_NUMBER"`
and update the parameter bullet list to include `-n` - house number; also adjust
the Note line to explicitly mention both postcode and house number must be
passed. This ensures the example matches the explanatory text and the script's
expected flags.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
uk_bin_collection/uk_bin_collection/councils/PowysCouncil.py (1)

128-145: ⚠️ Potential issue | 🟠 Major

Don’t swallow garden‑waste parsing errors; fail fast with context.

The broad except currently drops invalid dates silently, which hides site format changes and returns incomplete data. Raise a clear error when parsing fails so upstream can detect and handle breakages.

Suggested fix
-            for date in garden_waste_dates:
-                try:
-                    dict_data = {
-                        "type": "Garden Waste",
-                        "collectionDate": datetime.strptime(
-                            remove_ordinal_indicator_from_date_string(
-                                date.split(" (")[0]
-                            ),
-                            "%A %d %B %Y",
-                        ).strftime(date_format),
-                    }
-                    data["bins"].append(dict_data)
-                except:
-                    continue
+            for date in garden_waste_dates:
+                clean_date = remove_ordinal_indicator_from_date_string(
+                    date.split(" (")[0]
+                )
+                try:
+                    parsed_date = datetime.strptime(clean_date, "%A %d %B %Y")
+                except ValueError as exc:
+                    raise ValueError(
+                        f"Unexpected garden waste date format: {date}"
+                    ) from exc
+                data["bins"].append(
+                    {
+                        "type": "Garden Waste",
+                        "collectionDate": parsed_date.strftime(date_format),
+                    }
+                )

Based on learnings: In uk_bin_collection/**/*.py, prefer explicit failures on unexpected formats over swallowed errors.

🤖 Fix all issues with AI agents
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py`:
- Around line 34-44: Check that the parsed HTML nodes exist and that each entry
has the expected " - " format: after creating soup, assert results is not None
and ol is not None (raise ValueError with a clear message like
"nearest-search-results container missing" or "ordered list missing"), then when
iterating bin_collections validate bin_collection.get_text() splits into at
least two parts before assigning to collection_day and collection_type and raise
ValueError with a descriptive message (e.g., "unexpected bin collection entry
format: '<text>'") if not; reference the variables/results and the loop
assigning collection_day and collection_type to locate where to add these checks
and failures.

Comment on lines +34 to +44
soup = BeautifulSoup(response.content, features="html.parser")
results = soup.find("div", {"class": "nearest-search-results"})
ol = results.find("ol")
bin_collections = ol.find_all("a")

today = datetime.now().strftime("%A")

for bin_collection in bin_collections:
collection_day = bin_collection.get_text().split(" - ")[0]
collection_type = bin_collection.get_text().split(" - ")[1]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fail explicitly on unexpected HTML structure or entry format.

Right now results, ol, or malformed entry text will throw less-informative errors. Make the failure explicit to surface site changes quickly.

🛠️ Suggested change
-        results = soup.find("div", {"class": "nearest-search-results"})
-        ol = results.find("ol")
-        bin_collections = ol.find_all("a")
+        results = soup.find("div", {"class": "nearest-search-results"})
+        if results is None:
+            raise ValueError("Could not find nearest search results section.")
+        ol = results.find("ol")
+        if ol is None:
+            raise ValueError("Could not find bin collections list.")
+        bin_collections = ol.find_all("a")
...
-            collection_day = bin_collection.get_text().split(" - ")[0]
-            collection_type = bin_collection.get_text().split(" - ")[1]
+            parts = bin_collection.get_text().split(" - ")
+            if len(parts) != 2:
+                raise ValueError(
+                    f"Unexpected bin collection entry format: {bin_collection.get_text()!r}"
+                )
+            collection_day, collection_type = parts

Based on learnings, in uk_bin_collection/**/*.py, prefer explicit failures on unexpected formats with clear exception types and messages.

🤖 Prompt for AI Agents
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py`
around lines 34 - 44, Check that the parsed HTML nodes exist and that each entry
has the expected " - " format: after creating soup, assert results is not None
and ol is not None (raise ValueError with a clear message like
"nearest-search-results container missing" or "ordered list missing"), then when
iterating bin_collections validate bin_collection.get_text() splits into at
least two parts before assigning to collection_day and collection_type and raise
ValueError with a descriptive message (e.g., "unexpected bin collection entry
format: '<text>'") if not; reference the variables/results and the loop
assigning collection_day and collection_type to locate where to add these checks
and failures.

fix: robbrad#1845 - Mid Suffolk District Council
fix: robbrad#1851 Bromley Borough Council
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py`:
- Around line 33-34: The hardcoded user_agent string assigned to user_agent
before calling create_webdriver is stuck at Chrome 120; update it to a current
Chrome version (e.g., Chrome/143) or, better, replace the literal with the
shared constant/pattern used elsewhere (extract to and import the UA constant
used in get_bin_data.py) so BromleyBoroughCouncil.py sets user_agent
consistently before calling create_webdriver(web_driver, headless, user_agent,
__name__); ensure the new value follows the same format as other scrapers and
update any imports if you extract the constant.

Comment on lines +33 to +34
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Check all hardcoded user_agent strings in council scrapers
rg -n 'user_agent\s*=' --type=py uk_bin_collection/uk_bin_collection/councils/ -A 1 | head -50

Repository: robbrad/UKBinCollectionData

Length of output: 5269


🏁 Script executed:

#!/bin/bash
# Check what Chrome versions are used in the codebase
rg 'Chrome/[0-9]+' --type=py uk_bin_collection/uk_bin_collection/councils/ | sort | uniq -c

Repository: robbrad/UKBinCollectionData

Length of output: 13582


🏁 Script executed:

#!/bin/bash
# Check if there are any shared constants or utilities for user_agent
rg -n 'user_agent|USER_AGENT' --type=py uk_bin_collection/uk_bin_collection/ | grep -E '(const|CONSTANT|=)' | head -20

Repository: robbrad/UKBinCollectionData

Length of output: 3004


Hardcoded user-agent references Chrome 120, which is outdated (14 months old, with Chrome 143+ in active use elsewhere in the codebase).

Chrome 120 was released in December 2023. Multiple other council scrapers in the codebase have since moved to Chrome 121–143. Consider updating to a more current version (e.g., Chrome 141–143) to stay aligned with contemporary releases, or extract this into a shared constant similar to the pattern in get_bin_data.py.

Example UA bump
-            user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
+            user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36"
driver = create_webdriver(web_driver, headless, user_agent, __name__)
🧰 Tools
🪛 Ruff (0.15.0)

[error] 34-34: create_webdriver may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
In `@uk_bin_collection/uk_bin_collection/councils/BromleyBoroughCouncil.py` around
lines 33 - 34, The hardcoded user_agent string assigned to user_agent before
calling create_webdriver is stuck at Chrome 120; update it to a current Chrome
version (e.g., Chrome/143) or, better, replace the literal with the shared
constant/pattern used elsewhere (extract to and import the UA constant used in
get_bin_data.py) so BromleyBoroughCouncil.py sets user_agent consistently before
calling create_webdriver(web_driver, headless, user_agent, __name__); ensure the
new value follows the same format as other scrapers and update any imports if
you extract the constant.

fix: robbrad#1853 - Wakefield City Council
fix: robbrad#1848 - Redcar and Cleveland Council
fix: robbrad#1855 - Barking & Dagenham
fix: robbrad#1858 - Cumberland Council
fix: robbrad#1861 - North East Derbyshire District Council
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment