Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions uk_bin_collection/tests/input.json
Original file line number Diff line number Diff line change
Expand Up @@ -1408,6 +1408,14 @@
"wiki_note": "Pass the UPRN. You can find it using [FindMyAddress](https://www.findmyaddress.co.uk/search).",
"LAD24CD": "E09000009"
},
"LondonBoroughHammersmithandFulham": {
"postcode": "W12 0BQ",
"url": "https://www.lbhf.gov.uk/",
"wiki_command_url_override": "https://www.lbhf.gov.uk/",
"wiki_name": "Hammersmith & Fulham",
"wiki_note": "Pass only the property postcode",
"LAD24CD": "E09000013"
},
Comment on lines +1420 to +1427
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add skip_get_url: true to avoid a prefetch that can fail before parsing.
The parser performs its own request; without skip_get_url, a failed prefetch to the base URL can block the flow or add unnecessary latency. Consider adding skip_get_url: true so parse_data runs directly (the auto-generated wiki entry will then include -s).

🛠️ Suggested change
 "LondonBoroughHammersmithandFulham": {
     "postcode": "W12 0BQ",
+    "skip_get_url": true,
     "url": "https://www.lbhf.gov.uk/",
     "wiki_command_url_override": "https://www.lbhf.gov.uk/",
     "wiki_name": "Hammersmith & Fulham",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"LondonBoroughHammersmithandFulham": {
"postcode": "W12 0BQ",
"url": "https://www.lbhf.gov.uk/",
"wiki_command_url_override": "https://www.lbhf.gov.uk/",
"wiki_name": "Hammersmith & Fulham",
"wiki_note": "Pass only the property postcode",
"LAD24CD": "E09000013"
},
"LondonBoroughHammersmithandFulham": {
"postcode": "W12 0BQ",
"skip_get_url": true,
"url": "https://www.lbhf.gov.uk/",
"wiki_command_url_override": "https://www.lbhf.gov.uk/",
"wiki_name": "Hammersmith & Fulham",
"wiki_note": "Pass only the property postcode",
"LAD24CD": "E09000013"
},
🤖 Prompt for AI Agents
In `@uk_bin_collection/tests/input.json` around lines 1411 - 1418, The entry for
"LondonBoroughHammersmithandFulham" is missing skip_get_url and may trigger a
prefetch failure; add "skip_get_url": true to that object so the parser's
parse_data step runs directly and avoids the automatic GET to the base url (this
will also cause the auto-generated wiki entry to include -s). Update the JSON
object for LondonBoroughHammersmithandFulham to include the skip_get_url key
alongside postcode, url, wiki_command_url_override, wiki_name, wiki_note, and
LAD24CD.

"LondonBoroughHarrow": {
"uprn": "100021298754",
"url": "https://www.harrow.gov.uk",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,20 @@ def parse_data(self, page: str, **kwargs) -> dict:
check_uprn(user_uprn)
bindata = {"bins": []}

URI = "https://harborough.fccenvironment.co.uk/detail-address"
URI1 = "https://harborough.fccenvironment.co.uk/"
URI2 = "https://harborough.fccenvironment.co.uk/detail-address"

# Make the GET request
session = requests.session()
response = session.get(
URI1, verify=False
) # Initialize session state (cookies) required by URI2
response.raise_for_status() # Validate session initialization

headers = {
"Content-Type": "application/json",
"User-Agent": "Mozilla/5.0",
"Referer": "https://harborough.fccenvironment.co.uk/",
}
params = {"Uprn": user_uprn}
response = requests.post(URI, headers=headers, json=params)

response = session.post(URI2, json=params, verify=False)
response.raise_for_status() # Raise HTTPError for bad status codes

soup = BeautifulSoup(response.content, features="html.parser")
bin_collection = soup.find(
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
from datetime import datetime

import requests
from bs4 import BeautifulSoup

from uk_bin_collection.uk_bin_collection.common import *
from uk_bin_collection.uk_bin_collection.get_bin_data import AbstractGetBinDataClass


# import the wonderful Beautiful Soup and the URL grabber
class CouncilClass(AbstractGetBinDataClass):
"""
Concrete classes have to implement all abstract operations of the
base class. They can also override some operations with a default
implementation.
"""

def parse_data(self, page: str, **kwargs) -> dict:

user_postcode = kwargs.get("postcode")
check_uprn(user_postcode)
bindata = {"bins": []}

user_postcode = user_postcode.strip().replace(" ", "")

Comment on lines +20 to +25
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate postcode explicitly before calling .strip().
check_uprn doesn’t raise and prints a UPRN-specific message; a missing postcode currently results in an AttributeError later.

🛠️ Suggested change
-        user_postcode = kwargs.get("postcode")
-        check_uprn(user_postcode)
+        user_postcode = kwargs.get("postcode")
+        if not user_postcode:
+            raise ValueError("Postcode is required")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
user_postcode = kwargs.get("postcode")
check_uprn(user_postcode)
bindata = {"bins": []}
user_postcode = user_postcode.strip().replace(" ", "")
user_postcode = kwargs.get("postcode")
if not user_postcode:
raise ValueError("Postcode is required")
bindata = {"bins": []}
user_postcode = user_postcode.strip().replace(" ", "")
🧰 Tools
🪛 Ruff (0.14.14)

[error] 21-21: check_uprn may be undefined, or defined from star imports

(F405)

🤖 Prompt for AI Agents
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py`
around lines 20 - 25, Ensure the postcode is validated for presence and type
before calling .strip() and before invoking check_uprn: check that
kwargs.get("postcode") yields a non-empty string (e.g., not None) and raise or
return an explicit error if missing, then assign to user_postcode and call
user_postcode = user_postcode.strip().replace(" ", "") and only after that call
check_uprn(user_postcode) (or call check_uprn after validation but before
stripping if it expects raw input); update the logic around the user_postcode
variable and the check_uprn(...) call in LondonBoroughHammersmithandFulham.py to
avoid AttributeError when postcode is absent.

URI = f"https://www.lbhf.gov.uk/bin-recycling-day/results?postcode={user_postcode}"
UA = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/143.0.0.0 Safari/537.36"
session = requests.session()
session.headers.update({"User-Agent": UA})
# Make the GET request
response = session.get(URI)
response.raise_for_status()

soup = BeautifulSoup(response.content, features="html.parser")
results = soup.find("div", {"class": "nearest-search-results"})
ol = results.find("ol")
bin_collections = ol.find_all("a")
for bin_collection in bin_collections:
collection_day = bin_collection.get_text().split(" - ")[0]
collection_type = bin_collection.get_text().split(" - ")[1]

Comment on lines +34 to +44
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fail explicitly on unexpected HTML structure or entry format.

Right now results, ol, or malformed entry text will throw less-informative errors. Make the failure explicit to surface site changes quickly.

🛠️ Suggested change
-        results = soup.find("div", {"class": "nearest-search-results"})
-        ol = results.find("ol")
-        bin_collections = ol.find_all("a")
+        results = soup.find("div", {"class": "nearest-search-results"})
+        if results is None:
+            raise ValueError("Could not find nearest search results section.")
+        ol = results.find("ol")
+        if ol is None:
+            raise ValueError("Could not find bin collections list.")
+        bin_collections = ol.find_all("a")
...
-            collection_day = bin_collection.get_text().split(" - ")[0]
-            collection_type = bin_collection.get_text().split(" - ")[1]
+            parts = bin_collection.get_text().split(" - ")
+            if len(parts) != 2:
+                raise ValueError(
+                    f"Unexpected bin collection entry format: {bin_collection.get_text()!r}"
+                )
+            collection_day, collection_type = parts

Based on learnings, in uk_bin_collection/**/*.py, prefer explicit failures on unexpected formats with clear exception types and messages.

🤖 Prompt for AI Agents
In
`@uk_bin_collection/uk_bin_collection/councils/LondonBoroughHammersmithandFulham.py`
around lines 34 - 44, Check that the parsed HTML nodes exist and that each entry
has the expected " - " format: after creating soup, assert results is not None
and ol is not None (raise ValueError with a clear message like
"nearest-search-results container missing" or "ordered list missing"), then when
iterating bin_collections validate bin_collection.get_text() splits into at
least two parts before assigning to collection_day and collection_type and raise
ValueError with a descriptive message (e.g., "unexpected bin collection entry
format: '<text>'") if not; reference the variables/results and the loop
assigning collection_day and collection_type to locate where to add these checks
and failures.

if days_of_week.get(collection_day) == 0:
collection_day = datetime.now().strftime(date_format)
else:
collection_day = get_next_day_of_week(collection_day)
Comment thread
coderabbitai[bot] marked this conversation as resolved.

dict_data = {
"type": collection_type,
"collectionDate": collection_day,
}
bindata["bins"].append(dict_data)

bindata["bins"].sort(
key=lambda x: datetime.strptime(x.get("collectionDate"), "%d/%m/%Y")
)

return bindata
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ def parse_data(self, page: str, **kwargs) -> dict:

address_link.send_keys(Keys.ENTER)

address_results = wait.until(
wait.until(
EC.presence_of_element_located(
(By.CLASS_NAME, "your-collection-schedule-container")
)
Expand All @@ -71,16 +71,16 @@ def parse_data(self, page: str, **kwargs) -> dict:
soup = BeautifulSoup(driver.page_source, features="html.parser")
data = {"bins": []}

# Get the current month and year
current_month = datetime.now().month
current_year = datetime.now().year

# Function to extract collection data
def extract_collection_data(collection_div, collection_type):
if collection_div:
date_element = (
collection_div.find(class_="recycling-collection-day-numeric")
or collection_div.find(class_="refuse-collection-day-numeric")
collection_div.find(
class_="refuse-garden-collection-day-numeric"
)
or collection_div.find(
class_="recycling-garden-collection-day-numeric"
)
or collection_div.find(class_="garden-collection-day-numeric")
)
month_element = (
Expand All @@ -94,27 +94,14 @@ def extract_collection_data(collection_div, collection_type):
collection_month = month_element.get_text(strip=True)

# Combine month, date, and year into a string
date_string = (
f"{collection_date} {collection_month} {current_year}"
)
date_string = f"{collection_date} {collection_month}"

try:
# Convert the date string to a datetime object
collection_date_obj = datetime.strptime(
formatted_date = datetime.strptime(
date_string, "%d %B %Y"
)

# Check if the month is ahead of the current month
if collection_date_obj.month >= current_month:
# If the month is ahead, use the current year
formatted_date = collection_date_obj.strftime(
date_format
)
else:
# If the month is before the current month, use the next year
formatted_date = collection_date_obj.replace(
year=current_year + 1
).strftime(date_format)
).strftime(date_format)
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

# Create a dictionary for each collection entry
dict_data = {
"type": collection_type,
Expand All @@ -128,25 +115,12 @@ def extract_collection_data(collection_div, collection_type):
# Handle the case where the date format is invalid
formatted_date = "Invalid Date Format"

# Extract Recycling collection data
extract_collection_data(
soup.find(
class_="container-fluid RegularCollectionDay"
).find_next_sibling("div"),
"Recycling",
)

# Extract Refuse collection data
for refuse_div in soup.find_all(
class_="container-fluid RegularCollectionDay"
):
extract_collection_data(refuse_div, "Refuse")

# Extract Garden Waste collection data
extract_collection_data(
soup.find(class_="container-fluid gardenwasteCollectionDay"),
"Garden Waste",
)
container_fluid = soup.find_all(class_="container-fluid")
for container in container_fluid:
collection_type = container.find("h3").get_text().strip()
collections = container.find_all(class_="bs3-col-sm-2")
for collection in collections:
extract_collection_data(collection, collection_type)
Comment thread
coderabbitai[bot] marked this conversation as resolved.
Outdated

# Print the extracted data
except Exception as e:
Expand Down
77 changes: 46 additions & 31 deletions wiki/Councils.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,14 +76,12 @@ This document is still a work in progress, don't worry if your council isn't lis
- [Chorley](#chorley)
- [Colchester](#colchester)
- [Conwy](#conwy)
- [Copeland](#copeland)
- [Cornwall](#cornwall)
- [Cotswold](#cotswold)
- [Coventry](#coventry)
- [Crawley](#crawley)
- [Croydon](#croydon)
- [Cumberland](#cumberland)
- [Cumberland](#cumberland)
- [Dacorum](#dacorum)
- [Darlington Borough Council](#darlington-borough-council)
- [Dartford](#dartford)
Expand Down Expand Up @@ -111,6 +109,7 @@ This document is still a work in progress, don't worry if your council isn't lis
- [East Staffordshire](#east-staffordshire)
- [East Suffolk](#east-suffolk)
- [Eastleigh](#eastleigh)
- [Eden District (Westmorland and Furness)](#eden-district-(westmorland-and-furness))
Comment thread
coderabbitai[bot] marked this conversation as resolved.
- [City of Edinburgh](#city-of-edinburgh)
- [Elmbridge](#elmbridge)
- [Enfield](#enfield)
Expand Down Expand Up @@ -171,7 +170,9 @@ This document is still a work in progress, don't worry if your council isn't lis
- [City of Lincoln](#city-of-lincoln)
- [Lisburn and Castlereagh](#lisburn-and-castlereagh)
- [Liverpool](#liverpool)
- [Camden](#camden)
- [Ealing](#ealing)
- [Hammersmith & Fulham](#hammersmith-&-fulham)
- [Harrow](#harrow)
- [Havering](#havering)
- [Hounslow](#hounslow)
Expand Down Expand Up @@ -628,13 +629,14 @@ Note: You will need to use [FindMyAddress](https://www.findmyaddress.co.uk/searc

### Bexley
```commandline
python collect_data.py BexleyCouncil https://waste.bexley.gov.uk/waste -s -u XXXXXXXX
python collect_data.py BexleyCouncil https://waste.bexley.gov.uk/waste -s -u XXXXXXXX -w http://HOST:PORT/
```
Additional parameters:
- `-s` - skip get URL
- `-u` - UPRN
- `-w` - remote Selenium web driver URL (required for Home Assistant)

Note: Provide your UPRN. Use [FindMyAddress](https://www.findmyaddress.co.uk/search) to locate it.
Note: Provide your UPRN. Use [FindMyAddress](https://www.findmyaddress.co.uk/search) to locate it. This parser requires a Selenium webdriver.

---

Expand Down Expand Up @@ -1156,17 +1158,6 @@ Note: Use [FindMyAddress](https://www.findmyaddress.co.uk/search) to find your U

---

### Copeland
```commandline
python collect_data.py CopelandBoroughCouncil https://www.copeland.gov.uk -u XXXXXXXX
```
Additional parameters:
- `-u` - UPRN

Note: *****This has now been replaced by Cumberland Council****

---

### Cornwall
```commandline
python collect_data.py CornwallCouncil https://www.cornwall.gov.uk/my-area/ -s -u XXXXXXXX
Expand Down Expand Up @@ -1229,18 +1220,6 @@ Note: Pass the house number and postcode in their respective parameters. This pa

---

### Cumberland
```commandline
python collect_data.py CumberlandAllerdaleCouncil https://www.allerdale.gov.uk -p "XXXX XXX" -n XX
```
Additional parameters:
- `-p` - postcode
- `-n` - house number

Note: Pass the house number and postcode in their respective parameters.

---

### Cumberland
```commandline
python collect_data.py CumberlandCouncil https://www.cumberland.gov.uk/bins-recycling-and-street-cleaning/waste-collections/bin-collection-schedule -u XXXXXXXX
Expand Down Expand Up @@ -1579,6 +1558,17 @@ Note: Pass the UPRN. You can find it using [FindMyAddress](https://www.findmyadd

---

### Eden District (Westmorland and Furness)
```commandline
python collect_data.py EdenDistrictCouncil https://my.eden.gov.uk/myeden.aspx -u XXXXXXXX
```
Additional parameters:
- `-u` - UPRN

Note: For Eden area addresses within Westmorland and Furness. Provide your UPRN. You can find your UPRN using [FindMyAddress](https://www.findmyaddress.co.uk/search). Note: This returns collection days (e.g., 'Wednesday') rather than specific dates.

---

### City of Edinburgh
```commandline
python collect_data.py EdinburghCityCouncil https://www.edinburgh.gov.uk -s -p "XXXX XXX" -n XX
Expand Down Expand Up @@ -2314,6 +2304,19 @@ Note: You will need to use [FindMyAddress](https://www.findmyaddress.co.uk/searc

---

### Camden
```commandline
python collect_data.py LondonBoroughCamdenCouncil https://environmentservices.camden.gov.uk/property -s -u XXXXXXXX -p "XXXX XXX"
```
Additional parameters:
- `-s` - skip get URL
- `-u` - UPRN
- `-p` - postcode

Note: Pass the property ID as UPRN. Find your property at https://www.camden.gov.uk/check-collection-day then use the property ID from the URL (e.g., https://environmentservices.camden.gov.uk/property/5063139).

---

### Ealing
```commandline
python collect_data.py LondonBoroughEaling https://www.ealing.gov.uk/site/custom_scripts/WasteCollectionWS/home/FindCollection -s -u XXXXXXXX
Expand All @@ -2326,6 +2329,17 @@ Note: Pass the UPRN. You can find it using [FindMyAddress](https://www.findmyadd

---

### Hammersmith & Fulham
```commandline
python collect_data.py LondonBoroughHammersmithandFulham https://www.lbhf.gov.uk/ -p "XXXX XXX"
```
Additional parameters:
- `-p` - postcode

Note: Pass only the property postcode

---

### Harrow
```commandline
python collect_data.py LondonBoroughHarrow https://www.harrow.gov.uk -u XXXXXXXX
Expand Down Expand Up @@ -3468,7 +3482,7 @@ Note: Follow the instructions [here](https://www.southlanarkshire.gov.uk/info/20

### South Norfolk
```commandline
python collect_data.py SouthNorfolkCouncil https://www.southnorfolkandbroadland.gov.uk/rubbish-recycling/south-norfolk-bin-collection-day-finder -s -u XXXXXXXX
python collect_data.py SouthNorfolkCouncil https://area.southnorfolkandbroadland.gov.uk/FindAddress -s -u XXXXXXXX
```
Additional parameters:
- `-s` - skip get URL
Expand Down Expand Up @@ -4257,12 +4271,13 @@ Note: Provide your UPRN. You can find it using [FindMyAddress](https://www.findm

### Wirral
```commandline
python collect_data.py WirralCouncil https://www.wirral.gov.uk -u XXXXXXXX
python collect_data.py WirralCouncil https://www.wirral.gov.uk -p "XXXX XXX" -w http://HOST:PORT/
```
Additional parameters:
- `-u` - UPRN
- `-p` - postcode
- `-w` - remote Selenium web driver URL (required for Home Assistant)

Note: In the `uprn` field, enter your street name and suburb separated by a comma (e.g., 'Vernon Avenue,Seacombe').
Note: Pass your postcode and house number.
Comment on lines +4298 to +4304
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add the missing house-number parameter in the Wirral command.

The note says to pass a house number, but the command and parameter list omit -n.

🛠️ Suggested fix
-python collect_data.py WirralCouncil https://www.wirral.gov.uk -p "XXXX XXX" -w http://HOST:PORT/
+python collect_data.py WirralCouncil https://www.wirral.gov.uk -p "XXXX XXX" -n XX -w http://HOST:PORT/
-Additional parameters:
-- `-p` - postcode
-- `-w` - remote Selenium web driver URL (required for Home Assistant)
+Additional parameters:
+- `-p` - postcode
+- `-n` - house number
+- `-w` - remote Selenium web driver URL (required for Home Assistant)
🤖 Prompt for AI Agents
In `@wiki/Councils.md` around lines 4287 - 4293, Update the Wirral example command
and parameter list to include the missing house-number flag (-n): in the example
invocation `python collect_data.py WirralCouncil ...` add `-n "HOUSE_NUMBER"`
and update the parameter bullet list to include `-n` - house number; also adjust
the Note line to explicitly mention both postcode and house number must be
passed. This ensures the example matches the explanatory text and the script's
expected flags.


---

Expand Down