Skip to content

77 library size exceeding pypi size limit of 100mb#81

Open
Herald-TUOS wants to merge 9 commits intodevelopfrom
77-library-size-exceeding-pypi-size-limit-of-100mb
Open

77 library size exceeding pypi size limit of 100mb#81
Herald-TUOS wants to merge 9 commits intodevelopfrom
77-library-size-exceeding-pypi-size-limit-of-100mb

Conversation

@Herald-TUOS
Copy link
Copy Markdown
Contributor

No description provided.

@Herald-TUOS Herald-TUOS self-assigned this May 5, 2026
@Herald-TUOS Herald-TUOS linked an issue May 5, 2026 that may be closed by this pull request
Copy link
Copy Markdown

@AndrewCRichards AndrewCRichards left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a few comments against parts of the code - I acknowledge I'm being a bit fussy with a couple of those, see what you think.

Comment thread geocode/utilities.py Outdated
"""
with py7zr.SevenZipFile(str(archive_path), mode="r") as archive:
targets = [name for name in archive.getnames() if name == target_filename]
archive.extract(path=tmp_dir, targets=targets)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although only a single file is being extracted, the logic here includes a loop on archive.getnames() - I'd prefer to see this coded as a try...except block, in particular I wonder if that would work if it went around archive.extract(path=tmp_dir, targets=[target_filename]) so you'd not need to create the short-lived targets variable at all. This approach would probably make it clearer what happens when target_filename doesn't exist in the archive, assuming the except block includes some helpful text in the resulting exception.
....oh, in fact I see you have coded something like this in the very next function read_csv_from_7z, so not sure why you aren't doing the same here

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in 7143ab3
I've not added a try ... execpt block, but removed the extra targets variable. I want the library to raise an error and fail if the specified file isn't in the archive

Comment thread geocode/ons_nrs.py Outdated
zip_path_2011 = self.data_dir.joinpath("nrs_2011.zip")
zip_path_2021 = self.data_dir.joinpath("nrs_2021.zip")
zip_path_2011 = self.data_dir.joinpath("nrs_2011.7z")
zip_path_2021 = self.data_dir.joinpath("nrs_2021.7z")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These variable names are OK and do provide consistency with the prior version of the code, but (e.g.) sevenzip_path_2011 would be more accurate and less likely to mislead since the new archives aren't in zip format, so I would prefer to see these renamed (but you may disagree, don't feel obliged)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with 6dce946

Comment thread geocode/utilities.py


def extract_from_7z(archive_path: Path, target_filename: str, tmp_dir: str):
"""
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the risk of being overly picky, I suggest file_to_extract is clearer than target_filename since the latter suggests the output name , not necessarily the archive filename

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved in 7143ab3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Library size exceeding PyPi size limit of 100MB

2 participants