77 library size exceeding pypi size limit of 100mb#81
77 library size exceeding pypi size limit of 100mb#81Herald-TUOS wants to merge 9 commits intodevelopfrom
Conversation
AndrewCRichards
left a comment
There was a problem hiding this comment.
I've added a few comments against parts of the code - I acknowledge I'm being a bit fussy with a couple of those, see what you think.
| """ | ||
| with py7zr.SevenZipFile(str(archive_path), mode="r") as archive: | ||
| targets = [name for name in archive.getnames() if name == target_filename] | ||
| archive.extract(path=tmp_dir, targets=targets) |
There was a problem hiding this comment.
Although only a single file is being extracted, the logic here includes a loop on archive.getnames() - I'd prefer to see this coded as a try...except block, in particular I wonder if that would work if it went around archive.extract(path=tmp_dir, targets=[target_filename]) so you'd not need to create the short-lived targets variable at all. This approach would probably make it clearer what happens when target_filename doesn't exist in the archive, assuming the except block includes some helpful text in the resulting exception.
....oh, in fact I see you have coded something like this in the very next function read_csv_from_7z, so not sure why you aren't doing the same here
There was a problem hiding this comment.
resolved in 7143ab3
I've not added a try ... execpt block, but removed the extra targets variable. I want the library to raise an error and fail if the specified file isn't in the archive
| zip_path_2011 = self.data_dir.joinpath("nrs_2011.zip") | ||
| zip_path_2021 = self.data_dir.joinpath("nrs_2021.zip") | ||
| zip_path_2011 = self.data_dir.joinpath("nrs_2011.7z") | ||
| zip_path_2021 = self.data_dir.joinpath("nrs_2021.7z") |
There was a problem hiding this comment.
These variable names are OK and do provide consistency with the prior version of the code, but (e.g.) sevenzip_path_2011 would be more accurate and less likely to mislead since the new archives aren't in zip format, so I would prefer to see these renamed (but you may disagree, don't feel obliged)
|
|
||
|
|
||
| def extract_from_7z(archive_path: Path, target_filename: str, tmp_dir: str): | ||
| """ |
There was a problem hiding this comment.
At the risk of being overly picky, I suggest file_to_extract is clearer than target_filename since the latter suggests the output name , not necessarily the archive filename
No description provided.