Skip to content

Add support for Reference Fix Commits improver#2163

Open
ziadhany wants to merge 8 commits intoaboutcode-org:mainfrom
ziadhany:collect-package-patch
Open

Add support for Reference Fix Commits improver#2163
ziadhany wants to merge 8 commits intoaboutcode-org:mainfrom
ziadhany:collect-package-patch

Conversation

@ziadhany
Copy link
Collaborator

@ziadhany ziadhany force-pushed the collect-package-patch branch from b9b3fe1 to ae2b35a Compare February 13, 2026 23:21
@ziadhany ziadhany marked this pull request as ready for review February 13, 2026 23:46
Update the pipeline and fix the test

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
@ziadhany ziadhany force-pushed the collect-package-patch branch from ae2b35a to 8bf03c1 Compare February 20, 2026 02:06
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
@ziadhany ziadhany force-pushed the collect-package-patch branch from 8bf03c1 to 374de5e Compare February 20, 2026 02:54
@ziadhany
Copy link
Collaborator Author

ziadhany commented Feb 20, 2026

Pipeline Logs:

Improving data using collect_ref_fix_commits_v2
INFO 2026-02-28 00:10:48.158093 UTC Pipeline [CollectReferencesFixCommitsPipeline] starting
INFO 2026-02-28 00:10:48.158204 UTC Step [collect_and_store_fix_commits] starting
INFO 2026-02-28 00:11:09.194192 UTC Progress: 10% (34724/347233) ETA: 189 seconds (3.1 minutes)
INFO 2026-02-28 00:11:33.339578 UTC Progress: 20% (69447/347233) ETA: 181 seconds (3.0 minutes)
INFO 2026-02-28 00:12:08.272976 UTC Progress: 30% (104170/347233) ETA: 187 seconds (3.1 minutes)
INFO 2026-02-28 00:12:24.619073 UTC Progress: 40% (138894/347233) ETA: 145 seconds (2.4 minutes)
INFO 2026-02-28 00:12:40.194570 UTC Progress: 50% (173617/347233) ETA: 112 seconds (1.9 minutes)
INFO 2026-02-28 00:12:57.404996 UTC Progress: 60% (208340/347233) ETA: 86 seconds (1.4 minutes)
INFO 2026-02-28 00:13:13.046547 UTC Progress: 70% (243064/347233) ETA: 62 seconds (1.0 minutes)
INFO 2026-02-28 00:13:29.600672 UTC Progress: 80% (277787/347233) ETA: 40 seconds
INFO 2026-02-28 00:13:48.119435 UTC Progress: 90% (312510/347233) ETA: 20 seconds
INFO 2026-02-28 00:14:04.763267 UTC Progress: 100% (347233/347233)
INFO 2026-02-28 00:14:05.313832 UTC Successfully processed pkg patch commit 51,206
INFO 2026-02-28 00:14:05.313951 UTC Step [collect_and_store_fix_commits] completed in 197 seconds (3.3 minutes)
INFO 2026-02-28 00:14:05.314000 UTC Pipeline completed in 197 seconds (3.3 minutes)

Process finished with exit code 0
from vulnerabilities.models import PackageCommitPatch
PackageCommitPatch.objects.count()
Out[3]: 37649

@ziadhany ziadhany requested a review from TG1999 February 20, 2026 02:58
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
None
"""
purl = url2purl(url)
if not purl or purl.type not in VCS_URLS_SUPPORTED_TYPES:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to do
purl.type not in VCS_URLS_SUPPORTED_TYPES:

When we are already doing url2purl(purl) to ensure we get a package. Also put it in try except block with a logger.

Copy link
Contributor

@TG1999 TG1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits for your consideration

total_iterations=impacted_packages_advisories.count(), logger=self.log
)
for adv in progress.iter(impacted_packages_advisories.paginated(per_page=500)):
urls = {r.url for r in adv.references.all()} | {p.patch_url for p in adv.patches.all()}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refs = adv.references.values_list("url", flat=True)
patches = adv.patches.values_list("patch_url", flat=True)
urls = set(refs) | set(patches)

Fetch only needed columns ?

vcs_url, commit_hash = vcs_data
package_commit_obj, _ = PackageCommitPatch.objects.get_or_create(
vcs_url=vcs_url, commit_hash=commit_hash
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we handle this at bulk level, and avoid get or create and addition in a loop ?

Signed-off-by: ziad hany <ziadhany2016@gmail.com>
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
Signed-off-by: ziad hany <ziadhany2016@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants