From a24ee753111041325ee673034c59837f796cc1f0 Mon Sep 17 00:00:00 2001 From: GeiserX <9169332+GeiserX@users.noreply.github.com> Date: Tue, 10 Mar 2026 12:13:18 +0100 Subject: [PATCH 1/2] Add Wayback-Archive to web archives tools Add Wayback-Archive, a Python tool for downloading complete websites from the Wayback Machine with full asset preservation for offline viewing. --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index 45b7d17..340ac4d 100644 --- a/README.md +++ b/README.md @@ -1871,6 +1871,7 @@ Don't forget that OSINT's main strength is in automation. Read the [Netlas Cookb | [Web Archive Google Chrome Extension](https://github.com/husseinphp/web-archive) | Simple Chrome Extensions for getting information about current URL using http://archive.org CDX API | | [WAYBACK GOOGLE ANALYTICS](https://github.com/bellingcat/wayback-google-analytics) | A tool that finds all Google Analytics ID in URL (including old ones from Web Archive). | | [GAU](https://github.com/lc/gau) | Simple #golang tool to fetch all known website URLs from: WayBackMachine, AlienVault's Open Threat Exchange, Common Crawl, URLScan | +| [Wayback-Archive](https://github.com/GeiserX/Wayback-Archive) | Download complete websites from the Wayback Machine with full asset preservation for offline viewing. #python GPL-3.0 | ### [](#warc)Tools for working with WARC (WebARChive) files From 554433ae938a17a003b19834b2985f00e00900d1 Mon Sep 17 00:00:00 2001 From: GeiserX <9169332+GeiserX@users.noreply.github.com> Date: Tue, 10 Mar 2026 12:23:14 +0100 Subject: [PATCH 2/2] Remove license info from Wayback-Archive entry No other entry in the collection includes license information in its description, so remove GPL-3.0 to stay consistent. --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 340ac4d..69bd66b 100644 --- a/README.md +++ b/README.md @@ -1871,7 +1871,7 @@ Don't forget that OSINT's main strength is in automation. Read the [Netlas Cookb | [Web Archive Google Chrome Extension](https://github.com/husseinphp/web-archive) | Simple Chrome Extensions for getting information about current URL using http://archive.org CDX API | | [WAYBACK GOOGLE ANALYTICS](https://github.com/bellingcat/wayback-google-analytics) | A tool that finds all Google Analytics ID in URL (including old ones from Web Archive). | | [GAU](https://github.com/lc/gau) | Simple #golang tool to fetch all known website URLs from: WayBackMachine, AlienVault's Open Threat Exchange, Common Crawl, URLScan | -| [Wayback-Archive](https://github.com/GeiserX/Wayback-Archive) | Download complete websites from the Wayback Machine with full asset preservation for offline viewing. #python GPL-3.0 | +| [Wayback-Archive](https://github.com/GeiserX/Wayback-Archive) | Download complete websites from the Wayback Machine with full asset preservation for offline viewing. #python | ### [](#warc)Tools for working with WARC (WebARChive) files