This issue is for tracking the ongoing development of BBOT's web spider capabilities.
BBOT's web spider is already solid, easily crawling websites and extracting URLs, JS links, etc. How well it performs compared to Project Discovery's Katana is unknown.
Here are some things we can do to build on BBOT's feature set, to make it a best-in-class web spider:
Also, consolidating / rustifying excavate, along with it's custom rule integration, will enable us to spider at scale, with the highest performance possible.
Why not a Katana module?
While a Katana module would be easy to write, it wouldn't be ideal for two main reasons:
- BBOT is already recursive, and introducing another recursive tool is likely to have unintended side effects. Examples include infinite recursion bugs, visiting the same URL multiple times, or putting heavy stress on the target.
- Many of Katana's features are already included in BBOT, including configurable web spider settings, URL extraction, and custom rules to search HTTP responses.
Therefore the best approach will be to polish BBOT's existing spider feature set to make it more effective and user friendly.
Relevant:
This issue is for tracking the ongoing development of BBOT's web spider capabilities.
BBOT's web spider is already solid, easily crawling websites and extracting URLs, JS links, etc. How well it performs compared to Project Discovery's Katana is unknown.
Here are some things we can do to build on BBOT's feature set, to make it a best-in-class web spider:
Also, consolidating / rustifying excavate, along with it's custom rule integration, will enable us to spider at scale, with the highest performance possible.
Why not a Katana module?
While a Katana module would be easy to write, it wouldn't be ideal for two main reasons:
Therefore the best approach will be to polish BBOT's existing spider feature set to make it more effective and user friendly.
Relevant: