Skip to content

enhancement: Rename WARC output #64

@laurieburchell

Description

@laurieburchell

There is currently no option to rename an outputted WARC. For example, running the following command creates a file containing one record called TEST-000000.extracted.warc.gz

 cdxt --cc --crawl CC-MAIN-2025-43 --from 20251016192109 --limit 1 warc 'commoncrawl.org/get-started'

There should be the option to rename this file (-n flag?). In addition, the default name should not be TEST but something like OUTPUT.

@malteos I can't assign people in this repo but did you want to work on this?

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions