Skip to content

Downloading resources are wasted if threads are downloading the same file #481

@jessebrennan

Description

@jessebrennan

With the current architecture, it's possible that multiple threads are downloading the same file.

This does not affect correctness of the download because of the filestore layout. It does affect efficiency. Because of "copy forward", the same files appear in both the primary and secondary bundles. If both are adjacent in the same manifest, the duplicate download becomes much more likely.

One idea for a solution would be to keep a global table of all of the files that are currently downloading / downloaded. Threads can check this table and sleep if the files already exists.

Another idea would be to have a .tmp version of the file that exists until the download is complete. I have not thought through all of the implications of this design.

Metadata

Metadata

Assignees

No one assigned

    Labels

    orangeDone by the Azul, Data Browser and Portal team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions