Skip to content

Missing prefix cache hit rate in Azure LLM inference trace 2023 and 2024. #53

@IteratorandIterator

Description

@IteratorandIterator

In these two traces, there are only input and output lengths, but no prefix cache hit rate. Therefore, the data tested using these traces cannot truly reflect the inference performance and load conditions. Could you please supplement these traces with their corresponding prefix cache hit rates?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions