Confusion in Selecting Indices

Thank you for your excellent work. I have encountered some confusion while trying to reproduce the results:

In `snapkv_utils.py,` the following snippet is used to select indices and gather corresponding values:
[snapkv_utils.py#L62-L65](https://github.com/FasterDecoding/SnapKV/blob/82135ce2cc60f212a9ba918467f3d9c8134e163f/snapkv/monkeypatch/snapkv_utils.py#L62-L65)

```
            indices = attn_cache.topk(self.max_capacity_prompt - self.window_size, dim=-1).indices
            indices = indices.unsqueeze(-1).expand(-1, -1, -1, head_dim)
            k_past_compress = key_states[:, :, :-self.window_size, :].gather(dim = 2, index = indices)
            v_past_compress = value_states[:, :, :-self.window_size, :].gather(dim = 2, index = indices)
```

The indices here are selected based on the scores in `attn_cache`, which are sorted by their values, rather than by their original positional order. 

My concern is whether this approach could lead to incorrect ordering in the KV cache when performing the gather operation for `k_past_compress` and `v_past_compress`. Specifically, could this potentially disrupt the alignment between keys and values during subsequent inference steps?

I would appreciate any insights or clarifications on this matter. Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion in Selecting Indices #31

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Confusion in Selecting Indices #31

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions