[REQUEST] Can't Set Context Length via front-end

### OS

Windows

### GPU Library

CUDA 12.x

### Python version

3.11

### Describe the bug

There's no way for the front-end to set the context length of the conversation. Either the context is set via max_seq_len in a config file or there's no context size control at all.

This is terrible for tools like Cline where you want it to be able to set a custom context when running the model and it may differ with each run.

### Reproduction steps

Running Cline with context set to any value.

### Expected behavior

Tabby should run the model with the context length ("max_seq_len") that's requested.

### Logs

_No response_

### Additional context

_No response_

### Acknowledgements

- [x] I have looked for similar issues before submitting this one.
- [x] I have read the disclaimer, and this issue is related to a code bug. If I have a question, I will use the Discord server.
- [x] I understand that the developers have lives and my issue will be answered when possible.
- [x] I understand the developers of this program are human, and I will ask my questions politely.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[REQUEST] Can't Set Context Length via front-end #400

OS

GPU Library

Python version

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[REQUEST] Can't Set Context Length via front-end #400

Description

OS

GPU Library

Python version

Describe the bug

Reproduction steps

Expected behavior

Logs

Additional context

Acknowledgements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions