Skip to content

[DRAFT] Replacing WEB UI for Ollama Backend.#116

Draft
RSDNTWK wants to merge 4 commits intoRubiksman78:mainfrom
RSDNTWK:ollama
Draft

[DRAFT] Replacing WEB UI for Ollama Backend.#116
RSDNTWK wants to merge 4 commits intoRubiksman78:mainfrom
RSDNTWK:ollama

Conversation

@RSDNTWK
Copy link
Copy Markdown
Contributor

@RSDNTWK RSDNTWK commented Dec 7, 2025

This pr is a proof of concept that replaces text-generation-ui, sillytavern and playwright components with an ollama backend.

This code is fully functional with the submod however it has the following limitations:

Hardcoded model chosen "llama3.2" due to it's small size.
Unable to unload model on exit.
Minor warnings about text context size that is not really an issue.

To test this code download and install Ollama here: https://github.com/ollama/ollama/releases/latest

To download llama 3.2, open a command prompt window and type: ollama pull llama3.2

Once it is downloaded just execute run.bat as normal. Ollama will load the model once you type your first message into MAS.

After exiting MAS, to unload the model, open a command prompt window and type: ollama stop llama3.2

We believe this will reduce the amount of overhead when running the submod.

@Sylphar
Copy link
Copy Markdown
Contributor

Sylphar commented Dec 8, 2025

Interesting. I'll forward this to the main maintainer/creator. Perhaps a stupid question : couldn't you just use try/finally to unload the model automatically ?

@Rubiksman78
Copy link
Copy Markdown
Owner

Would be a nice addition but due to the use limited to one model for now it would be preferable to still give users the alternative to choose Web UI if they have the computing capabilities instead of a complete replacement.

@RSDNTWK
Copy link
Copy Markdown
Contributor Author

RSDNTWK commented Dec 8, 2025

Interesting. I'll forward this to the main maintainer/creator. Perhaps a stupid question : couldn't you just use try/finally to unload the model automatically ?

We're experimenting with using the ollama cmd commands to unload the models so it could work. It's just buggy for now.

Would be a nice addition but due to the use limited to one model for now it would be preferable to still give users the alternative to choose Web UI if they have the computing capabilities instead of a complete replacement.

Fair enough, We are looking into adding an option to load other models too. Ollama has native Nvidia/AMD GPU support as well as CPU as a fallback if needed. Ollama would be a better option overall as with the right optimisations it's a lot less overhead for lower end systems. Ollama does have the capability of loading models from huggingface too.

@RSDNTWK
Copy link
Copy Markdown
Contributor Author

RSDNTWK commented Jan 16, 2026

New improvements have been made:

Added ability to load custom Ollama models in the settings and save it to the config.json file.
Added startup trigger to check if model specified in the config.json file is downloaded to ollama and pull it if not installed.
Updated Ollama API call to read the model saved in the config.json file and load it on the first message.
Added trigger to unload current model using ollama stop command if the user types or calls "QUIT" in MAS. (Not ideal but better than nothing for now.)

@RSDNTWK
Copy link
Copy Markdown
Contributor Author

RSDNTWK commented Jan 18, 2026

Current planned addition ideas being looked into:

Add user's name from MAS in user prompt or add a setting to chose character name in settings as an alternative option.
Add Memories to allow Monika to remember specific events whenever the user interacts with her even if MAS/Ollama is restarted using a chat history logging method.
Add an option to list and delete Ollama models in the settings page invoking ollama list and ollama rm commands internally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants