Replies: 1 comment
-
|
Hello through centuries! Just move out Llama instance & use import llama_cpp
# Enable unlimited caching.
cache = llama_cpp.LlamaCache()
model.set_cache(cache)
# Your promts.
prompts = [ "What capital of Paris?", "Who are you", "..." ]
# Our outputs.
outputs = []
for p in prompts:
outputs.append(model.create_completion(prompt=p)['choices'][0]['text'])Hope it helps even through times! |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I want to sample the output of the same query multiple times independently. My (limited) understanding is that the history of these queries change subsequent outputs. So, I guess I could do:
but re-instantiating the model seems a bit heavy, and I was wondering if there were a way to simply reset the model. I see that there is a
resetmethod of the llm object, but as far as i can tell it simply setsn_tokensattribute to 0. Is that really enough to set the model back to its initial state?Beta Was this translation helpful? Give feedback.
All reactions