The chatbot now support local ollama!
This demo uses the deepseek-r1:1.5b model to generate responses. Since this is a reasoning model, I added a "thinking" display in the chatbot UI. You can choose to turn off the "thinking," which makes the responses faster, but the quality may noticeably decrease.
I also tried running Ollama on the Zero2W with a smaller model, smollm2:135m-instruct-q4_K_S, but after answering just one question, the Zero2W became extremely sluggish — its limited hardware performance is the objective bottleneck.
Jdaie
Discussions
Become a Hackaday.io Member
Create an account to leave a comment. Already have an account? Log In.