Not sure how this was missed, but the AI Assistant is not retaining chat context within sessions, so every message is treated as a completely new request. We need to include previous messages with each new message, and implement chat compaction to avoid hitting token budgets with longer chats.