Senpai Mode Provides Higher Quality With Heavier Validation #46
otakustay
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
During our extensive experience using coding agent as assistant in development, we have discovered the following fact:
This gap often manifests in the following ways:
The reason for these phenomena is quite simple: a general-purpose role setting needs to cater to the baseline performance of all types of tasks, making it difficult to enhance any specific direction.
The concept of "modes" in Oniichan was introduced from the beginning with the ability to switch between multiple roles. Thanks to this foundational architecture, we can provide a workflow that maintains high task completion and high-quality generation even in non-SOTA modes.
Starting from version
3.7.0, Oniichan has a built-in Senpai mode. Senpai (Japanese “先輩”) conveys the intention of “having a senior mentor inspect the task.” It defines two roles to complete tasks together:Expressed with a flowchart, this workflow is as follows:
flowchart TB Start[User Request] --> Actor Actor[[Actor]] --> ActorAction[Take Action] ActorAction --> ActorComplete{Completed?} ActorComplete --> |Yes| Summary{{Task Summary}} ActorComplete --> |No| ActorAction Summary --> Reviewer[[Reviewer]] Reviewer --> QualityCheck{Quality Check} QualityCheck --> QualityCheckComplete{Completed?} QualityCheckComplete --> |Yes| ReviewerComplete{Quality OK?} QualityCheckComplete --> |No| QualityCheck ReviewerComplete --> |Yes| CompleteTask[Complete Task] ReviewerComplete --> |No| Suggestion{{Suggestion}} Suggestion --> ActorThe key role here is the reviewer, which, as part of the enhanced verification and review process, actively performs the following behaviors:
The reviewer does not perform any code editing, including creating, modifying, or deleting files, but can read and execute commands, allowing it to focus more on problem analysis and providing suggestions.
In this mode, we observe a visible improvement in task completion and quality. Below is a demonstration on an ordinary task:
oniichan-senpai-mode.mp4
As seen, when the actor first completes the task (00:42), it only deletes part of the code, but some remnants remain, and the task does not reach a high level of completion.
Subsequently, the reviewer checks and discovers a unit test failure (00:53), then provides a series of further correction suggestions (01:12), guiding the actor to further remove erroneous code, ultimately completing the entire task with all unit tests passing.
Through such a composite workflow, Senpai mode can achieve effects equivalent to top models like Claude on DeepSeek V3 and similarly scaled models, but with nearly 1/5 of the cost.
Starting from this, we believe that by enhancing the collaboration of various roles with different capabilities, we can achieve sufficiently good generative effects on models with low costs. In the future, Oniichan will gradually move away from relying on the effects of SOTA models like Claude, using engineering to pursue a win-win situation for both cost and effect.
Beta Was this translation helpful? Give feedback.
All reactions