Senpai Mode Provides Higher Quality With Heavier Validation #46

otakustay · 2025-05-28T07:52:18Z

otakustay
May 28, 2025
Maintainer

During our extensive experience using coding agent as assistant in development, we have discovered the following fact:

On state-of-the-art (SOTA) models, a task can often be completed thoroughly and comprehensively. However, on relatively lower-quality models, there is a gap in task completion.

This gap often manifests in the following ways:

A task may only be partially completed.
The execution of a task may cause other unexpected errors, such as build or unit test failures.

The reason for these phenomena is quite simple: a general-purpose role setting needs to cater to the baseline performance of all types of tasks, making it difficult to enhance any specific direction.

The concept of "modes" in Oniichan was introduced from the beginning with the ability to switch between multiple roles. Thanks to this foundational architecture, we can provide a workflow that maintains high task completion and high-quality generation even in non-SOTA modes.

Starting from version 3.7.0, Oniichan has a built-in Senpai mode. Senpai (Japanese “先輩”) conveys the intention of “having a senior mentor inspect the task.” It defines two roles to complete tasks together:

Actor: Understands the task you propose and completes it in a standard work mode.
Reviewer: After the actor completes the task, it performs quality checks in a strengthened verification form and provides further suggestions when checks fail.

Expressed with a flowchart, this workflow is as follows:

flowchart TB
    Start[User Request] --> Actor
    Actor[[Actor]] --> ActorAction[Take Action]
    ActorAction --> ActorComplete{Completed?}
    ActorComplete --> |Yes| Summary{{Task Summary}}
    ActorComplete --> |No| ActorAction
    Summary --> Reviewer[[Reviewer]]
    Reviewer --> QualityCheck{Quality Check}
    QualityCheck --> QualityCheckComplete{Completed?}
    QualityCheckComplete --> |Yes| ReviewerComplete{Quality OK?}
    QualityCheckComplete --> |No| QualityCheck
    ReviewerComplete --> |Yes| CompleteTask[Complete Task]
    ReviewerComplete --> |No| Suggestion{{Suggestion}}
    Suggestion --> Actor

The key role here is the reviewer, which, as part of the enhanced verification and review process, actively performs the following behaviors:

Executes commands to run build, test, lint, and other tasks to ensure they are successful.
Re-examines code changes, performing code reviews based on the capabilities of LLM.

The reviewer does not perform any code editing, including creating, modifying, or deleting files, but can read and execute commands, allowing it to focus more on problem analysis and providing suggestions.

In this mode, we observe a visible improvement in task completion and quality. Below is a demonstration on an ordinary task:

oniichan-senpai-mode.mp4

As seen, when the actor first completes the task (00:42), it only deletes part of the code, but some remnants remain, and the task does not reach a high level of completion.

Subsequently, the reviewer checks and discovers a unit test failure (00:53), then provides a series of further correction suggestions (01:12), guiding the actor to further remove erroneous code, ultimately completing the entire task with all unit tests passing.

Through such a composite workflow, Senpai mode can achieve effects equivalent to top models like Claude on DeepSeek V3 and similarly scaled models, but with nearly 1/5 of the cost.

Starting from this, we believe that by enhancing the collaboration of various roles with different capabilities, we can achieve sufficiently good generative effects on models with low costs. In the future, Oniichan will gradually move away from relying on the effects of SOTA models like Claude, using engineering to pursue a win-win situation for both cost and effect.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Senpai Mode Provides Higher Quality With Heavier Validation #46

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Senpai Mode Provides Higher Quality With Heavier Validation #46

Uh oh!

Uh oh!

otakustay May 28, 2025 Maintainer

Replies: 0 comments

otakustay
May 28, 2025
Maintainer