Become a sponsor to Shuu
Hi, I'm Shun0212, a student and open source developer focused on building advanced code understanding and code search models, especially for multilingual and function-level retrieval tasks.
I’m the creator of CodeSearch-ModernBERT-Crow-Plus, a state-of-the-art model listed on the MTEB leaderboard. My tools help developers find reusable code across large repositories using natural language.
With your support, I can:
- Develop better open models for the community
- Create accessible tools like GitHub code search demos
- Explore new tasks like retrieval-augmented generation (RAG) for code
- And reduce part-time work hours to focus more on research and development
More time means more innovation and higher quality for the community. Thank you for your support!
Featured work
-
embeddings-benchmark/mteb
MTEB: Massive Text Embedding Benchmark
Python 3,248 -
embeddings-benchmark/results
Data for the MTEB leaderboard
Python 53 -
Shun0212/CodeBERTPretrained
このリポジトリでは、CodeSearchNet データセットを用いて、Pythonのコード検索タスク向けに BERT モデルをゼロから学習するプロジェクトを提供します。This repository provides a project for training a BERT model from scratch for the Python code search task using t…
Jupyter Notebook 1 -
Shun0212/CodeSearch-Crow
CodeSearch-Crowは、日本語の自然言語クエリを用いて、指定した GitHub リポジトリ内のコードスニペットを検索するツールです. This tool allows you to search for relevant code snippets in a GitHub repository using natural Japanese language queries.
Jupyter Notebook