-
Notifications
You must be signed in to change notification settings - Fork 6
deep-fin-pre-commit-patch #15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -174,3 +174,4 @@ werewolves_swarm | |
| .claude | ||
| tensorboard_log | ||
| tutorial/**/*.json | ||
| node_modules | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
138 changes: 113 additions & 25 deletions
138
tutorial/opencode_build_openclaw_agent/on_compute_relative_reward.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,41 +1,129 @@ | ||
| # -*- coding: utf-8 -*- | ||
| """Compute relative rewards based on extraversion personality alignment.""" | ||
| """Compute relative rewards based on extraversion personality alignment using OpenJudge.""" | ||
|
|
||
| import os | ||
| from typing import List, Dict | ||
| from beast_logger import print_listofdict | ||
| from openjudge.graders.base_grader import GraderMode, GraderScore, GraderRank | ||
| from openjudge.graders.llm_grader import LLMGrader | ||
| from openjudge.models import OpenAIChatModel | ||
|
|
||
| def score_extraversion(response_text: str) -> float: | ||
| """Score response for extraversion traits (1-10 scale).""" | ||
| extraversion_keywords = [ | ||
| 'excited', 'love', 'amazing', 'awesome', 'fantastic', 'great', | ||
| 'wonderful', 'thrilled', 'energetic', 'enthusiastic', 'fun', | ||
| 'social', 'outgoing', 'active', 'lively', 'vibrant', 'happy', | ||
| 'enjoy', 'delighted', 'cheerful', 'positive' | ||
| ] | ||
| # Configuration | ||
| REWARD_MODE = os.getenv("REWARD_MODE", "pointwise") # Options: pointwise, listwise | ||
| API_KEY = os.getenv("DASHSCOPE_API_KEY", "sk-xxx") | ||
| BASE_URL = os.getenv("JUDGE_BASE_URL", "https://dashscope.aliyuncs.com/compatible-mode/v1") | ||
| JUDGE_MODEL = os.getenv("JUDGE_MODEL", "qwen-plus") | ||
|
|
||
| text_lower = response_text.lower() | ||
| score = 5.0 | ||
| # OpenJudge grader setup | ||
| judge_model = OpenAIChatModel( | ||
| model=JUDGE_MODEL, | ||
| api_key=API_KEY, | ||
| base_url=BASE_URL, | ||
| ) | ||
|
|
||
| for keyword in extraversion_keywords: | ||
| if keyword in text_lower: | ||
| score += 0.5 | ||
| EXTRAVERSION_PROMPT = """You are evaluating responses for extraversion personality traits. | ||
|
|
||
| score += min(response_text.count('!') * 0.3, 2.0) | ||
| Extraversion characteristics include: | ||
| - Outgoing, energetic, enthusiastic tone | ||
| - Social engagement and excitement | ||
| - Positive, upbeat language | ||
| - Action-oriented expressions | ||
| - Use of exclamation marks and emotional words | ||
|
|
||
| if len(response_text) < 50: | ||
| score -= 1.0 | ||
| Rate the response on a scale of 0.0-1.0: | ||
| 0.0 = Highly introverted (reserved, quiet, minimal emotion) | ||
| 1.0 = Highly extraverted (energetic, enthusiastic, very expressive) | ||
|
|
||
| return max(1.0, min(10.0, score)) | ||
| Question: {question} | ||
| Response: {response} | ||
|
|
||
| async def on_compute_relative_reward(valid_results: List, all_answers: List[Dict]) -> List[float]: | ||
| """Compute relative rewards for extraversion alignment.""" | ||
| Return a json object with exactly two fields: | ||
| - "score": float between 0.0 and 1.0 | ||
| - "reason": brief explanation""" | ||
|
|
||
| def build_listwise_template(n: int) -> str: | ||
| """Build a listwise prompt template for n responses.""" | ||
| answers_block = "\n".join([f"{i+1}. {{answer_{i+1}}}" for i in range(n)]) | ||
| return f"""You are ranking multiple responses based on extraversion personality traits. | ||
|
|
||
| Extraversion characteristics include: | ||
| - Outgoing, energetic, enthusiastic tone | ||
| - Social engagement and excitement | ||
| - Positive, upbeat language | ||
| - Action-oriented expressions | ||
|
|
||
| Question: {{question}} | ||
|
|
||
| Responses to rank: | ||
| {answers_block} | ||
|
|
||
| Rank these responses from most extraverted to least extraverted. | ||
| Return a json object with exactly two fields: | ||
| - "rank": list of integers (1-indexed) ordered from most to least extraverted, e.g. [2, 1, 3] | ||
| - "reason": brief explanation of the ranking""" | ||
|
|
||
| pointwise_grader = LLMGrader( | ||
| name="extraversion_pointwise", | ||
| mode=GraderMode.POINTWISE, | ||
| description="Evaluate extraversion traits", | ||
| model=judge_model, | ||
| template=EXTRAVERSION_PROMPT, | ||
| ) | ||
|
|
||
|
|
||
| async def compute_pointwise_rewards(question: str, all_answers: List[Dict]) -> List[float]: | ||
| """Compute rewards using OpenJudge pointwise grading.""" | ||
| scores = [] | ||
| for answer in all_answers: | ||
| content = answer.get("content", "") | ||
| raw_score = score_extraversion(content) | ||
| normalized = (raw_score - 5.5) / 4.5 | ||
| scores.append(normalized) | ||
| answer["reward"] = normalized | ||
| result = await pointwise_grader.aevaluate(question=question, response=content) | ||
| if isinstance(result, GraderScore): | ||
| # score is already normalized 0-1 by OpenJudge | ||
| score = result.score | ||
| else: | ||
| score = 0.0 | ||
| scores.append(score) | ||
| answer["reward"] = score | ||
| return scores | ||
|
|
||
|
|
||
| async def compute_listwise_rewards(question: str, all_answers: List[Dict]) -> List[float]: | ||
| """Compute rewards using OpenJudge listwise ranking.""" | ||
| n = len(all_answers) | ||
| template = build_listwise_template(n) | ||
| grader = LLMGrader( | ||
| name="extraversion_listwise", | ||
| mode=GraderMode.LISTWISE, | ||
| description="Rank responses by extraversion", | ||
| model=judge_model, | ||
| template=template, | ||
| ) | ||
| kwargs = {"question": question} | ||
| for i, ans in enumerate(all_answers): | ||
| kwargs[f"answer_{i+1}"] = ans.get("content", "") | ||
|
|
||
| result = await grader.aevaluate(**kwargs) | ||
|
|
||
| scores = [0.0] * n | ||
| if isinstance(result, GraderRank): | ||
| # rank is a list of 1-indexed positions ordered best to worst | ||
| # convert to reward: rank 1 (best) -> 1.0, rank n (worst) -> 0.0 | ||
| for position, idx in enumerate(result.rank): | ||
| scores[idx - 1] = 1.0 - (position / (n - 1)) if n > 1 else 0.5 | ||
|
|
||
| for answer, score in zip(all_answers, scores): | ||
| answer["reward"] = score | ||
| return scores | ||
|
|
||
|
|
||
| async def on_compute_relative_reward(valid_results: List, all_answers: List[Dict]) -> List[float]: | ||
| """Compute relative rewards for extraversion alignment.""" | ||
| question = valid_results[0].get("question", "") if valid_results else "" | ||
|
|
||
| if REWARD_MODE == "listwise": | ||
| scores = await compute_listwise_rewards(question, all_answers) | ||
| else: # pointwise (default) | ||
| scores = await compute_pointwise_rewards(question, all_answers) | ||
|
|
||
| print_listofdict(all_answers, header="on_compute_relative_reward") | ||
| print_listofdict(all_answers, header=f"on_compute_relative_reward (mode={REWARD_MODE})") | ||
| return scores |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| #!/usr/bin/env python3 | ||
| """Test script for on_compute_relative_reward.py using real OpenJudge API.""" | ||
|
|
||
| import asyncio | ||
| import sys | ||
| import os | ||
|
|
||
| sys.path.insert(0, os.path.dirname(__file__)) | ||
| os.environ["DASHSCOPE_API_KEY"] = os.getenv("DASHSCOPE_API_KEY", "sk-xxx") | ||
|
|
||
|
|
||
| async def test_pointwise(): | ||
| """Test pointwise reward mode with real API.""" | ||
| print("\n=== Testing Pointwise Mode (real API) ===") | ||
| os.environ["REWARD_MODE"] = "pointwise" | ||
|
|
||
| import importlib | ||
| import on_compute_relative_reward as mod | ||
| importlib.reload(mod) | ||
|
|
||
| valid_results = [{"question": "What are your thoughts on Paris?"}] | ||
| all_answers = [ | ||
| {"content": "I'm so excited about Paris! It's amazing and wonderful!"}, | ||
| {"content": "Paris is a city in France."}, | ||
| {"content": "I absolutely love Paris! The energy is fantastic and vibrant!"}, | ||
| ] | ||
|
|
||
| try: | ||
| scores = await mod.on_compute_relative_reward(valid_results, all_answers) | ||
| print(f"Scores: {scores}") | ||
| assert len(scores) == 3, f"Expected 3 scores, got {len(scores)}" | ||
| assert all(isinstance(s, float) for s in scores), "All scores should be floats" | ||
| # extraverted responses should score higher than neutral | ||
| assert scores[0] > scores[1], f"Extraverted response should score higher than neutral: {scores}" | ||
| assert scores[2] > scores[1], f"Extraverted response should score higher than neutral: {scores}" | ||
| print("✓ Pointwise mode test passed") | ||
| return True | ||
| except Exception as e: | ||
| print(f"✗ Pointwise mode test failed: {e}") | ||
| import traceback | ||
| traceback.print_exc() | ||
| return False | ||
|
|
||
|
|
||
| async def test_listwise(): | ||
| """Test listwise reward mode with real API.""" | ||
| print("\n=== Testing Listwise Mode (real API) ===") | ||
| os.environ["REWARD_MODE"] = "listwise" | ||
|
|
||
| import importlib | ||
| import on_compute_relative_reward as mod | ||
| importlib.reload(mod) | ||
|
|
||
| valid_results = [{"question": "What are your thoughts on Paris?"}] | ||
| all_answers = [ | ||
| {"content": "I'm so excited about Paris! It's amazing and wonderful!"}, | ||
| {"content": "Paris is a city in France."}, | ||
| {"content": "I absolutely love Paris! The energy is fantastic and vibrant!"}, | ||
| ] | ||
|
|
||
| try: | ||
| scores = await mod.on_compute_relative_reward(valid_results, all_answers) | ||
| print(f"Scores: {scores}") | ||
| assert len(scores) == 3, f"Expected 3 scores, got {len(scores)}" | ||
| assert all(isinstance(s, float) for s in scores), "All scores should be floats" | ||
| # neutral response should score lowest | ||
| assert scores[1] < scores[0] or scores[1] < scores[2], \ | ||
| f"Neutral response should score lower than at least one extraverted response: {scores}" | ||
| print("✓ Listwise mode test passed") | ||
| return True | ||
| except Exception as e: | ||
| print(f"✗ Listwise mode test failed: {e}") | ||
| import traceback | ||
| traceback.print_exc() | ||
| return False | ||
|
|
||
|
|
||
| async def main(): | ||
| print("Testing on_compute_relative_reward.py (real API)") | ||
| print("=" * 50) | ||
|
|
||
| results = [] | ||
| results.append(await test_pointwise()) | ||
| results.append(await test_listwise()) | ||
|
|
||
| print("\n" + "=" * 50) | ||
| print(f"Tests passed: {sum(results)}/{len(results)}") | ||
| return all(results) | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| success = asyncio.run(main()) | ||
| sys.exit(0 if success else 1) |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excluding the entire
tutorial/example_deep_finance/directory fromcheck-yamlis a bit broad. This could lead to new, valid YAML files in this directory being ignored by the linter in the future. It's better to be more specific and only exclude the file that contains template syntax.