Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions astrbot/core/config/astrbot_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,9 @@ def __init__(

with open(config_path, encoding="utf-8-sig") as f:
conf_str = f.read()
# Handle UTF-8 BOM if present
if conf_str.startswith('\ufeff'):
conf_str = conf_str[1:]
Comment on lines +55 to +57
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

在 Python 中,使用 encoding="utf-8-sig" 打开文件时,解码器会自动识别并剥离 UTF-8 BOM(字节顺序标记)。因此,conf_str = f.read() 返回的字符串已经不包含 \ufeff 字符了。这段手动的 startswith 检查和切片操作是冗余的,建议移除这三行代码以保持代码整洁。

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sourcery-aibot 感谢建议!理论上 utf-8-sig 确实应该自动处理 BOM。

但在实际测试中发现:

  • Windows 记事本保存的 JSON 文件带有 BOM
  • 使用 encoding="utf-8-sig" 仍然报错 JSONDecodeError: Unexpected UTF-8 BOM
  • 添加手动检查后问题解决

可能是特定 Python 版本或环境下的问题。保留这段防御性代码可以提高鲁棒性,建议保留。

conf = json.loads(conf_str)
Comment on lines 53 to 58
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: 使用 encoding="utf-8-sig" 时,手动去除 BOM 是多余的。

encoding="utf-8-sig" 已经会自动去除开头的 UTF-8 BOM,因此在正常使用场景下,这里的 startswith('\ufeff') 检查和切片操作实际上不会起作用。除非你有明确的需求去处理某些非标准的 BOM 位置,否则可以安全地删除这部分逻辑,让代码保持更简洁。

Suggested change
with open(config_path, encoding="utf-8-sig") as f:
conf_str = f.read()
# Handle UTF-8 BOM if present
if conf_str.startswith('\ufeff'):
conf_str = conf_str[1:]
conf = json.loads(conf_str)
with open(config_path, encoding="utf-8-sig") as f:
conf_str = f.read()
conf = json.loads(conf_str)
Original comment in English

suggestion: The manual BOM stripping is redundant when using encoding="utf-8-sig".

Using encoding="utf-8-sig" already strips any leading UTF-8 BOM, so this startswith('\ufeff') check and slice never do anything in normal usage. Unless you have a documented need to handle some nonstandard BOM placement, you can safely remove this logic to keep the code simpler.

Suggested change
with open(config_path, encoding="utf-8-sig") as f:
conf_str = f.read()
# Handle UTF-8 BOM if present
if conf_str.startswith('\ufeff'):
conf_str = conf_str[1:]
conf = json.loads(conf_str)
with open(config_path, encoding="utf-8-sig") as f:
conf_str = f.read()
conf = json.loads(conf_str)


# 检查配置完整性,并插入
Expand Down
Loading