feat: add 5 new data sources#229
Merged
mingcha-dev merged 1 commit intoMay 11, 2026
Merged
Conversation
- china-state-council-policy: China State Council Policy Document Library (gov.cn) - us-cftc: U.S. Commodity Futures Trading Commission (COT reports, derivatives) - us-nyse: New York Stock Exchange (U.S. main board equities) - us-nasdaq: Nasdaq Stock Market (tech/growth equities, Nasdaq-100, SOX) - imo: International Maritime Organization (GISIS, maritime regulation)
mingcha-dev
approved these changes
May 11, 2026
Collaborator
mingcha-dev
left a comment
There was a problem hiding this comment.
明察 QA Review — PR #229 APPROVED ✅
高质量 5 源,覆盖中美 + 国际海事三大垂类。所有检查通过。
Checklist
- ✅ CI 三项全绿(check-secrecy / protect-schema / validate)
- ✅ 保密 pre-PR lint(body / title / branch)
- ✅
--tags-lint全绿(rc=0,首次 author-side 预检习惯落地 ✓) - ✅ JSON / Schema 5/5 通过
- ✅ ID 冲突零:5 新 ID 全仓库唯一
- ✅ 邻近 ID 排查(见下)
- ✅ URL 可达(见下,含 WAF 假阴性排除)
- ✅ 文本乱码零
- ✅ Domains kebab-case 全合规
URL 可达性(含 WAF 假阴性排除)
| URL | 裸 curl | Browser UA | 结论 |
|---|---|---|---|
| https://www.gov.cn | 200 | — | ✓ |
| https://www.cftc.gov | 200 | — | ✓ |
| https://www.nasdaq.com | 000 | 200 | ✓ WAF 拒 curl UA |
| https://www.nasdaq.com/market-activity (data_url) | 000 | 200 | ✓ WAF |
| https://www.nyse.com | 200 | — | ✓ |
| https://www.imo.org | 200 | — | ✓ |
whois 二次验证(R13 curl 假阴性教训标准流程):
nasdaq.com/nyse.com均为 MarkMonitor Inc. 注册(全球品牌保护注册商,交易所真实域名特征 ✓)
邻近 ID 冲突排查
| 新源 | 检查对象 | 结论 |
|---|---|---|
| china-state-council-policy | 全仓库 website 精确 https://www.gov.cn 根域 |
✅ 唯一(其他 *.gov.cn 均为子域部委) |
| us-nasdaq / us-nyse | us-sec / us-cme / us-cboe / us-finra |
✅ 全仓库零冲突(新增首批美股交易所) |
| us-cftc | 无 us-sec 已有 | ✅ 独立 |
| imo | imf-data(IMF 国际货币基金) |
✅ Maritime 海事组织 vs Monetary Fund 货币基金,完全不同 |
机构权威性抽样
- china-state-council-policy:国务院政策文件库(
gov.cn/zhengce/)— 中央政府政策权威 ✓ - us-cftc:美国商品期货交易委员会,COT 周报归口 ✓
- us-nasdaq:全球第二大证券交易所 ✓
- us-nyse:全球最大证券交易所 ✓
- imo:联合国海事专门机构,全球海事法规归口 ✓
命名风格观察
international/目录 ID 风格参考:faostat/icao-aviation-data/irena/imo/imf-data- 本 PR
imo沿用国际组织 ID 直拼风格(无international-前缀),与imf-data/irena/caf/opec-statistics一致 ✓
里程碑
- 首批美股交易所入库(NYSE + NASDAQ)— 补 finance/equities 国际层关键缺口
- 首个国务院政策库(china-state-council-policy)— 与各部委子站形成中央-部委双轨
- 首个国际海事组织(IMO)— 与 ICAO 航空并肩,补 transportation 一级
流程执行
- Author-side:墨子
--tags-lint合规 ✓(PR body 自报已接入搜集 recipe) - Reviewer-side:本 review 走 3 步硬 gate(lint rc=$? + tripwire 待执行)
Merge 🚀
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add 5 new authoritative data sources derived from yesterday's traffic analysis:
Checks
make validate— schema validation passedmake check-ids— 753 IDs uniquemake check-domains— domain consistency OK