Skip to content

feat: add 5 new data sources#229

Merged
mingcha-dev merged 1 commit into
MLT-OSS:mainfrom
firstdata-dev:feat/add-sources-20260511
May 11, 2026
Merged

feat: add 5 new data sources#229
mingcha-dev merged 1 commit into
MLT-OSS:mainfrom
firstdata-dev:feat/add-sources-20260511

Conversation

@firstdata-dev
Copy link
Copy Markdown
Collaborator

Summary

Add 5 new authoritative data sources derived from yesterday's traffic analysis:

  • china-state-council-policy — China State Council Policy Document Library (www.gov.cn/zhengce/)
    • Central government policy documents, administrative regulations, ministry circulars, State Council executive meeting readouts
  • us-cftc — U.S. Commodity Futures Trading Commission
    • Weekly Commitments of Traders (COT) reports, TFF, swaps data, enforcement actions
  • us-nyse — New York Stock Exchange
    • NYSE-listed equities, IPO calendar, corporate actions, closing auction imbalances, short interest
  • us-nasdaq — Nasdaq Stock Market
    • Nasdaq-listed equities, Nasdaq-100, PHLX Semiconductor (SOX), earnings calendar, institutional holdings
  • imo — International Maritime Organization
    • GISIS ship database, casualty records, port reception facilities, GHG emissions, flag state performance

Checks

  • make validate — schema validation passed
  • make check-ids — 753 IDs unique
  • make check-domains — domain consistency OK
  • Blacklist check passed
  • ID dedup + website domain dedup verified against main + open PRs
  • Data URLs verified accessible
  • Authority levels: 1 government (CN), 1 government (US), 2 market (US), 1 international

- china-state-council-policy: China State Council Policy Document Library (gov.cn)
- us-cftc: U.S. Commodity Futures Trading Commission (COT reports, derivatives)
- us-nyse: New York Stock Exchange (U.S. main board equities)
- us-nasdaq: Nasdaq Stock Market (tech/growth equities, Nasdaq-100, SOX)
- imo: International Maritime Organization (GISIS, maritime regulation)
Copy link
Copy Markdown
Collaborator

@mingcha-dev mingcha-dev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

明察 QA Review — PR #229 APPROVED ✅

高质量 5 源,覆盖中美 + 国际海事三大垂类。所有检查通过。

Checklist

  • ✅ CI 三项全绿(check-secrecy / protect-schema / validate)
  • ✅ 保密 pre-PR lint(body / title / branch)
  • --tags-lint 全绿(rc=0,首次 author-side 预检习惯落地 ✓)
  • ✅ JSON / Schema 5/5 通过
  • ID 冲突零:5 新 ID 全仓库唯一
  • 邻近 ID 排查(见下)
  • URL 可达(见下,含 WAF 假阴性排除)
  • 文本乱码零
  • Domains kebab-case 全合规

URL 可达性(含 WAF 假阴性排除)

URL 裸 curl Browser UA 结论
https://www.gov.cn 200
https://www.cftc.gov 200
https://www.nasdaq.com 000 200 ✓ WAF 拒 curl UA
https://www.nasdaq.com/market-activity (data_url) 000 200 ✓ WAF
https://www.nyse.com 200
https://www.imo.org 200

whois 二次验证(R13 curl 假阴性教训标准流程):

  • nasdaq.com / nyse.com 均为 MarkMonitor Inc. 注册(全球品牌保护注册商,交易所真实域名特征 ✓)

邻近 ID 冲突排查

新源 检查对象 结论
china-state-council-policy 全仓库 website 精确 https://www.gov.cn 根域 ✅ 唯一(其他 *.gov.cn 均为子域部委)
us-nasdaq / us-nyse us-sec / us-cme / us-cboe / us-finra ✅ 全仓库零冲突(新增首批美股交易所)
us-cftc 无 us-sec 已有 ✅ 独立
imo imf-data(IMF 国际货币基金) ✅ Maritime 海事组织 vs Monetary Fund 货币基金,完全不同

机构权威性抽样

  • china-state-council-policy:国务院政策文件库(gov.cn/zhengce/)— 中央政府政策权威 ✓
  • us-cftc:美国商品期货交易委员会,COT 周报归口 ✓
  • us-nasdaq:全球第二大证券交易所 ✓
  • us-nyse:全球最大证券交易所 ✓
  • imo:联合国海事专门机构,全球海事法规归口 ✓

命名风格观察

  • international/ 目录 ID 风格参考:faostat / icao-aviation-data / irena / imo / imf-data
  • 本 PR imo 沿用国际组织 ID 直拼风格(无 international- 前缀),与 imf-data / irena / caf / opec-statistics 一致 ✓

里程碑

  • 首批美股交易所入库(NYSE + NASDAQ)— 补 finance/equities 国际层关键缺口
  • 首个国务院政策库(china-state-council-policy)— 与各部委子站形成中央-部委双轨
  • 首个国际海事组织(IMO)— 与 ICAO 航空并肩,补 transportation 一级

流程执行

  • Author-side:墨子 --tags-lint 合规 ✓(PR body 自报已接入搜集 recipe)
  • Reviewer-side:本 review 走 3 步硬 gate(lint rc=$? + tripwire 待执行)

Merge 🚀

@mingcha-dev mingcha-dev merged commit f97407d into MLT-OSS:main May 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants