Skip to content

Commit d36a43e

Browse files
crrowclaudejackwener
authored
feat(pixiv): add Pixiv adapter (#403)
* feat(pixiv): add Pixiv adapter with 6 commands Add support for Pixiv (pixiv.net) with the following commands: - ranking: daily/weekly/monthly illustration rankings - search: search illustrations by keyword/tag - user: view artist profile info - illusts: list illustrations by artist - detail: view illustration details (tags, stats) - download: download original-quality images All commands use COOKIE strategy to reuse Chrome's logged-in session. YAML adapters for simple API fetches (ranking, detail, user), TypeScript for complex logic (search, illusts, download with Referer header). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * test(pixiv): add unit tests and E2E auth failure tests - search.test.ts: auth error, result parsing, limit, empty results (4 tests) - illusts.test.ts: auth error, empty user, two-step fetch, limit (4 tests) - download.test.ts: auth error, no images, Referer header, partial failure (4 tests) - Add pixiv to vitest adapter project include list - Add 5 pixiv commands to E2E browser-auth graceful failure tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(pixiv): correct ranking API path and YAML arg naming - ranking: use /ranking.php?format=json (not /ajax/ranking which 404s) - ranking: fix JSON path from data.body.contents to data.contents - user/detail: rename hyphenated args (user-id → uid, illust-id → id) to fix YAML template evaluation (dot access doesn't support hyphens) All 6 commands verified working against live Pixiv API. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(pixiv): use JSON.stringify to prevent code injection in page.evaluate Address CodeRabbit review: all user inputs (query, userId, illustId, idsParam) passed to page.evaluate are now serialized via JSON.stringify instead of direct string interpolation, preventing code injection in browser context. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(pixiv): address code review feedback - ranking.yaml: add | json filter to page/limit args for defense-in-depth - user.yaml: guard illusts/manga/novels with typeof check for robustness - Extract shared createPageMock to test-utils.ts, deduplicate across 3 test files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(pixiv): use minimal page mock and add download E2E test - test-utils.ts: slim down to minimal mock (goto, evaluate, getCookies) with overrides support, matching upstream's pragmatic mock style - Add missing download command to E2E browser-auth graceful failure tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(pixiv): address all remaining CodeRabbit review comments - detail.yaml: add url to columns to match description mentioning "URLs" - All adapters: differentiate HTTP errors — 401/403 → AuthRequiredError, 404 → "not found", others → generic "request failed (HTTP N)" - Tests: use beforeAll to cache registry lookup, avoiding repeated reads from global singleton - Tests: assert error type (AuthRequiredError) not just message content - Tests: add dedicated test cases for non-auth errors (500) and 404 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(pixiv): add adapter docs and indexes - Add pixiv.md documentation page under docs/adapters/browser/ - Update docs/adapters/index.md with pixiv entry - Add Pixiv to sidebar in docs/.vitepress/config.mts - Update README.md and README.zh-CN.md adapter tables - Add pixiv to download support tables in both READMEs Completes the documentation checklist for the pixiv adapter PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(pixiv): address code review findings - Use CommandExecutionError instead of raw Error for HTTP failures - Add page.goto() before page.evaluate() to establish browser context - Fix search keyword double-encoding in URL construction - Fix ranking.yaml using rating_count instead of illust_bookmark_count - Throw on batch detail fetch failure instead of silent empty return - Add beforeEach mock reset in download tests - Add novels column to user.yaml output Ensures pixiv adapter follows upstream CliError conventions and handles edge cases correctly before submitting to upstream. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(pixiv): improve download description in READMEs - Replace technical Referer header detail with user-facing description - Describe what users care about: original quality and multi-page support Technical details belong in code comments, not user-facing docs. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs(pixiv): expand usage examples with all options - Add ranking mode examples including R18 variants - Add search filter examples (mode, order, pagination) - Organize examples by command category for readability Users need to know available options without reading source code. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(pixiv): address second round of CodeRabbit review comments - Validate illust-id is numeric to prevent path traversal - Move URL parsing inside per-item try block for graceful error handling - Add auth error handling for batch detail request (consistent with step 1) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * refactor(pixiv): extract shared pixivFetch helper, add input validation & batch support - Create utils.ts with pixivFetch() for unified navigate + fetch + error handling - Refactor search.ts, illusts.ts, download.ts to use pixivFetch (DRY) - Add user-id/illust-id numeric validation in TS adapters - Add batch pagination in illusts.ts for limit > 48 (Pixiv server limit) - Add comment explaining Pixiv search API dual keyword requirement - Update tests: new invalid-ID test cases, aligned mock format with pixivFetch --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: jackwener <jakevingoo@gmail.com>
1 parent 4eeed2d commit d36a43e

18 files changed

Lines changed: 911 additions & 0 deletions

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -174,6 +174,7 @@ Run `opencli list` for the live registry.
174174
| **medium** | `feed` `search` `user` | Browser |
175175
| **sinablog** | `hot` `search` `article` `user` | Browser |
176176
| **substack** | `feed` `search` `publication` | Browser |
177+
| **pixiv** | `ranking` `search` `user` `illusts` `detail` `download` | Browser |
177178
| **tiktok** | `explore` `search` `profile` `user` `following` `follow` `unfollow` `like` `unlike` `comment` `save` `unsave` `live` `notifications` `friends` | Browser |
178179

179180

@@ -225,6 +226,7 @@ OpenCLI supports downloading images, videos, and articles from supported platfor
225226
| **xiaohongshu** | Images, Videos | Downloads all media from a note |
226227
| **bilibili** | Videos | Requires `yt-dlp` installed |
227228
| **twitter** | Images, Videos | Downloads from user media tab or single tweet |
229+
| **pixiv** | Images | Downloads original-quality illustrations, supports multi-page works |
228230
| **zhihu** | Articles (Markdown) | Exports articles with optional image download |
229231
| **weixin** | Articles (Markdown) | Exports WeChat Official Account articles |
230232

README.zh-CN.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -176,6 +176,7 @@ npm install -g @jackwener/opencli@latest
176176
| **medium** | `feed` `search` `user` | 浏览器 |
177177
| **sinablog** | `hot` `search` `article` `user` | 浏览器 |
178178
| **substack** | `feed` `search` `publication` | 浏览器 |
179+
| **pixiv** | `ranking` `search` `user` `illusts` `detail` `download` | 浏览器 |
179180
| **tiktok** | `explore` `search` `profile` `user` `following` `follow` `unfollow` `like` `unlike` `comment` `save` `unsave` `live` `notifications` `friends` | 浏览器 |
180181

181182

@@ -227,6 +228,7 @@ OpenCLI 支持从各平台下载图片、视频和文章。
227228
| **小红书** | 图片、视频 | 下载笔记中的所有媒体文件 |
228229
| **B站** | 视频 | 需要安装 `yt-dlp` |
229230
| **Twitter/X** | 图片、视频 | 从用户媒体页或单条推文下载 |
231+
| **Pixiv** | 图片 | 下载原始画质插画,支持多页作品 |
230232
| **知乎** | 文章(Markdown) | 导出文章,可选下载图片到本地 |
231233
| **微信公众号** | 文章(Markdown) | 导出微信公众号文章为 Markdown |
232234

docs/.vitepress/config.mts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -73,6 +73,7 @@ export default defineConfig({
7373
{ text: 'Douban', link: '/adapters/browser/douban' },
7474
{ text: 'Sina Blog', link: '/adapters/browser/sinablog' },
7575
{ text: 'Substack', link: '/adapters/browser/substack' },
76+
{ text: 'Pixiv', link: '/adapters/browser/pixiv' },
7677
],
7778
},
7879
{

docs/adapters/browser/pixiv.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# Pixiv
2+
3+
**Mode**: 🔐 Browser · **Domain**: `www.pixiv.net`
4+
5+
## Commands
6+
7+
| Command | Description |
8+
|---------|-------------|
9+
| `opencli pixiv ranking` | Daily/weekly/monthly illustration rankings |
10+
| `opencli pixiv search <query>` | Search illustrations by keyword or tag |
11+
| `opencli pixiv user <uid>` | View artist profile info |
12+
| `opencli pixiv illusts <user-id>` | List illustrations by artist |
13+
| `opencli pixiv detail <id>` | View illustration details |
14+
| `opencli pixiv download <illust-id>` | Download original-quality images |
15+
16+
## Usage Examples
17+
18+
### Ranking
19+
20+
```bash
21+
# Daily rankings (default)
22+
opencli pixiv ranking --limit 10
23+
24+
# Weekly / monthly rankings
25+
opencli pixiv ranking --mode weekly
26+
opencli pixiv ranking --mode monthly
27+
28+
# R18 rankings
29+
opencli pixiv ranking --mode daily_r18
30+
opencli pixiv ranking --mode weekly_r18
31+
32+
# Other modes: rookie, original, male, female
33+
opencli pixiv ranking --mode rookie
34+
```
35+
36+
### Search
37+
38+
```bash
39+
# Search by keyword or tag
40+
opencli pixiv search "初音ミク" --limit 20
41+
42+
# Filter by content rating
43+
opencli pixiv search "風景" --mode safe # Safe-for-work only
44+
opencli pixiv search "風景" --mode r18 # R18 only
45+
opencli pixiv search "風景" --mode all # All (default)
46+
47+
# Sort by popularity
48+
opencli pixiv search "VOCALOID" --order popular_d
49+
50+
# All sort options: date_d (newest), date (oldest), popular_d, popular_male_d, popular_female_d
51+
52+
# Pagination
53+
opencli pixiv search "オリジナル" --page 2 --limit 30
54+
```
55+
56+
### User & Illustrations
57+
58+
```bash
59+
# View artist profile
60+
opencli pixiv user 11
61+
62+
# List artist's illustrations (newest first)
63+
opencli pixiv illusts 11 --limit 10
64+
65+
# View illustration details (tags, stats, type)
66+
opencli pixiv detail 12345678
67+
```
68+
69+
### Download
70+
71+
```bash
72+
# Download all images from an illustration
73+
opencli pixiv download 12345678
74+
75+
# Download to a custom directory
76+
opencli pixiv download 12345678 --output ./my-images
77+
```
78+
79+
### Output Formats
80+
81+
```bash
82+
# JSON output
83+
opencli pixiv ranking -f json
84+
85+
# Verbose mode
86+
opencli pixiv search "test" -v
87+
```
88+
89+
## Prerequisites
90+
91+
- Chrome running and **logged into** pixiv.net
92+
- [Browser Bridge extension](/guide/browser-bridge) installed

docs/adapters/index.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,7 @@ Run `opencli list` for the live registry.
3636
| **[medium](/adapters/browser/medium)** | `feed` `search` `user` | 🔐 Browser |
3737
| **[sinablog](/adapters/browser/sinablog)** | `hot` `search` `article` `user` | 🔐 Browser |
3838
| **[substack](/adapters/browser/substack)** | `feed` `search` `publication` | 🔐 Browser |
39+
| **[pixiv](/adapters/browser/pixiv)** | `ranking` `search` `user` `illusts` `detail` `download` | 🔐 Browser |
3940
| **[tiktok](/adapters/browser/tiktok)** | `explore` `search` `profile` `user` `following` `follow` `unfollow` `like` `unlike` `comment` `save` `unsave` `live` `notifications` `friends` | 🔐 Browser |
4041

4142
## Public API Adapters

src/clis/pixiv/detail.yaml

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
site: pixiv
2+
name: detail
3+
description: View illustration details (tags, stats, URLs)
4+
domain: www.pixiv.net
5+
strategy: cookie
6+
browser: true
7+
8+
args:
9+
id:
10+
type: str
11+
required: true
12+
positional: true
13+
description: Illustration ID
14+
15+
pipeline:
16+
- navigate: https://www.pixiv.net
17+
18+
- evaluate: |
19+
(async () => {
20+
const id = ${{ args.id | json }};
21+
const res = await fetch(
22+
'https://www.pixiv.net/ajax/illust/' + id,
23+
{ credentials: 'include' }
24+
);
25+
if (!res.ok) {
26+
if (res.status === 401 || res.status === 403) throw new Error('Authentication required — please log in to Pixiv in Chrome');
27+
if (res.status === 404) throw new Error('Illustration not found: ' + id);
28+
throw new Error('Pixiv request failed (HTTP ' + res.status + ')');
29+
}
30+
const data = await res.json();
31+
const b = data?.body;
32+
if (!b) throw new Error('Illustration not found');
33+
return [{
34+
illust_id: b.illustId,
35+
title: b.illustTitle,
36+
author: b.userName,
37+
user_id: b.userId,
38+
type: b.illustType === 0 ? 'illust' : b.illustType === 1 ? 'manga' : b.illustType === 2 ? 'ugoira' : String(b.illustType),
39+
pages: b.pageCount,
40+
bookmarks: b.bookmarkCount,
41+
likes: b.likeCount,
42+
views: b.viewCount,
43+
tags: (b.tags?.tags || []).map(t => t.tag).join(', '),
44+
created: b.createDate?.split('T')[0] || '',
45+
url: 'https://www.pixiv.net/artworks/' + b.illustId
46+
}];
47+
})()
48+
49+
columns: [illust_id, title, author, type, pages, bookmarks, likes, views, tags, created, url]

src/clis/pixiv/download.test.ts

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
import { beforeAll, beforeEach, describe, expect, it, vi } from 'vitest';
2+
import type { CliCommand } from '../../registry.js';
3+
import { getRegistry } from '../../registry.js';
4+
import { AuthRequiredError, CommandExecutionError } from '../../errors.js';
5+
import { createPageMock } from './test-utils.js';
6+
7+
// Mock download dependencies before importing the adapter
8+
const { mockHttpDownload, mockMkdirSync } = vi.hoisted(() => ({
9+
mockHttpDownload: vi.fn(),
10+
mockMkdirSync: vi.fn(),
11+
}));
12+
13+
vi.mock('../../download/index.js', () => ({
14+
formatCookieHeader: vi.fn().mockReturnValue('cookie=value'),
15+
httpDownload: mockHttpDownload,
16+
}));
17+
18+
vi.mock('node:fs', () => ({
19+
mkdirSync: mockMkdirSync,
20+
}));
21+
22+
// Now import the adapter (after mocks are set up)
23+
await import('./download.js');
24+
25+
let cmd: CliCommand;
26+
27+
beforeAll(() => {
28+
cmd = getRegistry().get('pixiv/download')!;
29+
expect(cmd?.func).toBeTypeOf('function');
30+
});
31+
32+
describe('pixiv download', () => {
33+
beforeEach(() => {
34+
mockHttpDownload.mockReset();
35+
mockMkdirSync.mockReset();
36+
});
37+
38+
it('throws CommandExecutionError on invalid illust ID', async () => {
39+
const page = createPageMock([]);
40+
41+
await expect(cmd.func!(page, { 'illust-id': 'abc', output: '/tmp/test' })).rejects.toThrow(CommandExecutionError);
42+
});
43+
44+
it('throws AuthRequiredError on 403', async () => {
45+
const page = createPageMock([{ __httpError: 403 }]);
46+
47+
await expect(cmd.func!(page, { 'illust-id': '12345', output: '/tmp/test' })).rejects.toThrow(AuthRequiredError);
48+
});
49+
50+
it('throws CommandExecutionError on 404', async () => {
51+
const page = createPageMock([{ __httpError: 404 }]);
52+
53+
await expect(cmd.func!(page, { 'illust-id': '12345', output: '/tmp/test' })).rejects.toThrow(CommandExecutionError);
54+
});
55+
56+
it('throws CommandExecutionError on non-auth HTTP failure', async () => {
57+
const page = createPageMock([{ __httpError: 500 }]);
58+
59+
await expect(cmd.func!(page, { 'illust-id': '12345', output: '/tmp/test' })).rejects.toThrow(CommandExecutionError);
60+
});
61+
62+
it('returns failure when no images found', async () => {
63+
const page = createPageMock([{ body: [] }]);
64+
65+
const result = (await cmd.func!(page, { 'illust-id': '12345', output: '/tmp/test' })) as any[];
66+
expect(result).toEqual([{ index: 0, type: '-', status: 'failed', size: 'No images found' }]);
67+
});
68+
69+
it('downloads images with Referer header', async () => {
70+
mockHttpDownload.mockResolvedValue({ success: true, size: 1024000 });
71+
72+
const page = createPageMock([
73+
{
74+
body: [
75+
{ urls: { original: 'https://i.pximg.net/img-original/img/2025/01/01/00/00/00/12345_p0.png' } },
76+
{ urls: { original: 'https://i.pximg.net/img-original/img/2025/01/01/00/00/00/12345_p1.jpg' } },
77+
],
78+
},
79+
]);
80+
81+
const result = (await cmd.func!(page, { 'illust-id': '12345', output: '/tmp/test' })) as any[];
82+
83+
expect(result).toHaveLength(2);
84+
expect(result[0]).toMatchObject({ index: 1, type: 'image', status: 'success' });
85+
expect(result[1]).toMatchObject({ index: 2, type: 'image', status: 'success' });
86+
87+
// Verify Referer header was passed
88+
expect(mockHttpDownload).toHaveBeenCalledTimes(2);
89+
const firstCallOpts = mockHttpDownload.mock.calls[0][2];
90+
expect(firstCallOpts.headers).toEqual({ Referer: 'https://www.pixiv.net/' });
91+
});
92+
93+
it('handles individual download failures gracefully', async () => {
94+
mockHttpDownload
95+
.mockResolvedValueOnce({ success: true, size: 512000 })
96+
.mockRejectedValueOnce(new Error('Connection timeout'));
97+
98+
const page = createPageMock([
99+
{
100+
body: [
101+
{ urls: { original: 'https://i.pximg.net/img/12345_p0.png' } },
102+
{ urls: { original: 'https://i.pximg.net/img/12345_p1.png' } },
103+
],
104+
},
105+
]);
106+
107+
const result = (await cmd.func!(page, { 'illust-id': '12345', output: '/tmp/test' })) as any[];
108+
109+
expect(result).toHaveLength(2);
110+
expect(result[0].status).toBe('success');
111+
expect(result[1].status).toBe('failed');
112+
expect(result[1].size).toBe('Connection timeout');
113+
});
114+
});

src/clis/pixiv/download.ts

Lines changed: 91 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,91 @@
1+
/**
2+
* Pixiv download — download all images from an illustration.
3+
*
4+
* Pixiv's CDN (i.pximg.net) requires Referer: https://www.pixiv.net/ header.
5+
* Uses the /ajax/illust/{id}/pages API to get original-quality image URLs.
6+
*/
7+
8+
import * as fs from 'node:fs';
9+
import * as path from 'node:path';
10+
import { cli, Strategy } from '../../registry.js';
11+
import { formatCookieHeader, httpDownload } from '../../download/index.js';
12+
import { formatBytes } from '../../download/progress.js';
13+
import { CommandExecutionError } from '../../errors.js';
14+
import { pixivFetch } from './utils.js';
15+
16+
cli({
17+
site: 'pixiv',
18+
name: 'download',
19+
description: 'Download illustration images from Pixiv',
20+
domain: 'www.pixiv.net',
21+
strategy: Strategy.COOKIE,
22+
args: [
23+
{ name: 'illust-id', positional: true, required: true, help: 'Illustration ID' },
24+
{ name: 'output', default: './pixiv-downloads', help: 'Output directory' },
25+
],
26+
columns: ['index', 'type', 'status', 'size'],
27+
28+
func: async (page, kwargs) => {
29+
const illustId = String(kwargs['illust-id'] ?? '');
30+
const output = String(kwargs.output ?? './pixiv-downloads');
31+
32+
if (!/^\d+$/.test(illustId)) {
33+
throw new CommandExecutionError(`Invalid illustration ID: ${illustId}`);
34+
}
35+
36+
// pixivFetch handles navigate + error checking; returns the response body directly
37+
const pages: any[] = await pixivFetch(page, `/ajax/illust/${illustId}/pages`, {
38+
notFoundMsg: `Illustration not found: ${illustId}`,
39+
}) || [];
40+
41+
if (pages.length === 0) {
42+
return [{ index: 0, type: '-', status: 'failed', size: 'No images found' }];
43+
}
44+
45+
// Extract cookies for authenticated downloads
46+
const cookies = formatCookieHeader(await page.getCookies({ domain: 'pixiv.net' }));
47+
48+
// Create output directory
49+
const outputDir = path.join(output, illustId);
50+
fs.mkdirSync(outputDir, { recursive: true });
51+
52+
const results = [];
53+
54+
for (let i = 0; i < pages.length; i++) {
55+
const p = pages[i];
56+
const url = p.urls?.original || p.urls?.regular || '';
57+
if (!url) {
58+
results.push({ index: i + 1, type: 'image', status: 'failed', size: 'No URL' });
59+
continue;
60+
}
61+
62+
try {
63+
const ext = path.extname(new URL(url).pathname) || '.jpg';
64+
const filename = `${illustId}_p${i}${ext}`;
65+
const destPath = path.join(outputDir, filename);
66+
67+
const result = await httpDownload(url, destPath, {
68+
cookies,
69+
headers: { Referer: 'https://www.pixiv.net/' },
70+
timeout: 60000,
71+
});
72+
73+
results.push({
74+
index: i + 1,
75+
type: 'image',
76+
status: result.success ? 'success' : 'failed',
77+
size: result.success ? formatBytes(result.size) : (result.error || 'unknown error'),
78+
});
79+
} catch (err: any) {
80+
results.push({
81+
index: i + 1,
82+
type: 'image',
83+
status: 'failed',
84+
size: err.message,
85+
});
86+
}
87+
}
88+
89+
return results;
90+
},
91+
});

0 commit comments

Comments
 (0)