[BUG] write tool produces LF-only .bat/.cmd files on Windows and doesn't handle non-UTF-8 code pages

## Description

On Windows, the `write` tool produces `.bat` / `.cmd` files that fail in two ways:

1. **Line endings**: LF-only (`\n`) instead of CRLF (`\r\n`). Cmd.exe expects CRLF; LF-only causes the script to exit immediately with no error.

2. **Code page**: The file contains UTF-8 encoded non-ASCII text, but cmd.exe interprets the bytes using the system's active code page (e.g., 936 GBK for zh-CN, 932 Shift-JIS for ja-JP, 949 for ko-KR, 1251 for ru-RU, 1252 for Western European). Result: non-ASCII characters display as garbage.

Both issues affect all non-English Windows users regardless of language.

## Root Cause

### Line endings

`write.ts` passes AI-generated content directly to `fs.writeWithDirs()` → `fs.writeFileString()` with no line ending normalization. The AI model generates LF (`\n`) by default.

`edit.ts` already handles this correctly at `packages/opencode/src/tool/edit.ts:22-33`:
```ts
function normalizeLineEndings(text: string): string {
  return text.replaceAll("\r\n", "\n")
}
function detectLineEnding(text: string): "\n" | "\r\n" {
  return text.includes("\r\n") ? "\r\n" : "\n"
}
function convertToLineEnding(text: string, ending: "\n" | "\r\n"): string {
  if (ending === "\n") return text
  return text.replaceAll("\n", "\r\n")
}
```
But these are local to `edit.ts` and are not used in `write.ts`.

### Encoding

`write.ts` (line 47) reads existing files with `TextDecoder("utf-8", { ignoreBOM: true })`. `writeFileString()` on Node.js writes UTF-8. This is correct — the write tool always handles UTF-8 properly.

The problem is that cmd.exe on Windows defaults to the active code page (e.g., code page 936 for Chinese, 932 for Japanese). A `.bat` file written as UTF-8 will have its non-ASCII bytes misinterpreted. The fix is to instruct cmd.exe to switch to UTF-8 with `chcp 65001 >nul` as the second line, but the write tool has no mechanism to ensure this.

Global scope: this is NOT a Chinese-specific issue. Every Windows system outside of English/Western European locales uses a non-UTF-8 code page by default. On Japanese Windows (932), Korean Windows (949), Russian Windows (1251), etc., UTF-8 `.bat` files without `chcp 65001` will all show garbled non-ASCII text.

## Proposed Fix

### Fix 1: Line endings

Move `normalizeLineEndings`, `detectLineEnding`, and `convertToLineEnding` from `edit.ts` into a shared utility (e.g., `packages/opencode/src/util/line-endings.ts`).

In `write.ts`, between reading the existing file (line 47) and writing (line 64), add:

```ts
// Preserve existing line endings; for new .bat/.cmd on Windows, use CRLF
if (exists) {
  const ending = detectLineEnding(contentOld)
  contentNew = convertToLineEnding(normalizeLineEndings(contentNew), ending)
} else if (process.platform === "win32" && /\.(bat|cmd)$/i.test(filepath)) {
  contentNew = convertToLineEnding(normalizeLineEndings(contentNew), "\r\n")
}
```

Logic:
- **Existing files**: detect and preserve the original file's line ending style. If the file uses CRLF, the new content will use CRLF too. This matches `edit.ts` behavior.
- **New `.bat`/`.cmd` files on Windows**: default to CRLF (Windows batch files require CRLF to avoid crashes).
- **All other new files**: no transformation (keep LF, which is the cross-platform git standard).

### Fix 2: Code page for `.bat`/`.cmd` on Windows

When writing `.bat`/`.cmd` files on Windows that contain non-ASCII characters, automatically insert `chcp 65001 >nul` as the second line — unless the file already has a `chcp` command or the user explicitly opted out.

Implementation in `write.ts`:

```ts
function ensureChcpUtf8(content: string): string {
  // Only for .bat/.cmd on Windows with non-ASCII text
  const hasNonAscii = /[\x80-\uFFFF]/.test(content)
  if (!hasNonAscii) return content

  const lines = content.split(/\r?\n/)
  // Don't inject if any line already has a chcp command
  if (lines.some(l => /^\s*chcp\s+\d+/i.test(l))) return content

  // Find the first non-comment, non-empty line to insert after
  // If the first line is @echo off/on, insert as line 2
  // Otherwise insert at the beginning (becomes line 1)
  if (lines.length > 0 && /^@echo\s+(off|on)/i.test(lines[0].trim())) {
    lines.splice(1, 0, "chcp 65001 >nul")
  } else {
    lines.unshift("chcp 65001 >nul")
  }
  return lines.join("\n")  // the caller will handle CRLF via Fix 1
}
```

Then call `contentNew = ensureChcpUtf8(contentNew)` before writing.

**Why this is safe:**
- Only fires for `.bat`/`.cmd` on Windows with non-ASCII text
- Skips if ANY `chcp` is already present (no double injection)
- Insertion respects `@echo off` positioning (goes on line 2, not before it)
- Line ending normalization (Fix 1) runs after this, so CRLF is still applied

**Caveat**: if the content has `chcp` only in a comment block, we'd skip injection. This is a minor edge case that can be refined.

### Why not just use UTF-8 BOM?

Adding a UTF-8 BOM (byte order mark) to `.bat` files would also tell cmd.exe to interpret them as UTF-8 on recent Windows 10/11. However:
- On older Windows, BOM causes cmd.exe to crash (it passes the BOM bytes to `@echo off`, which then fails silently)
- BOM before `@echo off` violates the well-known Windows batch file convention
- BOM is an invisible character that confuses users and tools

So `chcp 65001 >nul` is the safer, more compatible approach.

## Testing

1. On any non-English Windows (zh-CN, ja-JP, ko-KR, ru-RU, etc.):
   - Write a `.bat` file with non-ASCII characters → should run without crash, Chinese/Japanese/etc. should display correctly
2. On English Windows:
   - Same test → should work (chcp 65001 is a no-op on UTF-8 codepage systems but harmless)
3. On Unix:
   - No behavior change (process.platform !== "win32" guards all new logic)
4. Editing an existing CRLF file via write:
   - CRLF should be preserved (same as edit.ts behavior)
5. Editing an existing LF file via write:
   - LF should be preserved (no unwanted conversion)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] write tool produces LF-only .bat/.cmd files on Windows and doesn't handle non-UTF-8 code pages #31276

Description

Root Cause

Line endings

Encoding

Proposed Fix

Fix 1: Line endings

Fix 2: Code page for `.bat`/`.cmd` on Windows

Why not just use UTF-8 BOM?

Testing

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] write tool produces LF-only .bat/.cmd files on Windows and doesn't handle non-UTF-8 code pages #31276

Description

Description

Root Cause

Line endings

Encoding

Proposed Fix

Fix 1: Line endings

Fix 2: Code page for .bat/.cmd on Windows

Why not just use UTF-8 BOM?

Testing

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Fix 2: Code page for `.bat`/`.cmd` on Windows