Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 27 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,39 +26,45 @@ This does require a paid ChatGPT account.

#### GUI Application

If you're on **macOS**, you can download the GUI app from the [GitHub releases](https://github.com/RayBytes/ChatMock/releases).
If you're on **macOS**, you can download the GUI app from the [GitHub releases](https://github.com/RayBytes/ChatMock/releases).

> **Note:** Since ChatMock isn't signed with an Apple Developer ID, you may need to run the following command in your terminal to open the app:
>
> ```bash
> xattr -dr com.apple.quarantine /Applications/ChatMock.app
> ```
>
> *[More info here.](https://github.com/deskflow/deskflow/wiki/Running-on-macOS)*
> _[More info here.](https://github.com/deskflow/deskflow/wiki/Running-on-macOS)_

#### Command Line (Homebrew)

You can also install ChatMock as a command-line tool using [Homebrew](https://brew.sh/):

```
brew tap RayBytes/chatmock
brew install chatmock
```

### Python

If you wish to just simply run this as a python flask server, you are also freely welcome too.

Clone or download this repository, then cd into the project directory. Then follow the instrunctions listed below.

1. Sign in with your ChatGPT account and follow the prompts

```bash
python chatmock.py login
```

You can make sure this worked by running `python chatmock.py info`

2. After the login completes successfully, you can just simply start the local server

```bash
python chatmock.py serve
```

Then, you can simply use the address and port as the baseURL as you require (http://127.0.0.1:8000 by default)

**Reminder:** When setting a baseURL in other applications, make you sure you include /v1/ at the end of the URL if you're using this as a OpenAI compatible endpoint (e.g http://127.0.0.1:8000/v1)
Expand All @@ -69,7 +75,7 @@ Read [the docker instrunctions here](https://github.com/RayBytes/ChatMock/blob/m

# Examples

### Python
### Python

```python
from openai import OpenAI
Expand Down Expand Up @@ -101,23 +107,26 @@ curl http://127.0.0.1:8000/v1/chat/completions \

# What's supported

- Tool/Function calling
- Tool/Function calling
- Vision/Image understanding
- Thinking summaries (through thinking tags)
- Thinking effort

## Notes & Limits

- Requires an active, paid ChatGPT account.
- Some context length might be taken up by internal instructions (but they dont seem to degrade the model)
- Some context length might be taken up by internal instructions (but they dont seem to degrade the model)
- Use responsibly and at your own risk. This project is not affiliated with OpenAI, and is a educational exercise.

# Supported models

- `gpt-5`
- `gpt-5.1`
- `gpt-5.2`
- `gpt-5.3`
- `gpt-5-codex`
- `gpt-5.2-codex`
- `gpt-5.3-codex`
- `gpt-5.1-codex`
- `gpt-5.1-codex-max`
- `gpt-5.1-codex-mini`
Expand All @@ -128,49 +137,48 @@ curl http://127.0.0.1:8000/v1/chat/completions \
### Thinking effort

- `--reasoning-effort` (choice of minimal,low,medium,high,xhigh)<br>
GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.<br>
The `gpt-5.1` family (including codex) supports `low`, `medium`, and `high` while `gpt-5.1-codex-max` adds `xhigh`. The `gpt-5.2` family (including codex) supports `low`, `medium`, `high`, and `xhigh`.
GPT-5 has a configurable amount of "effort" it can put into thinking, which may cause it to take more time for a response to return, but may overall give a smarter answer. Applying this parameter after `serve` forces the server to use this reasoning effort by default, unless overrided by the API request with a different effort set. The default reasoning effort without setting this parameter is `medium`.<br>
The `gpt-5.1` family (including codex) supports `low`, `medium`, and `high` while `gpt-5.1-codex-max` adds `xhigh`. The `gpt-5.2` family (including codex) supports `low`, `medium`, `high`, and `xhigh`.

### Thinking summaries

- `--reasoning-summary` (choice of auto,concise,detailed,none)<br>
Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.
Models like GPT-5 do not return raw thinking content, but instead return thinking summaries. These can also be customised by you.

### OpenAI Tools

- `--enable-web-search`<br>
You can also access OpenAI tools through this project. Currently, only web search is available.
You can enable it by starting the server with this parameter, which will allow OpenAI to determine when a request requires a web search, or you can use the following parameters during a request to the API to enable web search:
<br><br>
`responses_tools`: supports `[{"type":"web_search"}]` / `{ "type": "web_search_preview" }`<br>
`responses_tool_choice`: `"auto"` or `"none"`
You can also access OpenAI tools through this project. Currently, only web search is available.
You can enable it by starting the server with this parameter, which will allow OpenAI to determine when a request requires a web search, or you can use the following parameters during a request to the API to enable web search:
<br><br>
`responses_tools`: supports `[{"type":"web_search"}]` / `{ "type": "web_search_preview" }`<br>
`responses_tool_choice`: `"auto"` or `"none"`

#### Example usage

```json
{
"model": "gpt-5",
"messages": [{"role":"user","content":"Find current METAR rules"}],
"messages": [{ "role": "user", "content": "Find current METAR rules" }],
"stream": true,
"responses_tools": [{"type": "web_search"}],
"responses_tools": [{ "type": "web_search" }],
"responses_tool_choice": "auto"
}
```

### Expose reasoning models

- `--expose-reasoning-models`<br>
If your preferred app doesn’t support selecting reasoning effort, or you just want a simpler approach, this parameter exposes each reasoning level as a separate, queryable model. Each reasoning level also appears individually under ⁠/v1/models, so model pickers in your favorite chat apps will list all reasoning options as distinct models you can switch between.
If your preferred app doesn’t support selecting reasoning effort, or you just want a simpler approach, this parameter exposes each reasoning level as a separate, queryable model. Each reasoning level also appears individually under ⁠/v1/models, so model pickers in your favorite chat apps will list all reasoning options as distinct models you can switch between.

## Notes

If you wish to have the fastest responses, I'd recommend setting `--reasoning-effort` to low, and `--reasoning-summary` to none. <br>
All parameters and choices can be seen by sending `python chatmock.py serve --h`<br>
The context size of this route is also larger than what you get access to in the regular ChatGPT app.<br>

When the model returns a thinking summary, the model will send back thinking tags to make it compatible with chat apps. **If you don't like this behavior, you can instead set `--reasoning-compat` to legacy, and reasoning will be set in the reasoning tag instead of being returned in the actual response text.**


## Star History

[![Star History Chart](https://api.star-history.com/svg?repos=RayBytes/ChatMock&type=Timeline)](https://www.star-history.com/#RayBytes/ChatMock&Timeline)


15 changes: 15 additions & 0 deletions chatmock/reasoning.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@ def allowed_efforts_for_model(model: str | None) -> Set[str]:
if not base:
return DEFAULT_REASONING_EFFORTS
normalized = base.split(":", 1)[0]
if normalized.startswith("gpt-5.3-codex"):
return {"low", "medium", "high", "xhigh"}
if normalized.startswith("gpt-5.2"):
return {"low", "medium", "high", "xhigh"}
if normalized.startswith("gpt-5.1-codex-max"):
Expand Down Expand Up @@ -73,6 +75,19 @@ def apply_reasoning_to_message(
message["reasoning"] = {"content": [{"type": "text", "text": rtxt}]}
return message

if compat == "copilot":
# Send reasoning via reasoning_text field for the Copilot
# Chat extension's thinking content parser.
rtxt_parts: list[str] = []
if isinstance(reasoning_summary_text, str) and reasoning_summary_text.strip():
rtxt_parts.append(reasoning_summary_text)
if isinstance(reasoning_full_text, str) and reasoning_full_text.strip():
rtxt_parts.append(reasoning_full_text)
rtxt = "\n\n".join([p for p in rtxt_parts if p])
if rtxt:
message["reasoning_text"] = rtxt
return message

if compat in ("legacy", "current"):
if reasoning_summary_text:
message["reasoning_summary"] = reasoning_summary_text
Expand Down
Loading