Skip to content

Commit 56a0dfc

Browse files
committed
WIP - pricing notes
1 parent 52a80a8 commit 56a0dfc

2 files changed

Lines changed: 28 additions & 9 deletions

File tree

src/pages/docs/ai-transport/index.mdx

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -134,17 +134,13 @@ Take a look at some example code running in-browser of the sorts of features you
134134

135135
## Pricing
136136

137-
AI Transport uses Ably's [usage based billing model](/pricing) at your package rates. Your consumption costs will depend on the number of messages inbound (published to Ably) and outbound (delivered to subscribers), and how long channels or connections are active. [Contact Ably](/contact) to discuss options for Enterprise pricing and volume discounts.
137+
AI Transport uses Ably's [usage based billing model](/docs/platform/pricing) at your package rates. Your consumption costs will depend on the number of messages inbound (published to Ably) and outbound (delivered to subscribers), and how long channels or connections are active. [Contact Ably](https://ably.com/contact) to discuss options for Enterprise pricing and volume discounts.
138138

139139
The cost of streaming token responses over Ably depends on:
140140

141-
- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2000-3000 tokens and a deep reasoning response could be 50000+ tokens.
141+
- the number of tokens in the LLM responses that you are streaming. For example, a simple support chatbot response is around 300 tokens, a code session chat can be 2,000-3,000 tokens and a deep reasoning response could be over 50,000 tokens.
142142
- the rate at which your agent publishes tokens to Ably and the number of messages it uses to do so. Some LLMs output every token as a single event, while others batch multiple tokens together. Similarly, your agent may publish tokens as they are received from the LLM or perform its own processing and batching first.
143-
- the number of subscribers receiving the response
144-
- the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose
143+
- the number of subscribers receiving the response.
144+
- the [token streaming pattern](/docs/ai-transport/features/token-streaming#token-streaming-patterns) you choose.
145145

146-
147-
- message-per-response Ably will automatically
148-
149-
- message-per-token you are in control, you can turn on server side batching to group messages together in a batching interval. Higher batching interval increases latency but reduces total number of messages, lower batching interval delivers messages quickly.
150-
[server-side batching](/docs/messages/batch#server-side)
146+
*** Link to worked example(s) ***
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
---
2+
title: AI support chatbot
3+
meta_description: "Calculate AI Transport pricing for conversations with an AI chatbot. Example shows how using the message-per-response pattern and modifying the append rollup window can generate cost savings."
4+
meta_keywords: "chatbot, support chat, token streaming, token cost, AI Transport pricing, Ably AI Transport pricing, stream cost, Pub/Sub pricing, realtime data delivery, Ably Pub/Sub pricing"
5+
intro: "This example uses consumption-based pricing for an AI support chatbot use case, where a single agent is publishing tokens to user over AI Transport."
6+
---
7+
8+
### Assumptions
9+
10+
The scale and features used in this calculation.
11+
12+
### Cost summary
13+
14+
The high level cost breakdown for this scenario. Messages are billed for both inbound (published to Ably) and outbound (delivered to subscribers).
15+
16+
### Effect
17+
18+
19+
- message-per-response Ably will automatically
20+
21+
- message-per-token you are in control, you can turn on server side batching to group messages together in a batching interval. Higher batching interval increases latency but reduces total number of messages, lower batching interval delivers messages quickly.
22+
[server-side batching](/docs/messages/batch#server-side)
23+

0 commit comments

Comments
 (0)