Skip to content

Commit 32a202e

Browse files
committed
[SeaORM] Docs
1 parent 89b51d7 commit 32a202e

2 files changed

Lines changed: 202 additions & 0 deletions

File tree

β€ŽSeaORM/docs/01-index.mdβ€Ž

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -144,6 +144,8 @@
144144

145145
12.1 [Arrow & Parquet](12-data-science/01-arrow-parquet.md)
146146

147+
12.2 [ClickHouse](12-data-science/02-clickhouse.md)
148+
147149
13. Internal Design
148150

149151
13.1 [Traits and Types](13-internal-design/01-trait-and-type.md)
Lines changed: 200 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,200 @@
1+
# ClickHouse
2+
3+
[`sea-clickhouse`](https://docs.rs/sea-clickhouse) is a ClickHouse client that integrates with the SeaQL ecosystem. It is a soft fork of [`clickhouse-rs`](https://github.com/ClickHouse/clickhouse-rs), 100% compatible with all upstream features, and continually rebased on upstream.
4+
5+
Query results are decoded into `sea_query::Value`, giving you first-class support for `DateTime`, `Decimal`, `BigDecimal`, `Json`, arrays, and more without defining any schema structs. Apache Arrow is also supported: stream query results directly into `RecordBatch`es, or insert Arrow batches back into ClickHouse.
6+
7+
## Setup
8+
9+
```toml
10+
[dependencies]
11+
# Dynamic DataRow + SeaQuery value support
12+
sea-clickhouse = { version = "0.14", features = ["sea-ql"] }
13+
14+
# Apache Arrow support (includes sea-ql)
15+
sea-clickhouse = { version = "0.14", features = ["arrow"] }
16+
```
17+
18+
## Dynamic DataRow
19+
20+
`fetch_rows()` decodes every column into the matching `sea_query::Value` variant without needing a schema struct:
21+
22+
```rust
23+
use clickhouse::{Client, DataRow, error::Result};
24+
use sea_query::Value;
25+
26+
let mut cursor = client
27+
.query(
28+
"SELECT
29+
1::UInt8 AS u8_col,
30+
3.14::Float64 AS f64_col,
31+
'hello'::String AS str_col,
32+
toDate('2026-01-15') AS date_col,
33+
toDateTime('2026-01-15 12:34:56') AS dt_col,
34+
toDecimal64(123.45, 2) AS dec64_col,
35+
NULL::Nullable(Int32) AS null_col,
36+
['a', 'b', 'c']::Array(String) AS arr_col
37+
",
38+
)
39+
.fetch_rows()?;
40+
41+
let row = cursor.next().await?.unwrap();
42+
assert_eq!(row.values[0], Value::TinyUnsigned(Some(1)));
43+
assert_eq!(row.values[2], Value::String(Some("hello".into())));
44+
assert_eq!(row.values[7], Value::Json(Some(Box::new(serde_json::json!(["a", "b", "c"])))));
45+
```
46+
47+
Values can be converted to a desired type at runtime:
48+
49+
```rust
50+
let row = cursor.next().await?.expect("expected one row");
51+
52+
assert_eq!(row.try_get::<f64, _>(0)?, 2.0); // by index
53+
assert_eq!(row.try_get::<Decimal, _>("value")?, 2.into()); // by column name
54+
```
55+
56+
## Inserting DataRows
57+
58+
Build `DataRow`s with a shared column list and insert them in a single streaming request:
59+
60+
```rust
61+
use std::sync::Arc;
62+
use clickhouse::{Client, DataRow};
63+
use sea_query::Value;
64+
65+
let columns: Arc<[Arc<str>]> = Arc::from(["id".into(), "name".into(), "score".into()]);
66+
67+
let rows: Vec<DataRow> = (0u32..5)
68+
.map(|i| DataRow {
69+
columns: columns.clone(),
70+
values: vec![
71+
Value::Unsigned(Some(i)),
72+
Value::String(Some("original".into())),
73+
Value::Double(Some(i as f64 * 1.5)),
74+
],
75+
})
76+
.collect();
77+
78+
let mut insert = client.insert_data_row("my_table", &rows[0]).await?;
79+
for row in &rows {
80+
insert.write_row(row).await?;
81+
}
82+
insert.end().await?;
83+
```
84+
85+
## Column-Oriented Batches
86+
87+
`next_batch(max_rows)` accumulates rows column-by-column into a `RowBatch` (one `Vec<Value>` per column), making it a natural bridge toward Apache Arrow:
88+
89+
```rust
90+
let mut cursor = client
91+
.query("SELECT number::UInt64 AS n, number * 2 AS doubled FROM system.numbers LIMIT 1000")
92+
.fetch_rows()?;
93+
94+
while let Some(batch) = cursor.next_batch(256).await? {
95+
// batch.column_names[i] - column name
96+
// batch.column_data[i] - Vec<Value> for column i
97+
// batch.num_rows
98+
}
99+
```
100+
101+
## Apache Arrow
102+
103+
`next_arrow_batch(chunk_size)` streams ClickHouse results as `arrow::RecordBatch`es, ready for DataFusion, Polars, Parquet export, or any Arrow consumer:
104+
105+
```rust
106+
let mut cursor = client.query("SELECT * FROM sensor_data").fetch_rows()?;
107+
108+
while let Some(batch) = cursor.next_arrow_batch(1000).await? {
109+
arrow::util::pretty::print_batches(&[batch]).unwrap();
110+
}
111+
```
112+
113+
### SeaORM to ClickHouse
114+
115+
Build an Arrow `RecordBatch` from SeaORM entities and insert it directly into ClickHouse:
116+
117+
```rust
118+
use sea_orm::{ArrowSchema, Set};
119+
120+
#[sea_orm::model]
121+
#[derive(Clone, Debug, PartialEq, DeriveEntityModel)]
122+
#[sea_orm(table_name = "measurement", arrow_schema)]
123+
pub struct Model {
124+
#[sea_orm(primary_key)]
125+
pub id: i32,
126+
pub recorded_at: ChronoDateTime,
127+
pub sensor_id: i32,
128+
pub temperature: f64,
129+
#[sea_orm(column_type = "Decimal(Some((38, 4)))")]
130+
pub voltage: Decimal,
131+
}
132+
133+
let models: Vec<measurement::ActiveModel> = vec![..];
134+
let schema = measurement::Entity::arrow_schema();
135+
let batch = measurement::ActiveModel::to_arrow(&models, &schema)?;
136+
137+
let mut insert = client.insert_arrow("measurement", &batch).await?;
138+
insert.write_batch(&batch).await?;
139+
insert.end().await?;
140+
```
141+
142+
### Arrow Schema to ClickHouse Table
143+
144+
`ClickHouseSchema::from_arrow` derives a full `CREATE TABLE` DDL from an Arrow schema:
145+
146+
```rust
147+
use clickhouse::schema::{ClickHouseSchema, Engine};
148+
149+
let mut schema = ClickHouseSchema::from_arrow(&batch.schema());
150+
schema
151+
.table_name("measurement")
152+
.engine(Engine::ReplacingMergeTree)
153+
.primary_key(["recorded_at", "sensor_id"]);
154+
schema.find_column_mut("sensor_id").set_low_cardinality(true);
155+
156+
let ddl = schema.to_string();
157+
client.query(&ddl).execute().await?;
158+
```
159+
160+
The generated DDL:
161+
162+
```sql
163+
CREATE TABLE measurement (
164+
id Int32,
165+
recorded_at DateTime64(6),
166+
sensor_id Int32,
167+
temperature Float64,
168+
voltage Decimal(38, 4)
169+
) ENGINE = ReplacingMergeTree()
170+
PRIMARY KEY (recorded_at, sensor_id)
171+
```
172+
173+
## Type Mapping
174+
175+
| ClickHouse Type | `sea_query::Value` Variant |
176+
|---|---|
177+
| `Bool` | `Value::Bool` |
178+
| `Int8`–`Int64` | `Value::TinyInt`–`Value::BigInt` |
179+
| `UInt8`–`UInt64` | `Value::TinyUnsigned`–`Value::BigUnsigned` |
180+
| `Int128` / `Int256` / `UInt128` / `UInt256` | `Value::BigDecimal` (scale 0) |
181+
| `Float32` / `Float64` | `Value::Float` / `Value::Double` |
182+
| `String` | `Value::String` |
183+
| `FixedString(n)` | `Value::Bytes` |
184+
| `UUID` | `Value::Uuid` |
185+
| `Date` / `Date32` | `Value::ChronoDate` |
186+
| `DateTime` / `DateTime64` | `Value::ChronoDateTime` |
187+
| `Decimal32` / `Decimal64` | `Value::Decimal` |
188+
| `Decimal128` | `Value::Decimal` or `Value::BigDecimal` if scale > 28 |
189+
| `Decimal256` | `Value::BigDecimal` |
190+
| `Array(T)` / `Tuple(...)` / `Map(K,V)` | `Value::Json` |
191+
| `Nullable(T)` null | Typed `None` variant |
192+
193+
## Full Examples
194+
195+
Working examples are available in the [sea-clickhouse repository](https://github.com/SeaQL/clickhouse-rs):
196+
197+
- [`data_rows`](https://github.com/SeaQL/clickhouse-rs/blob/main/examples/data_rows.rs) β€” fetch rows and assert type mappings
198+
- [`data_row_insert`](https://github.com/SeaQL/clickhouse-rs/blob/main/examples/data_row_insert.rs) β€” insert, mutate, re-insert (ReplacingMergeTree)
199+
- [`arrow_sensor_data`](https://github.com/SeaQL/clickhouse-rs/blob/main/examples/arrow_sensor_data.rs) β€” sensor data processing via Arrow
200+
- [`sea-orm-arrow-example`](https://github.com/SeaQL/clickhouse-rs/blob/main/sea-orm-arrow-example/src/main.rs) β€” SeaORM entity to Arrow to ClickHouse

0 commit comments

Comments
Β (0)