Skip to content

Comments

Rewrite clickhouse operational queries documentation for clarity and improved structure#122

Open
pavlochaikivskyi wants to merge 22 commits intomainfrom
epmcdme-10519
Open

Rewrite clickhouse operational queries documentation for clarity and improved structure#122
pavlochaikivskyi wants to merge 22 commits intomainfrom
epmcdme-10519

Conversation

@pavlochaikivskyi
Copy link
Collaborator

@pavlochaikivskyi pavlochaikivskyi commented Feb 19, 2026

Summary

Rewrote Langfuse ClickHouse operational queries documentation with improved structure, comprehensive tabbed SQL examples, and clearer organization. The new storage management guide provides better guidance for disk usage analysis, data cleanup, and retention policy monitoring.

Changes

  • Restructured operational queries into comprehensive Storage Management Guide with clear sections for disk usage, data cleanup, and TTL monitoring
  • Added tabbed SQL query examples for all table types (Langfuse tables: observations, traces, blob storage; ClickHouse system tables: query_log, opentelemetry_span_log, etc.)
  • Enhanced Deployment Prerequisites with detailed TTL configuration examples and retention policy guidance
  • Organized queries by performance characteristics (fast metadata queries vs heavy data-scanning operations) with clear performance indicators
  • Replaced fragmented operational queries documentation with unified, structured guide following consistent patterns across all table types

Testing

  • Tested locally with npm start
  • All pages render correctly
  • Images display properly
  • Internal links work
  • Sidebar navigation works

Quality Checks

  • npm run check passes (typecheck + lint + commitlint)
  • No MDX compilation errors
  • No raw angle brackets (<text> must be `<text>`)
  • Sidebar references document IDs (not filenames)
  • Images stored locally next to content (not in static/img/)
  • Commit messages follow Conventional Commits
  • No secrets or credentials in documentation

Additional Notes

@github-actions
Copy link

PR Preview Ready!

Your preview documentation is ready! Visit it here:

Preview URL: http://codemie-docs-pr-previews.s3-website.eu-central-1.amazonaws.com/pr-122/


This preview will be automatically updated when you push new commits to this PR.
Preview files will be deleted when the PR is closed or merged.

:::info Database Types
ClickHouse contains two types of databases:

- **`default`** - Langfuse application database containing business data (Langfuse uses `default` as the [database name by default](https://langfuse.com/self-hosting/configuration#:~:text=CLICKHOUSE_DB,Name%20of%20the%20ClickHouse%20database%20to%20use.)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woud it be better to use tabs to reduce overall page lenght?

@@ -0,0 +1,792 @@
---
id: storage-management
title: Storage Management
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this page accordingly. Not sure that this is a Storage Management. I would use something like "Data Volume Retention" or "Data Volume Maintenance"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also please add this "magic" emoji to indicate this is an non-oss feature (like in deployment extension section)

- `scores`
- Other Langfuse tables and views

- **`system`** - ClickHouse internal database containing metadata ([Located in the `system` database](https://clickhouse.com/docs/operations/system-tables/overview)):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use – (en dash) in this cases instead of - (dash)

- `asynchronous_metric_log`, `text_log`, `part_log`
- `processors_profile_log`, `latency_log`, `session_log`
- `asynchronous_insert_log`, `error_log`
:::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this note :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should put this in a separate section and make it a link?
we have 5 the same notes

@elluvium elluvium marked this pull request as draft February 20, 2026 18:05
@pavlochaikivskyi pavlochaikivskyi marked this pull request as ready for review February 24, 2026 12:46
sum(rows) AS rows,
formatReadableSize(sum(bytes_on_disk)) AS compressed_size
FROM system.parts
WHERE database = 'system' AND table = 'query_log' AND active
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we modify the query and use highlight feature?
I'm just suggesting :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants