Skip to content

Commit e8a1ca9

Browse files
authored
Document some of the enterprise features (#58)
1 parent b04b126 commit e8a1ca9

File tree

12 files changed

+355
-69
lines changed

12 files changed

+355
-69
lines changed

.github/workflows/ci.yaml

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,12 +11,13 @@ jobs:
1111
steps:
1212
- name: Checkout code
1313
uses: actions/checkout@v4
14-
- name: Checkout another public repo
14+
- name: Checkout pgdog-enterprise
1515
uses: actions/checkout@v4
1616
with:
17-
repository: pgdogdev/pgdog
18-
ref: main
17+
repository: pgdogdev/pgdog-enterprise
18+
ref: main-ent
1919
path: pgdog-source
20+
token: ${{ secrets.PGDOG_ENTERPRISE_TOKEN }}
2021
- uses: actions-rs/toolchain@v1
2122
with:
2223
toolchain: stable

docs/enterprise_edition/active_queries.md

Lines changed: 0 additions & 52 deletions
This file was deleted.
Lines changed: 68 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,68 @@
1+
---
2+
icon: material/console
3+
---
4+
5+
# Control plane
6+
7+
Multi-node PgDog deployments require synchronization to perform certain tasks, like atomic configuration changes, toggling [maintenance mode](../administration/maintenance_mode.md), [resharding](../features/sharding/resharding/index.md), and more. To make this work, PgDog Enterprise comes with a control plane, an application deployed alongside PgDog, to provide coordination and collect and present system telemetry.
8+
9+
## How it works
10+
11+
The control plane and PgDog processes communicate via the network using HTTP. They exchange messages to send metrics, commands, and other metadata that allows PgDog to transmit real-time information to the control plane, and for the control plane to control the behavior of each PgDog process.
12+
13+
<center>
14+
<img src="/images/control_plane.png" width="90%" alt="Control plane">
15+
</center>
16+
17+
### Configuration
18+
19+
In order for PgDog to connect to the control plane, it needs to be configured with its endpoint address and an authentication token, both of which are specified in [`pgdog.toml`](../configuration/pgdog.toml/general.md):
20+
21+
```toml
22+
[control]
23+
endpoint = "https://control-plane-endpoint.cloud.pgdog.dev"
24+
token = "cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d"
25+
```
26+
27+
The authentication token is generated by the control plane and identifies each PgDog deployment. PgDog nodes which are part of the same deployment should use the same token.
28+
29+
For example, if you're using our [Helm chart](../installation.md#kubernetes), you can configure the endpoint and token in `values.yaml` as follows:
30+
31+
```yaml
32+
control:
33+
endpoint: https://control-plane-endpoint.cloud.pgdog.dev
34+
token: cff57e5c-7c4f-4ca0-b81c-c8ed22cf873d
35+
```
36+
37+
### Connection flow
38+
39+
The connection to the control plane is initiated by PgDog on startup and happens in the background. Upon connecting, PgDog will send its node identifier (randomly generated, or set in the `NODE_ID` envrionment variable) to register with the control plane, and start uploading telemtry and poll for commands.
40+
41+
!!! note "Error handling"
42+
Since most PgDog functions (including sharding) are configuration-driven, the control plane connection is **not required**
43+
for PgDog to start and serve queries.
44+
45+
If any error is encounted while communicating with the control plane,
46+
PgDog will continue operating normally, while attempting to reconnect periodically.
47+
48+
49+
This architecture makes the communication link more resilient to unreliable network conditions.
50+
51+
### Telemetry
52+
53+
PgDog transmits the following information to the control plane:
54+
55+
| Telemetry | Description |
56+
|-|-|
57+
| [Metrics](metrics.md) | The same [metrics](../features/metrics.md) as exposed by the Prometheus endpoint (and the admin database), are trasmitted at a much higher frequency, to allow for real-time monitoring. |
58+
| [Active queries](insights/active_queries.md) | Queries that are currently executing through each PgDog node. |
59+
| [Query statistics](insights/statistics.md) | Real-time statistics on each query executed through PgDog, like duration, idle-in-transaction time, and more. |
60+
| [Errors](insights/errors.md) | Recent errors encountered by clients, e.g. query syntax issues. |
61+
| [Query plans](insights/query_plans.md) | Output of `EXPLAIN` for slow and sampled queries, collected by PgDog in the background. |
62+
| [Configuration](configuration.md) | Current PgDog settings and database schema. |
63+
64+
#### High availability
65+
66+
The control plane itself is backed by a PostgreSQL database, used for storing historical metrics, query statistics, configuration, and other metadata.
67+
68+
This allows multiple instances of the control plane to be deployed in a high-avaibility setup, since all actions are syncrhonized by PostgreSQL transactions and locks.

docs/enterprise_edition/index.md

Lines changed: 18 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2,20 +2,31 @@
22
icon: material/office-building
33
---
44

5-
# PgDog EE
5+
# PgDog Enterprise
66

7-
PgDog **E**nterprise **E**dition is a version of PgDog that contains additional features for at scale monitoring and deployment of sharded (and unsharded) PostgreSQL databases.
7+
PgDog Enterprise is a version of PgDog that contains additional features for at scale monitoring and deployment of sharded (and unsharded) PostgreSQL databases.
88

9-
Unlike PgDog itself, PgDog EE is closed source and available upon the purchase of a license. It comes with a hosted management dashboard which provides real-time visibility into PgDog's operations.
9+
Unlike PgDog itself, PgDog Enterprise is closed source and available upon the purchase of a license. It comes with a control plane which provides real-time visibility into PgDog's operations and enterprise features.
1010

1111
## Features
1212

1313
| Feature | Description |
1414
|-|-|
15-
| [Running queries](active_queries.md) | Instant view into queries running through PgDog. |
16-
| [Query plans](query_plans.md) | Root cause slow queries in seconds with automatic PostgreSQL query plans. |
17-
| [Real-time metrics](metrics.md) | All PgDog metrics, delivered with second-precision through a dedicated link. |
18-
| Query blocker | Terminate all instances of a slow query with a button click and prevent them from running again. |
15+
| [Control plane](control_plane.md) | Synchronize and monitor multiple PgDog processes. |
16+
| [Active queries](insights/active_queries.md) | Real-time view into queries running through PgDog. |
17+
| [Query plans](insights/query_plans.md) | Root cause slow queries and execution anomalies with real-time Postgres query plans, collected in the background. |
18+
| [Real-time metrics](metrics.md) | All PgDog metrics, delivered with second-precision through a dedicated connection. |
19+
| [Query statistics](insights/statistics.md) | Query execution statistics, like duration, idle-in-transaction time, errors, and more. |
20+
21+
## Roadmap
22+
23+
PgDog Enterprise is new and in active development. A lot of the features we want aren't built yet:
24+
25+
| Feature | Description |
26+
|-|-|
27+
| QoS | Quality of service guarantees, incl. throttling on a per-user/database/query level. |
28+
| AWS RDS integration | Deploy PgDog on top of AWS RDS, without the hassle of Kubernetes or manual configuration. |
29+
| Automatic resharding | Detect hot shards and re-shard data without operator intervention. |
1930

2031
## Get a demo
2132

Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
icon: material/play-circle
3+
---
4+
5+
# Active queries
6+
7+
PgDog Enterprise provides a real-time view into queries currently executing on its PostgreSQL connections. This is accessible in two places:
8+
9+
1. [`SHOW ACTIVE_QUERIES`](#admin-database) admin command
10+
2. [Activity](#dashboard) view in the dashboard
11+
12+
## How it works
13+
14+
When a client sends a query to PgDog, it will first attempt to acquire a connection from the connection pool. Once acquired, it will register the query with the live query view. After the query finishes running, it's removed from the view.
15+
16+
Only queries that are currently executing through PgDog are visible. If your application doesn't connect to PgDog, its queries won't appear here.
17+
18+
### Admin database
19+
20+
You can see which queries are actually running on each instance by connecting to the [admin database](../../administration/index.md) and running the `SHOW ACTIVE_QUERIES` command:
21+
22+
=== "Command"
23+
```
24+
SHOW ACTIVE_QUERIES;
25+
```
26+
27+
=== "Output"
28+
```
29+
query | protocol | database | user | running_time | plan
30+
---------------------------------------------------+----------+----------+-------+--------------+---------------------------------------------------------------
31+
SELECT * FROM users WHERE id = $1 | extended | pgdog | pgdog | 15 | Index Scan on users (cost=0.15..8.17 rows=1 width=64)
32+
SELECT pg_sleep(50) | simple | pgdog | pgdog | 1662 | Result (cost=0.00..0.01 rows=1 width=4)
33+
INSERT INTO users (id, email) VALUES ($1, $2) | extended | pgdog | pgdog | 1 | Insert on users (cost=0.00..0.01 rows=0 width=0)
34+
```
35+
36+
The following information is available in the running queries view:
37+
38+
| Column | Description |
39+
|-|-|
40+
| `query` | The SQL statement currently executing on a PostgreSQL connection. |
41+
| `protocol` | What version of the query protocol is used. `simple` protocol injects parameters into text, while `extended` is used by prepared statements. |
42+
| `database` | The name of the connection pool database. |
43+
| `user` | The name of the user executing the query. |
44+
| `running_time` | For how long (in ms) has the query been running. |
45+
| `plan` | The query execution plan obtained from PostgreSQL using `EXPLAIN`. |
46+
47+
### Web UI
48+
49+
If you're running multiple instances of PgDog, active queries from all instances are aggregated and sent to the [control plane](../control_plane.md). They are then made available in the Activity tab, in real-time, with query plans automatically attached for slow queries.
50+
51+
<center>
52+
<img src="/images/ee/activity.png" width="100%" alt="How PgDog works" class="screenshot" />
53+
</center>
54+
55+
### Parameters
56+
57+
If your application is using prepared statements (or just placeholders in queries), the parameters for these queries are not shown and will not be sent to the control plane.
58+
59+
If your application is using simple statements (parameters in query text), PgDog will normalize the queries, removing values and replacing them with parameter symbols (e.g., `$1`). This is to make sure no sensitive data leaves the database network.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
---
2+
icon: material/alert-circle
3+
---
4+
5+
# Errors
6+
7+
PgDog Enterprise tracks query errors returned by PostgreSQL, providing a real-time view into recently encountered issues like syntax errors, missing columns, or lock timeouts.
8+
9+
## Admin database
10+
11+
You can see recent errors by connecting to the [admin database](../../administration/index.md) and running the `SHOW ERRORS` command:
12+
13+
=== "Command"
14+
```
15+
SHOW ERRORS;
16+
```
17+
18+
=== "Output"
19+
```
20+
error | count | age | query
21+
--------------------------------+-------+------+------------------------
22+
column "sdfsdf" does not exist | 1 | 1444 | SELECT sdfsdf;
23+
syntax error at end of input | 3 | 500 | SELECT FROM users;
24+
relation "foo" does not exist | 2 | 120 | SELECT * FROM foo;
25+
```
26+
27+
The following information is available in the errors view:
28+
29+
| Column | Description |
30+
|-|-|
31+
| `error` | The error message returned by PostgreSQL. |
32+
| `count` | The number of times this error has been encountered. |
33+
| `age` | How long ago (in ms) was this error last seen. |
34+
| `query` | The last SQL statement that caused the error. |
35+
36+
## Configuration
37+
38+
Errors are collected automatically if query statistics are enabled. The in-memory view is periodically purged of old errors, configurable in [`pgdog.toml`](../configuration/pgdog.toml/general.md):
39+
40+
```toml
41+
[query_stats]
42+
enabled = true
43+
max_errors = 100
44+
max_error_age = 300_000 # 5 minutes
45+
```
46+
47+
By default, PgDog will keep up to 100 distinct errors for a maximum of 5 minutes. This data is periodically sent to the [control plane](../control_plane.md), so the history of seen errors is available in the web UI.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
---
2+
icon: material/lightbulb-on
3+
---
4+
5+
# Query insights
6+
7+
PgDog Enterprise provides visibility into all queries that it serves, which allows it to analyze and report how those queries perform, in real-time.
8+
9+
## Telemetry
10+
11+
PgDog Enterprise collects the following telemetry:
12+
13+
| Telemetry | Frequency | Description |
14+
|-|-|-|
15+
| [Active queries](active_queries.md) | real time | Queries actively executing through the proxy. |
16+
| [Query plans](query_plans.md) | sample / threshold | Query plans (`EXPLAIN` output) are collected for slow queries and sampled queries automatically. |
17+
| [Query statistics](statistics.md) | real time | Query duration, number of rows returned, idle-in-transaction time, errors, and more. |
18+
| [Errors](errors.md) | real time | View into recently encountered query errors, like syntax errors or lock timeouts. |
19+
20+
This data is transmitted to the [control plane](../control_plane.md) in real-time, which makes it available via its web dashboard and HTTP API.

docs/enterprise_edition/query_plans.md renamed to docs/enterprise_edition/insights/query_plans.md

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,26 @@ The following information is available in this view:
4141
| `user` | The name of the user running the query. |
4242
| `age` | How long ago the plan was fetched from Postgres (in ms). |
4343

44+
### Configuration
45+
46+
Which queries are planned and how frequently is configurable in [`pgdog.toml`](../configuration/pgdog.toml/general.md):
47+
48+
```toml
49+
[query_stats]
50+
enabled = true
51+
query_plan_threshold = 250 # 250 ms
52+
query_plans_cache = 100
53+
query_plans_sample_rate = 0.0
54+
query_plan_max_age = 15_000
55+
```
56+
57+
| Setting | Description |
58+
|-|-|
59+
| `query_plan_threshold` | Minimum query execution duration (in ms), as recorded by PgDog in [query statistics](statistics.md) which will trigger a plan collection. |
60+
| `query_plans_cache` | How many plans to keep in the cache to avoid planning the same queries multiple times. |
61+
| `query_plans_sample_rate` | Percentage of queries (0.0 - 1.0) to collect plans for irrespective of their execution duration. |
62+
| `query_plan_max_age` | For how long (in ms) to keep plans in the cache before they are considered stale and require a new plan. |
63+
4464
### Dashboard
4565

4666
The query plans are automatically attached to running queries and sent to the Dashboard via a dedicated connection. They can be viewed in real-time in the [Activity](active_queries.md#dashboard) tab.

0 commit comments

Comments
 (0)