diff --git a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx index b6baf993b..4d7d50756 100644 --- a/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx +++ b/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx @@ -586,6 +586,41 @@ Memgraph HA uses standard Kubernetes **startup**, **readiness**, and - **Coordinators**: probed on the **NuRaft server** - **Data instances**: probed on the **Bolt server** +## Debugging + +There are different ways in which you can debug Memgraph's HA cluster in production. One way is to send us logs from all instances if you notice some issue. That's why we +advise users to set the log level to `TRACE` if possible. Note however that running `TRACE` log level has some performance costs, especially when logging to stderr additionally +to files. If the performance is your concern, first try to set `--also-log-to-stderr=false` since logging to files is cheaper. If you're still unhappy with the performance overhead +of logging, use `--log-level=DEBUG` (higher log level will also be fine like `INFO`, `CRITICAL`...) and `--also-log-to-stderr=true`. + +If you notice your application is crashing, you will be able to collect core dumps by setting `storage.data.createCoreDumpsClaim` and `storage.coordinators.createCoreDumpsClaim` +to `true`. That will trigger the creation of init container which will be run in privileged mode as root user to set-up all the necessary things on your nodes to be able to +collect core dumps. You can then create the debug pod and attach PVC containing core dumps to that pod to be able to extract core dumps outside of the K8s nodes. The example +of such a debug pod is the following YAML file: +```yaml +apiVersion: v1 +kind: Pod +metadata: + name: debug-coredump +spec: + containers: + - name: debug + image: ubuntu:22.04 + command: ["sleep", "infinity"] + volumeMounts: + - name: coredumps + mountPath: /var/core/memgraph + volumes: + - name: coredumps + persistentVolumeClaim: + claimName: memgraph-data-0-core-dumps-storage-memgraph-data-0-0 + restartPolicy: Never +``` +There is also a possibility of automatically uploading core dumps to S3. To do that, set `coreDumpUploader.enabled` to `true` and configure the S3 bucket, +AWS region, and credentials secret in the `coreDumpUploader` section. Note that the `createCoreDumpsClaim` flag for the relevant role (data/coordinators) +must also be set to `true`, as the uploader sidecar mounts the same PVC used for core dump storage. Core dumps are uploaded to +`s3://///`. + ## Monitoring @@ -743,7 +778,7 @@ and their default values. | `prometheus.memgraphExporter.pullFrequencySeconds` | How often will Memgraph's Prometheus exporter pull data from Memgraph instances. | `5` | | `prometheus.memgraphExporter.repository` | The repository where Memgraph's Prometheus exporter image is available. | `memgraph/prometheus-exporter` | | `prometheus.memgraphExporter.tag` | The tag of Memgraph's Prometheus exporter image. | `0.2.1` | -| `prometheus.serviceMonitor.enabled` | If enabled, a `ServiceMonitor` object will be deployed. | `true` | +| `prometheus.serviceMonitor.enabled` | If enabled, a `ServiceMonitor` object will be deployed. | `true` | | `prometheus.serviceMonitor.kubePrometheusStackReleaseName` | The release name under which `kube-prometheus-stack` chart is installed. | `kube-prometheus-stack` | | `prometheus.serviceMonitor.interval` | How often will Prometheus pull data from Memgraph's Prometheus exporter. | `15s` | | `labels.coordinators.podLabels` | Enables you to set labels on a pod level. | `{}` | @@ -754,6 +789,17 @@ and their default values. | `extraEnv.coordinators` | Env variables that users can define and are applied to coordinators | `[]` | | `initContainers.data` | Init containers that users can define that will be applied to data instances. | `[]` | | `initContainers.coordinators` | Init containers that users can define that will be applied to coordinators. | `[]` | +| `coreDumpUploader.enabled` | Enable the core dump S3 uploader sidecar. Requires `storage..createCoreDumpsClaim` to be `true`. | `false` | +| `coreDumpUploader.image.repository` | Docker image repository for the uploader sidecar | `amazon/aws-cli` | +| `coreDumpUploader.image.tag` | Docker image tag for the uploader sidecar | `2.33.28` | +| `coreDumpUploader.image.pullPolicy` | Image pull policy for the uploader sidecar | `IfNotPresent` | +| `coreDumpUploader.s3BucketName` | S3 bucket name where core dumps will be uploaded | `""` | +| `coreDumpUploader.s3Prefix` | S3 key prefix (folder) for uploaded core dumps | `core-dumps` | +| `coreDumpUploader.awsRegion` | AWS region of the S3 bucket | `us-east-1` | +| `coreDumpUploader.pollIntervalSeconds` | How often (in seconds) the sidecar checks for new core dump files | `30` | +| `coreDumpUploader.secretName` | Name of the K8s Secret containing AWS credentials | `aws-s3-credentials` | +| `coreDumpUploader.accessKeySecretKey` | Key in the K8s Secret for `AWS_ACCESS_KEY_ID` | `AWS_ACCESS_KEY_ID` | +| `coreDumpUploader.secretAccessKeySecretKey` | Key in the K8s Secret for `AWS_SECRET_ACCESS_KEY` | `AWS_SECRET_ACCESS_KEY` | For the `data` and `coordinators` sections, each item in the list has the diff --git a/pages/database-management/debugging.mdx b/pages/database-management/debugging.mdx index 4bd1b9498..c6d075b74 100644 --- a/pages/database-management/debugging.mdx +++ b/pages/database-management/debugging.mdx @@ -578,6 +578,8 @@ To enable core dumps, create a `values.yaml` file with at least the following se createCoreDumpsClaim: true ``` +If you're running Memgraph high availability chart, you can automatically upload [core dumps to S3](/pages/clustering/high-availability/setup-ha-cluster-k8s.mdx). + Setting this value to true will also enable the use of GDB inside Memgraph containers when using our provided [charts](https://github.com/memgraph/helm-charts).