Grafana Loki is configured in a YAML file which contains information on the Loki server and its individual components.
Some of these need to be tweaked for network observability according to your cluster size, number of flows and sampling.
Use the following commands to update your loki config in an easy way.
Update zero-click-loki/2-loki.yaml or Update zero-click-loki/2-loki-tls.yamlwith your custom configuration.
Then replace the related config using:
oc replace --force -f config.yamlThe pod will restart automatically.
Update loki-microservices/1-prerequisites/config.yaml with your custom configuration.
Then replace the config using:
oc replace --force -f config.yamlRestart all pods of loki instance:
oc delete pods --selector app.kubernetes.io/instance=loki -n netobservLokiStack needs to be configured as Unmanaged management state first to allow configmap updates.
Run the following command to get the lokistack-config configmap in netobserv namespace:
oc get configmap lokistack-config -n netobserv -o yaml | yq '.binaryData | map_values(. | @base64d)' > binaryData.txtUpdate the binaryData.txt file accordingly.
Then run the following command to update lokistack-config configmap in netobserv namespace using the updated file:
BINARY_CONFIG=$(yq -o=json -I=0 'map_values(. | @base64)' binaryData.txt) && echo $BINARY_CONFIG
oc patch configmap lokistack-config -n netobserv -p '{"binaryData":'$BINARY_CONFIG'}'Restart all pods of lokistack instance:
oc delete pods --selector app.kubernetes.io/name=lokistack -n netobservThe query frontend splits larger queries into multiple smaller queries, executing these queries in parallel on downstream queriers and stitching the results back together again. This prevents large (multi-day, etc) queries from causing out of memory issues in a single querier and helps to execute them faster.
Check Grafana official documentation
Some queries may be limited by the query scheduler. You will need to update the following configuration:
query_range:
parallelise_shardable_queries: true
query_scheduler:
max_outstanding_requests_per_tenant: 100 Ensure parallelise_shardable_queries is set to true and increase max_outstanding_requests_per_tenant following your needs (default = 100). It's reasonable to put a high value here as 2048 for example, however it will decrease multi-users queries performance.
Check query_scheduler configuration for more details.
The messages containing bulks of records received by Loki distributor and exchanged between components have a maximum size in bytes set by the following parameters:
server:
grpc_server_max_recv_msg_size: 4194304
grpc_server_max_send_msg_size: 4194304 By default the size is 4194304 = 4Mb. It's reasonable to increase it to 8388608 = 8Mb.
While collecting flows and enriching them, a latency appears between the current time and records. This particularly applies when using Kafka on large clusters.
Loki can be configured to reject old samples using the following configuration:
limits_config:
reject_old_samples_max_age: 168hOn top of that, Loki logs are written in chunks in order by time. If a message received is too old than the most recent one, it will be out-of-order.
To accept messages within a specific time range, use the following configuration:
ingester:
max_chunk_age: 2hBe careful, Loki calculates the earliest time that out-of-order entries may have and be accepted with:
time_of_most_recent_line - (max_chunk_age / 2)
Check accept out-of-order writes documentation for more info.
The number of active streams can be limited per user per ingester (unlimited by default) or per user across the cluster (default = 5000).
To update these limits, you can tweak the following values:
limits_config:
max_streams_per_user: 0
max_global_streams_per_user: 5000It's not recommended to disable both using 0. You may set max_streams_per_user to 5000 using multiple ingesters and disable max_global_streams_per_user or increase max_global_streams_per_user value instead.
Check limits_config for more details.
Ingestion is limited in terms of sample size per second as ingestion_rate_mb and per distributor local rate as ingestion_burst_size_mb:
limits_config:
ingestion_rate_mb: 4
ingestion_burst_size_mb: 6It's common to put more than 10Mb on each. You can safely increase these two values but keep an eye on your ingester performances and on your storage size.
Check limits_config for more details.
Metric queries are limited in terms of unique series count. You can increase max_query_series up to 10000 if you get maximum of series reached for a single query error in the console plugin.
We don't recommend to go above 10000 since it affects both query frontend stability and browser rendering performance.
Check limits_config for more details.