Skip to content

Conversation

@danish9039
Copy link
Contributor

@danish9039 danish9039 commented Jan 8, 2026

PR Description

This pull request introduces fixes, configuration improvements, and new deployments for the OTEL demo environment.

New Deployments

  • Spark Dependencies: Added a CronJob (spark-dependencies-cronjob-opensearch.yaml) to calculate dependencies for OpenSearch storage.
  • HotROD: Added a standalone hotrod.yaml deployment manifest to provide independent lifecycle control over the HotROD application.

Configuration Updates

  • Ingress

    • Corrected the jaeger-hotrod service port to 8080.
    • Split TLS certificates into two separate secrets (jaeger-ui-only-tls and hotrod-ui-only-tls) to avoid rate limiting and simplify certificate management.
  • Jaeger

    • Enabled create_mappings: true for OpenSearch to ensure correct index setup.
    • Updated OTLP endpoints in jaeger-values.yaml to reference the jaeger service instead of jaeger-collector.
    • Standardized imagePullPolicy to Always.
  • OpenTelemetry Demo

    • Updated the Collector exporter to send traces to the jaeger service.
    • Set OTEL_SERVICE_NAME for the frontend component to otelstore-frontend-ui.
    • Simplified OTEL_RESOURCE_ATTRIBUTES configuration.

Upgrades

  • Jaeger chart: Upgraded to version 4.2.3
  • OpenSearch storage: Upgraded to version 3.3.2
  • OpenSearch Dashboards: Upgraded image tag from 2.11.0 to 3.3.0 for compatibility with backend services

Script Changes

  • deploy-all.sh

    • Removed the deprecated clone_jaeger_v2 logic.
    • Added deployment execution for the Spark dependencies job.
    • Updated service readiness checks to validate against the primary jaeger service.

Signed-off-by: danish9039 <danishsiddiqui040@gmail.com>
Signed-off-by: danish9039 <danishsiddiqui040@gmail.com>
Signed-off-by: danish9039 <danishsiddiqui040@gmail.com>
@danish9039 danish9039 requested a review from a team as a code owner January 8, 2026 10:12
@danish9039 danish9039 requested a review from jkowall January 8, 2026 10:12
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.53%. Comparing base (cb60fb4) to head (c898155).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #7843   +/-   ##
=======================================
  Coverage   95.53%   95.53%           
=======================================
  Files         307      307           
  Lines       15911    15911           
=======================================
  Hits        15201    15201           
  Misses        558      558           
  Partials      152      152           
Flag Coverage Δ
badger_v1 9.18% <ø> (ø)
badger_v2 1.93% <ø> (ø)
cassandra-4.x-v1-manual 13.58% <ø> (ø)
cassandra-4.x-v2-auto 1.92% <ø> (ø)
cassandra-4.x-v2-manual 1.92% <ø> (ø)
cassandra-5.x-v1-manual 13.58% <ø> (ø)
cassandra-5.x-v2-auto 1.92% <ø> (ø)
cassandra-5.x-v2-manual 1.92% <ø> (ø)
clickhouse 1.97% <ø> (ø)
elasticsearch-6.x-v1 17.54% <ø> (ø)
elasticsearch-7.x-v1 17.57% <ø> (ø)
elasticsearch-8.x-v1 17.73% <ø> (ø)
elasticsearch-8.x-v2 1.93% <ø> (ø)
elasticsearch-9.x-v2 1.93% <ø> (ø)
grpc_v1 8.84% <ø> (ø)
grpc_v2 1.93% <ø> (ø)
kafka-3.x-v2 1.93% <ø> (ø)
memory_v2 1.93% <ø> (ø)
opensearch-1.x-v1 17.62% <ø> (ø)
opensearch-2.x-v1 17.62% <ø> (ø)
opensearch-2.x-v2 1.93% <ø> (ø)
opensearch-3.x-v2 1.93% <ø> (ø)
query 1.93% <ø> (ø)
tailsampling-processor 0.55% <ø> (ø)
unittests 94.16% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link

github-actions bot commented Jan 8, 2026

Metrics Comparison Summary

Total changes across all snapshots: 0

Detailed changes per snapshot

summary_metrics_snapshot_cassandra

📊 Metrics Diff Summary

Total Changes: 0

  • 🆕 Added: 0 metrics
  • ❌ Removed: 0 metrics
  • 🔄 Modified: 0 metrics
  • 🚫 Excluded: 53 metrics

➡️ View full metrics file

Copy link
Contributor

@jkowall jkowall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR, just one note.

name: jaeger-hotrod
port:
number: 80
number: 8080
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want this on 8080?

@jkowall jkowall requested a review from Copilot January 8, 2026 10:56
@jkowall jkowall added the changelog:dependencies Update to dependencies label Jan 8, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request upgrades the OTEL demo environment to use OpenSearch 3.x and Jaeger chart 4.x, adds a Spark dependency analysis job, introduces a standalone HotROD deployment, and fixes various configuration issues.

Key Changes:

  • Upgraded Jaeger Helm chart to version 4.2.3 and removed deprecated v2 branch cloning logic
  • Upgraded OpenSearch and OpenSearch Dashboards to version 3.3.x
  • Added Spark dependencies CronJob for trace dependency analysis and standalone HotROD deployment manifest
  • Fixed service endpoints and OTLP configurations to use the primary jaeger service instead of jaeger-collector
  • Split TLS certificates into separate secrets for better certificate management

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 13 comments.

Show a summary per file
File Description
spark-dependencies-cronjob-opensearch.yaml New CronJob for calculating Jaeger dependencies from OpenSearch storage, runs every 15 minutes
otel-demo-values.yaml Simplified OTEL_RESOURCE_ATTRIBUTES, added frontend service name override, updated collector to use jaeger service
opensearch-dashboard-values.yaml Removed hardcoded image tag to allow version management via deployment script
jaeger-values.yaml Updated OTLP endpoints to jaeger service, standardized image pull policy, configured HotROD service name
jaeger-config.yaml Enabled create_mappings for OpenSearch to ensure proper index setup
ingress/ingress-jaeger.yaml Fixed HotROD service port to 8080, split TLS into separate secrets per host
hotrod.yaml New standalone HotROD deployment with independent lifecycle management
deploy-all.sh Removed deprecated clone logic, added version variables, deployed new components, updated service checks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +34 to +35
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name.prefix=hotrod-
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The OTEL_SERVICE_NAME is set to "hotrod-frontend" but there's also a OTEL_RESOURCE_ATTRIBUTES with "service.name.prefix=hotrod-". This configuration is contradictory - the resource attribute appears to be trying to set a service name prefix, but this is not a standard OpenTelemetry attribute. The service.name.prefix attribute doesn't exist in the OpenTelemetry specification. If the intent is to prefix the service name, this should be handled differently or removed.

Suggested change
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name.prefix=hotrod-

Copilot uses AI. Check for mistakes.
Comment on lines +248 to +249
--version "${OPENSEARCH_VERSION}" \
--set image.tag="${OPENSEARCH_VERSION}" \
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --version parameter for the Helm chart is being set to the OpenSearch version (3.3.2), but this should be the Helm chart version, not the application version. The chart version and application version are different. You should specify the appropriate chart version separately from the image tag. For example, the chart version might be something like "2.x.x" while the image tag is "3.3.2".

Copilot uses AI. Check for mistakes.
Comment on lines +257 to +258
--version "${OPENSEARCH_DASHBOARDS_VERSION}" \
--set image.tag="${OPENSEARCH_DASHBOARDS_VERSION}" \
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the OpenSearch deployment, the --version parameter for the OpenSearch Dashboards Helm chart is being set to the application version (3.3.0), but this should be the Helm chart version, not the application version. The chart version and application version are different and should be specified separately.

Copilot uses AI. Check for mistakes.
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://jaeger:4318"
- name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
value: "http://jaeger:4318/v1/traces"
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HotROD deployment is missing environment variables for OTLP protocol configuration. The jaeger-values.yaml file sets OTEL_EXPORTER_OTLP_PROTOCOL to "http/protobuf" (lines 30-31), but this environment variable is not present in the standalone HotROD deployment. This inconsistency could lead to different behavior between the two deployments.

Suggested change
value: "http://jaeger:4318/v1/traces"
value: "http://jaeger:4318/v1/traces"
- name: OTEL_EXPORTER_OTLP_PROTOCOL
value: "http/protobuf"

Copilot uses AI. Check for mistakes.
memory: 1Gi
limits:
cpu: 500m
memory: 2Gi
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory limit of 2Gi might be insufficient for Spark dependency processing depending on the volume of traces. The JAVA_OPTS sets max heap to 1g, but Spark also needs off-heap memory for overhead. Consider monitoring memory usage and adjusting limits if OOMKilled errors occur. A safer configuration might be to increase the limit to 3Gi or adjust the heap settings to leave more room for off-heap memory.

Suggested change
memory: 2Gi
memory: 3Gi

Copilot uses AI. Check for mistakes.
- hosts:
- hotrod.demo.jaegertracing.io
secretName: jaeger-demo-tls
secretName: hotrod-ui-only-tls
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ingress now has two separate TLS secrets for the two hosts, but there's no validation that these secrets exist. Ensure that both jaeger-ui-only-tls and hotrod-ui-only-tls secrets are created before deploying this ingress, otherwise the TLS configuration will fail.

Suggested change
secretName: hotrod-ui-only-tls
secretName: hotrod-ui-only-tls
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: jaeger-ui-only-tls
namespace: jaeger
spec:
secretName: jaeger-ui-only-tls
dnsNames:
- jaeger.demo.jaegertracing.io
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
name: hotrod-ui-only-tls
namespace: jaeger
spec:
secretName: hotrod-ui-only-tls
dnsNames:
- hotrod.demo.jaegertracing.io
issuerRef:
name: letsencrypt-prod
kind: ClusterIssuer

Copilot uses AI. Check for mistakes.
ports:
- containerPort: 8080
---
apiVersion: v1
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a trailing space after the service type declaration on line 31 that should be removed for consistency.

Suggested change
apiVersion: v1
apiVersion: v1

Copilot uses AI. Check for mistakes.
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://jaeger:4318"
- name: OTEL_EXPORTER_OTLP_TRACES_ENDPOINT
value: "http://jaeger:4318/v1/traces"
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The standalone HotROD deployment is missing the OTEL_SERVICE_NAME, OTEL_RESOURCE_ATTRIBUTES, and OTEL_LOG_LEVEL environment variables that are configured in jaeger-values.yaml (lines 32-37). This creates an inconsistency between the HotROD deployment when managed by the Jaeger Helm chart versus the standalone deployment, which could lead to different observability behavior and make it harder to trace issues.

Suggested change
value: "http://jaeger:4318/v1/traces"
value: "http://jaeger:4318/v1/traces"
- name: OTEL_SERVICE_NAME
value: "jaeger-hotrod"
- name: OTEL_RESOURCE_ATTRIBUTES
value: "service.name=jaeger-hotrod"
- name: OTEL_LOG_LEVEL
value: "info"

Copilot uses AI. Check for mistakes.
@@ -1,7 +1,5 @@

image:
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tag field has been removed from the image configuration, but no explicit version is being set. This means the deployment will use the default tag from the Helm chart values. Given that the deploy-all.sh script now sets --set image.tag="${OPENSEARCH_DASHBOARDS_VERSION}", this is correct. However, it would be clearer to include a comment explaining that the version is managed via the deployment script to avoid confusion.

Suggested change
image:
image:
# Note: image.tag is intentionally not set here.
# The deploy-all.sh script sets it via:
# --set image.tag="${OPENSEARCH_DASHBOARDS_VERSION}"

Copilot uses AI. Check for mistakes.
# Narrower service namespace + explicit environment tag
- name: OTEL_RESOURCE_ATTRIBUTES
value: service.name=$(OTEL_SERVICE_NAME),service.namespace=otel-demo,deployment.environment=oke-dev
value: service.namespace=otel-demo,deployment.environment=oke-dev
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing service.name=$(OTEL_SERVICE_NAME) from the default OTEL_RESOURCE_ATTRIBUTES will cause services to lose their service name in the resource attributes, unless each component explicitly sets OTEL_SERVICE_NAME. This could lead to services being incorrectly identified in tracing. Verify that all components that need service names either have OTEL_SERVICE_NAME set individually or that the SDK defaults are acceptable for your use case.

Suggested change
value: service.namespace=otel-demo,deployment.environment=oke-dev
value: service.name=$(OTEL_SERVICE_NAME),service.namespace=otel-demo,deployment.environment=oke-dev

Copilot uses AI. Check for mistakes.
ROLLOUT_TIMEOUT="${ROLLOUT_TIMEOUT:-600}"

# Versions
OPENSEARCH_VERSION="${OPENSEARCH_VERSION:-3.3.2}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OPENSEARCH_VERSION="${OPENSEARCH_VERSION:-3.3.2}"
DEFAULT_OPENSEARCH_VERSION=3.3.2
OPENSEARCH_VERSION="${OPENSEARCH_VERSION:-${DEFAULT_OPENSEARCH_VERSION}}"

echo " clean - Clean install (removes existing deployment first)"
echo ""
echo "Environment Variables:"
echo " OPENSEARCH_VERSION - Version of OpenSearch (default: 3.3.2)"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
echo " OPENSEARCH_VERSION - Version of OpenSearch (default: 3.3.2)"
echo " OPENSEARCH_VERSION - Version of OpenSearch (default: ${DEFAULT_OPENSEARCH_VERSION})"

spec:
containers:
- name: jaeger-hotrod
image: docker.io/jaegertracing/example-hotrod:1.72.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
image: docker.io/jaegertracing/example-hotrod:1.72.0
image: docker.io/jaegertracing/example-hotrod:2.14.0

name: jaeger-spark-dependencies
namespace: jaeger
spec:
schedule: "*/15 * * * *"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does the job know that you're running it every 15min and that it only needs to look in the last 15min window?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants