Skip to content

Latest commit

 

History

History
66 lines (50 loc) · 3.02 KB

File metadata and controls

66 lines (50 loc) · 3.02 KB

Tracing

Tracing can be enabled by setting spark.comet.tracing.enabled=true.

With this feature enabled, each Spark executor will write a JSON event log file in Chrome's Trace Event Format. The file will be written to the executor's current working directory with the filename comet-event-trace-{pid}.json, where {pid} is the executor process ID.

Additionally, enabling the jemalloc feature will enable tracing of native memory allocations.

make release COMET_FEATURES="jemalloc"

Example output:

{ "name": "decodeShuffleBlock", "cat": "PERF", "ph": "B", "pid": 12345, "tid": 5, "ts": 10109225730 },
{ "name": "decodeShuffleBlock", "cat": "PERF", "ph": "E", "pid": 12345, "tid": 5, "ts": 10109228835 },
{ "name": "decodeShuffleBlock", "cat": "PERF", "ph": "B", "pid": 12345, "tid": 5, "ts": 10109245928 },
{ "name": "decodeShuffleBlock", "cat": "PERF", "ph": "E", "pid": 12345, "tid": 5, "ts": 10109248843 },
{ "name": "execute_plan", "cat": "PERF", "ph": "E", "pid": 12345, "tid": 5, "ts": 10109350935 },
{ "name": "CometExecIterator_getNextBatch", "cat": "PERF", "ph": "E", "pid": 12345, "tid": 5, "ts": 10109367116 },
{ "name": "CometExecIterator_getNextBatch", "cat": "PERF", "ph": "B", "pid": 12345, "tid": 5, "ts": 10109479156 },

Traces can be viewed with Perfetto UI.

Example trace visualization:

tracing

Definition of Labels

Label Meaning
jvm_heap_used JVM heap memory usage of live objects for the executor process
jemalloc_allocated Native memory usage for the executor process
task_memory_comet_NNN Off-heap memory allocated by Comet for query execution
task_memory_spark_NNN On-heap & Off-heap memory allocated by Spark
comet_shuffle_NNN Off-heap memory allocated by Comet for columnar shuffle
shuffle_spilled_bytes Bytes written to disk in a single shuffle spill operation