Spark Stage Details

This tab displays the summary metrics of the tasks that completed within the selected Spark stage.

MetricDescription
PercentileA number where certain percentage of jobs fall below that number. For example, 95th percentile indicates that a certain amount of jobs executed within xx seconds, which is faster than 95% of the jobs in queue.
DurationTime taken by the stages to complete.
Executor CpuTimeTotal CPU time taken by the executor to run the task (in milliseconds).
Executor DeserializeTimeTime taken by the executor to deserialize tasks.
Jvm GcTimeTime spent by the JVM in garbage collection while executing a task.
Result SerializationTimeTime spent to serialize a task result.
Peak ExecutionMemoryThe memory used during shuffles, aggregations, and joins by internal data structures.
Shuffle writeTimeTime spent to write serialized data on all executors.
Shuffle BytesWrittenBytes written to the host. These bytes are read by a shuffle later when needed.

Tasks Analysis by Metrics

Note: To see task analysis by the Details metrics, click any row in Details table. This displays the data represented in bar charts for each metric.

The bar chart represents two kinds of values.

  • The percentile value.
  • The average percentile value.

Executors

This tab displays the following aggregated metrics of executors in hosts for the selected stage ID.

MetricDescription
eidThe executor ID.
HostThe host on which the executors are running.
failedTasksThe number of failed tasks in the executor.
killedTasksThe number of terminated tasks in the executor.
SucceededTasksThe number of successfully completed tasks in the executor.
taskTimeThe time spent on the tasks.
memoryBytesSpilledThe amount of deserialized form of the data in memory at the time it is spilt.
Input BytesThe number of bytes read from the executor in that stage.
Output BytesThe number of bytes written to the executor in that stage.
shuffleReadThe amount of serialized data read on the executor.
ShuffleWriteThe nmount of serialized data written to the executor.

Note: To view the host details, click the host name.

Task

MetricDescription
taskIdThe ID of the task.
statusThe status of the task, which can be one of the following.
- running, succeeded, failed, unknown.
taskLocalityThe type of task in the host which can be on of the following.
PROCESS_LOCAL
NODE_LOCAL
RACK_LOCAL
NO_PREF
hostThe hostname where the task resides on.
DurationTime elapsed in completing the task.
Jvm GcTimeTime spent by the JVM in garbage collection while executing a task.
Result SerializationTimeTime spent to serialize a task result.
Peak ExecutionMemoryThe memory used during shuffles, aggregations, and joins by internal data structures.
Input Read BytesBytes read from the executor in that stage.
Shuffle Read Blocked TimeThe time spent by tasks staying blocked and waiting for shuffle data to be read from remote machines.
Shuffle Records ReadNumber of records written to the host. These records are read by a shuffle later when needed.
Shuffle Remote ReadsThe shuffle bytes read from remote executors.

Trace

The Trace tab displays the logs of that Stage ID. You can detect the internal state of the jobs running in the stage you are currently viewing.

This tab also displays the log for any errors that occur in the tasks of that stage.

DAG

The Direct Acyclic Graph (DAG) displays a flow diagram of the Spark job.

DAG is a work scheduling graph with finite elements connected in edges and vertices. These elements are also called RDDs (Resilient Distributed Datasets). The RDDs are fault-tolerant in nature.

The order of execution of the jobs in DAG is specified by the directions of the edges in the graph. The graph is acyclic as it has no loops or cycles.

DAG