Spark Thrift Queries
Click Spark Thrift → Queries to view the list of queries classified in the following charts.
note
- To view the distribution charts, click the .
- The default time range is Last 24 hrs. To view statistics from a custom date range, click the icon and select a time frame and timezone of your choice.
Chart | Description |
---|---|
Distribution | The Distributions panel displays the summary of jobs as a Sankey diagram. By default, the chart displays the distribution by Duration. You can filter the distribution by Input Data, Output Data, Shuffle Reads, or Shuffle Writes. |
Core Wastage | The Core Wastage chart displays the core wastage by the following locality types. The chart also displays Core Used and Core Wasted values (in%). Process Local: The tasks in this locality are run within the same process as the source data. Node Local: The tasks in this locality are run on the same machine as the source data. Rack Local: The tasks in this locality are run in the same rack as the source data. Any: The tasks in this locality are run anywhere else but not on the same node or rack. No pref: The tasks in this locality have no locality preference. Idle: The tasks in this locality that are idle. |
VCore Usage | The number of physical virtual cores used by a queue in the cluster. This chart displays the Average VCore Usage and Max VCore Usage. |
The following metrics are displayed for each user.
Metric | Description |
---|---|
User | The name of the user running a query. |
Pool | The name of the fair scheduler pool the user belongs to. |
State | The final state of the query run by the user. The state can be: Failed, Finished, Compiled. |
Over Head Time | The excess time taken for a query to run (in nanoseconds). |
Start Time | The time at which the user executed the query. |
Completion Time | The time at which the query completed execution. |
# of Stages | The number of stages of the query. |
Duration | The time taken to run the query. |
Data Read | The amount of data read by the query. |
Data Written | The amount of data written by the query. |
GC Time | Time spent by the JVM in garbage collection while executing a query. |