The Tez Query Details page contains the following panels:
- Query Trends
- YARN Diagnostics
- MapReduce Stats
- Query Execution Metrics
- Query Plan and DAG
The summary panel displays the following information.
|User||The name of the user that executed the job.|
|State||The state of the job that can be one of the following: Created, Initialized, Compiled, Running, Finished, Exception, or Unknown.|
|Duration||The duration of the query execution.|
|Start Time||The time at which the query execution started.|
|End Time||The time at which the query execution ended.|
|No. of Vertices||The number of vertices in the query.|
|HDFS Read||The amount of HDFS data read.|
|HDFS Written||The amount of HDFS data written to an output file format.|
|Application ID||The ID of the application of the user you are currently viewing.|
The Query Trends panel displays a chart showing the pattern of jobs running at a particular time, based on the following factors.
|Elapsed Time||The time taken to run the jobs at a particular time.|
|VCores||The number of VCores consumed to execute the query within a timeframe.|
|Memory||The amount of memory used to execute the query within a timeframe.|
Click Compare Runs to compare different runs of the query. Select the runs that you want to compare. You can choose from upto 10 previous runs of the query. The metrics that are different are highlighted and displayed at the very top of the comparison result.
The Recommendations panel displays recommendations that you can use to improve the performance of the SQL Query.
The Query panel displays the query along with the Join details and the following details of table(s) used in the query.
|Table Name||The table used in the query.|
|Filter expression||The expression used in query filtering.|
|Total Rows||The total number of rows in the table.|
|Output Rows||The number of rows returned on executing the query.|
This panel displays the following diagnostics metrics of YARN.
|Start Time||The time at which the YARN application started.|
|End Time||The time at which the YARN application ended.|
|State||The state of the YARN application. The state can be one of the following: Created, Initialized, Compiled, Running,Finished,Exception, or Unknown.|
|Message||The diagnostic message in the YARN application.|
|Message Count||The number of diagnostic messages.|
The following details are displayed for jobs in a YARN container.
A row contains data for a minute of the selected duration.
|Time||The minute at which the job is executed.|
|Preempted MB||The amount of processes that need priority to run the job.|
|Preempted VCores||The number of VCores that need priority to run the job.|
|Allocated MB||The amount of memory allocated to the query (in Mb).|
|Avg Memory||The average a mount of memory used.|
|Avg VCore||The average amount of VCores used.|
|Running Containers||The number of containers running in the query.|
|Queue Usage %||The amount of queue usage (in %).|
|Cluster Usage %||The amount of cluster usage (in %).|
|State||The state of the query using YARN application. The state can be one of the following: Created, Initialized, Compiled, Running,Finished,Exception, or Unknown.|
|Message||The diagnostic message.|
Map Reduce Stats
This tile displays the statistics of processing of large data sets on a worker node. You can monitor the statistics of following processes by elapsed time.
These statistics can be sorted by Duration and Start Time.
Query Execution Metrics
The Query Execution Metrics panel displays the following set of metrics.
|Metric Type||Metric Name||Description|
|Task||Committed Heap Bytes||The maximum amount of memory (in bytes) that can be used for memory management.|
|CPU Milliseconds||The CPU time (in ms)|
|GC Time||Time spent by the JVM in garbage collection while executing a query.|
|Input Records Processed||The number of input records processed.|
|Merge Phase Time||The time taken to merge a query phase.|
|Merged Map Outputs||The number of map outputs that were merged.|
|Output Bytes||The number of output bytes written to a file format while executing the query at a given time.|
|Output Records||The number of output records.|
|Physical Memory Bytes||The amount of physical memory the query uses.|
|Shuffle Bytes To Disk||The number of shuffle records written to disk.|
|Shuffle Bytes To Mem||The number of shuffle records written to memory.|
|Shuffle Bytes||The number of shuffle bytes the query uses.|
|Shuffle Phase Time||The time taken to shuffle a query phase.|
|Spilled Records||The number of spilled records.|
|Virtual Memory Bytes||The amount of virtual memory used by queries.|
|File System Metrics||HDFS Bytes Written||The number of HDFS bytes written to the query executor.|
|File Bytes Written||The number of file bytes written to the query executor.|
|HDFS Bytes Read||The number of HDFS bytes read.|
|File Bytes Read||The number of file bytes read.|
|DAG Metrics||Total Launched Tasks||The number of tasks launched in DAG.|
|Rack Local Tasks||The number of tasks that are local to the rack.|
|Data Local Tasks||The number of tasks that are local to the data.|
|# of Succeeded Tasks||The number of tasks that completed successfully.|
|Application Master Cpu Time||The time taken by the CPU in application master.|
|Application Master GC Time||The time taken by the GC in the application master.|
|HIVE Metrics||Hive Files Created||The number of HIVE files that were created.|
Query DAG and Plan
The panel displays the distribution of query logic in the form of a DAG and a physical execution plan.
The Direct Acyclic Graph (DAG) is an execution graph that displays a flow diagram of the compiled Hive SQL queries. This graph is a work scheduling graph with finite elements connected in edges and vertices. The order of execution of the jobs in DAG is specified by the directions of the edges in the graph. The graph is acyclic as it has no loops or cycles.
Plan is a logical representation of how Spark executes the query, where a query is broken into different logical plans.