Acceldata for Spark Management offers advanced Exploratory features, which can enable you to look at various aspects of the Hive Query Execution.
The following execution versions are supported
- Spark 1.x
- Spark 2.x
Spark Query Dashboard
Top K Users & Tables
This dashboard talks shows the distribution of jobs by various criterias such as
- GC time
- Driver time
- Wait time
Resource Utilization by Spark
Resources Utilized by Spark during the selected Calendar interval allows one to look at the VCore and Memory Usage and visually identify debugging ideas. Resource usage can be filtered by Queue.
Each of these UI elements are clickable and will take you to the filtered list of queries.
Track and Debug Spark Queries
Queries can be identified with the Yarn Application Id. In addition to that there is a comprehensive search available which allows search by other parameters such as User, Application Type, Time of Execution
Core Utilization Analysis
Core utilization reports processing parameters such as
- Process Local
- Node Local
- Rack Local
Also, comments about how the cores are utilized during the course of execution.
Driver versus executor time spent determines how well the spark program has been written and if the right amount of parallelism is achieved.
This determines what should be the ideal number of executors on the spark program and what would be the effect of such changes to the number of executors on the overall time and utilization.
Query Execution Stats
The following section points out which lines of code take up what amount of time, resources, I/O and other parameters.
Please note to get data for the above to work seamlessly, Spark Hooks need to be deployed on all the HiveServers which are deployed on the cluster.