YARN Capacity

The YARN Capacity dashboard gives you information about memory utilized by jobs of different application types in a selected queue for computing. To access YARN Capacity dashboard, click YARN -> Dashboard in the left pane.

Note: The default time range is Last 24 hrs and CST timezone. To view statistics from a custom date range, click the Timeframe icon and select a time frame and timezone.

Summary panel

The top panels of the YARN dashboard display the following states and status of applications.

StateDescription
NewThe application that is just created.
New_SavingThe application that is being saved.
SubmittedThe application that is just submitted.
AcceptedThe application accepted to run by the YARN scheduler.
RunningThe application that is currently running.
FinishedThe application that completed execution.
FailedThe application that failed.
KilledThe application that you terminated.
StatusDescription
SucceededThe number of jobs that completed successfully.
FailedThe number of jobs that failed.
KilledThe number of jobs that were forcefully or unexpectedly terminated.
UndefinedThe number of jobs that have a state other than Finished.

The top right panel of the YARN dashboard displays the number of jobs running in the queue by the following applications.

Note: To view details of specific job types, click the number under these metrics.

Application TypeDescription
YARNThe number of YARN applications submitted in the queue.
TezThe number of Tez applications submitted in the queue.
SparkThe number of Spark applications submitted in the queue.
MapReduceThe number of jobs running in MapReduce application in the queue.

Queues

In Queues tab, you can see the root queue, default queue, and custom queue(s) defined by the cluster administrator.

  • root: A predefined queue of YARN. This queue is a parent of the queues available in your cluster. This queue uses 100% of resources.
  • default: A designated queue defined by the administrator. This queue contains jobs that do not have a queue allocated.

Note: To view memory capacity allocated to or used by resources on a queue, click the queue in the Queues tab. You can then view the statistics on the YARN Capacity dashboard.

Node Labels

The Node Labels displays the label name of the node partition and metrics of the nodes having similar characteristics. By default, the nodes belong to the Default label. For every node label, you can view the queue, vcore, and memory metrics.

To know more about the statistics of a metric, click the number under these metrics.

Queue Metric Distributions

The Query Metric Distributions Sankey chart displays the flow of data values from Queues to Application Type to Jobs.

Reading a Sankey Chart

The following screenshot is an example of a Queue Metric Distributions Sankey chart displayed by Duration.

Sankey chart

You can gather the following information from the chart.

Note: To see the distribution in numbers, hover over the Sankey chart.

You can observe the following in Queues.

  • 62.95% of the time is spent by applications running on the default queue.
  • 36.58% of the time is spent by applications running on the spark_jobs_q queue.
  • The remaining 0.48% of the time is spent by applications running on the llap queue.

The applications in these queues are either of the following types.

  • Spark
  • MapReduce
  • Tez
  • Slider

You can observe the following based on the application types.

  • 92.4% of jobs are done by Spark application.
  • 3.33% of jobs are done by MapReduce application.
  • 4.04% of jobs are done by Tez application.
  • 0.24% of jobs are done by Slider jobs.

Out of the jobs submitted, you can observe the following.

  • 99.05% of jobs were completed within an average time of 6.00 milliseconds to 1.85 hours.
  • The remaining jobs were completed within an average of 1.65 days to 3.13 days.

Viewing Sankey chart by distribution

You can view the Sankey chart by the following distributions.

  • Duration
  • Duration Variance
  • VCore Seconds
  • Memory Seconds

Other Capacity Charts

The following table gives you an overview of the other metrics charts in Capacity.

Note: The default time range is Last 24 hrs. To view statistics from a custom date range, click the Timeframe icon and select a time frame and timezone of your choice.

Chart NameDescription
Queue DistributionDistribution of queue capacity in percentage, configured in the cluster. You can see the capacity of root, default, and other custom defined queues.
Queue Usage% of resource usage in a queue, calculated in percentage, in a timeframe. The queue usage lets you identify when resources in a queue are consumed at peak.
Memory UsageAmount of memory used within the queue in the cluster in a particular timeframe.
VCore UsageNumber of virtual cores used within the queue in the cluster.
VCore Usage Per UserNumber of virtual cores used by an individual user.
Memory Usage Per UsageAmount of memory used per user.
Memory UtilizationAmount of memory utilized within the queue. You can monitor the following memory statistics.
Peak mem utilization: The maximum amount of memory used by a queue in a timeframe. The peak memory lets you know when a queue is highly utilized.
Avg mem utilization: The average amount of memory used by a queue in a timeframe.
Max mem limit: The maximum amount of memory configured according to selected queue.
Mem limit: The maximum amount of memory defined by the administrator for a queue.
No. of ContainersThe number of containers allocated to run a single or multiple applications in a queue in a timeframe. The following metrics are used in monitoring the containers.
Max containers allocated: The maximum number of containers allocated to a queue.
Avg containers allocated: The average number of containers allocated to a queue.