Torch System Requirements | On Premise

Infrastructure

Supported Operating Systems

The following operating systems are required for installation of Torch on-prem:

  • Ubuntu 16.04 (Kernel version >= 4.15)
  • Ubuntu 18.04 (Recommended)
  • Ubuntu 20.04 (Docker version >= 19.03.10)
  • CentOS 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8.1, 8.2, 8.3 (CentOS 8.x requires Containerd)
  • RHEL 7.4, 7.5, 7.6, 7.7, 7.8, 7.9, 8.1, 8.2, 8.3 (RHEL 8.x requires Containerd)
  • Amazon Linux 2
  • Minimum System Requirements

The recommended hardware configurations depends on the following factors:

  • Amount of data to be processed
  • Type of Spark deployment used

Refer to https://docs.acceldata.dev/docs/torch/overview/installation#minimum-hardware-recommendation for more information.

Other requirements:

  • 200 GB of Disk Space per machine
  • TCP ports 2379, 2380, 6443, 6783, 10250, 10251 and 10252 open between cluster nodes
  • UDP ports 6783 and 6784 open between cluster nodes

Networking Requirements

The following ports should be open between the Kubernetes cluster nodes:

  • TCP ports 2379, 2380, 6443, 6783, 10250, 10251 and 10252
  • UDP ports 6783 and 6784 open between cluster nodes

Firewall Openings for Online Installations

The following domains need to be accessible from servers performing online kURL installs. IP addresses for these services can be found in replicatedhq/ips.

HostDescription
amazonaws.comtar.gz packages are downloaded from Amazon S3 during embedded cluster installations. The IP ranges to allowlist for accessing these can be scraped dynamically from the AWS IP Address Ranges documentation.
k8s.kurl.shKubernetes cluster installation scripts and artifacts are served from kurl.sh. Bash scripts and binary executables are served from kurl.sh. This domain is owned by Replicated, Inc which is headquartered in Los Angeles, CA.

No outbound internet access is required for airgapped installations. Airgapped installation is a kind where the Kubernetes nodes do not have access to the internet.

Host Firewall Rules

The kURL install script will prompt to disable firewalld. Note that firewall rules can affect communications between containers on the same machine, so it is recommended to disable these rules entirely for Kubernetes. Firewall rules can be added after or preserved during an install, but because installation parameters like pod and service CIDRS can vary based on local networking conditions, there is no general guidance available on default requirements.

The following ports must be open between nodes for multi-node clusters. These following ports are also required for Kubernetes and Weave Net.

Primary Nodes:

ProtocolDirectionPort RangePurposeUsed By
TCPInbound6443KubernetesAPI server
TCPInbound2379-2380etcd server client APIPrimary
TCPInbound10250kubelet APIPrimary
TCPInbound6783Weave Net controlAll
UDPInbound6783-6784Weave Net dataAll

Secondary Nodes:

ProtocolDirectionPort RangePurposeUsed By
TCPInbound10250kubelet APIPrimary
TCPInbound6783Weave Net controlAll
UDPInbound6783-6784Weave Net dataAll

Application

Torch

  • Torch UI access requires ports 80 and 443 should be accessible from outside directly or through VPN.
  • Torch admin console port 8800 should be accessible from internal locations.

Torch with CDH/HDP

  • Access to name nodes and data nodes are required from the Kubernetes nodes.
  • The Livy port (default: 8998) should be opened for egress traffic for Kubernetes nodes.
  • For the type of data sources, appropriate port connectivity should be opened based on the data source and deployment(on-premise / cloud).

Torch with Databricks

  • Access to AWS S3 will be required from the Kubernetes nodes.
  • For the type of data sources, appropriate port connectivity should be opened based on the data source and deployment(on-premise / cloud).

Torch with Spark on Kubernetes

  • Access to AWS S3 will be required from the Kubernetes nodes.
  • For the type of data sources, appropriate port connectivity should be opened based on the data source and deployment(on-premise / cloud).