- Adding nodes: If you outgrow your initial Workbench cluster installation, you can always add nodes to your cluster—including GPUs. Once added, make these nodes available to platform users by configuring resource profiles.
- Monitoring and managing resources: To help you manage your organization’s cluster resources more efficiently, Workbench enables you to monitor which sessions and deployments are running on specific nodes or by specific users. You can also monitor cluster resource usage in terms of CPU, memory, disk space, network, and GPU utilization.
- Troubleshooting and debugging: To help you gain insights into user services and troubleshoot issues, Workbench provides detailed logs and debugging information related to the Kubernetes services it uses, as well as all activity performed by users. See fault tolerance in Workbench for information about what to do if a master node fails.
