gpu-workload/triton/gpu.yaml (18 lines of code) (raw):
apiVersion: v1
kind: Pod
metadata:
name: my-gpu-pod
spec:
nodeSelector:
cloud.google.com/gke-accelerator: nvidia-tesla-t4
# cloud.google.com/gke-accelerator: nvidia-tesla-a100
containers:
- name: my-gpu-container
image: nvidia/cuda:11.0.3-runtime-ubuntu20.04
command: ["/bin/bash", "-c", "--"]
args: ["while true; do sleep 600; done;"]
resources:
limits:
nvidia.com/gpu: 1
requests:
cpu: "18"
memory: "18Gi"