Provide clear documentation on how users can inspect and, if necessary, customize the resource requests/limits and key configuration flags for their cluster's API server. This transparency is key for advanced users running demanding workloads.
Information about robust CRD and Controller Operation:
Ensure that the installation and operation of complex CRDs and their associated controllers, common in AI/ML operators (e.g. Ray, Kubeflow), function reliably. The platform must not impose non-standard limitations that cause failures beyond standard Kubernetes resource quotas. Examples of such prohibited limitations are aggressive API server rate-limiting that throttles normal operator reconciliation, or control plane resource constraints that lead to unreliable webhook execution or slow controller reconciliation under load.
Provide clear documentation on how users can inspect and, if necessary, customize the resource requests/limits and key configuration flags for their cluster's API server. This transparency is key for advanced users running demanding workloads.
Information about robust CRD and Controller Operation:
Ensure that the installation and operation of complex CRDs and their associated controllers, common in AI/ML operators (e.g. Ray, Kubeflow), function reliably. The platform must not impose non-standard limitations that cause failures beyond standard Kubernetes resource quotas. Examples of such prohibited limitations are aggressive API server rate-limiting that throttles normal operator reconciliation, or control plane resource constraints that lead to unreliable webhook execution or slow controller reconciliation under load.