Add resource-specific health checks for standard Kubernetes resources#160
Conversation
Implement health check system for standard K8s resources that complements the existing Ready condition checking. Standard Kubernetes resources now use resource-specific health logic before falling back to Ready conditions. Supported resources: - Deployment: checks replica counts and Available condition - StatefulSet: checks replicas and revision matching - DaemonSet: checks pod scheduling and availability - Service: checks LoadBalancer ingress assignment - Ingress: checks load balancer assignment - Secret, ConfigMap, ServiceAccount: ready if exists Uses registry pattern for extensibility. All other resources (Crossplane managed resources, custom resources) continue using Ready condition checks. Signed-off-by: Tobias Kässer <tobias.kasser@upbound.io>
58cd1f2 to
50bf50e
Compare
stevendborrelli
left a comment
There was a problem hiding this comment.
This is a great addition that addresses a pain point for composition authors.
Can we update the Readme with a running status of what is supported? I wonder if there becomes a point where this starts to bloat the base auto-ready function with features.
The list of resources in a default Kubernetes install:
# Core (core/v1)
- [ ] Pod
- [ ] Service
- [ ] Namespace
- [ ] Node
- [ ] ConfigMap
- [ ] Secret
- [ ] ServiceAccount
- [ ] Endpoints
- [ ] EndpointSlice
- [ ] PersistentVolume
- [ ] PersistentVolumeClaim
- [ ] ReplicationController
- [ ] ResourceQuota
- [ ] LimitRange
- [ ] Event
# Apps (apps/v1)
- [ ] Deployment
- [ ] StatefulSet
- [ ] DaemonSet
- [ ] ReplicaSet
# Batch (batch/v1)
- [ ] Job
- [ ] CronJob
# Autoscaling (autoscaling/v2)
- [ ] HorizontalPodAutoscaler
# Networking (networking.k8s.io/v1)
- [ ] Ingress
- [ ] IngressClass
- [ ] NetworkPolicy
# RBAC (rbac.authorization.k8s.io/v1)
- [ ] Role
- [ ] ClusterRole
- [ ] RoleBinding
- [ ] ClusterRoleBinding
# Authentication (authentication.k8s.io/v1)
- [ ] TokenReview
# Authorization (authorization.k8s.io/v1)
- [ ] SubjectAccessReview
- [ ] LocalSubjectAccessReview
- [ ] SelfSubjectAccessReview
- [ ] SelfSubjectRulesReview
# Storage (storage.k8s.io/v1)
- [ ] StorageClass
- [ ] VolumeAttachment
- [ ] CSIDriver
- [ ] CSINode
# Policy (policy/v1)
- [ ] PodDisruptionBudget
# Coordination (coordination.k8s.io/v1)
- [ ] Lease
# Admission Registration (admissionregistration.k8s.io/v1)
- [ ] MutatingWebhookConfiguration
- [ ] ValidatingWebhookConfiguration
# Scheduling (scheduling.k8s.io/v1)
- [ ] PriorityClass
# API Extensions (apiextensions.k8s.io/v1)
- [ ] CustomResourceDefinition (CRD)
# Discovery (discovery.k8s.io/v1)
- [ ] EndpointSlice (also listed in core/v1 for compatibility)| // in the observed state, it's considered ready. | ||
| func checkConfigMapHealth(obj *unstructured.Unstructured) bool { | ||
| // ConfigMaps are always ready if they exist | ||
| return true |
There was a problem hiding this comment.
A very minor nit but maybe find common health checks that are shared?
func alwaysReady(*unstructured.Unstructured) bool {
return true
}There was a problem hiding this comment.
Great idea, implemented
| @@ -0,0 +1,69 @@ | |||
| apiVersion: apiextensions.crossplane.io/v1 | |||
There was a problem hiding this comment.
Can this example be flushed out with a README, render command, XR/XRD so that it can be run locally and in-cluster?
|
|
||
| func init() { | ||
| // Register all standard Kubernetes resource health checks | ||
| registerDeploymentHealthCheck() |
There was a problem hiding this comment.
Can you order these alphabetically to make it easier to make sure that all checks are registered?
There was a problem hiding this comment.
I think registerServiceAccountSetHealthCheck() is missing. We may need some integration tests to ensure all types are being checked.
f35d588 to
806bd65
Compare
|
@stevendborrelli Thank you so much for the great feedback. I think i incorporated it all and added some more healthchecks based on the logic which is used in argocd |
Signed-off-by: Tobias Kässer <tobias.kasser@upbound.io>
806bd65 to
39fadec
Compare
| ## Files | ||
|
|
||
| - **composition-k8s.yaml**: A composition that creates three Kubernetes resources (Service, Deployment, ConfigMap) and uses function-auto-ready to detect their readiness | ||
| - **xr.yaml**: A simple composite resource (XR) instance to render the composition |
There was a problem hiding this comment.
I think the xr.yaml and functions.yaml are missing.
stevendborrelli
left a comment
There was a problem hiding this comment.
Looks good.
I think overall this PR will change the nature of function-auto-ready, which up to now has been simple like the Unix cat command. Aside from filling in more types to support, one could see future requests to support things like the custom checks in P&T.
It's a question for the maintainers.
| RegisterHealthCheck(gvk, checkDaemonSetHealth) | ||
| } | ||
|
|
||
| // checkDaemonSetHealth implements ArgoCD-style health check for DaemonSets |
There was a problem hiding this comment.
nit: if we using the same implementation for ready than ArgoCD - can we add the source here ? make it easier in the future to extend with more resources
haarchri
left a comment
There was a problem hiding this comment.
thanks - this would eliminate a lot of boilerplate in compositions
Description of your changes
Implement health check system for standard K8s resources that complements the existing Ready condition checking. Standard Kubernetes resources now use resource-specific health logic before falling back to Ready conditions.
Supported resources:
Uses registry pattern for extensibility. All other resources (Crossplane managed resources, custom resources) continue using Ready condition checks.
I have:
What i did for additional testing is deployed https://github.com/upbound/configuration-app-model with following change:
Then i deployed the example:
kubectl apply -f examples/app/example.yamland observed everything to be healthy:
I additionally rendered the example i added with the function in dev mode and saw the XR coming up as ready there as well.
To confirm everything works with only standard crossplane resource i also ran e2e tests for https://github.com/upbound/configuration-aws-network by switching out the function with my custom one:
and observed e2e tests going green:
Removed some of the logs due to limitations and brevity