A note for the community
- Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
- If you are interested in working on this issue or have submitted a pull request, please leave a comment
Problem
Hey there,
After upgrading Datadog agent from 7.39.2 to 7.45.0, we observed that some metrics which use the device tag stopped coming. We further investigated this and found out that the device tag was renamed to resource.device after the upgrade. This resulted in many dashboards being empty and monitors going off in Datadog. We had to revert the upgrade for fixing this issue. We looked into the source code of Datadog agent and Vector to find the root cause of this issue.
Here's what we think is causing this:
Starting from the 7.43.2 release of Datadog agent, the device tag was sent as a part of resources array : DataDog/datadog-agent#16264.
The datadog_agent source acknowledges the V2 API payload with the resources field, but does not handle the tags that are sent as a part of resources and not the tags array. ref:
|
serie.resources.into_iter().for_each(|r| { |
|
// As per https://github.com/DataDog/datadog-agent/blob/a62ac9fb13e1e5060b89e731b8355b2b20a07c5b/pkg/serializer/internal/metrics/iterable_series.go#L180-L189 |
|
// the hostname can be found in MetricSeries::resources and that is the only value stored there. |
|
if r.r#type.eq("host") { |
|
log_schema() |
|
.host_key() |
|
.and_then(|key| tags.replace(key.to_string(), r.name)); |
|
} else { |
|
// But to avoid losing information if this situation changes, any other resource type/name will be saved in the tags map |
|
tags.replace(format!("resource.{}", r.r#type), r.name); |
|
} |
|
}); |
Because of the above block of code, the device tag that comes as an element of resources gets remapped to resource.device by the datadog_agent source. Because of this remapping, the metrics sent out by the datadog_metrics sink have the resource.device tag which is incorrect. It should be device only.
Seeking assistance in resolving this issue.
Discord thread: https://discord.com/channels/742820443487993987/1155850005391880214
cc @datsabk @jszwedko
Configuration
api:
address: 0.0.0.0:8686
enabled: true
playground: false
data_dir: /data/vector
sinks:
datadog_metrics:
batch:
max_bytes: 512000
buffer:
max_events: 10000
type: memory
when_full: block
default_api_key: ${DD_API_KEY}
inputs:
- modify_tags_for_datadog
type: datadog_metrics
sources:
datadog_agent:
address: 0.0.0.0:8282
disable_logs: true
disable_traces: true
multiple_outputs: false
store_api_key: true
type: datadog_agent
vector_source:
address: 0.0.0.0:9000
type: vector
transforms:
filter_for_datadog:
condition:
source: "true"
type: vrl
inputs:
- datadog_agent
- vector_source
type: filter
modify_tags_for_datadog:
inputs:
- filter_for_datadog
source: |-
del(.tags."k2.version")
del(.tags."k2.skip_checks")
del(.tags.container_id)
del(.tags.display_container_name)
del(.tags."git.commit.sha")
del(.tags.kube_ownerref_name)
del(.tags.kube_replica_set)
del(.tags."io.kubernetes.pod.uid")
del(.tags.image_id)
type: remap
Version
vector 0.30.0
Debug Output
No response
Example Data
{
"metric": {
"name": "disk.in_use",
"namespace": "system",
"tags": {
"device_name": "loop0",
"host": "i-03fe32ac191d77928",
"resource.device": "/dev/loop0",
"source_type_name": "System"
},
"timestamp": "2023-09-26T15:25:14Z",
"kind": "absolute",
"gauge": {
"value": 0.192
}
}
}
Additional Context
No response
References
No response
A note for the community
Problem
Hey there,
After upgrading Datadog agent from
7.39.2to7.45.0, we observed that some metrics which use thedevicetag stopped coming. We further investigated this and found out that thedevicetag was renamed toresource.deviceafter the upgrade. This resulted in many dashboards being empty and monitors going off in Datadog. We had to revert the upgrade for fixing this issue. We looked into the source code of Datadog agent and Vector to find the root cause of this issue.Here's what we think is causing this:
Starting from the
7.43.2release of Datadog agent, thedevicetag was sent as a part ofresourcesarray : DataDog/datadog-agent#16264.The
datadog_agentsource acknowledges the V2 API payload with theresourcesfield, but does not handle the tags that are sent as a part ofresourcesand not thetagsarray. ref:vector/src/sources/datadog_agent/metrics.rs
Lines 270 to 281 in 53cad38
Because of the above block of code, the
devicetag that comes as an element ofresourcesgets remapped toresource.deviceby thedatadog_agentsource. Because of this remapping, the metrics sent out by thedatadog_metricssink have theresource.devicetag which is incorrect. It should bedeviceonly.Seeking assistance in resolving this issue.
Discord thread: https://discord.com/channels/742820443487993987/1155850005391880214
cc @datsabk @jszwedko
Configuration
Version
vector 0.30.0
Debug Output
No response
Example Data
{ "metric": { "name": "disk.in_use", "namespace": "system", "tags": { "device_name": "loop0", "host": "i-03fe32ac191d77928", "resource.device": "/dev/loop0", "source_type_name": "System" }, "timestamp": "2023-09-26T15:25:14Z", "kind": "absolute", "gauge": { "value": 0.192 } } }Additional Context
No response
References
No response