Skip to content

datadog_agent source is not processing V2 API payload from Datadog agent accurately. #18690

@rpriyanshu9

Description

@rpriyanshu9

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Problem

Hey there,

After upgrading Datadog agent from 7.39.2 to 7.45.0, we observed that some metrics which use the device tag stopped coming. We further investigated this and found out that the device tag was renamed to resource.device after the upgrade. This resulted in many dashboards being empty and monitors going off in Datadog. We had to revert the upgrade for fixing this issue. We looked into the source code of Datadog agent and Vector to find the root cause of this issue.

Here's what we think is causing this:

Starting from the 7.43.2 release of Datadog agent, the device tag was sent as a part of resources array : DataDog/datadog-agent#16264.

The datadog_agent source acknowledges the V2 API payload with the resources field, but does not handle the tags that are sent as a part of resources and not the tags array. ref:

serie.resources.into_iter().for_each(|r| {
// As per https://github.com/DataDog/datadog-agent/blob/a62ac9fb13e1e5060b89e731b8355b2b20a07c5b/pkg/serializer/internal/metrics/iterable_series.go#L180-L189
// the hostname can be found in MetricSeries::resources and that is the only value stored there.
if r.r#type.eq("host") {
log_schema()
.host_key()
.and_then(|key| tags.replace(key.to_string(), r.name));
} else {
// But to avoid losing information if this situation changes, any other resource type/name will be saved in the tags map
tags.replace(format!("resource.{}", r.r#type), r.name);
}
});

Because of the above block of code, the device tag that comes as an element of resources gets remapped to resource.device by the datadog_agent source. Because of this remapping, the metrics sent out by the datadog_metrics sink have the resource.device tag which is incorrect. It should be device only.

Seeking assistance in resolving this issue.

Discord thread: https://discord.com/channels/742820443487993987/1155850005391880214

cc @datsabk @jszwedko

Configuration

    api:
      address: 0.0.0.0:8686
      enabled: true
      playground: false
    data_dir: /data/vector
    sinks:
      datadog_metrics:
        batch:
          max_bytes: 512000
        buffer:
          max_events: 10000
          type: memory
          when_full: block
        default_api_key: ${DD_API_KEY}
        inputs:
        - modify_tags_for_datadog
        type: datadog_metrics
    sources:
      datadog_agent:
        address: 0.0.0.0:8282
        disable_logs: true
        disable_traces: true
        multiple_outputs: false
        store_api_key: true
        type: datadog_agent
      vector_source:
        address: 0.0.0.0:9000
        type: vector
    transforms:
      filter_for_datadog:
        condition:
          source: "true"
          type: vrl
        inputs:
        - datadog_agent
        - vector_source
        type: filter
      modify_tags_for_datadog:
        inputs:
        - filter_for_datadog
        source: |-
          del(.tags."k2.version")
          del(.tags."k2.skip_checks")
          del(.tags.container_id)
          del(.tags.display_container_name)
          del(.tags."git.commit.sha")
          del(.tags.kube_ownerref_name)
          del(.tags.kube_replica_set)
          del(.tags."io.kubernetes.pod.uid")
          del(.tags.image_id)
        type: remap

Version

vector 0.30.0

Debug Output

No response

Example Data

{
    "metric": {
        "name": "disk.in_use",
        "namespace": "system",
        "tags": {
            "device_name": "loop0",
            "host": "i-03fe32ac191d77928",
            "resource.device": "/dev/loop0",
            "source_type_name": "System"
        },
        "timestamp": "2023-09-26T15:25:14Z",
        "kind": "absolute",
        "gauge": {
            "value": 0.192
        }
    }
}

Additional Context

No response

References

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions