-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Vector batch bytes limits are based on in-memory sizing of events #10020
Copy link
Copy link
Open
Labels
domain: performanceAnything related to Vector's performanceAnything related to Vector's performance
Description
@tobz pointed out that our current batching mechanism uses the in-memory representation of the events to determine their size which will not match their serialized size. This could result in Vector sending batches that are either above or below the configured batch size. Typically we think it will be below given that the in-memory size of the event has been observed to be generally much greater than the serialized size of the event so Vector will send suboptimal batches. However, if it does generate a batch greater than the configured batch size this could result in failed requests if the batch size was configured to match some sink API limit.
References:
- Issues encountered upgrading v0.17.3 -> v0.18.0: batch defaults and maximums #10128
- vector 0.18.1 s3 sink does not use batch.max_bytes, creates small files on S3 #10535
- chore(sinks): refactor
RequestBuilder/RequestMetadatato streamline splitting/building #12857 gcp_cloud_storagesink ignoring batch.max_bytes #14426- Batch sizing for aws_s3 sink does not work #14416
- Vector logs an error on packets that are too large #13175
- Instrument sink batching #9719
- Datadog Logs sink sacrifices 750KB on payload size for throughput and we'd like to avoid that sacrifice. #9202
- datadog_traces - Failed to encode Datadog traces. #14244
- data loss submitting metrics to datadog #18123
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
domain: performanceAnything related to Vector's performanceAnything related to Vector's performance