fix: config validation for elasticsearch, loki, file_out (#718, #530)#795
Conversation
Both types are fully implemented via build_sink_factory() but were incorrectly rejected at validation time as "not yet implemented". Elasticsearch and Loki now validate like other endpoint-based outputs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FileOut has no sink implementation but was accepted by validation, causing a runtime crash. Now rejected at config load like Parquet. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
WalkthroughValidation in crates/logfwd-config was changed: Possibly related PRs
Caution Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional.
❌ Failed checks (1 error, 3 warnings, 1 inconclusive)
✅ Passed checks (2 passed)
Comment |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
crates/logfwd-config/src/lib.rs (3)
982-995: 🧹 Nitpick | 🔵 TrivialAdd explicit failing cases for missing
endpointon the new outputs.This loop only proves the happy path. If a future edit drops
elasticsearchorlokifrom the endpoint-required arm, these cases still stay green. Please add direct negative tests that both types fail withoutendpoint.🧪 Suggested regression test
+ #[test] + fn validation_endpoint_required_for_elasticsearch_and_loki() { + for otype in ["elasticsearch", "loki"] { + let yaml = format!( + "input:\n type: file\n path: /tmp/x.log\noutput:\n type: {otype}\n" + ); + let msg = Config::load_str(&yaml).unwrap_err().to_string(); + assert!( + msg.contains("endpoint"), + "expected missing endpoint error for '{otype}': {msg}" + ); + } + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-config/src/lib.rs` around lines 982 - 995, The test all_output_types() only checks successful parsing when endpoints are present; add explicit negative tests that verify Config::load_str fails for the new outputs when the endpoint is missing by calling Config::load_str on YAMLs for "elasticsearch" and "loki" without an endpoint and asserting an error (e.g., using unwrap_err or assert!(matches!(...))). Update or add assertions near the all_output_types() test to include these two failing cases so regressions removing endpoint validation for elasticsearch or loki are caught.
523-542: 🛠️ Refactor suggestion | 🟠 MajorDelete the stale
FileOutarm from the second match.
FileOutis now rejected above as unimplemented, but Line 523 still treats it as a valid output that only needspath. Keeping both paths in the same function leaves conflicting behavior and makes it easy to regress back into the runtime crash this PR is fixing.♻️ Suggested cleanup
- OutputType::FileOut => { - if output.path.is_none() { - return Err(ConfigError::Validation(format!( - "pipeline '{name}' output '{label}': {} output requires 'path'", - output_type_name(&output.output_type), - ))); - } - } OutputType::Stdout | OutputType::Null => {} OutputType::TcpOut | OutputType::UdpOut => { if output.endpoint.is_none() { return Err(ConfigError::Validation(format!( "pipeline '{name}' output '{label}': {} output requires 'endpoint'", output_type_name(&output.output_type), ))); } } - // FileOut and Parquet are already rejected above. - OutputType::FileOut | OutputType::Parquet => { + // Parquet is already rejected above. + OutputType::Parquet => { unreachable!("placeholder types are rejected before this match") }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-config/src/lib.rs` around lines 523 - 542, The second match still contains an OutputType::FileOut arm that conflicts with the earlier rejection of FileOut; remove the stale OutputType::FileOut alternative from that match so only Parquet (or the intended placeholder types) are handled by the unreachable! arm. Ensure the logic that checks output.path and output.endpoint remains in the first match branches (the existing OutputType::FileOut and OutputType::TcpOut/OutputType::UdpOut arms) and keep the unreachable! arm only for OutputType::Parquet (and any real placeholders), leaving output_type_name, ConfigError::Validation, output.path and output.endpoint checks unchanged.
500-514:⚠️ Potential issue | 🟡 MinorTighten endpoint validation before reusing it for
elasticsearch/loki.
validate_endpoint_url()only checks the scheme prefix and whether anything remains after it, so malformed values likehttp:///bulkorhttp://:9200still pass this branch. That lets the newly accepted outputs clear config validation and fail later during sink construction instead of at load time.🛠️ Suggested fix
fn validate_endpoint_url(endpoint: &str) -> Result<(), String> { // Defer validation for values that still contain unexpanded env var placeholders. if endpoint.contains("${") { return Ok(()); } let rest = if let Some(r) = endpoint.strip_prefix("https://") { r } else if let Some(r) = endpoint.strip_prefix("http://") { r } else { return Err(format!( "endpoint '{endpoint}' has no recognised scheme; expected 'http://' or 'https://'" )); }; - if rest.is_empty() { + let authority = rest.split(['/', '?', '#']).next().unwrap_or_default(); + if authority.is_empty() || authority.starts_with(':') { return Err(format!( "endpoint '{endpoint}' has no host after the scheme" )); } Ok(()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-config/src/lib.rs` around lines 500 - 514, The endpoint validation for outputs of type OutputType::Elasticsearch and OutputType::Loki (and other URL-based outputs) is too permissive because validate_endpoint_url() only checks the scheme prefix; update the validation in the match arm handling output.output_type (and reuse of validate_endpoint_url) to fully parse the endpoint (e.g., via a proper URL parser) and ensure a valid authority/host (and optional port) and that there are no empty authority components (reject values like "http:///bulk" or "http://:9200"), and if parsing/host checks fail return the same ConfigError::Validation with the existing formatted message so validation fails at load time instead of later during sink construction.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 982-995: The test all_output_types() only checks successful
parsing when endpoints are present; add explicit negative tests that verify
Config::load_str fails for the new outputs when the endpoint is missing by
calling Config::load_str on YAMLs for "elasticsearch" and "loki" without an
endpoint and asserting an error (e.g., using unwrap_err or
assert!(matches!(...))). Update or add assertions near the all_output_types()
test to include these two failing cases so regressions removing endpoint
validation for elasticsearch or loki are caught.
- Around line 523-542: The second match still contains an OutputType::FileOut
arm that conflicts with the earlier rejection of FileOut; remove the stale
OutputType::FileOut alternative from that match so only Parquet (or the intended
placeholder types) are handled by the unreachable! arm. Ensure the logic that
checks output.path and output.endpoint remains in the first match branches (the
existing OutputType::FileOut and OutputType::TcpOut/OutputType::UdpOut arms) and
keep the unreachable! arm only for OutputType::Parquet (and any real
placeholders), leaving output_type_name, ConfigError::Validation, output.path
and output.endpoint checks unchanged.
- Around line 500-514: The endpoint validation for outputs of type
OutputType::Elasticsearch and OutputType::Loki (and other URL-based outputs) is
too permissive because validate_endpoint_url() only checks the scheme prefix;
update the validation in the match arm handling output.output_type (and reuse of
validate_endpoint_url) to fully parse the endpoint (e.g., via a proper URL
parser) and ensure a valid authority/host (and optional port) and that there are
no empty authority components (reject values like "http:///bulk" or
"http://:9200"), and if parsing/host checks fail return the same
ConfigError::Validation with the existing formatted message so validation fails
at load time instead of later during sink construction.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 529e5cc7-7553-466f-a68e-f9f3c9029b07
📒 Files selected for processing (1)
crates/logfwd-config/src/lib.rs
|
@coderabbitai can you please fully review this change? |
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 47812
Overall AssessmentThe logic intent is correct: 🔴 Bug: Dead/duplicate
|
| # | Severity | Issue |
|---|---|---|
| 1 | 🔴 Bug | Duplicate FileOut arm in second match → unreachable_patterns warning; dead path check |
| 2 | 🟡 Missing tests | No coverage for elasticsearch/loki missing/invalid endpoint |
| 3 | ℹ️ Pre-existing | tcp_out/udp_out absent from all_output_types |
🧠 Learnings used
Learnt from: CR
Repo: strawgate/memagent PR: 0
File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0
Timestamp: 2026-04-03T03:35:42.884Z
Learning: Applies to logfwd-io/src/**/*.rs : New async pipeline code must have proptest coverage for: arbitrary event ordering; acks out of order; drain while in-flight
FileOut was matched twice — once for validation (new code) and once in the old catch-all. Remove the duplicate that caused the unreachable_pattern clippy error. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/logfwd-config/src/lib.rs (1)
500-545: 🛠️ Refactor suggestion | 🟠 MajorRemove dead
FileOut/Parquetarms from the second match.Because Line 491 returns early for
FileOut/Parquet, theOutputType::FileOutarm (Line 526) andOutputType::Parquetarm (Line 543) in the later match are unreachable by control flow and now misleading.Suggested patch
match output.output_type { OutputType::Otlp | OutputType::Http | OutputType::Elasticsearch | OutputType::Loki => { if output.endpoint.is_none() { return Err(ConfigError::Validation(format!( "pipeline '{name}' output '{label}': {} output requires 'endpoint'", output_type_name(&output.output_type), ))); } if let Some(ep) = &output.endpoint && let Err(msg) = validate_endpoint_url(ep) { return Err(ConfigError::Validation(format!( "pipeline '{name}' output '{label}': {msg}", ))); } if output.output_type == OutputType::Otlp && output.compression.as_deref() == Some("gzip") { return Err(ConfigError::Validation(format!( "pipeline '{name}' output '{label}': otlp output does not support 'gzip' compression yet" ))); } } - OutputType::FileOut => { - if output.path.is_none() { - return Err(ConfigError::Validation(format!( - "pipeline '{name}' output '{label}': {} output requires 'path'", - output_type_name(&output.output_type), - ))); - } - } OutputType::Stdout | OutputType::Null => {} OutputType::TcpOut | OutputType::UdpOut => { if output.endpoint.is_none() { return Err(ConfigError::Validation(format!( "pipeline '{name}' output '{label}': {} output requires 'endpoint'", output_type_name(&output.output_type), ))); } } - OutputType::Parquet => { - // Parquet output not yet implemented - } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/logfwd-config/src/lib.rs` around lines 500 - 545, The second match on output.output_type contains dead/unreachable arms for OutputType::FileOut and OutputType::Parquet; remove those two arms (the OutputType::FileOut and OutputType::Parquet match branches) from the match so the control flow no longer contains misleading unreachable cases, and if necessary replace them with a catch-all arm (e.g. _ => {}) to keep the match exhaustive; update the match that inspects output.output_type (the one handling Otlp/Http/Elasticsearch/Loki/Stdout/Null/TcpOut/UdpOut) accordingly so behavior and validation for other variants (including endpoint/path checks and the otlp gzip check) remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 991-992: Add negative-path tests for the "elasticsearch" and
"loki" output types similar to the existing otlp/http checks: in the tests
module where all_output_types exists, add tests that invoke the same
validation/parsing used by all_output_types to assert errors when the endpoint
is missing and when the endpoint has no scheme (e.g., "http://x" vs "x" vs
empty). Name the tests clearly (e.g., elasticsearch_missing_endpoint,
elasticsearch_scheme_missing, loki_missing_endpoint, loki_scheme_missing) and
assert that the validator returns an error for those cases so regressions are
caught.
In `@crates/logfwd/tests/allocation_scaling.rs`:
- Around line 51-54: Update the inline timeout comment to reflect the actual
deadline used: change the comment that currently says "after 30s safety timeout"
to state "after 60s safety timeout" so it matches the Deadline computed in the
std::thread::spawn block (the variable deadline using Duration::from_secs(60)).
---
Outside diff comments:
In `@crates/logfwd-config/src/lib.rs`:
- Around line 500-545: The second match on output.output_type contains
dead/unreachable arms for OutputType::FileOut and OutputType::Parquet; remove
those two arms (the OutputType::FileOut and OutputType::Parquet match branches)
from the match so the control flow no longer contains misleading unreachable
cases, and if necessary replace them with a catch-all arm (e.g. _ => {}) to keep
the match exhaustive; update the match that inspects output.output_type (the one
handling Otlp/Http/Elasticsearch/Loki/Stdout/Null/TcpOut/UdpOut) accordingly so
behavior and validation for other variants (including endpoint/path checks and
the otlp gzip check) remain unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: ASSERTIVE
Plan: Pro
Run ID: 07ee4e5d-5f28-4d99-abab-827169a37e95
📒 Files selected for processing (2)
crates/logfwd-config/src/lib.rscrates/logfwd/tests/allocation_scaling.rs
| ("elasticsearch", "endpoint: http://x"), | ||
| ("loki", "endpoint: http://x"), |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Add missing negative endpoint/scheme tests for elasticsearch and loki.
You added positive-path coverage for these types in all_output_types, but there is still no explicit regression test that they fail on missing endpoint or scheme-less endpoint (like existing otlp/http checks).
Suggested patch
#[test]
fn validation_missing_output_endpoint() {
- let yaml = r"
-input:
- type: file
- path: /var/log/test.log
-output:
- type: otlp
-";
- let err = Config::load_str(yaml).unwrap_err();
- let msg = err.to_string();
- assert!(
- msg.contains("endpoint"),
- "expected 'endpoint' in error: {msg}"
- );
+ for otype in ["otlp", "http", "elasticsearch", "loki"] {
+ let yaml = format!(
+ "input:\n type: file\n path: /var/log/test.log\noutput:\n type: {otype}\n"
+ );
+ let err = Config::load_str(&yaml).unwrap_err();
+ let msg = err.to_string();
+ assert!(
+ msg.contains("endpoint"),
+ "expected 'endpoint' in error for '{otype}': {msg}"
+ );
+ }
}
@@
#[test]
fn validation_endpoint_missing_scheme() {
// Scheme-less endpoints must be rejected for both otlp and http outputs.
- for otype in ["otlp", "http"] {
+ for otype in ["otlp", "http", "elasticsearch", "loki"] {
let yaml = format!(
"input:\n type: file\n path: /tmp/x.log\noutput:\n type: {otype}\n endpoint: collector:4317\n"
);
@@
#[test]
fn validation_endpoint_valid_schemes() {
// Both http:// and https:// must be accepted for otlp and http outputs.
for (otype, scheme) in [
("otlp", "http://"),
("otlp", "https://"),
("http", "http://"),
("http", "https://"),
+ ("elasticsearch", "http://"),
+ ("elasticsearch", "https://"),
+ ("loki", "http://"),
+ ("loki", "https://"),
] {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/logfwd-config/src/lib.rs` around lines 991 - 992, Add negative-path
tests for the "elasticsearch" and "loki" output types similar to the existing
otlp/http checks: in the tests module where all_output_types exists, add tests
that invoke the same validation/parsing used by all_output_types to assert
errors when the endpoint is missing and when the endpoint has no scheme (e.g.,
"http://x" vs "x" vs empty). Name the tests clearly (e.g.,
elasticsearch_missing_endpoint, elasticsearch_scheme_missing,
loki_missing_endpoint, loki_scheme_missing) and assert that the validator
returns an error for those cases so regressions are caught.
| // Cancel as soon as all expected rows are processed, or after 30s safety timeout. | ||
| // This makes the test data-driven, not time-driven. | ||
| std::thread::spawn(move || { | ||
| let deadline = std::time::Instant::now() + Duration::from_secs(30); | ||
| let deadline = std::time::Instant::now() + Duration::from_secs(60); |
There was a problem hiding this comment.
Update the timeout comment to match the new value.
Line 54 uses a 60s deadline, but Line 51 still says 30s. Please keep the comment aligned with behavior to avoid test-debug confusion.
Suggested patch
- // Cancel as soon as all expected rows are processed, or after 30s safety timeout.
+ // Cancel as soon as all expected rows are processed, or after 60s safety timeout.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Cancel as soon as all expected rows are processed, or after 30s safety timeout. | |
| // This makes the test data-driven, not time-driven. | |
| std::thread::spawn(move || { | |
| let deadline = std::time::Instant::now() + Duration::from_secs(30); | |
| let deadline = std::time::Instant::now() + Duration::from_secs(60); | |
| // Cancel as soon as all expected rows are processed, or after 60s safety timeout. | |
| // This makes the test data-driven, not time-driven. | |
| std::thread::spawn(move || { | |
| let deadline = std::time::Instant::now() + Duration::from_secs(60); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/logfwd/tests/allocation_scaling.rs` around lines 51 - 54, Update the
inline timeout comment to reflect the actual deadline used: change the comment
that currently says "after 30s safety timeout" to state "after 60s safety
timeout" so it matches the Deadline computed in the std::thread::spawn block
(the variable deadline using Duration::from_secs(60)).
Summary
After this change:
build_sink_factory()build_sink_factory()Test plan
cargo test -p logfwd-config— 34 tests passingcargo check --all-targets— cleanvalidation_unimplemented_output_typecoversfile_outandparquetall_output_typesconfirmselasticsearchandlokiare acceptedCloses #718, closes #530
🤖 Generated with Claude Code