Skip to content

Conversation

@mirko-lazarevic
Copy link
Contributor

@mirko-lazarevic mirko-lazarevic commented Dec 1, 2025

This fix ensures that when the buffer is
flushed, the record will have proper timestamp
and metadata instead of just the "log" field.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

Summary by CodeRabbit

  • Bug Fixes

    • Multiline processing now registers first-line context earlier (including pre-concatenation when starting fresh) and avoids packing metadata for records truncated during processing, preventing metadata loss.
  • Tests

    • Added regression tests to ensure full per-record metadata (time, stream, file, log) is preserved across multiline flushes, slow arrivals, and truncation/continuation boundaries.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 1, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

The multiline parser now registers the first-line context earlier when the group's first-line buffer is empty and avoids packing metadata for truncated content; new unit tests verify metadata preservation across flushes and truncation boundaries.

Changes

Cohort / File(s) Change Summary
Multiline context & metadata logic
src/multiline/flb_ml.c
Register full_map as context earlier when stream_group->mp_sbuf.size == 0 (including ENDSWITH path) and require !truncated when deciding to pack metadata, preventing metadata packing for truncated content.
Multiline metadata regression tests
tests/internal/multiline.c
Add tests for issue 10576: introduce metadata_result, flush_callback_metadata_check, append_log_with_metadata, test_issue_10576, and test_issue_truncation_10576; register tests in TEST_LIST to assert per-record stream/file metadata presence and behavior across truncation.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Review placement of flb_ml_register_context() calls to ensure they're only invoked when mp_sbuf.size == 0.
  • Confirm the updated metadata packing guard (!truncated && processed && metadata != NULL) doesn't unintentionally drop metadata in valid flows.
  • Validate the new tests for determinism and that they correctly simulate truncation/continuation edge cases.

Poem

🐇 I hop through logs where fragments play,
I plant the map at each new day.
When bytes are chewed and lines extend,
I keep your tags until the end.
A rabbit guards the metadata way.

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'ml: ensure context is registered for REGEX type' is partially related to the changeset. The changes involve context registration in multiline processing with conditional logic on empty buffers and truncation boundaries, but the title specifically references 'REGEX type' which is not explicitly mentioned in the file-level summaries and may not represent the main point of the changes. Clarify whether 'REGEX type' is the key aspect being fixed, or consider a more descriptive title that captures the core issue (e.g., metadata preservation during multiline flush/truncation) if that is the primary change.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between fa27f8d and 79afb58.

📒 Files selected for processing (2)
  • src/multiline/flb_ml.c (2 hunks)
  • tests/internal/multiline.c (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/multiline/flb_ml.c
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: cosmo0920
Repo: fluent/fluent-bit PR: 11059
File: plugins/in_tail/tail_file.c:1618-1640
Timestamp: 2025-10-23T07:43:16.216Z
Learning: In plugins/in_tail/tail_file.c, when truncate_long_lines is enabled and the buffer is full, the early truncation path uses `lines > 0` as the validation pattern to confirm whether process_content successfully processed content. This is intentional to track occurrences of line processing rather than byte consumption, and consuming bytes based on `processed_bytes > 0` would be overkill for this validation purpose.
🧬 Code graph analysis (1)
tests/internal/multiline.c (4)
src/multiline/flb_ml.c (1)
  • flb_ml_append_object (764-863)
src/flb_config.c (1)
  • flb_config_exit (488-672)
src/multiline/flb_ml_parser.c (4)
  • flb_ml_parser_create (200-224)
  • flb_ml_parser_init (131-141)
  • flb_ml_parser_instance_create (261-312)
  • flb_ml_parser_instance_set (315-340)
src/multiline/flb_ml_stream.c (1)
  • flb_ml_stream_create (223-276)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (31)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 32bit, x86, x86-windows-static, 3.31.6)
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit (Arm64), amd64_arm64, -DCMAKE_SYSTEM_NAME=Windows -DCMA...
  • GitHub Check: pr-windows-build / call-build-windows-package (Windows 64bit, x64, x64-windows-static, 3.31.6)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_ARROW=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COMPILER_STRICT_POINTER_TYPES=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_JEMALLOC=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_MEMORY=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SIMD=Off, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_UNDEFINED=On, 3.31.6, clang, clang++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_COVERAGE=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SANITIZE_THREAD=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DSANITIZE_ADDRESS=On, 3.31.6, gcc, g++)
  • GitHub Check: run-ubuntu-unit-tests (-DFLB_SMALL=On, 3.31.6, clang, clang++)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, clang, clang++, ubuntu-24.04, clang-14)
  • GitHub Check: pr-compile-system-libs (-DFLB_PREFER_SYSTEM_LIBS=On, 3.31.6, gcc, g++, ubuntu-22.04, clang-12)
  • GitHub Check: pr-compile-centos-7
  • GitHub Check: pr-compile-without-cxx (3.31.6)
  • GitHub Check: PR - fuzzing test
🔇 Additional comments (1)
tests/internal/multiline.c (1)

1646-2090: LGTM! Excellent test coverage for metadata preservation.

The new unit tests comprehensively validate the fix for issue #10576:

  1. test_issue_10576 properly simulates slow log arrival by flushing after each line and verifies all records maintain complete metadata (stream and file fields).

  2. test_issue_truncation_10576 correctly validates metadata isolation across truncation boundaries—ensuring the second multiline group gets its own fresh metadata rather than inheriting stale values.

  3. Helper infrastructure (metadata_result struct, flush_callback_metadata_check, append_log_with_metadata) is well-designed, bounds-safe, and properly manages msgpack resources.

  4. Test assertions are thorough and include helpful diagnostic output.

All previous review feedback (timestamp length, typo, style) has been addressed in prior commits.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@mirko-lazarevic
Copy link
Contributor Author

This pull request should address the issue #10576

For the fluent-bit configuration example and steps how to reproduce the issue, navigate to #10576

Output after the fix:

Fluent Bit v4.2.1
* Copyright (C) 2015-2025 The Fluent Bit Authors
* Fluent Bit is a CNCF graduated project under the Fluent organization
* https://fluentbit.io

______ _                  _    ______ _ _             ___   _____
|  ___| |                | |   | ___ (_) |           /   | / __  \
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __/ /| | `' / /'
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / / /_| |   / /
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /\___  |_./ /___
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/     |_(_)_____/

             Fluent Bit v4.2 – Direct Routes Ahead
         Celebrating 10 Years of Open, Fluent Innovation!

[2025/12/01 12:30:43.528267000] [ info] [fluent bit] version=4.2.1, commit=10ebd3a354, pid=6123
[2025/12/01 12:30:43.528771000] [ info] [storage] ver=1.5.4, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2025/12/01 12:30:43.528993000] [ info] [simd    ] disabled
[2025/12/01 12:30:43.528998000] [ info] [cmetrics] version=1.0.5
[2025/12/01 12:30:43.529349000] [ info] [ctraces ] version=0.6.6
[2025/12/01 12:30:43.529578000] [ info] [input:tail:tail.0] initializing
[2025/12/01 12:30:43.529585000] [ info] [input:tail:tail.0] storage_strategy='memory' (memory only)
[2025/12/01 12:30:43.530012000] [ info] [input:tail:tail.0] multiline core started
[2025/12/01 12:30:43.530308000] [ info] [input:tail:tail.0] thread instance initialized
[2025/12/01 12:30:43.530546000] [ info] [filter:multiline:ml-detect] created emitter: emitter_for_ml-detect
[2025/12/01 12:30:43.530591000] [ info] [input:emitter:emitter_for_ml-detect] initializing
[2025/12/01 12:30:43.530596000] [ info] [input:emitter:emitter_for_ml-detect] storage_strategy='memory' (memory only)
[2025/12/01 12:30:43.530916000] [ info] [output:stdout:stdout.0] worker #0 started
[2025/12/01 12:30:43.531683000] [ info] [http_server] listen iface=0.0.0.0 tcp_port=8081
[2025/12/01 12:30:43.531917000] [ info] [sp] stream processor started
[2025/12/01 12:30:43.532206000] [ info] [engine] Shutdown Grace Period=5, Shutdown Input Grace Period=2


[2025/12/01 12:30:49.787352000] [ info] [filter:multiline:ml-detect] created new multiline stream for tail.0_kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588649.296478000, {}], {"time"=>"2025-12-01T12:30:49.296478+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:30:49 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588654.298018000, {}], {"time"=>"2025-12-01T12:30:54.298018+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:30:54 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588659.299245000, {}], {"time"=>"2025-12-01T12:30:59.299245+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"2025-12-01T11:30:59+00:00 should be ok", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588667.512873000, {}], {"time"=>"2025-12-01T12:31:07.512873+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:31:07 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]
[0] kube.Users.mila.dev.tmp.fluent-bit-10576-go-repro.output.log: [[1764588672.513383999, {}], {"time"=>"2025-12-01T12:31:12.513384+01:00", "stream"=>"stdout", "_p"=>"F", "log"=>"Mon Dec  1 11:31:12 UTC 2025 Likely to fail", "file"=>"/Users/mila/dev/tmp/fluent-bit-10576-go-repro/output.log"}]

@patrick-stephens
Copy link
Collaborator

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem.

@patrick-stephens
Copy link
Collaborator

The CIFuzz failure is down to something else so can be ignored: #11227

@mirko-lazarevic
Copy link
Contributor Author

@patrick-stephens

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

I saw exact the same commit message from one of the maintainers, that's why I did the same. Anyway, I removed ml:.

I'll see if I can add some unit tests, although my knowledge in this area is limited.

@mirko-lazarevic
Copy link
Contributor Author

mirko-lazarevic commented Dec 2, 2025

@mirko-lazarevic maybe tweak the commit slightly as having ml: in there is redundant and confusing.

Can you add some unit tests as well? I really like to see those as next time the code is refactored/updated it will prevent a similar problem.

@patrick-stephens Done

@mirko-lazarevic mirko-lazarevic changed the title multiline: ensure context is registered for REGEX type ml: ensure context is registered for REGEX type Dec 10, 2025
@mirko-lazarevic
Copy link
Contributor Author

Hey, the patch is good but there is only one commit does not fit our policy of commit messages:

❌ Commit 23a2e18 failed:
Subject prefix 'multiline:' does not match files changed.
Expected one of: ml:

Done.

@mirko-lazarevic
Copy link
Contributor Author

@cosmo0920 @patrick-stephens I'm not 100% sure what needs to be done regarding commit message linter. Do I need to squash my commits into a single one? Thank you

@patrick-stephens
Copy link
Collaborator

The prefix should match the files being changed, it looks like it is expected tests: for some of them. If you check out the output from the CI you can see: https://github.com/fluent/fluent-bit/actions/runs/20093737550/job/57646949415?pr=11231

Signed-off-by: Mirko Lazarevic <mirko.lazarevic@ibm.com>
Signed-off-by: Mirko Lazarevic <mirko.lazarevic@ibm.com>
@cosmo0920
Copy link
Contributor

Commit linter still complains one commit:
❌ Commit 01f3911 failed:
Subject prefix 'tests:' does not match files changed.
Expected one of: ml:, tests:
Commit prefix validation failed.
Error: Process completed with exit code 1.

Addresses PR comments and adds correspoinding unit tests

Signed-off-by: Mirko Lazarevic <mirko.lazarevic@ibm.com>
Signed-off-by: Mirko Lazarevic <mirko.lazarevic@ibm.com>
Signed-off-by: Mirko Lazarevic <mirko.lazarevic@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants