Skip to content

[monitoring] Add per-tenant log routing support in fluent-bit#2195

Open
mattia-eleuteri wants to merge 1 commit intocozystack:mainfrom
mattia-eleuteri:monitoring/per-tenant-log-routing
Open

[monitoring] Add per-tenant log routing support in fluent-bit#2195
mattia-eleuteri wants to merge 1 commit intocozystack:mainfrom
mattia-eleuteri:monitoring/per-tenant-log-routing

Conversation

@mattia-eleuteri
Copy link
Copy Markdown
Contributor

@mattia-eleuteri mattia-eleuteri commented Mar 10, 2026

Summary

  • Adds tenantLogRouting parameter to monitoring-agents fluent-bit config
  • For each configured tenant, generates rewrite_tag filters to match logs by kubernetes_namespace_name and re-tag them as tenant.<namespace>.*
  • Generates per-tenant HTTP OUTPUT blocks to send re-tagged logs to the tenant's own VLogs instance

Problem

Currently, monitoring-agents fluent-bit sends all container logs to vlogs-generic.<target>.svc (typically tenant-root). Tenants that deploy their own monitoring stack get a VLogs instance but it receives no logs.

Workaround: Manual patch of the fluent-bit ConfigMap with rewrite_tag + per-tenant OUTPUT blocks.

Usage

fluent-bit:
  tenantLogRouting:
  - namespace: tenant-example
    vlogs: vlogs-generic.tenant-example.svc

Closes #2194 (partially)

Test plan

  • helm template monitoring-agents with tenantLogRouting populated
  • Verify rewrite_tag filter and HTTP output are generated for each tenant
  • Deploy and verify logs appear in tenant VLogs
Add per-tenant log routing support in fluent-bit monitoring-agents

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Enabled per-tenant log routing so tenant logs can be ingested and managed separately.
    • Per-tenant routes are configurable per-namespace via a monitoring label, allowing tenant-specific targets.
    • Tag rewriting applied per-namespace to route kube logs to tenant-specific ingest endpoints while preserving the global/default routing.

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. enhancement New feature or request labels Mar 10, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 10, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Changes update Fluent Bit configuration rendered from packages/system/monitoring-agents/values.yaml to retarget several existing outputs from vlogs-generic.*:9428 to vlinsert-generic.*:9481, and add dynamically-generated tenant-specific rewrite_tag filters and per-tenant HTTP OUTPUT blocks based on Namespace labels and deduplicated targets.

Changes

Cohort / File(s) Summary
Fluent Bit config (values.yaml)
packages/system/monitoring-agents/values.yaml
Change default HTTP output host/port from vlogs-generic.{{ .Values.global.target }}.svc:9428 to vlinsert-generic.{{ .Values.global.target }}.svc:9481. Add logic that: (1) queries Namespace resources, extracts namespace.cozystack.io/monitoring labels, deduplicates targets (excluding empty and default), (2) emits per-namespace FILTER rewrite_tag rules to retag kube.* into tenant.{target}.{namespace}.$TAG when monitor target differs from default, and (3) generates per-tenant OUTPUT blocks Match tenant.{target}.* sending JSON to vlinsert-generic.{target}.svc:9481. Existing modify filter adding tenant {{ .Values.global.target }} for Match * remains. Pay attention to templating, label lookup, and escaping in the Helm-rendered config.

Sequence Diagram(s)

sequenceDiagram
    participant Pod as App Pod
    participant FB as Fluent Bit
    participant VInsert as vlinsert-generic (tenant)

    Pod->>FB: emit logs (kube.* tags)
    FB->>FB: FILTER rewrite_tag (per-namespace) -> retag to tenant.{target}.{namespace}.<stream>
    FB->>FB: FILTER modify (Match *) -> add tenant {{ .Values.global.target }}
    FB->>VInsert: OUTPUT Match tenant.{target}.* -> HTTP/JSON POST to vlinsert-generic.{target}.svc:9481
    VInsert-->>FB: 200 OK
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I nibbled namespaces, hopped through tags,

Switched the hosts and fixed the paths,
Retagged the logs, sent each to its nest,
Tenants now listen, each log can rest,
A tiny hop, deployment laughs. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding per-tenant log routing support in fluent-bit, which is the core objective of this PR.
Linked Issues check ✅ Passed The PR implements objective 1 from issue #2194 by adding per-tenant log routing that directs logs to tenant-specific VLogs instances instead of all logs going to the parent tenant.
Out of Scope Changes check ✅ Passed All changes in this PR are directly within scope: the modifications update fluent-bit routing configuration to implement per-tenant log routing as specified in issue #2194.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/system/monitoring-agents/values.yaml (1)

382-394: Consider adding input validation for tenantLogRouting entries.

If a user provides a tenantLogRouting entry without the required namespace or vlogs fields, the template will generate an invalid fluent-bit config that could cause deployment failures with unhelpful error messages.

🛡️ Suggested validation approach

Add validation at the start of the range loop:

      {{- range .Values.tenantLogRouting }}
+     {{- if not .namespace }}
+     {{- fail "tenantLogRouting entry missing required 'namespace' field" }}
+     {{- end }}
+     {{- if not .vlogs }}
+     {{- fail "tenantLogRouting entry missing required 'vlogs' field" }}
+     {{- end }}
      [OUTPUT]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/system/monitoring-agents/values.yaml` around lines 382 - 394, The
template iterates over .Values.tenantLogRouting without checking required
fields, which can produce invalid fluent-bit OUTPUT blocks when namespace or
vlogs are missing; update the range loop over tenantLogRouting to validate each
entry by testing that .namespace and .vlogs are non-empty (e.g., using
conditional/if checks) and either skip the entry or emit a clear failure/warning
message before rendering the OUTPUT stanza for that item; reference the existing
range over .Values.tenantLogRouting and the fields .namespace and .vlogs when
implementing the checks so only entries with both fields produce the Name http /
Match / Host / header blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/system/monitoring-agents/values.yaml`:
- Around line 444-449: The Rule line in the Helm template that builds the
rewrite_tag filter interpolates .Values.tenantLogRouting[].namespace directly
into a regex (the "Rule $kubernetes_namespace_name ^({{ .namespace }})$
tenant.{{ .namespace }}.$TAG false"), which can break when namespaces contain
regex metacharacters; update that template to escape the namespace using Helm's
regexQuoteMeta (or equivalent) when constructing both the capture group and the
replacement token so the generated regex is literal and safe, ensuring you
reference the template block iterating over .Values.tenantLogRouting and the
rewrite_tag Rule to locate and change the interpolation.

---

Nitpick comments:
In `@packages/system/monitoring-agents/values.yaml`:
- Around line 382-394: The template iterates over .Values.tenantLogRouting
without checking required fields, which can produce invalid fluent-bit OUTPUT
blocks when namespace or vlogs are missing; update the range loop over
tenantLogRouting to validate each entry by testing that .namespace and .vlogs
are non-empty (e.g., using conditional/if checks) and either skip the entry or
emit a clear failure/warning message before rendering the OUTPUT stanza for that
item; reference the existing range over .Values.tenantLogRouting and the fields
.namespace and .vlogs when implementing the checks so only entries with both
fields produce the Name http / Match / Host / header blocks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 8ffb4485-2409-4778-bcf4-7be38544caca

📥 Commits

Reviewing files that changed from the base of the PR and between a13481b and 9b044a3.

📒 Files selected for processing (1)
  • packages/system/monitoring-agents/values.yaml

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses the current limitation where monitoring-agents Fluent Bit sends all container logs to a generic VLogs instance, preventing tenants from receiving their own logs in their dedicated VLogs. By introducing a new tenantLogRouting configuration, it enables per-tenant log routing, allowing logs from specific namespaces to be automatically directed to their respective VLogs instances, thereby enhancing log isolation and management for multi-tenant environments.

Highlights

  • Configuration Parameter: Introduced tenantLogRouting to the monitoring-agents Fluent Bit configuration, allowing specification of tenant namespaces and their corresponding VLogs endpoints.
  • Dynamic Output Generation: Implemented logic to dynamically generate per-tenant HTTP OUTPUT blocks within Fluent Bit, directing logs to each tenant's dedicated VLogs instance based on the new configuration.
  • Log Retagging: Added dynamic generation of rewrite_tag filters that match logs by kubernetes_namespace_name and re-tag them for proper routing to the respective tenant's VLogs.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • packages/system/monitoring-agents/values.yaml
    • Introduced a new tenantLogRouting configuration block under fluent-bit to enable per-tenant log routing.
    • Added a range loop to dynamically generate [OUTPUT] blocks for each configured tenant, directing logs to their specific VLogs instance.
    • Included another range loop to dynamically generate [FILTER] blocks with rewrite_tag rules, matching logs by kubernetes_namespace_name and re-tagging them for tenant-specific routing.
Activity
  • The pull request was created to address the issue of tenants not receiving their logs in their dedicated VLogs instances.
  • Automated summaries were generated by Claude Code and CodeRabbit.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new Fluent Bit configuration for per-tenant log routing, allowing logs from specific namespaces to be routed to dedicated VLogs services. This involves dynamically generating HTTP output and rewrite tag filter blocks based on user-defined tenantLogRouting values. However, the current implementation has several security vulnerabilities: the .vlogs parameter, used as an HTTP output host, is user-controlled and creates a Server-Side Request Forgery (SSRF) risk; the .namespace parameter, embedded in a regular expression for tag rewriting, is user-controlled and leads to a Regular Expression Denial of Service (ReDoS) vulnerability; and the .namespace parameter, embedded in glob patterns for log matching, is user-controlled and creates a log routing injection vulnerability. Remediation requires strict validation and escaping of these user-controlled parameters, along with refactoring duplicated output configurations for better maintainability.

{{- if not .namespace }}{{ fail "tenantLogRouting: namespace is required" }}{{- end }}
[OUTPUT]
Name http
Match tenant.{{ .namespace }}.*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The .vlogs parameter is directly used as the Host for an HTTP output, which is a user-controlled value. This creates a Server-Side Request Forgery (SSRF) vulnerability, allowing an attacker to specify an arbitrary hostname for Fluent Bit to send logs, potentially scanning internal networks or exfiltrating data. Implement strict validation and allow-listing for .vlogs to restrict it to trusted hosts. Additionally, the configuration for this [OUTPUT] block is duplicated, increasing maintenance overhead. Consider refactoring for future maintainability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a Helm values.yaml template — .vlogs is set by the platform operator during tenant provisioning, not by end users. There is no user-controlled input path here, so SSRF is not applicable.

{{- range .Values.tenantLogRouting }}
[FILTER]
Name rewrite_tag
Match kube.*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The .namespace parameter is directly embedded into a regular expression within the Rule field of a rewrite_tag filter. This user-controlled value creates a Regular Expression Denial of Service (ReDoS) vulnerability, allowing an attacker to inject malicious regex patterns that could consume excessive CPU and lead to a denial of service. Implement strict validation and escape special characters for .namespace. Additionally, the current regex ^({{ .namespace }})$ uses an unnecessary capturing group; simplifying it could improve clarity and avoid minor overhead.

Rule $kubernetes_namespace_name ^{{ .namespace }}$ tenant.{{ .namespace }}.$TAG false

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — regexQuoteMeta is now applied to escape the namespace value before it's used in the regex pattern.

{{- range .Values.tenantLogRouting }}
{{- if not .namespace }}{{ fail "tenantLogRouting: namespace is required" }}{{- end }}
[OUTPUT]
Name http
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

The .namespace parameter is directly embedded into the Match field of an [OUTPUT] block, which uses glob patterns. An attacker controlling this value could inject glob pattern wildcards (*, ?, []) to broaden log matching criteria.

Impact: This could lead to misrouting of logs, potential information leakage between tenants if a malicious pattern matches other tenants' logs, or a denial of service by causing Fluent Bit to process an excessive amount of logs.

Remediation: Implement strict validation for the .namespace parameter to ensure it only contains valid and expected namespace names. Avoid allowing wildcard characters or other special glob pattern characters in user-supplied namespace values.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Match field uses fluent-bit tag matching, not arbitrary glob. The namespace value is set by the platform during provisioning, not by tenant users, so glob injection is not a concern here.

@mattia-eleuteri mattia-eleuteri force-pushed the monitoring/per-tenant-log-routing branch from 6eb1707 to 70274a7 Compare March 10, 2026 15:57
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/system/monitoring-agents/values.yaml`:
- Around line 382-387: The template validates .namespace but not .vlogs, so
missing vlogs silently defaults Host to localhost; update the tenantLogRouting
loop in values.yaml to check for .vlogs (e.g., add {{- if not .vlogs }}{{ fail
"tenantLogRouting: vlogs is required for namespace <namespace>" }}{{- end }})
before rendering the [OUTPUT] block so rendering fails fast when .vlogs is
omitted; keep the existing Host {{ .vlogs }} usage and include the namespace in
the failure message to aid debugging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 92ca9944-7a8d-42fc-9f26-3d5af186d017

📥 Commits

Reviewing files that changed from the base of the PR and between 6eb1707 and 70274a7.

📒 Files selected for processing (1)
  • packages/system/monitoring-agents/values.yaml

@mattia-eleuteri mattia-eleuteri force-pushed the monitoring/per-tenant-log-routing branch from 70274a7 to 28a6826 Compare March 11, 2026 09:38
Copy link
Copy Markdown
Member

@kvaps kvaps left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling this! The problem is real — tenant VLogs instances receiving no logs is a gap we need to close.

However, I think the manual tenantLogRouting approach has some drawbacks:

  1. No automation — someone has to manually add entries to values.yaml every time a tenant with monitoring is created/deleted. This is error-prone and doesn't scale.
  2. No sub-tenant support — nested tenants won't get their logs routed automatically.
  3. Diverges from the existing pattern — vmagent already solves the equivalent problem for metrics automatically.

How vmagent does it (for reference)

  • Each tenant namespace gets a label namespace.cozystack.io/monitoring: <monitoring-namespace> (namespace.yaml)
  • The cozystack-values Secret injects _namespace.monitoring into every tenant, pointing to the correct monitoring stack
  • The monitoring-agents HelmRelease for each Kubernetes cluster uses $targetTenant := .Values._namespace.monitoring to route both metrics and logs to the right destination automatically

For child Kubernetes clusters, fluent-bit already sends all logs to vlinsert-generic.{{ $targetTenant }}.svc — so logs already work automatically there. The gap is only on the root cluster, where all namespace logs go to tenant-root without per-tenant splitting.

Suggested approach

Instead of a manual list in monitoring-agents values, consider generating the fluent-bit routing config at the tenant chart level (packages/apps/tenant/templates/), using the same _namespace.monitoring value that vmagent already relies on. This way:

  • Tenant creation automatically sets up log routing
  • Sub-tenant inheritance works out of the box (child tenants inherit _namespace.monitoring from parent)
  • The pattern stays consistent with how metrics collection already works

For reference, PR #1684 added per-tenant vmagent with automatic wiring — the same pattern should be followed for log routing.

@mattia-eleuteri mattia-eleuteri force-pushed the monitoring/per-tenant-log-routing branch from 28a6826 to 29a4a64 Compare March 19, 2026 13:49
@dosubot dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:S This PR changes 10-29 lines, ignoring generated files. labels Mar 19, 2026
@mattia-eleuteri
Copy link
Copy Markdown
Contributor Author

@kvaps Thanks for the detailed review and the pointer to how vmagent handles this!

I've reworked the implementation to follow the exact same automatic pattern. Here's what changed:

Approach: Instead of a manual tenantLogRouting list, fluent-bit config now uses Helm lookup to discover namespaces with the namespace.cozystack.io/monitoring label — the same label that vmagent already relies on. No manual configuration needed.

How it works:

  1. lookup "v1" "Namespace" "" "" lists all namespaces at deploy time
  2. Filters those with namespace.cozystack.io/monitoring pointing to a target different from global.target (tenant-root)
  3. Generates rewrite_tag filters per namespace → routes matching logs to tenant.<target>.<ns>.$TAG
  4. Generates deduplicated HTTP OUTPUTs per unique monitoring target → sends to vlinsert-generic.<target>.svc:9481
  5. Flux reconciliation (every 5min) picks up new tenants automatically

Also fixed: Default outputs now correctly use vlinsert-generic:9481 (VLCluster) instead of vlogs-generic:9428.

Edge cases handled:

  • Tenant without monitoring → label = tenant-root → filtered out → logs go to default
  • Sub-tenant inheriting parent monitoring → label = parent target → routed correctly
  • helm template dry-run → lookup returns empty → no dynamic routing → safe
  • Namespace deleted → lookup no longer finds it → automatic cleanup

Only packages/system/monitoring-agents/values.yaml is modified, no changes needed in the tenant chart since namespace.cozystack.io/monitoring labels are already in place.

@mattia-eleuteri mattia-eleuteri force-pushed the monitoring/per-tenant-log-routing branch from 29a4a64 to da4c3ac Compare March 19, 2026 13:52
Replace manual tenantLogRouting list with automatic discovery using
Helm lookup on namespace.cozystack.io/monitoring labels, matching
the existing vmagent pattern for metrics routing.

- Use lookup "v1" "Namespace" to discover tenant namespaces
- Generate rewrite_tag filters and HTTP outputs dynamically
- Deduplicate targets (multiple namespaces -> one output per target)
- Fix default outputs to use vlinsert-generic:9481 (VLCluster)
- Remove manual tenantLogRouting configuration

Signed-off-by: mattia-eleuteri <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@mattia-eleuteri mattia-eleuteri force-pushed the monitoring/per-tenant-log-routing branch from da4c3ac to 1e8f354 Compare March 19, 2026 13:52
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
packages/system/monitoring-agents/values.yaml (1)

442-459: Consider order of modify and rewrite_tag filters for tenant field accuracy.

The modify filter at line 442-445 adds tenant {{ .Values.global.target }} to all logs (Match *) before the rewrite_tag filters process them. This means logs routed to tenant-specific VLogs instances will still have tenant: cozy-monitoring (or whatever global.target is) rather than the actual tenant target.

If tenants should see their own tenant name in the tenant field, consider either:

  1. Moving the tenant modify filter after rewrite_tag and adjusting the match patterns, or
  2. Adding tenant-specific modify filters after rewrite_tag

This may be intentional if the tenant field is meant for platform-level identification, or could be addressed in the follow-up PR #2196 (metadata redaction).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/system/monitoring-agents/values.yaml` around lines 442 - 459, The
current global modify filter (Name modify that adds tenant {{
.Values.global.target }}) is applied before the rewrite_tag rules, causing
tenant-specific routing to keep the global tenant value; update the config so
tenant is set after tags are rewritten: either move the global "Name modify Add
tenant {{ .Values.global.target }}" block below the rewrite_tag blocks and
narrow its Match to only platform-level logs, or add per-namespace
tenant-specific modify filters immediately after the rewrite_tag Rule (matching
the rewritten tag pattern such as tenant.{{ $monTarget }}.*) and set Add tenant
to the corresponding {{ $monTarget }}; adjust Match patterns accordingly so the
modify that sets tenant runs after the rewrite_tag transformation (refer to the
existing [FILTER] blocks, the Name modify filter and the Name rewrite_tag filter
with Rule $kubernetes_namespace_name ^({{ $ns.metadata.name }})$ tenant.{{
$monTarget }}.{{ $ns.metadata.name }}.$TAG).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/system/monitoring-agents/values.yaml`:
- Around line 347-348: Add a missing ExternalName Service entry for
vlinsert-generic in the monitoring-external-services.yaml template: create an
ExternalName service resource named vlinsert-generic that mirrors the existing
vlogs-generic entry but points to vlinsert-generic.tenant-root.svc.cluster.local
and exposes port 9481 so Fluent Bit outputs (references to vlinsert-generic:9481
in values.yaml) can resolve; update the template where other services are
defined (use the vlogs-generic Service as the pattern) and ensure the same
entries are added for the other occurrences noted (lines near 358-359, 369-370,
392).

---

Nitpick comments:
In `@packages/system/monitoring-agents/values.yaml`:
- Around line 442-459: The current global modify filter (Name modify that adds
tenant {{ .Values.global.target }}) is applied before the rewrite_tag rules,
causing tenant-specific routing to keep the global tenant value; update the
config so tenant is set after tags are rewritten: either move the global "Name
modify Add tenant {{ .Values.global.target }}" block below the rewrite_tag
blocks and narrow its Match to only platform-level logs, or add per-namespace
tenant-specific modify filters immediately after the rewrite_tag Rule (matching
the rewritten tag pattern such as tenant.{{ $monTarget }}.*) and set Add tenant
to the corresponding {{ $monTarget }}; adjust Match patterns accordingly so the
modify that sets tenant runs after the rewrite_tag transformation (refer to the
existing [FILTER] blocks, the Name modify filter and the Name rewrite_tag filter
with Rule $kubernetes_namespace_name ^({{ $ns.metadata.name }})$ tenant.{{
$monTarget }}.{{ $ns.metadata.name }}.$TAG).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5220f786-1b26-476a-87f9-623825339d61

📥 Commits

Reviewing files that changed from the base of the PR and between 28a6826 and 29a4a64.

📒 Files selected for processing (1)
  • packages/system/monitoring-agents/values.yaml

Comment on lines +347 to +348
Host vlinsert-generic.{{ .Values.global.target }}.svc
port 9481
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Search for vlinsert-generic service definitions in the codebase
echo "=== Searching for vlinsert-generic service definitions ==="
rg -n "vlinsert-generic" --type yaml

echo ""
echo "=== Checking monitoring package templates ==="
fd -e yaml . packages/system/monitoring/templates -x grep -l "vlinsert" {} \;

Repository: cozystack/cozystack

Length of output: 610


🏁 Script executed:

#!/bin/bash
# Search for all external service definitions
echo "=== Checking for external services definitions ==="
rg -n "ExternalName" --type yaml -A 3 | head -50

echo ""
echo "=== Looking for vlogs references ==="
rg -n "vlogs" --type yaml | grep -i service

echo ""
echo "=== Checking monitoring-external-services.yaml ==="
find . -name "monitoring-external-services.yaml" -exec cat {} \;

Repository: cozystack/cozystack

Length of output: 9071


🏁 Script executed:

#!/bin/bash
# Check Victoria Logs configuration and services
echo "=== Searching for victoria logs related configs ==="
fd -name "*.yaml" | xargs grep -l "victoria" | head -20

echo ""
echo "=== Check if vlinsert is mentioned in any documentation ==="
fd -name "*.md" | xargs grep -l "vlinsert" 2>/dev/null || echo "No vlinsert in markdown files"

Repository: cozystack/cozystack

Length of output: 684


🏁 Script executed:

#!/bin/bash
# Search for victoria-logs package
echo "=== Finding victoria-logs related files ==="
find . -type d -name "*victoria*logs*" | head -20

echo ""
echo "=== Searching for any vlinsert mention (case-insensitive) ==="
rg -i "vlinsert" --type yaml

echo ""
echo "=== Check monitoring package structure ==="
ls -la packages/system/monitoring/templates/ | head -30

echo ""
echo "=== Check if monitoring package defines any services ==="
fd -e yaml . packages/system/monitoring/templates -exec basename {} \;

Repository: cozystack/cozystack

Length of output: 2170


🏁 Script executed:

#!/bin/bash
# Check vlogs configuration in monitoring package
echo "=== Content of vlogs.yaml ==="
cat packages/system/monitoring/templates/vlogs/vlogs.yaml

echo ""
echo "=== Content of vlogs grafana-datasource.yaml ==="
cat packages/system/monitoring/templates/vlogs/grafana-datasource.yaml

Repository: cozystack/cozystack

Length of output: 1146


🏁 Script executed:

#!/bin/bash
# Check monitoring-agents deployment/values
echo "=== Check monitoring-agents to understand service discovery ==="
head -100 packages/system/monitoring-agents/values.yaml

echo ""
echo "=== Search for VLogs CRD definitions ==="
rg -i "kind.*vlogs" --type yaml

echo ""
echo "=== Check if VLogs operator creates services automatically ==="
find . -name "*.yaml" -o -name "*.md" | xargs grep -l "VLogs.*service" 2>/dev/null | head -5

Repository: cozystack/cozystack

Length of output: 4352


🏁 Script executed:

#!/bin/bash
# Check VLogs CRD to understand service creation
echo "=== Searching for VLogs CRD service-related fields ==="
rg -A 5 -B 5 "vlinsert\|service" packages/system/victoria-metrics-operator/charts/victoria-metrics-operator/crd.yaml | head -100

echo ""
echo "=== Check if there's any documentation about VLogs services ==="
find . -name "*.md" -o -name "README*" | xargs grep -l "VLogs\|vlinsert" 2>/dev/null | head -5

Repository: cozystack/cozystack

Length of output: 182


🏁 Script executed:

#!/bin/bash
# Look at the full diff context around the changes
echo "=== Check git diff for this file if available ==="
git log --oneline packages/system/monitoring-agents/values.yaml 2>/dev/null | head -5

echo ""
echo "=== Check recent changes context ==="
git show HEAD:packages/system/cozystack-basics/templates/monitoring-external-services.yaml 2>/dev/null || echo "Not available"

Repository: cozystack/cozystack

Length of output: 783


🏁 Script executed:

#!/bin/bash
# Check the actual changes in monitoring-agents/values.yaml
echo "=== Context around lines 340-400 in monitoring-agents/values.yaml ==="
sed -n '340,400p' packages/system/monitoring-agents/values.yaml | cat -n

echo ""
echo "=== Check what the OUTPUT is for ==="
sed -n '330,350p' packages/system/monitoring-agents/values.yaml | cat -n

Repository: cozystack/cozystack

Length of output: 4108


🏁 Script executed:

#!/bin/bash
# Final verification: check if VLogs operator would create vlinsert services
echo "=== Search for any indication that vlinsert services are created ==="
rg -i "vlinsert" . --type yaml

echo ""
echo "=== Check if there's a pattern for how services are named in monitoring ==="
rg -i "port.*948|9428" . --type yaml | grep -v node_modules | head -20

echo ""
echo "=== Verify the monitoring namespace and how external services work ==="
grep -r "cozy-monitoring" packages/system/cozystack-basics/templates/monitoring-external-services.yaml

Repository: cozystack/cozystack

Length of output: 1308


Add vlinsert-generic ExternalName service to monitoring-external-services.yaml.

Fluent Bit outputs (lines 347-348 and others) send logs to vlinsert-generic:9481 (VictoriaLogs insert endpoint), but this service is not defined in packages/system/cozystack-basics/templates/monitoring-external-services.yaml. Without it, logs will fail to be inserted into VictoriaLogs. Add the missing ExternalName service mapping to vlinsert-generic.tenant-root.svc.cluster.local similar to the existing vlogs-generic service.

Also applies to: 358-359, 369-370, 392

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/system/monitoring-agents/values.yaml` around lines 347 - 348, Add a
missing ExternalName Service entry for vlinsert-generic in the
monitoring-external-services.yaml template: create an ExternalName service
resource named vlinsert-generic that mirrors the existing vlogs-generic entry
but points to vlinsert-generic.tenant-root.svc.cluster.local and exposes port
9481 so Fluent Bit outputs (references to vlinsert-generic:9481 in values.yaml)
can resolve; update the template where other services are defined (use the
vlogs-generic Service as the pattern) and ensure the same entries are added for
the other occurrences noted (lines near 358-359, 369-370, 392).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[monitoring] Improve multi-tenant log isolation, dashboard scoping, and VM monitoring

2 participants