-
Notifications
You must be signed in to change notification settings - Fork 830
feat: docker-aware machine stats #7488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
All contributors have signed the CLA ✍️ ✅ |
|
I have read the CLA Document and I hereby sign the CLA |
9d47058 to
bbd42ed
Compare
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds automatic detection of containerized environments (Docker, Kubernetes, Podman) to display accurate resource statistics when running marimo in containers with resource limits. Instead of showing misleading host-level metrics, the footer now displays container-specific memory limits when cgroup constraints are detected, with an updated label to indicate "container memory" vs "computer memory".
Key Changes:
- Added cgroup v1 and v2 detection functions to identify and read container resource limits from Linux cgroups
- Modified the
/api/usageendpoint to return container statistics when limits are detected, adding anis_containerboolean flag - Updated the frontend machine stats component to display appropriate memory labels based on container context
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
marimo/_utils/health.py |
Added has_cgroup_limits() and get_container_resources() functions to detect and read cgroup v1/v2 resource limits |
marimo/_server/api/endpoints/health.py |
Modified /api/usage endpoint to use container resources when available and include is_container flag |
tests/_server/api/endpoints/test_health.py |
Added assertions to verify is_container field is present and boolean in API response |
packages/openapi/src/api.ts |
Added optional is_container boolean field to memory response type |
packages/openapi/api.yaml |
Added is_container boolean field to memory schema definition |
frontend/src/components/editor/chrome/wrapper/footer-items/machine-stats.tsx |
Updated memory stats tooltip to show "container memory" vs "computer memory" based on is_container flag |
Comments suppressed due to low confidence (11)
marimo/_utils/health.py:204
- File is opened but is not closed.
open("/sys/fs/cgroup/memory.max", encoding="utf-8")
marimo/_utils/health.py:213
- File is opened but is not closed.
open("/sys/fs/cgroup/cpu.max", encoding="utf-8").read().strip()
marimo/_utils/health.py:226
- File is opened but is not closed.
open(
"/sys/fs/cgroup/memory/memory.limit_in_bytes",
encoding="utf-8",
)
marimo/_utils/health.py:238
- File is opened but is not closed.
open("/sys/fs/cgroup/cpu/cpu.cfs_quota_us", encoding="utf-8")
marimo/_utils/health.py:345
- File is opened but is not closed.
open("/sys/fs/cgroup/cpu.max", encoding="utf-8")
marimo/_utils/health.py:366
- File is opened but is not closed.
open(
"/sys/fs/cgroup/cpu/cpu.cfs_quota_us", encoding="utf-8"
)
marimo/_utils/health.py:374
- File is opened but is not closed.
open(
"/sys/fs/cgroup/cpu/cpu.cfs_period_us",
encoding="utf-8",
)
marimo/_utils/health.py:323
- File is opened but is not closed.
open(
"/sys/fs/cgroup/memory/memory.usage_in_bytes",
encoding="utf-8",
)
marimo/_utils/health.py:287
- File is opened but is not closed.
open("/sys/fs/cgroup/memory.max", encoding="utf-8")
marimo/_utils/health.py:315
- File is opened but is not closed.
open(
"/sys/fs/cgroup/memory/memory.limit_in_bytes",
encoding="utf-8",
)
marimo/_utils/health.py:292
- File is opened but is not closed.
open("/sys/fs/cgroup/memory.current", encoding="utf-8")
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| quota = int( | ||
| open( | ||
| "/sys/fs/cgroup/cpu/cpu.cfs_quota_us", encoding="utf-8" | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| period = int( | ||
| open( | ||
| "/sys/fs/cgroup/cpu/cpu.cfs_period_us", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
| memory_max = ( | ||
| open("/sys/fs/cgroup/memory.max", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle. This issue occurs multiple times throughout both functions where files are opened but never closed.
| cpu_max = ( | ||
| open("/sys/fs/cgroup/cpu.max", encoding="utf-8").read().strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
| limit = int( | ||
| open( | ||
| "/sys/fs/cgroup/memory/memory.limit_in_bytes", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
| def has_cgroup_limits() -> tuple[bool, bool]: | ||
| """ | ||
| Check if cgroup resource limits are explicitly set. | ||
| Returns: | ||
| (has_memory_limit, has_cpu_limit): Tuple of booleans indicating | ||
| whether memory and CPU limits are set. | ||
| """ | ||
| has_memory = False | ||
| has_cpu = False | ||
|
|
||
| try: | ||
| # Check cgroup v2 (modern containers) | ||
| if os.path.exists("/sys/fs/cgroup/memory.max"): | ||
| memory_max = ( | ||
| open("/sys/fs/cgroup/memory.max", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) | ||
| # 'max' means unlimited, any number means limited | ||
| has_memory = memory_max != "max" | ||
|
|
||
| if os.path.exists("/sys/fs/cgroup/cpu.max"): | ||
| cpu_max = ( | ||
| open("/sys/fs/cgroup/cpu.max", encoding="utf-8").read().strip() | ||
| ) | ||
| # 'max' means unlimited | ||
| has_cpu = cpu_max != "max" | ||
|
|
||
| # Fallback to cgroup v1 (legacy) | ||
| if not has_memory and os.path.exists( | ||
| "/sys/fs/cgroup/memory/memory.limit_in_bytes" | ||
| ): | ||
| limit = int( | ||
| open( | ||
| "/sys/fs/cgroup/memory/memory.limit_in_bytes", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| # Very large number (typically > 2^62) indicates unlimited | ||
| # This is the default "unlimited" value in cgroup v1 | ||
| has_memory = limit < (1 << 62) | ||
|
|
||
| if not has_cpu and os.path.exists( | ||
| "/sys/fs/cgroup/cpu/cpu.cfs_quota_us" | ||
| ): | ||
| quota = int( | ||
| open("/sys/fs/cgroup/cpu/cpu.cfs_quota_us", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) | ||
| # In cgroup v1, -1 means unlimited | ||
| has_cpu = quota > 0 | ||
|
|
||
| except (FileNotFoundError, PermissionError, ValueError) as e: | ||
| LOGGER.debug(f"Error checking cgroup limits: {e}") | ||
|
|
||
| return has_memory, has_cpu | ||
|
|
||
|
|
||
| def get_container_resources() -> Optional[dict[str, Any]]: | ||
| """ | ||
| Get container resource limits if running in a resource-restricted container. | ||
| Returns: | ||
| Dictionary with 'memory' and/or 'cpu' keys if limits are set, | ||
| None if not in a container or no limits are configured. | ||
| Example return value: | ||
| { | ||
| 'memory': { | ||
| 'total': 2147483648, # bytes | ||
| 'used': 1073741824, # bytes | ||
| 'available': 1073741824, # bytes | ||
| 'percent': 50.0 # percentage | ||
| }, | ||
| 'cpu': { | ||
| 'quota': 200000, # microseconds | ||
| 'period': 100000, # microseconds | ||
| 'cores': 2.0 # effective number of cores | ||
| } | ||
| } | ||
| """ | ||
| has_memory_limit, has_cpu_limit = has_cgroup_limits() | ||
|
|
||
| if not (has_memory_limit or has_cpu_limit): | ||
| return None | ||
|
|
||
| resources: dict[str, Any] = {} | ||
|
|
||
| # Get memory stats if limited | ||
| if has_memory_limit: | ||
| try: | ||
| # Try cgroup v2 first | ||
| if os.path.exists("/sys/fs/cgroup/memory.max"): | ||
| memory_max = ( | ||
| open("/sys/fs/cgroup/memory.max", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) | ||
| memory_current = ( | ||
| open("/sys/fs/cgroup/memory.current", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) | ||
|
|
||
| if memory_max != "max": | ||
| total = int(memory_max) | ||
| used = int(memory_current) | ||
| available = total - used | ||
| percent = (used / total) * 100 if total > 0 else 0 | ||
|
|
||
| resources["memory"] = { | ||
| "total": total, | ||
| "used": used, | ||
| "available": available, | ||
| "percent": percent, | ||
| } | ||
| # Fallback to cgroup v1 | ||
| elif os.path.exists("/sys/fs/cgroup/memory/memory.limit_in_bytes"): | ||
| total = int( | ||
| open( | ||
| "/sys/fs/cgroup/memory/memory.limit_in_bytes", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| used = int( | ||
| open( | ||
| "/sys/fs/cgroup/memory/memory.usage_in_bytes", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| available = total - used | ||
| percent = (used / total) * 100 if total > 0 else 0 | ||
|
|
||
| resources["memory"] = { | ||
| "total": total, | ||
| "used": used, | ||
| "available": available, | ||
| "percent": percent, | ||
| } | ||
| except (FileNotFoundError, PermissionError, ValueError) as e: | ||
| LOGGER.debug(f"Error reading container memory stats: {e}") | ||
|
|
||
| # Get CPU stats if limited | ||
| if has_cpu_limit: | ||
| try: | ||
| # cgroup v2 | ||
| if os.path.exists("/sys/fs/cgroup/cpu.max"): | ||
| cpu_max_line = ( | ||
| open("/sys/fs/cgroup/cpu.max", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) | ||
| if cpu_max_line != "max": | ||
| parts = cpu_max_line.split() | ||
| if len(parts) == 2: | ||
| quota = int(parts[0]) | ||
| period = int(parts[1]) | ||
| cores = quota / period | ||
|
|
||
| resources["cpu"] = { | ||
| "quota": quota, | ||
| "period": period, | ||
| "cores": cores, | ||
| } | ||
| # cgroup v1 | ||
| elif os.path.exists("/sys/fs/cgroup/cpu/cpu.cfs_quota_us"): | ||
| quota = int( | ||
| open( | ||
| "/sys/fs/cgroup/cpu/cpu.cfs_quota_us", encoding="utf-8" | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| period = int( | ||
| open( | ||
| "/sys/fs/cgroup/cpu/cpu.cfs_period_us", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| if quota > 0: # -1 means unlimited | ||
| cores = quota / period | ||
| resources["cpu"] = { | ||
| "quota": quota, | ||
| "period": period, | ||
| "cores": cores, | ||
| } | ||
| except (FileNotFoundError, PermissionError, ValueError) as e: | ||
| LOGGER.debug(f"Error reading container CPU stats: {e}") | ||
|
|
||
| return resources if resources else None |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new cgroup detection functions has_cgroup_limits() and get_container_resources() lack unit tests. Given that other utility functions in marimo/_utils/health.py have corresponding tests in tests/_utils/test_health_utils.py, these functions should also have unit tests covering different scenarios (cgroup v1, v2, no limits, file not found, etc.).
| quota = int( | ||
| open("/sys/fs/cgroup/cpu/cpu.cfs_quota_us", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
| memory_max = ( | ||
| open("/sys/fs/cgroup/memory.max", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) | ||
| memory_current = ( | ||
| open("/sys/fs/cgroup/memory.current", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
| total = int( | ||
| open( | ||
| "/sys/fs/cgroup/memory/memory.limit_in_bytes", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) | ||
| used = int( | ||
| open( | ||
| "/sys/fs/cgroup/memory/memory.usage_in_bytes", | ||
| encoding="utf-8", | ||
| ) | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
| cpu_max_line = ( | ||
| open("/sys/fs/cgroup/cpu.max", encoding="utf-8") | ||
| .read() | ||
| .strip() | ||
| ) |
Copilot
AI
Dec 15, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
File handles are not being closed after reading cgroup files, which can lead to resource leaks. Each open() call should use a context manager (with statement) or explicitly close the file handle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a great improvement :). The frontend looks good to me.
Curious if you have tried this solution on the backend? giampaolo/psutil#1011. api docs: https://psutil.readthedocs.io/en/latest/index.html#psutil.PROCFS_PATH
| const isContainer = is_container === true; | ||
| const roundedPercent = Math.round(percent); | ||
| const memoryLabel = isContainer ? "container memory" : "computer memory"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit
| const isContainer = is_container === true; | |
| const roundedPercent = Math.round(percent); | |
| const memoryLabel = isContainer ? "container memory" : "computer memory"; | |
| const roundedPercent = Math.round(percent); | |
| const memoryLabel = is_container ? "container memory" : "computer memory"; |
|
this is a good find @Light2Dark . @Ddfulton using https://psutil.readthedocs.io/en/latest/index.html#psutil.PROCFS_PATH should hopefully clean up some code, if you are open to make the change |
📝 Summary
When running marimo in containers with resource limits (e.g., docker run --memory=16g), the machine stats footer shows host resources instead of container resources. This PR auto-detects containerized environments with cgroup limits and displays container resources instead.
Before: Shows host memory (512GB) even in a 16GB container

After: Shows actual container limits with "container memory" label

This is something we've been doing at Carolina Cloud for a while and have found very useful. Not sure whether it belongs in the main source code but figured I'd share to see what the community thinks!
🔍 Description of Changes
Backend:
Frontend:
Compatibility:
📋 Checklist