Skip to content

[Bug]: couch_peruser leaks mem3_cluster and init_changes_handler processes #5871

@NiloCK

Description

@NiloCK

Version

CouchDB version: 3.4.3 (dpkg on Ubuntu, single-node)

Describe the problem you're encountering

Image

Above memory use is from an idle server - the drop off is a restart of couchdb via systemctl restart couchdb.

Root cause:

init_state/0 in couch_peruser.erl spawns a new mem3_cluster:start_link/4 process on every cluster_unstable cast and update_config cast. exit_changes/1 cleans up change feed handlers but never terminates the previous mem3_cluster process. Old mem3_cluster processes remain alive and continue firing cluster_unstable events, each triggering another init_state/0 call, creating a feedback loop of process accumulation.

Evidence:

Erlang remote shell inspection on the running node:

  • process_count climbed from 4,097 to 4,176 in ~2 minutes on an idle system (70 HTTP requests total since last restart)
  • erlang:process_info showed 16,426 couch_peruser:init_changes_handler processes accumulating
  • couch_event_server (the event dispatcher) was the top memory consumer at 8.3 MB, growing as it tracked all registered handlers

Memory timeline (from system monitoring):

  • Post-restart: 3.2% RAM → 3.4% in 2 minutes
  • 6:14 PM: 39.7% → 2:24 AM: 87%+ (8 hours, +5.77%/hour)
  • Consistent across multiple restarts

_system snapshots (3.6 minutes apart):

  ┌───────────┬──────────────────────────┬────────────────────────────┐
  │   Field   │ Reading 1 (uptime 1593s) │  Reading 2 (uptime 1811s)  │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ processes │ 63,622,424 bytes         │ 70,445,232 bytes (+6.8 MB) │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ binary    │ 787,440                  │ 903,448                    │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ ets       │ 1,302,568                │ 1,308,112                  │
  ├───────────┼──────────────────────────┼────────────────────────────┤
  │ code      │ 13,955,494               │ 13,955,494                 │
  └───────────┴──────────────────────────┴────────────────────────────┘

Workaround:

Disable couch_peruser (PUT _node/_local/_config/couch_peruser/enable → "false").

Expected Behaviour

Steady state memory consumption for an idle server.

Steps to Reproduce

Haven't manually reproduced, but roughly:

  • init 3.4.3 db
  • enable couchdb per_user
  • create some dbs, users
  • observe

Your Environment

  • Single-node CouchDB 3.4.3 (dpkg, Ubuntu 20.04)
  • 2 GB RAM
  • 97 databases (including ~80 userdb-* via couch_peruser)
  • Near-zero traffic
  • No active tasks, smoosh idle, no replication jobs

Additional Context

Related prior work:

PR #3851 attempted a fix by unlinking the old mem3_cluster process, but reviewer @nickva noted it was insufficient — unlink without kill still leaves processes alive and sending events. nickva recommended keeping a single mem3_cluster process for the gen_server's lifetime and only restarting change feed handlers on state resets.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions