Jump to content

Server Admin Log

From Wikitech

2025-08-09

  • 01:12 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 11m 23s)
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2025-08-08

  • 01:12 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 12m 11s)
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 00:19 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1047.eqiad.wmnet with OS bullseye
  • 00:03 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1047.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED

2025-08-07

  • 23:38 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1047.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 23:38 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephosd1047
  • 23:37 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephosd1047
  • 23:37 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 23:37 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudcephosd1047 - vriley@cumin1002"
  • 23:37 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudcephosd1047 - vriley@cumin1002"
  • 23:33 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 22:59 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 22:59 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 22:58 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 22:38 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1042.eqiad.wmnet with reason: host reimage
  • 22:34 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1042.eqiad.wmnet with reason: host reimage
  • 22:15 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1016.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 22:15 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 22:09 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 21:16 jgleeson: payments-wiki upgraded from 0ab5bab9 to 0a1084a8
  • 21:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2018.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 21:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1016.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 21:14 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bullseye
  • 21:01 swfrench@deploy1003: Finished scap sync-world: No-op deployment to clear chart version diffs from https://gerrit.wikimedia.org/r/1176543 (duration: 02m 45s)
  • 20:58 swfrench@deploy1003: Started scap sync-world: No-op deployment to clear chart version diffs from https://gerrit.wikimedia.org/r/1176543
  • 20:30 cjming@deploy1003: Finished scap sync-world: Backport for Update PageVisit instruments for a logged-in synth experiment (T397140) (duration: 07m 34s)
  • 20:25 cjming@deploy1003: cjming: Continuing with sync
  • 20:24 cjming@deploy1003: cjming: Backport for Update PageVisit instruments for a logged-in synth experiment (T397140) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:23 cjming@deploy1003: Started scap sync-world: Backport for Update PageVisit instruments for a logged-in synth experiment (T397140)
  • 20:10 cjming@deploy1003: Finished scap sync-world: Backport for XLab/Hooks: Only fetch experiment configs when user is registered (duration: 08m 05s)
  • 20:05 cjming@deploy1003: cjming: Continuing with sync
  • 20:04 cjming@deploy1003: cjming: Backport for XLab/Hooks: Only fetch experiment configs when user is registered synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:02 cjming@deploy1003: Started scap sync-world: Backport for XLab/Hooks: Only fetch experiment configs when user is registered
  • 20:01 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-fe2020.codfw.wmnet with OS bullseye
  • 19:55 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 19:28 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2019.codfw.wmnet with OS bullseye
  • 19:28 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 19:28 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 19:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
  • 19:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
  • 19:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2019.codfw.wmnet with reason: host reimage
  • 18:57 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2019.codfw.wmnet with reason: host reimage
  • 18:50 cjming@deploy1003: mwscript-k8s job started: extensions/MetricsPlatform/maintenance/UpdateConfigs.php --wiki aawiki # Test run for T398422
  • 18:41 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2020.codfw.wmnet with OS bullseye
  • 18:41 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2019.codfw.wmnet with OS bullseye
  • 18:24 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1015.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 18:23 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2022.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 18:16 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 18:07 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2020.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2020.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2019.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:54 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 17:32 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2019.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:29 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2017.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 17:27 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2022.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:26 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1015.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 17:11 bd808@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
  • 17:09 bd808@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
  • 17:09 bd808@deploy1003: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
  • 17:09 bd808@deploy1003: helmfile [codfw] START helmfile.d/services/developer-portal: apply
  • 17:09 bd808@deploy1003: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
  • 17:08 bd808@deploy1003: helmfile [staging] START helmfile.d/services/developer-portal: apply
  • 16:51 dancy@deploy1003: Installation of scap version "4.198.0" completed for 2 hosts
  • 16:49 dancy@deploy1003: Installing scap version "4.198.0" for 2 host(s)
  • 16:26 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host deploy2003.codfw.wmnet with OS bookworm
  • 16:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2018.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:57 zabe@deploy1003: Finished scap sync-world: update interwiki cache (duration: 07m 29s)
  • 15:56 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2018.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2017.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:55 jhancock@cumin1003: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host ms-fe2017.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2017.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 15:50 zabe@deploy1003: Started scap sync-world: update interwiki cache
  • 15:49 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2018.codfw.wmnet with OS bullseye
  • 15:49 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 15:49 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 15:46 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2017.codfw.wmnet with OS bullseye
  • 15:46 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 15:46 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 15:44 zabe@deploy1003: Finished scap sync-world: Backport for Activate tlwikisource (T388639) (duration: 07m 47s)
  • 15:38 zabe@deploy1003: zabe: Continuing with sync
  • 15:38 zabe@deploy1003: zabe: Backport for Activate tlwikisource (T388639) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:36 zabe@deploy1003: Started scap sync-world: Backport for Activate tlwikisource (T388639)
  • 15:33 zabe: Create Wikisource Tagalog # T388639
  • 15:31 zabe@deploy1003: Finished scap sync-world: Backport for Initial configuration for tlwikisource (T388639) (duration: 07m 45s)
  • 15:31 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2018.codfw.wmnet with reason: host reimage
  • 15:28 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2017.codfw.wmnet with reason: host reimage
  • 15:26 zabe@deploy1003: zabe: Continuing with sync
  • 15:26 zabe@deploy1003: zabe: Backport for Initial configuration for tlwikisource (T388639) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 15:24 zabe@deploy1003: Started scap sync-world: Backport for Initial configuration for tlwikisource (T388639)
  • 15:23 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2018.codfw.wmnet with reason: host reimage
  • 15:23 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2017.codfw.wmnet with reason: host reimage
  • 15:14 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1263.eqiad.wmnet with OS bookworm
  • 15:14 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:13 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:10 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1262.eqiad.wmnet with OS bookworm
  • 15:10 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:08 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:08 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2018.codfw.wmnet with OS bullseye
  • 15:08 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2017.codfw.wmnet with OS bullseye
  • 15:07 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host deploy2003.codfw.wmnet with OS bookworm
  • 15:07 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['deploy2003']
  • 15:06 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['deploy2003']
  • 15:05 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2019.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:05 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1261.eqiad.wmnet with OS bookworm
  • 15:05 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:04 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:02 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2019.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:02 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db1260.eqiad.wmnet with OS bookworm
  • 15:02 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 15:01 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 14:59 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe2019.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2020.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2018.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe2017.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host deploy2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2020.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2019.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:55 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1263.eqiad.wmnet with reason: host reimage
  • 14:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2018.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe2017.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:54 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host deploy2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 14:53 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe2020
  • 14:53 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe2019
  • 14:53 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe2018
  • 14:53 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-fe2017
  • 14:53 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host deploy2003
  • 14:53 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe2020
  • 14:53 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe2019
  • 14:53 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe2018
  • 14:53 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-fe2017
  • 14:53 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host deploy2003
  • 14:52 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:52 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-fe2020 to codfw - jhancock@cumin1003"
  • 14:52 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-fe2020 to codfw - jhancock@cumin1003"
  • 14:52 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1262.eqiad.wmnet with reason: host reimage
  • 14:49 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 14:48 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1261.eqiad.wmnet with reason: host reimage
  • 14:44 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1260.eqiad.wmnet with reason: host reimage
  • 14:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1263.eqiad.wmnet with reason: host reimage
  • 14:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1262.eqiad.wmnet with reason: host reimage
  • 14:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1261.eqiad.wmnet with reason: host reimage
  • 14:41 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on db1260.eqiad.wmnet with reason: host reimage
  • 14:25 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1263.eqiad.wmnet with OS bookworm
  • 14:25 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1262.eqiad.wmnet with OS bookworm
  • 14:25 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1261.eqiad.wmnet with OS bookworm
  • 14:24 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1260.eqiad.wmnet with OS bookworm
  • 14:24 zabe@deploy1003: Finished scap sync-world: Backport for Do not create a database table when a different provider is used (T397348), Do not create a database table when a different provider is used (T397348) (duration: 07m 54s)
  • 14:18 zabe@deploy1003: zabe: Continuing with sync
  • 14:18 zabe@deploy1003: zabe: Backport for Do not create a database table when a different provider is used (T397348), Do not create a database table when a different provider is used (T397348) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:16 zabe@deploy1003: Started scap sync-world: Backport for Do not create a database table when a different provider is used (T397348), Do not create a database table when a different provider is used (T397348)
  • 14:02 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1263.eqiad.wmnet with OS bookworm
  • 14:02 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db1262.eqiad.wmnet with OS bookworm
  • 14:01 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 13:49 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2248.codfw.wmnet with OS bookworm
  • 13:49 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 13:47 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 13:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 13:41 mvernon@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 13:30 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2248.codfw.wmnet with reason: host reimage
  • 13:29 mvernon@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 13:28 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:26 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2248.codfw.wmnet with reason: host reimage
  • 13:16 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Enable wgParserEnableUserLanguage for incubatorwiki (duration: 09m 37s)
  • 13:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1263.eqiad.wmnet with OS bookworm
  • 13:11 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1262.eqiad.wmnet with OS bookworm
  • 13:11 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Continuing with sync
  • 13:09 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2248.codfw.wmnet with OS bookworm
  • 13:09 lucaswerkmeister-wmde@deploy1003: jhsoby, lucaswerkmeister-wmde: Backport for Enable wgParserEnableUserLanguage for incubatorwiki synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:07 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2248.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:07 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Enable wgParserEnableUserLanguage for incubatorwiki
  • 13:03 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1262.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 13:02 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1263.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:55 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2247.codfw.wmnet with OS bookworm
  • 12:55 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 12:55 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2248.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:54 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 12:54 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2246.codfw.wmnet with OS bookworm
  • 12:54 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 12:51 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 12:49 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 12:47 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 12:47 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T399728)', diff saved to https://phabricator.wikimedia.org/P80972 and previous config saved to /var/cache/conftool/dbconfig/20250807-124728-fceratto.json
  • 12:46 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1260.eqiad.wmnet with OS bookworm
  • 12:46 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host db1261.eqiad.wmnet with OS bookworm
  • 12:45 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2245.codfw.wmnet with OS bookworm
  • 12:45 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 12:45 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 12:41 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1261.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:41 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db1260.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:37 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1263.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:37 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2247.codfw.wmnet with reason: host reimage
  • 12:36 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1263.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:32 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P80970 and previous config saved to /var/cache/conftool/dbconfig/20250807-123220-fceratto.json
  • 12:32 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2246.codfw.wmnet with reason: host reimage
  • 12:28 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2245.codfw.wmnet with reason: host reimage
  • 12:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1262.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:27 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1263.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:24 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2247.codfw.wmnet with reason: host reimage
  • 12:24 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2246.codfw.wmnet with reason: host reimage
  • 12:24 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2245.codfw.wmnet with reason: host reimage
  • 12:23 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1262.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:23 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db1263.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:17 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P80969 and previous config saved to /var/cache/conftool/dbconfig/20250807-121712-fceratto.json
  • 12:15 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1262.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:15 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1263.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:14 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1261.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:14 jclark@cumin1002: START - Cookbook sre.hosts.provision for host db1260.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:13 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:13 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for db1260-3 - jclark@cumin1002"
  • 12:13 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for db1260-3 - jclark@cumin1002"
  • 12:08 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2247.codfw.wmnet with OS bookworm
  • 12:07 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2246.codfw.wmnet with OS bookworm
  • 12:07 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2245.codfw.wmnet with OS bookworm
  • 12:05 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 12:02 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T399728)', diff saved to https://phabricator.wikimedia.org/P80967 and previous config saved to /var/cache/conftool/dbconfig/20250807-120205-fceratto.json
  • 11:56 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T399728)', diff saved to https://phabricator.wikimedia.org/P80966 and previous config saved to /var/cache/conftool/dbconfig/20250807-115646-fceratto.json
  • 11:56 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:56 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 11:56 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T399728)', diff saved to https://phabricator.wikimedia.org/P80965 and previous config saved to /var/cache/conftool/dbconfig/20250807-115606-fceratto.json
  • 11:40 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P80964 and previous config saved to /var/cache/conftool/dbconfig/20250807-114058-fceratto.json
  • 11:25 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P80963 and previous config saved to /var/cache/conftool/dbconfig/20250807-112551-fceratto.json
  • 11:21 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 11:19 claime: deploy1003:~# lvextend -L+30G /dev/vg0/srv
  • 11:10 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T399728)', diff saved to https://phabricator.wikimedia.org/P80961 and previous config saved to /var/cache/conftool/dbconfig/20250807-111043-fceratto.json
  • 11:05 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T399728)', diff saved to https://phabricator.wikimedia.org/P80960 and previous config saved to /var/cache/conftool/dbconfig/20250807-110549-fceratto.json
  • 11:05 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 11:05 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T399728)', diff saved to https://phabricator.wikimedia.org/P80959 and previous config saved to /var/cache/conftool/dbconfig/20250807-110527-fceratto.json
  • 10:50 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P80958 and previous config saved to /var/cache/conftool/dbconfig/20250807-105019-fceratto.json
  • 10:35 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P80957 and previous config saved to /var/cache/conftool/dbconfig/20250807-103512-fceratto.json
  • 10:24 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1045.eqiad.wmnet with OS bullseye
  • 10:20 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T399728)', diff saved to https://phabricator.wikimedia.org/P80956 and previous config saved to /var/cache/conftool/dbconfig/20250807-102004-fceratto.json
  • 10:16 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 10:15 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T399728)', diff saved to https://phabricator.wikimedia.org/P80955 and previous config saved to /var/cache/conftool/dbconfig/20250807-101515-fceratto.json
  • 10:15 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 10:14 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T399728)', diff saved to https://phabricator.wikimedia.org/P80954 and previous config saved to /var/cache/conftool/dbconfig/20250807-101452-fceratto.json
  • 09:59 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P80953 and previous config saved to /var/cache/conftool/dbconfig/20250807-095945-fceratto.json
  • 09:44 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P80952 and previous config saved to /var/cache/conftool/dbconfig/20250807-094437-fceratto.json
  • 09:29 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T399728)', diff saved to https://phabricator.wikimedia.org/P80951 and previous config saved to /var/cache/conftool/dbconfig/20250807-092930-fceratto.json
  • 09:24 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T399728)', diff saved to https://phabricator.wikimedia.org/P80950 and previous config saved to /var/cache/conftool/dbconfig/20250807-092433-fceratto.json
  • 09:24 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 09:24 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T399728)', diff saved to https://phabricator.wikimedia.org/P80949 and previous config saved to /var/cache/conftool/dbconfig/20250807-092410-fceratto.json
  • 09:09 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@417d4e8] (releasing): T400645 (duration: 00m 31s)
  • 09:09 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P80948 and previous config saved to /var/cache/conftool/dbconfig/20250807-090903-fceratto.json
  • 09:08 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@417d4e8] (releasing): T400645
  • 09:04 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1045.eqiad.wmnet with OS bullseye
  • 09:03 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 08:53 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P80947 and previous config saved to /var/cache/conftool/dbconfig/20250807-085355-fceratto.json
  • 08:51 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 08:45 brouberol@cumin1003: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling restart_daemons on A:datahubsearch
  • 08:43 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 08:41 hashar@deploy1003: rebuilt and synchronized wikiversions files: group2 to 1.45.0-wmf.13 refs T396374
  • 08:38 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T399728)', diff saved to https://phabricator.wikimedia.org/P80946 and previous config saved to /var/cache/conftool/dbconfig/20250807-083848-fceratto.json
  • 08:37 brouberol@cumin1003: START - Cookbook sre.opensearch.roll-restart-reboot rolling restart_daemons on A:datahubsearch
  • 08:33 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T399728)', diff saved to https://phabricator.wikimedia.org/P80945 and previous config saved to /var/cache/conftool/dbconfig/20250807-083348-fceratto.json
  • 08:33 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 08:33 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T399728)', diff saved to https://phabricator.wikimedia.org/P80944 and previous config saved to /var/cache/conftool/dbconfig/20250807-083325-fceratto.json
  • 08:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ms-be[1061-1063].eqiad.wmnet
  • 08:25 mvernon@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:25 mvernon@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[1061-1063].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin1003"
  • 08:20 mvernon@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ms-be[1061-1063].eqiad.wmnet decommissioned, removing all IPs except the asset tag one - mvernon@cumin1003"
  • 08:18 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P80943 and previous config saved to /var/cache/conftool/dbconfig/20250807-081818-fceratto.json
  • 08:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 08:17 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-main: apply
  • 08:15 mvernon@cumin1003: START - Cookbook sre.dns.netbox
  • 08:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-wmde: apply
  • 08:14 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-wmde: apply
  • 08:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-test-k8s: apply
  • 08:13 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-test-k8s: apply
  • 08:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-search: apply
  • 08:11 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-search: apply
  • 08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-research: apply
  • 08:09 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-research: apply
  • 08:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-platform-eng: apply
  • 08:08 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-platform-eng: apply
  • 08:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-analytics-product: apply
  • 08:06 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-analytics-product: apply
  • 08:04 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-analytics-test: apply
  • 08:04 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-analytics-test: apply
  • 08:03 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P80942 and previous config saved to /var/cache/conftool/dbconfig/20250807-080311-fceratto.json
  • 08:02 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1045.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 08:02 mvernon@cumin1003: START - Cookbook sre.hosts.decommission for hosts ms-be[1061-1063].eqiad.wmnet
  • 08:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-dev: apply
  • 08:01 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-dev: apply
  • 07:59 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephosd1045
  • 07:59 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephosd1045
  • 07:58 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 07:58 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudcephosd1045 - vriley@cumin1002"
  • 07:57 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudcephosd1045 - vriley@cumin1002"
  • 07:54 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 07:51 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-ml: apply
  • 07:51 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-ml: apply
  • 07:48 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T399728)', diff saved to https://phabricator.wikimedia.org/P80941 and previous config saved to /var/cache/conftool/dbconfig/20250807-074803-fceratto.json
  • 07:43 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T399728)', diff saved to https://phabricator.wikimedia.org/P80940 and previous config saved to /var/cache/conftool/dbconfig/20250807-074306-fceratto.json
  • 07:43 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 07:39 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 06:34 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1015.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 06:33 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 05:36 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2021.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 05:36 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1015.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 05:30 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 05:17 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 05:16 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 04:28 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 04:07 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.restart (exit_code=0)
  • 01:39 ejegg: fundraising civicrm upgraded from e591fe72 to ebb98a9e
  • 01:12 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 11m 24s)
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2025-08-06

  • 22:09 ryankemper@cumin2002: END (FAIL) - Cookbook sre.wdqs.restart (exit_code=99)
  • 21:57 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1014.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 21:50 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2015.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 21:15 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 21:14 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.restart (exit_code=97)
  • 21:14 ryankemper@cumin2002: START - Cookbook sre.wdqs.restart
  • 20:59 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1014.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 20:53 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2015.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 20:43 dancy@deploy1003: Installation of scap version "4.197.1" completed for 1 hosts
  • 20:42 dancy@deploy1003: Installing scap version "4.197.1" for 1 host(s)
  • 20:38 dancy@deploy1003: Installing scap version "4.197.1" for 169 host(s)
  • 20:09 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2247.codfw.wmnet with OS bookworm
  • 20:08 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2246.codfw.wmnet with OS bookworm
  • 20:07 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host db2245.codfw.wmnet with OS bookworm
  • 19:54 papaul: maintenance goin on on msw1-eqiad
  • 19:30 swfrench@deploy1003: Finished scap sync-world: No-op deployment to configure PHP 8.3 image builds - T399884 (duration: 19m 22s)
  • 19:11 swfrench@deploy1003: Started scap sync-world: No-op deployment to configure PHP 8.3 image builds - T399884
  • 18:55 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2247.codfw.wmnet with OS bookworm
  • 18:55 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2246.codfw.wmnet with OS bookworm
  • 18:54 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host db2245.codfw.wmnet with OS bookworm
  • 18:52 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['db2245']
  • 18:52 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['db2245']
  • 18:17 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1013.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 18:13 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2014.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 18:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2245.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:56 swfrench@deploy1003: Finished scap sync-world: Deployment to pick up new 8.1.33-1-s3 production images - T383047 (duration: 45m 10s)
  • 17:44 swfrench@deploy1003: swfrench: Continuing with sync
  • 17:32 swfrench@deploy1003: swfrench: Deployment to pick up new 8.1.33-1-s3 production images - T383047 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:30 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2245.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 17:20 amastilovic@deploy1003: Finished deploy [analytics/refinery@2178dda] (thin): Updates to sqoop THIN [analytics/refinery@2178dda8] (duration: 01m 08s)
  • 17:19 amastilovic@deploy1003: Started deploy [analytics/refinery@2178dda] (thin): Updates to sqoop THIN [analytics/refinery@2178dda8]
  • 17:19 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1013.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 17:19 amastilovic@deploy1003: Finished deploy [analytics/refinery@2178dda]: Updates to sqoop [analytics/refinery@2178dda8] (duration: 02m 29s)
  • 17:17 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2014.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:16 amastilovic@deploy1003: Started deploy [analytics/refinery@2178dda]: Updates to sqoop [analytics/refinery@2178dda8]
  • 17:15 amastilovic@deploy1003: Finished deploy [analytics/refinery@2178dda] (hadoop-test): Updates to sqoop TEST [analytics/refinery@2178dda8] (duration: 00m 53s)
  • 17:15 amastilovic@deploy1003: Started deploy [analytics/refinery@2178dda] (hadoop-test): Updates to sqoop TEST [analytics/refinery@2178dda8]
  • 17:11 swfrench@deploy1003: Started scap sync-world: Deployment to pick up new 8.1.33-1-s3 production images - T383047
  • 17:10 swfrench-wmf: built and published php8.1 production image stack at 8.1.33-1-s3 - T383047
  • 17:03 swfrench-wmf: reprepro include php8.1_8.1.33-1+wmf11u2 in component/php81 - T383047
  • 16:49 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 16:48 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T399728)', diff saved to https://phabricator.wikimedia.org/P80935 and previous config saved to /var/cache/conftool/dbconfig/20250806-164846-fceratto.json
  • 16:37 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 16:34 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 16:33 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P80934 and previous config saved to /var/cache/conftool/dbconfig/20250806-163338-fceratto.json
  • 16:33 cjming@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:32 cjming@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 16:18 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P80931 and previous config saved to /var/cache/conftool/dbconfig/20250806-161831-fceratto.json
  • 16:03 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1251 (T399728)', diff saved to https://phabricator.wikimedia.org/P80930 and previous config saved to /var/cache/conftool/dbconfig/20250806-160323-fceratto.json
  • 15:59 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1251 (T399728)', diff saved to https://phabricator.wikimedia.org/P80929 and previous config saved to /var/cache/conftool/dbconfig/20250806-155939-fceratto.json
  • 15:59 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1251.eqiad.wmnet with reason: Maintenance
  • 15:57 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 15:55 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 15:55 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T399728)', diff saved to https://phabricator.wikimedia.org/P80928 and previous config saved to /var/cache/conftool/dbconfig/20250806-155540-fceratto.json
  • 15:48 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1012.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 15:48 dancy@deploy1003: Installation of scap version "4.197.0" completed for 169 hosts
  • 15:46 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2013.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 15:42 dancy@deploy1003: Installing scap version "4.197.0" for 169 host(s)
  • 15:40 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P80927 and previous config saved to /var/cache/conftool/dbconfig/20250806-154032-fceratto.json
  • 15:25 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P80926 and previous config saved to /var/cache/conftool/dbconfig/20250806-152524-fceratto.json
  • 15:10 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1235 (T399728)', diff saved to https://phabricator.wikimedia.org/P80925 and previous config saved to /var/cache/conftool/dbconfig/20250806-151017-fceratto.json
  • 15:06 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1235 (T399728)', diff saved to https://phabricator.wikimedia.org/P80924 and previous config saved to /var/cache/conftool/dbconfig/20250806-150631-fceratto.json
  • 15:06 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1235.eqiad.wmnet with reason: Maintenance
  • 15:06 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T399728)', diff saved to https://phabricator.wikimedia.org/P80923 and previous config saved to /var/cache/conftool/dbconfig/20250806-150609-fceratto.json
  • 14:51 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P80922 and previous config saved to /var/cache/conftool/dbconfig/20250806-145101-fceratto.json
  • 14:50 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1012.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 14:50 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2013.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 14:35 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P80921 and previous config saved to /var/cache/conftool/dbconfig/20250806-143554-fceratto.json
  • 14:20 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1234 (T399728)', diff saved to https://phabricator.wikimedia.org/P80920 and previous config saved to /var/cache/conftool/dbconfig/20250806-142046-fceratto.json
  • 14:19 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1011.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 14:18 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2012.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 14:17 jgiannelos@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
  • 14:17 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1234 (T399728)', diff saved to https://phabricator.wikimedia.org/P80919 and previous config saved to /var/cache/conftool/dbconfig/20250806-141701-fceratto.json
  • 14:16 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1234.eqiad.wmnet with reason: Maintenance
  • 14:16 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T399728)', diff saved to https://phabricator.wikimedia.org/P80918 and previous config saved to /var/cache/conftool/dbconfig/20250806-141638-fceratto.json
  • 14:16 jgiannelos@deploy1003: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
  • 14:15 jgiannelos@deploy1003: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
  • 14:14 jgiannelos@deploy1003: helmfile [codfw] START helmfile.d/services/mobileapps: apply
  • 14:14 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:14 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 14:14 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 14:14 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:14 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:13 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:13 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:12 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:08 gengh@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
  • 14:07 gengh@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
  • 14:07 gengh@deploy1003: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
  • 14:07 gengh@deploy1003: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
  • 14:06 gengh@deploy1003: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
  • 14:05 gengh@deploy1003: helmfile [staging] START helmfile.d/services/wikifunctions: apply
  • 14:01 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P80917 and previous config saved to /var/cache/conftool/dbconfig/20250806-140130-fceratto.json
  • 13:50 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov2007.codfw.wmnet with OS bookworm
  • 13:50 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 13:49 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin1003"
  • 13:46 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P80916 and previous config saved to /var/cache/conftool/dbconfig/20250806-134623-fceratto.json
  • 13:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2088.codfw.wmnet with OS bullseye
  • 13:31 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1232 (T399728)', diff saved to https://phabricator.wikimedia.org/P80915 and previous config saved to /var/cache/conftool/dbconfig/20250806-133115-fceratto.json
  • 13:29 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1091.eqiad.wmnet with OS bullseye
  • 13:27 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1232 (T399728)', diff saved to https://phabricator.wikimedia.org/P80914 and previous config saved to /var/cache/conftool/dbconfig/20250806-132725-fceratto.json
  • 13:27 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1232.eqiad.wmnet with reason: Maintenance
  • 13:27 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T399728)', diff saved to https://phabricator.wikimedia.org/P80913 and previous config saved to /var/cache/conftool/dbconfig/20250806-132703-fceratto.json
  • 13:25 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov2007.codfw.wmnet with reason: host reimage
  • 13:21 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov2007.codfw.wmnet with reason: host reimage
  • 13:20 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs1011.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 13:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 13:18 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2012.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 13:14 Lucas_WMDE: UTC afternoon backport+config window done
  • 13:14 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 13:11 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P80912 and previous config saved to /var/cache/conftool/dbconfig/20250806-131155-fceratto.json
  • 13:11 Reedy: ran `foreachwiki extensions/Nuke/maintenance/normalizeNukeTags.php` T381598
  • 13:11 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2088.codfw.wmnet with reason: host reimage
  • 13:08 mvernon@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1091.eqiad.wmnet with reason: host reimage
  • 13:05 brouberol: committing new homer config to add dse-k8s-worker101[5-9] to the bgp groups
  • 13:05 reedy@deploy1003: Finished scap sync-world: Backport for Add maintenance script to recapitalize 'Nuke' tags (T381598) (duration: 08m 18s)
  • 13:04 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dbprov2007.codfw.wmnet with OS bookworm
  • 12:59 reedy@deploy1003: chlod, reedy: Continuing with sync
  • 12:59 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:58 reedy@deploy1003: chlod, reedy: Backport for Add maintenance script to recapitalize 'Nuke' tags (T381598) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:58 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2248.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:57 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2088.codfw.wmnet with OS bullseye
  • 12:56 reedy@deploy1003: Started scap sync-world: Backport for Add maintenance script to recapitalize 'Nuke' tags (T381598)
  • 12:56 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P80911 and previous config saved to /var/cache/conftool/dbconfig/20250806-125648-fceratto.json
  • 12:56 mvernon@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1091.eqiad.wmnet with OS bullseye
  • 12:55 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2247.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host db2246.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:49 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 12:42 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic: apply
  • 12:41 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1219 (T399728)', diff saved to https://phabricator.wikimedia.org/P80910 and previous config saved to /var/cache/conftool/dbconfig/20250806-124140-fceratto.json
  • 12:41 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic: apply
  • 12:39 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host db2245.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:37 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1219 (T399728)', diff saved to https://phabricator.wikimedia.org/P80909 and previous config saved to /var/cache/conftool/dbconfig/20250806-123751-fceratto.json
  • 12:37 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2248.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:37 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1219.eqiad.wmnet with reason: Maintenance
  • 12:37 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T399728)', diff saved to https://phabricator.wikimedia.org/P80908 and previous config saved to /var/cache/conftool/dbconfig/20250806-123738-fceratto.json
  • 12:37 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2247.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:36 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2246.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:36 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host db2245.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 12:35 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2248
  • 12:35 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2247
  • 12:35 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2246
  • 12:35 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host db2245
  • 12:35 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2248
  • 12:35 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2247
  • 12:35 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2246
  • 12:35 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host db2245
  • 12:35 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 12:35 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2245 to codfw - jhancock@cumin1003"
  • 12:35 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding db2245 to codfw - jhancock@cumin1003"
  • 12:35 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:33 Reedy: run namespaceDupes.php on thwiki T401287
  • 12:32 reedy@deploy1003: Finished scap sync-world: Backport for thwiki: add WT namespace alias (T401287) (duration: 08m 56s)
  • 12:31 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 12:29 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 12:26 reedy@deploy1003: chlod, reedy: Continuing with sync
  • 12:25 reedy@deploy1003: chlod, reedy: Backport for thwiki: add WT namespace alias (T401287) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:23 reedy@deploy1003: Started scap sync-world: Backport for thwiki: add WT namespace alias (T401287)
  • 12:22 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P80907 and previous config saved to /var/cache/conftool/dbconfig/20250806-122231-fceratto.json
  • 12:18 btullis@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 12:17 btullis@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 12:13 btullis@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 12:11 btullis@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 12:07 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P80906 and previous config saved to /var/cache/conftool/dbconfig/20250806-120723-fceratto.json
  • 11:52 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1218 (T399728)', diff saved to https://phabricator.wikimedia.org/P80905 and previous config saved to /var/cache/conftool/dbconfig/20250806-115216-fceratto.json
  • 11:44 btullis@deploy1003: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: sync
  • 11:44 btullis@deploy1003: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: sync
  • 11:36 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1218 (T399728)', diff saved to https://phabricator.wikimedia.org/P80904 and previous config saved to /var/cache/conftool/dbconfig/20250806-113633-fceratto.json
  • 11:36 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1218.eqiad.wmnet with reason: Maintenance
  • 11:36 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T399728)', diff saved to https://phabricator.wikimedia.org/P80903 and previous config saved to /var/cache/conftool/dbconfig/20250806-113609-fceratto.json
  • 11:21 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P80902 and previous config saved to /var/cache/conftool/dbconfig/20250806-112102-fceratto.json
  • 11:15 btullis@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 11:14 btullis@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 11:09 btullis@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 11:05 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P80901 and previous config saved to /var/cache/conftool/dbconfig/20250806-110555-fceratto.json
  • 11:00 btullis@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 10:58 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:58 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 10:50 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1206 (T399728)', diff saved to https://phabricator.wikimedia.org/P80900 and previous config saved to /var/cache/conftool/dbconfig/20250806-105047-fceratto.json
  • 10:48 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1206 (T399728)', diff saved to https://phabricator.wikimedia.org/P80899 and previous config saved to /var/cache/conftool/dbconfig/20250806-104805-fceratto.json
  • 10:47 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1206.eqiad.wmnet with reason: Maintenance
  • 10:47 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T399728)', diff saved to https://phabricator.wikimedia.org/P80898 and previous config saved to /var/cache/conftool/dbconfig/20250806-104743-fceratto.json
  • 10:32 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P80897 and previous config saved to /var/cache/conftool/dbconfig/20250806-103235-fceratto.json
  • 10:17 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P80896 and previous config saved to /var/cache/conftool/dbconfig/20250806-101728-fceratto.json
  • 10:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
  • 10:02 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1196 (T399728)', diff saved to https://phabricator.wikimedia.org/P80895 and previous config saved to /var/cache/conftool/dbconfig/20250806-100220-fceratto.json
  • 10:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
  • 09:58 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1196 (T399728)', diff saved to https://phabricator.wikimedia.org/P80894 and previous config saved to /var/cache/conftool/dbconfig/20250806-095839-fceratto.json
  • 09:58 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 09:58 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1196.eqiad.wmnet with reason: Maintenance
  • 09:58 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T399728)', diff saved to https://phabricator.wikimedia.org/P80893 and previous config saved to /var/cache/conftool/dbconfig/20250806-095758-fceratto.json
  • 09:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
  • 09:54 hashar@deploy1003: Finished scap sync-world: Backport for ExperimentManager: Fix #getExperiment() when uninitialized (T401294) (duration: 08m 20s)
  • 09:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
  • 09:48 hashar@deploy1003: hashar: Continuing with sync
  • 09:47 hashar@deploy1003: hashar: Backport for ExperimentManager: Fix #getExperiment() when uninitialized (T401294) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:45 hashar@deploy1003: Started scap sync-world: Backport for ExperimentManager: Fix #getExperiment() when uninitialized (T401294)
  • 09:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
  • 09:42 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P80892 and previous config saved to /var/cache/conftool/dbconfig/20250806-094250-fceratto.json
  • 09:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
  • 09:38 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
  • 09:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
  • 09:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
  • 09:34 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
  • 09:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
  • 09:33 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
  • 09:31 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
  • 09:27 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P80891 and previous config saved to /var/cache/conftool/dbconfig/20250806-092743-fceratto.json
  • 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
  • 09:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
  • 09:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
  • 09:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
  • 09:15 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
  • 09:12 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1195 (T399728)', diff saved to https://phabricator.wikimedia.org/P80890 and previous config saved to /var/cache/conftool/dbconfig/20250806-091235-fceratto.json
  • 09:12 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
  • 09:08 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1195 (T399728)', diff saved to https://phabricator.wikimedia.org/P80889 and previous config saved to /var/cache/conftool/dbconfig/20250806-090856-fceratto.json
  • 09:08 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1195.eqiad.wmnet with reason: Maintenance
  • 09:08 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T399728)', diff saved to https://phabricator.wikimedia.org/P80888 and previous config saved to /var/cache/conftool/dbconfig/20250806-090833-fceratto.json
  • 09:07 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 09:07 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
  • 08:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
  • 08:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from snapshot1016 to dse-k8s-worker1019
  • 08:54 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1019
  • 08:53 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P80887 and previous config saved to /var/cache/conftool/dbconfig/20250806-085326-fceratto.json
  • 08:52 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1019
  • 08:52 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-worker1019 on all recursors
  • 08:52 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-worker1019 on all recursors
  • 08:52 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:52 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1016 to dse-k8s-worker1019 - btullis@cumin1003"
  • 08:52 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1016 to dse-k8s-worker1019 - btullis@cumin1003"
  • 08:48 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 08:45 btullis@cumin1003: START - Cookbook sre.hosts.rename from snapshot1016 to dse-k8s-worker1019
  • 08:39 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.rename (exit_code=93) from snapshot1016 to dse-k8s-worker1019
  • 08:38 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P80886 and previous config saved to /var/cache/conftool/dbconfig/20250806-083818-fceratto.json
  • 08:34 btullis@cumin1003: START - Cookbook sre.hosts.rename from snapshot1016 to dse-k8s-worker1019
  • 08:25 hashar@deploy1003: rebuilt and synchronized wikiversions files: group1 to 1.45.0-wmf.13 refs T396374
  • 08:23 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1186 (T399728)', diff saved to https://phabricator.wikimedia.org/P80885 and previous config saved to /var/cache/conftool/dbconfig/20250806-082311-fceratto.json
  • 08:19 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1186 (T399728)', diff saved to https://phabricator.wikimedia.org/P80884 and previous config saved to /var/cache/conftool/dbconfig/20250806-081929-fceratto.json
  • 08:19 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1186.eqiad.wmnet with reason: Maintenance
  • 08:19 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T399728)', diff saved to https://phabricator.wikimedia.org/P80883 and previous config saved to /var/cache/conftool/dbconfig/20250806-081906-fceratto.json
  • 08:16 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 08:15 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 08:08 reedy@deploy1003: Finished scap sync-world: Backport for thwiki: enable WikiLove (T401279) (duration: 07m 46s)
  • 08:03 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P80882 and previous config saved to /var/cache/conftool/dbconfig/20250806-080359-fceratto.json
  • 08:03 reedy@deploy1003: reedy, chlod: Continuing with sync
  • 08:03 reedy@deploy1003: reedy, chlod: Backport for thwiki: enable WikiLove (T401279) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 08:01 reedy@deploy1003: Started scap sync-world: Backport for thwiki: enable WikiLove (T401279)
  • 07:48 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P80881 and previous config saved to /var/cache/conftool/dbconfig/20250806-074851-fceratto.json
  • 07:45 Reedy: created wikilove tables on thwiki T401279
  • 07:33 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1184 (T399728)', diff saved to https://phabricator.wikimedia.org/P80880 and previous config saved to /var/cache/conftool/dbconfig/20250806-073343-fceratto.json
  • 07:24 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1184 (T399728)', diff saved to https://phabricator.wikimedia.org/P80879 and previous config saved to /var/cache/conftool/dbconfig/20250806-072448-fceratto.json
  • 07:24 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1184.eqiad.wmnet with reason: Maintenance
  • 07:24 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T399728)', diff saved to https://phabricator.wikimedia.org/P80878 and previous config saved to /var/cache/conftool/dbconfig/20250806-072425-fceratto.json
  • 07:13 kartik@deploy1003: Finished scap sync-world: Backport for Enable the Contribute menu in 9th group of Wikipedias (T397122) (duration: 09m 37s)
  • 07:09 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P80877 and previous config saved to /var/cache/conftool/dbconfig/20250806-070918-fceratto.json
  • 07:08 kartik@deploy1003: kartik: Continuing with sync
  • 07:05 kartik@deploy1003: kartik: Backport for Enable the Contribute menu in 9th group of Wikipedias (T397122) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 07:03 kartik@deploy1003: Started scap sync-world: Backport for Enable the Contribute menu in 9th group of Wikipedias (T397122)
  • 06:54 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P80876 and previous config saved to /var/cache/conftool/dbconfig/20250806-065410-fceratto.json
  • 06:39 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1169 (T399728)', diff saved to https://phabricator.wikimedia.org/P80875 and previous config saved to /var/cache/conftool/dbconfig/20250806-063903-fceratto.json
  • 06:35 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1169 (T399728)', diff saved to https://phabricator.wikimedia.org/P80874 and previous config saved to /var/cache/conftool/dbconfig/20250806-063521-fceratto.json
  • 06:35 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1169.eqiad.wmnet with reason: Maintenance
  • 03:05 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host sretest2003.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
  • 00:29 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 00:29 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T399728)', diff saved to https://phabricator.wikimedia.org/P80873 and previous config saved to /var/cache/conftool/dbconfig/20250806-002921-fceratto.json
  • 00:14 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P80872 and previous config saved to /var/cache/conftool/dbconfig/20250806-001413-fceratto.json

2025-08-05

  • 23:59 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P80871 and previous config saved to /var/cache/conftool/dbconfig/20250805-235905-fceratto.json
  • 23:55 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2011.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 23:44 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1226 (T399728)', diff saved to https://phabricator.wikimedia.org/P80870 and previous config saved to /var/cache/conftool/dbconfig/20250805-234358-fceratto.json
  • 23:39 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1226 (T399728)', diff saved to https://phabricator.wikimedia.org/P80869 and previous config saved to /var/cache/conftool/dbconfig/20250805-233907-fceratto.json
  • 23:39 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1226.eqiad.wmnet with reason: Maintenance
  • 23:38 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T399728)', diff saved to https://phabricator.wikimedia.org/P80868 and previous config saved to /var/cache/conftool/dbconfig/20250805-233843-fceratto.json
  • 23:23 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P80867 and previous config saved to /var/cache/conftool/dbconfig/20250805-232336-fceratto.json
  • 23:08 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P80866 and previous config saved to /var/cache/conftool/dbconfig/20250805-230828-fceratto.json
  • 22:55 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2011.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 22:53 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1214 (T399728)', diff saved to https://phabricator.wikimedia.org/P80865 and previous config saved to /var/cache/conftool/dbconfig/20250805-225320-fceratto.json
  • 22:52 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=wdqs-scholarly,name=eqiad
  • 22:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mpic-next: apply
  • 22:48 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1214 (T399728)', diff saved to https://phabricator.wikimedia.org/P80864 and previous config saved to /var/cache/conftool/dbconfig/20250805-224824-fceratto.json
  • 22:48 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1214.eqiad.wmnet with reason: Maintenance
  • 22:48 sfaci@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mpic-next: apply
  • 22:48 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T399728)', diff saved to https://phabricator.wikimedia.org/P80863 and previous config saved to /var/cache/conftool/dbconfig/20250805-224801-fceratto.json
  • 22:32 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P80862 and previous config saved to /var/cache/conftool/dbconfig/20250805-223253-fceratto.json
  • 22:18 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dbprov1007.eqiad.wmnet with OS bookworm
  • 22:18 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:17 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209', diff saved to https://phabricator.wikimedia.org/P80861 and previous config saved to /var/cache/conftool/dbconfig/20250805-221746-fceratto.json
  • 22:17 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1002"
  • 22:05 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 22:02 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1209 (T399728)', diff saved to https://phabricator.wikimedia.org/P80860 and previous config saved to /var/cache/conftool/dbconfig/20250805-220238-fceratto.json
  • 21:59 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dbprov1007.eqiad.wmnet with reason: host reimage
  • 21:57 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1209 (T399728)', diff saved to https://phabricator.wikimedia.org/P80859 and previous config saved to /var/cache/conftool/dbconfig/20250805-215738-fceratto.json
  • 21:57 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1209.eqiad.wmnet with reason: Maintenance
  • 21:57 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80858 and previous config saved to /var/cache/conftool/dbconfig/20250805-215715-fceratto.json
  • 21:55 jclark@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on dbprov1007.eqiad.wmnet with reason: host reimage
  • 21:46 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2010.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 21:42 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P80857 and previous config saved to /var/cache/conftool/dbconfig/20250805-214208-fceratto.json
  • 21:41 jgleeson: SmashPig upgraded from a7e897ec to 83293ee1
  • 21:37 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1007.eqiad.wmnet with OS bookworm
  • 21:31 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 21:31 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 21:31 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 21:31 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 21:31 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 21:31 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 21:31 jclark@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov1007.eqiad.wmnet with OS bookworm
  • 21:27 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P80856 and previous config saved to /var/cache/conftool/dbconfig/20250805-212701-fceratto.json
  • 21:25 brett@dns1004: END - running authdns-update
  • 21:25 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 21:25 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 21:24 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 21:24 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 21:24 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 21:24 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 21:23 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=wdqs-scholarly,name=eqiad
  • 21:22 brett@dns1004: START - running authdns-update
  • 21:18 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 21:17 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 21:17 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer scholarly_articles from wdqs1023.eqiad.wmnet -> wdqs1024.eqiad.wmnet w/ force delete existing files, repooling both afterwards
  • 21:14 ryankemper@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-reload (exit_code=97) reloading scholarly_articles on wdqs1024.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250714/ using stat1009.eqiad.wmnet)
  • 21:11 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80855 and previous config saved to /var/cache/conftool/dbconfig/20250805-211153-fceratto.json
  • 21:07 jclark@cumin1002: START - Cookbook sre.hosts.reimage for host dbprov1007.eqiad.wmnet with OS bookworm
  • 21:06 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1203 (T399728)', diff saved to https://phabricator.wikimedia.org/P80854 and previous config saved to /var/cache/conftool/dbconfig/20250805-210649-fceratto.json
  • 21:06 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1203.eqiad.wmnet with reason: Maintenance
  • 21:06 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80853 and previous config saved to /var/cache/conftool/dbconfig/20250805-210627-fceratto.json
  • 20:51 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P80852 and previous config saved to /var/cache/conftool/dbconfig/20250805-205119-fceratto.json
  • 20:46 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2010.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 20:40 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2007.codfw.wmnet with OS bookworm
  • 20:36 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P80851 and previous config saved to /var/cache/conftool/dbconfig/20250805-203612-fceratto.json
  • 20:35 ebernhardson: starting cluster mutation test on relforge*.eqiad.wmnet servers
  • 20:21 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80850 and previous config saved to /var/cache/conftool/dbconfig/20250805-202104-fceratto.json
  • 20:20 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2008.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 20:19 jclark@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:16 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1192 (T399728)', diff saved to https://phabricator.wikimedia.org/P80849 and previous config saved to /var/cache/conftool/dbconfig/20250805-201601-fceratto.json
  • 20:15 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1192.eqiad.wmnet with reason: Maintenance
  • 20:15 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80848 and previous config saved to /var/cache/conftool/dbconfig/20250805-201539-fceratto.json
  • 20:00 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P80847 and previous config saved to /var/cache/conftool/dbconfig/20250805-200031-fceratto.json
  • 19:49 jclark@cumin1002: START - Cookbook sre.hosts.provision for host dbprov1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:49 mutante: [gitlab2002:~] $ sudo systemctl start wmf_auto_restart_ssh-gitlab T401191
  • 19:47 jclark@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 19:47 jclark@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for dbprov1007 - jclark@cumin1002"
  • 19:47 jclark@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update dns for dbprov1007 - jclark@cumin1002"
  • 19:45 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P80846 and previous config saved to /var/cache/conftool/dbconfig/20250805-194524-fceratto.json
  • 19:39 jclark@cumin1002: START - Cookbook sre.dns.netbox
  • 19:30 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80845 and previous config saved to /var/cache/conftool/dbconfig/20250805-193016-fceratto.json
  • 19:24 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1178 (T399728)', diff saved to https://phabricator.wikimedia.org/P80844 and previous config saved to /var/cache/conftool/dbconfig/20250805-192410-fceratto.json
  • 19:24 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1178.eqiad.wmnet with reason: Maintenance
  • 19:23 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T399728)', diff saved to https://phabricator.wikimedia.org/P80843 and previous config saved to /var/cache/conftool/dbconfig/20250805-192347-fceratto.json
  • 19:22 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs2007.codfw.wmnet -> wdqs2008.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 19:08 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P80842 and previous config saved to /var/cache/conftool/dbconfig/20250805-190840-fceratto.json
  • 19:04 rzl: rzl@deploy1003:/srv/deployment-charts$ sudo git restore helmfile.d/dse-k8s-services/airflow-ml/values-production.yaml # discarding local changes to unblock the minutely git pull
  • 19:01 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dbprov2007.codfw.wmnet with OS bookworm
  • 19:01 krinkle@deploy1003: Finished scap sync-world: Backport for Profiler: Remove support for php-tideways_xhprof (T401152) (duration: 14m 54s)
  • 18:55 krinkle@deploy1003: krinkle: Continuing with sync
  • 18:54 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:53 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P80841 and previous config saved to /var/cache/conftool/dbconfig/20250805-185332-fceratto.json
  • 18:50 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:50 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dbprov2007
  • 18:50 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dbprov2007
  • 18:50 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 18:48 krinkle@deploy1003: krinkle: Backport for Profiler: Remove support for php-tideways_xhprof (T401152) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:47 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 18:46 krinkle@deploy1003: Started scap sync-world: Backport for Profiler: Remove support for php-tideways_xhprof (T401152)
  • 18:39 dancy@deploy1003: Finished scap sync-world: testing T398875 (duration: 02m 54s)
  • 18:38 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1177 (T399728)', diff saved to https://phabricator.wikimedia.org/P80840 and previous config saved to /var/cache/conftool/dbconfig/20250805-183824-fceratto.json
  • 18:37 dancy@deploy1003: Started scap sync-world: testing T398875
  • 18:35 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 10m 39s)
  • 18:35 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 18:35 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 18:35 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 18:35 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 18:35 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 18:35 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 18:33 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1177 (T399728)', diff saved to https://phabricator.wikimedia.org/P80839 and previous config saved to /var/cache/conftool/dbconfig/20250805-183319-fceratto.json
  • 18:33 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1177.eqiad.wmnet with reason: Maintenance
  • 18:32 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T399728)', diff saved to https://phabricator.wikimedia.org/P80838 and previous config saved to /var/cache/conftool/dbconfig/20250805-183256-fceratto.json
  • 18:27 bking@cumin2002: END (PASS) - Cookbook sre.wdqs.data-transfer (exit_code=0) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 18:25 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 18:18 dancy@deploy1003: Installation of scap version "4.196.0" completed for 2 hosts
  • 18:17 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P80837 and previous config saved to /var/cache/conftool/dbconfig/20250805-181749-fceratto.json
  • 18:16 dancy@deploy1003: Installing scap version "4.196.0" for 2 host(s)
  • 18:02 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P80836 and previous config saved to /var/cache/conftool/dbconfig/20250805-180241-fceratto.json
  • 17:47 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1172 (T399728)', diff saved to https://phabricator.wikimedia.org/P80835 and previous config saved to /var/cache/conftool/dbconfig/20250805-174734-fceratto.json
  • 17:42 ryankemper@cumin2002: START - Cookbook sre.wdqs.data-reload reloading scholarly_articles on wdqs1024.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/scholarly/20250714/ using stat1009.eqiad.wmnet)
  • 17:42 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1172 (T399728)', diff saved to https://phabricator.wikimedia.org/P80834 and previous config saved to /var/cache/conftool/dbconfig/20250805-174219-fceratto.json
  • 17:42 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1172.eqiad.wmnet with reason: Maintenance
  • 17:38 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 17:38 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T399728)', diff saved to https://phabricator.wikimedia.org/P80833 and previous config saved to /var/cache/conftool/dbconfig/20250805-173835-fceratto.json
  • 17:37 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:37 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:37 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:37 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:37 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:37 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:35 krinkle@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
  • 17:33 krinkle@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
  • 17:28 swfrench@deploy1003: Finished scap sync-world: Migrate debug and cli images to xhprof - T401152 (duration: 22m 02s)
  • 17:27 swfrench@deploy1003: swfrench: Continuing with sync
  • 17:23 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P80832 and previous config saved to /var/cache/conftool/dbconfig/20250805-172327-fceratto.json
  • 17:19 bblack@cumin1002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jobo out of all services on: 2395 hosts
  • 17:16 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:15 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:15 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:14 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:14 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:14 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:14 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:14 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:14 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:14 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:14 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:14 bking@cumin2002: END (FAIL) - Cookbook sre.wdqs.data-transfer (exit_code=99) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:14 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:14 bking@cumin2002: END (ERROR) - Cookbook sre.wdqs.data-transfer (exit_code=97) (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 17:12 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:12 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:12 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:11 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:11 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:11 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:10 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:10 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:10 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:10 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:10 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:10 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:09 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:09 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:09 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:09 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:08 swfrench@deploy1003: swfrench: Migrate debug and cli images to xhprof - T401152 synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:08 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:08 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P80831 and previous config saved to /var/cache/conftool/dbconfig/20250805-170820-fceratto.json
  • 17:08 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 17:08 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 17:07 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 17:07 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 17:07 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 17:07 swfrench@deploy1003: Started scap sync-world: Migrate debug and cli images to xhprof - T401152
  • 17:05 krinkle@deploy1003: Finished scap sync-world: Backport for Profiler: Add php-xhprof support besides php-tideways_xhprof (T401152) (duration: 11m 15s)
  • 17:02 bking@cumin2002: START - Cookbook sre.wdqs.data-transfer (T386098, transfer newly-reloaded data) xfer wikidata_main from wdqs1022.eqiad.wmnet -> wdqs2007.codfw.wmnet w/ force delete existing files, repooling both afterwards
  • 16:59 krinkle@deploy1003: krinkle: Continuing with sync
  • 16:56 bblack@cumin1002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jobo out of all services on: 2395 hosts
  • 16:55 krinkle@deploy1003: krinkle: Backport for Profiler: Add php-xhprof support besides php-tideways_xhprof (T401152) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:53 krinkle@deploy1003: Started scap sync-world: Backport for Profiler: Add php-xhprof support besides php-tideways_xhprof (T401152)
  • 16:53 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1167 (T399728)', diff saved to https://phabricator.wikimedia.org/P80830 and previous config saved to /var/cache/conftool/dbconfig/20250805-165312-fceratto.json
  • 16:49 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1167 (T399728)', diff saved to https://phabricator.wikimedia.org/P80829 and previous config saved to /var/cache/conftool/dbconfig/20250805-164902-fceratto.json
  • 16:48 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 16:48 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1167.eqiad.wmnet with reason: Maintenance
  • 16:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from snapshot1015 to dse-k8s-worker1018
  • 16:47 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
  • 16:45 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
  • 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-worker1018 on all recursors
  • 16:45 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-worker1018 on all recursors
  • 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1015 to dse-k8s-worker1018 - btullis@cumin1003"
  • 16:44 bblack@cumin1002: DONE (PASS) - Cookbook sre.idm.logout (exit_code=0) Logging Jobo out of all services on: 2396 hosts
  • 16:40 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1015 to dse-k8s-worker1018 - btullis@cumin1003"
  • 16:34 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:34 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:34 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 16:33 btullis@cumin1003: START - Cookbook sre.hosts.rename from snapshot1015 to dse-k8s-worker1018
  • 16:32 jgiannelos@deploy1003: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
  • 16:32 jgiannelos@deploy1003: helmfile [staging] START helmfile.d/services/mobileapps: apply
  • 16:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from snapshot1013 to dse-k8s-worker1017
  • 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
  • 16:27 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 16:25 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
  • 16:25 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-worker1017 on all recursors
  • 16:25 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-worker1017 on all recursors
  • 16:25 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:25 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1013 to dse-k8s-worker1017 - btullis@cumin1003"
  • 16:25 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1013 to dse-k8s-worker1017 - btullis@cumin1003"
  • 16:15 mszabo@deploy1003: Finished scap sync-world: Backport for UserInfoCard: Fix UA exclusion in stream config (duration: 11m 34s)
  • 16:10 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 16:10 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T399728)', diff saved to https://phabricator.wikimedia.org/P80828 and previous config saved to /var/cache/conftool/dbconfig/20250805-161038-fceratto.json
  • 16:09 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 16:09 btullis@cumin1003: START - Cookbook sre.hosts.rename from snapshot1013 to dse-k8s-worker1017
  • 16:08 mszabo@deploy1003: mszabo: Continuing with sync
  • 16:07 mszabo@deploy1003: mszabo: Backport for UserInfoCard: Fix UA exclusion in stream config synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 16:04 mszabo@deploy1003: Started scap sync-world: Backport for UserInfoCard: Fix UA exclusion in stream config
  • 16:01 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 15:55 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P80827 and previous config saved to /var/cache/conftool/dbconfig/20250805-155530-fceratto.json
  • 15:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from snapshot1012 to dse-k8s-worker1016
  • 15:48 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1016
  • 15:40 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P80826 and previous config saved to /var/cache/conftool/dbconfig/20250805-154023-fceratto.json
  • 15:36 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1016
  • 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-worker1016 on all recursors
  • 15:36 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-worker1016 on all recursors
  • 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1012 to dse-k8s-worker1016 - btullis@cumin1003"
  • 15:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1012 to dse-k8s-worker1016 - btullis@cumin1003"
  • 15:27 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 15:27 btullis@cumin1003: START - Cookbook sre.hosts.rename from snapshot1012 to dse-k8s-worker1016
  • 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from snapshot1011 to dse-k8s-worker1015
  • 15:26 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1015
  • 15:25 brennen@deploy1003: Finished deploy [phabricator/deployment@7b907e8]: deploy phab1004 for T401213 (duration: 00m 40s)
  • 15:25 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T399728)', diff saved to and previous config saved to /var/cache/conftool/dbconfig/20250805-152515-fceratto.json
  • 15:25 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
  • 15:25 btullis@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) dse-k8s-worker1015 on all recursors
  • 15:25 btullis@cumin1003: START - Cookbook sre.dns.wipe-cache dse-k8s-worker1015 on all recursors
  • 15:25 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:25 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1011 to dse-k8s-worker1015 - btullis@cumin1003"
  • 15:24 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming snapshot1011 to dse-k8s-worker1015 - btullis@cumin1003"
  • 15:24 brennen@deploy1003: Started deploy [phabricator/deployment@7b907e8]: deploy phab1004 for T401213
  • 15:22 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1253 (T399728)', diff saved to https://phabricator.wikimedia.org/P80825 and previous config saved to /var/cache/conftool/dbconfig/20250805-152232-fceratto.json
  • 15:22 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1253.eqiad.wmnet with reason: Maintenance
  • 15:22 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T399728)', diff saved to https://phabricator.wikimedia.org/P80824 and previous config saved to /var/cache/conftool/dbconfig/20250805-152208-fceratto.json
  • 15:22 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:21 brennen@deploy1003: Finished deploy [phabricator/deployment@7b907e8]: deploy phab2002 for T401213 (duration: 00m 41s)
  • 15:20 brennen@deploy1003: Started deploy [phabricator/deployment@7b907e8]: deploy phab2002 for T401213
  • 15:19 sukhe@cumin1003: START - Cookbook sre.dns.netbox
  • 15:18 sukhe@dns1004: END - running authdns-update
  • 15:17 sukhe@dns1004: START - running authdns-update
  • 15:17 dzahn@cumin2002: DONE (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 0:30:00 on phab.wmfusercontent.org with reason: version upgrade
  • 15:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: phab deploy
  • 15:14 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: phab deploy
  • 15:11 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 15:10 jhancock@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 15:07 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P80823 and previous config saved to /var/cache/conftool/dbconfig/20250805-150701-fceratto.json
  • 14:57 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 14:51 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P80822 and previous config saved to /var/cache/conftool/dbconfig/20250805-145153-fceratto.json
  • 14:49 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 14:49 jhancock@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
  • 14:49 jhancock@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2007 to codfw - jhancock@cumin1003"
  • 14:47 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2007 to codfw - jhancock@cumin1003"
  • 14:44 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 14:43 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 14:42 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 14:39 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 14:37 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 14:36 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T399728)', diff saved to https://phabricator.wikimedia.org/P80821 and previous config saved to /var/cache/conftool/dbconfig/20250805-143646-fceratto.json
  • 14:34 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T399728)', diff saved to https://phabricator.wikimedia.org/P80820 and previous config saved to /var/cache/conftool/dbconfig/20250805-143359-fceratto.json
  • 14:33 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 14:33 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T399728)', diff saved to https://phabricator.wikimedia.org/P80819 and previous config saved to /var/cache/conftool/dbconfig/20250805-143336-fceratto.json
  • 14:26 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:24 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 14:18 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P80818 and previous config saved to /var/cache/conftool/dbconfig/20250805-141829-fceratto.json
  • 14:18 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 14:17 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 14:17 mszabo@deploy1003: Finished scap sync-world: Backport for UserInfoCard: Cap maximum count for thanks given/received (T398354) (duration: 36m 20s)
  • 14:17 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 14:16 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 14:16 cgoubert@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:15 cgoubert@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:15 cgoubert@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:14 cgoubert@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 14:14 cgoubert@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
  • 14:13 cgoubert@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
  • 14:13 cgoubert@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:12 cgoubert@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
  • 14:12 cgoubert@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
  • 14:09 cgoubert@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
  • 14:09 cgoubert@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:08 cgoubert@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
  • 14:06 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 14:06 cgoubert@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
  • 14:06 btullis@cumin1003: START - Cookbook sre.hosts.rename from snapshot1011 to dse-k8s-worker1015
  • 14:05 mszabo@deploy1003: mszabo: Continuing with sync
  • 14:04 cgoubert@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
  • 14:03 cgoubert@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
  • 14:03 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P80817 and previous config saved to /var/cache/conftool/dbconfig/20250805-140321-fceratto.json
  • 14:02 mszabo@deploy1003: mszabo: Backport for UserInfoCard: Cap maximum count for thanks given/received (T398354) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:01 cgoubert@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
  • 13:50 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 13:48 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T399728)', diff saved to https://phabricator.wikimedia.org/P80816 and previous config saved to /var/cache/conftool/dbconfig/20250805-134814-fceratto.json
  • 13:45 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T399728)', diff saved to https://phabricator.wikimedia.org/P80815 and previous config saved to /var/cache/conftool/dbconfig/20250805-134539-fceratto.json
  • 13:45 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 13:45 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T399728)', diff saved to https://phabricator.wikimedia.org/P80814 and previous config saved to /var/cache/conftool/dbconfig/20250805-134515-fceratto.json
  • 13:45 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 13:41 mszabo@deploy1003: Started scap sync-world: Backport for UserInfoCard: Cap maximum count for thanks given/received (T398354)
  • 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2331.codfw.wmnet with OS bookworm
  • 13:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
  • 13:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
  • 13:30 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P80813 and previous config saved to /var/cache/conftool/dbconfig/20250805-133007-fceratto.json
  • 13:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2331.codfw.wmnet with reason: host reimage
  • 13:20 jdrewniak@deploy1003: Synchronized portals: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 01m 45s)
  • 13:18 jdrewniak@deploy1003: Synchronized portals/wikipedia.org/assets: Wikimedia Portals Update: Bumping portals to master (T128546) (duration: 07m 07s)
  • 13:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2331.codfw.wmnet with reason: host reimage
  • 13:15 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P80812 and previous config saved to /var/cache/conftool/dbconfig/20250805-131500-fceratto.json
  • 13:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2331.codfw.wmnet with OS bookworm
  • 13:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 13:02 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 13:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 12:59 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T399728)', diff saved to https://phabricator.wikimedia.org/P80811 and previous config saved to /var/cache/conftool/dbconfig/20250805-125952-fceratto.json
  • 12:57 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T399728)', diff saved to https://phabricator.wikimedia.org/P80810 and previous config saved to /var/cache/conftool/dbconfig/20250805-125719-fceratto.json
  • 12:57 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 12:56 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T399728)', diff saved to https://phabricator.wikimedia.org/P80809 and previous config saved to /var/cache/conftool/dbconfig/20250805-125655-fceratto.json
  • 12:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker2331.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
  • 12:41 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P80807 and previous config saved to /var/cache/conftool/dbconfig/20250805-124147-fceratto.json
  • 12:35 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:35 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
  • 12:26 gkyziridis@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 12:26 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P80806 and previous config saved to /var/cache/conftool/dbconfig/20250805-122640-fceratto.json
  • 12:26 gkyziridis@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
  • 12:11 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T399728)', diff saved to https://phabricator.wikimedia.org/P80805 and previous config saved to /var/cache/conftool/dbconfig/20250805-121132-fceratto.json
  • 12:08 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T399728)', diff saved to https://phabricator.wikimedia.org/P80803 and previous config saved to /var/cache/conftool/dbconfig/20250805-120857-fceratto.json
  • 12:08 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 12:08 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T399728)', diff saved to https://phabricator.wikimedia.org/P80802 and previous config saved to /var/cache/conftool/dbconfig/20250805-120835-fceratto.json
  • 11:53 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P80801 and previous config saved to /var/cache/conftool/dbconfig/20250805-115327-fceratto.json
  • 11:38 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P80800 and previous config saved to /var/cache/conftool/dbconfig/20250805-113820-fceratto.json
  • 11:23 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T399728)', diff saved to https://phabricator.wikimedia.org/P80799 and previous config saved to /var/cache/conftool/dbconfig/20250805-112312-fceratto.json
  • 11:20 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T399728)', diff saved to https://phabricator.wikimedia.org/P80798 and previous config saved to /var/cache/conftool/dbconfig/20250805-112036-fceratto.json
  • 11:20 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 11:20 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T399728)', diff saved to https://phabricator.wikimedia.org/P80797 and previous config saved to /var/cache/conftool/dbconfig/20250805-112014-fceratto.json
  • 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts dumpsdata1003.eqiad.wmnet
  • 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dumpsdata1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
  • 11:14 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: dumpsdata1003.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
  • 11:05 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P80796 and previous config saved to /var/cache/conftool/dbconfig/20250805-110506-fceratto.json
  • 10:56 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@62138e1] (releasing): T401180 (duration: 00m 32s)
  • 10:56 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@62138e1] (releasing): T401180
  • 10:55 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 10:50 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts dumpsdata1003.eqiad.wmnet
  • 10:49 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P80795 and previous config saved to /var/cache/conftool/dbconfig/20250805-104959-fceratto.json
  • 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts snapshot1010.eqiad.wmnet
  • 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
  • 10:47 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: snapshot1010.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
  • 10:39 xSavitar: Ran fixStuckGlobalRename.php for T400862
  • 10:36 xSavitar: Ran fixStuckGlobalRename.php for T400974
  • 10:34 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T399728)', diff saved to https://phabricator.wikimedia.org/P80794 and previous config saved to /var/cache/conftool/dbconfig/20250805-103451-fceratto.json
  • 10:32 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T399728)', diff saved to https://phabricator.wikimedia.org/P80793 and previous config saved to /var/cache/conftool/dbconfig/20250805-103213-fceratto.json
  • 10:32 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 10:31 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 10:31 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T399728)', diff saved to https://phabricator.wikimedia.org/P80792 and previous config saved to /var/cache/conftool/dbconfig/20250805-103055-fceratto.json
  • 10:23 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host gitlab2002.wikimedia.org with OS bookworm
  • 10:18 btullis@cumin1003: START - Cookbook sre.dns.netbox
  • 10:15 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P80791 and previous config saved to /var/cache/conftool/dbconfig/20250805-101548-fceratto.json
  • 10:12 hashar@deploy1003: Finished scap sync-world: Backport for In robots.txt permit access to the sitemap API (T400023 T396684) (duration: 08m 01s)
  • 10:09 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts snapshot1010.eqiad.wmnet
  • 10:06 hashar@deploy1003: tstarling, hashar: Continuing with sync
  • 10:06 hashar@deploy1003: tstarling, hashar: Backport for In robots.txt permit access to the sitemap API (T400023 T396684) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 10:04 hashar@deploy1003: Started scap sync-world: Backport for In robots.txt permit access to the sitemap API (T400023 T396684)
  • 10:00 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P80790 and previous config saved to /var/cache/conftool/dbconfig/20250805-100040-fceratto.json
  • 09:59 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gitlab2002.wikimedia.org with reason: host reimage
  • 09:55 jelto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on gitlab2002.wikimedia.org with reason: host reimage
  • 09:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f2-codfw
  • 09:51 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f2-codfw
  • 09:45 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T399728)', diff saved to https://phabricator.wikimedia.org/P80789 and previous config saved to /var/cache/conftool/dbconfig/20250805-094533-fceratto.json
  • 09:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e4-codfw
  • 09:45 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e4-codfw
  • 09:42 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T399728)', diff saved to https://phabricator.wikimedia.org/P80788 and previous config saved to /var/cache/conftool/dbconfig/20250805-094244-fceratto.json
  • 09:42 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 09:42 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T399728)', diff saved to https://phabricator.wikimedia.org/P80787 and previous config saved to /var/cache/conftool/dbconfig/20250805-094221-fceratto.json
  • 09:37 jelto@cumin1003: START - Cookbook sre.hosts.reimage for host gitlab2002.wikimedia.org with OS bookworm
  • 09:34 jelto@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host gitlab2002.wikimedia.org with OS bookworm
  • 09:33 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f4-codfw
  • 09:33 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f4-codfw
  • 09:31 hashar@deploy1003: rebuilt and synchronized wikiversions files: group0 to 1.45.0-wmf.13 refs T396374
  • 09:30 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f4-codfw
  • 09:30 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f4-codfw
  • 09:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f4-codfw
  • 09:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f4-codfw
  • 09:27 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P80786 and previous config saved to /var/cache/conftool/dbconfig/20250805-092714-fceratto.json
  • 09:20 hashar@deploy1003: Finished scap sync-world: Backport for Authorize self for Google Search Console (T400023) (duration: 17m 50s)
  • 09:12 hashar@deploy1003: tstarling, hashar: Continuing with sync
  • 09:12 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P80785 and previous config saved to /var/cache/conftool/dbconfig/20250805-091206-fceratto.json
  • 09:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e2-codfw
  • 09:07 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e2-codfw
  • 09:07 hashar@deploy1003: tstarling, hashar: Backport for Authorize self for Google Search Console (T400023) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 09:02 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e4-codfw
  • 09:02 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e4-codfw
  • 09:02 hashar@deploy1003: Started scap sync-world: Backport for Authorize self for Google Search Console (T400023)
  • 08:59 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e5-codfw
  • 08:59 mwpresync@deploy1003: Finished scap sync-world: testwikis to 1.45.0-wmf.13 refs T396374 (duration: 40m 12s)
  • 08:59 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e5-codfw
  • 08:57 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T399728)', diff saved to https://phabricator.wikimedia.org/P80784 and previous config saved to /var/cache/conftool/dbconfig/20250805-085658-fceratto.json
  • 08:54 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T399728)', diff saved to https://phabricator.wikimedia.org/P80783 and previous config saved to /var/cache/conftool/dbconfig/20250805-085424-fceratto.json
  • 08:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f4-codfw
  • 08:54 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f4-codfw
  • 08:54 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 08:54 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f4-codfw
  • 08:54 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f4-codfw
  • 08:54 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 08:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f2-codfw
  • 08:38 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-f2-codfw
  • 08:19 mwpresync@deploy1003: Started scap sync-world: testwikis to 1.45.0-wmf.13 refs T396374
  • 08:18 hashar: train: sudo systemctl start train-presync # T396374
  • 08:12 jelto@cumin1003: START - Cookbook sre.hosts.reimage for host gitlab2002.wikimedia.org with OS bookworm
  • 08:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 08:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw Nokia switches mgmt - ayounsi@cumin1003"
  • 08:08 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: codfw Nokia switches mgmt - ayounsi@cumin1003"
  • 08:04 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 07:00 dcausse: repooling wdqs1021
  • 06:36 dcausse: restarting blazegraph on wdqs1021 (stuck)
  • 06:33 dcausse: repooling wdqs1016
  • 04:27 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 04:23 eileen: civicrm upgraded from f202b616 to e591fe72
  • 04:07 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 04:06 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 04:02 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 04:01 mwpresync@deploy1003: Pruned MediaWiki: 1.45.0-wmf.10 (duration: 01m 53s)
  • 03:08 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 03:07 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 02:45 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bookworm
  • 02:41 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1042.eqiad.wmnet with OS bullseye
  • 02:41 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 02:37 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 02:24 jhancock@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dbprov2007.codfw.wmnet with OS bookworm
  • 02:23 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bullseye
  • 02:20 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 02:19 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1042.eqiad.wmnet with OS bullseye
  • 02:09 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 01:47 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1042.eqiad.wmnet with OS bullseye
  • 01:41 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 01:37 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 01:34 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephosd1043.eqiad.wmnet with OS bookworm
  • 01:34 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 01:17 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dbprov2007
  • 01:16 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dbprov2007
  • 01:16 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 01:13 jhancock@cumin1003: START - Cookbook sre.dns.netbox
  • 01:13 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1002"
  • 01:11 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 10m 57s)
  • 01:03 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host dbprov2007.codfw.wmnet with OS bookworm
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 01:00 jhancock@cumin1003: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['dbprov2007']
  • 00:59 jhancock@cumin1003: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['dbprov2007']
  • 00:53 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephosd1043.eqiad.wmnet with reason: host reimage
  • 00:51 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:47 vriley@cumin1002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephosd1043.eqiad.wmnet with reason: host reimage
  • 00:38 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:29 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:28 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bookworm
  • 00:25 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:22 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bookworm
  • 00:16 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:10 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host dbprov2007.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 00:10 jhancock@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dbprov2007
  • 00:10 jhancock@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dbprov2007
  • 00:09 jhancock@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 00:09 jhancock@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2007 to codfw - jhancock@cumin1003"
  • 00:09 jhancock@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding dbprov2007 to codfw - jhancock@cumin1003"
  • 00:08 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bookworm
  • 00:08 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bookworm
  • 00:06 jhancock@cumin1003: START - Cookbook sre.dns.netbox

2025-08-04

  • 23:42 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bookworm
  • 21:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T400854)', diff saved to https://phabricator.wikimedia.org/P80782 and previous config saved to /var/cache/conftool/dbconfig/20250804-214644-ladsgroup.json
  • 21:39 kemayo@deploy1003: Finished scap sync-world: Backport for Change search teardown focus to not use an over-broad route (T401090) (duration: 08m 08s)
  • 21:33 kemayo@deploy1003: kemayo: Continuing with sync
  • 21:32 kemayo@deploy1003: kemayo: Backport for Change search teardown focus to not use an over-broad route (T401090) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P80781 and previous config saved to /var/cache/conftool/dbconfig/20250804-213136-ladsgroup.json
  • 21:31 kemayo@deploy1003: Started scap sync-world: Backport for Change search teardown focus to not use an over-broad route (T401090)
  • 21:16 ebernhardson@deploy1003: Finished scap sync-world: Backport for Revert "cirrus: Start AB test of completion suggester fuzziness" (T397732), Clean up CirrusSearch settings on ex-wikipedia special wikis (T400062) (duration: 08m 06s)
  • 21:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P80780 and previous config saved to /var/cache/conftool/dbconfig/20250804-211628-ladsgroup.json
  • 21:14 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 21:11 ebernhardson@deploy1003: ebernhardson: Continuing with sync
  • 21:10 ebernhardson@deploy1003: ebernhardson: Backport for Revert "cirrus: Start AB test of completion suggester fuzziness" (T397732), Clean up CirrusSearch settings on ex-wikipedia special wikis (T400062) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 21:08 ebernhardson@deploy1003: Started scap sync-world: Backport for Revert "cirrus: Start AB test of completion suggester fuzziness" (T397732), Clean up CirrusSearch settings on ex-wikipedia special wikis (T400062)
  • 21:03 cjming@deploy1003: Finished scap sync-world: Backport for Clear edit count when unattaching local users for rename (T313900), fixStuckGlobalRename: Fix using actor_id from the wrong wiki (T398177), SessionManager: Add $sessionWriteReason to shutdown and when saves are triggered from the destructor (T400249) (duration: 07m 36s)
  • 21:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2222 (T400854)', diff saved to https://phabricator.wikimedia.org/P80779 and previous config saved to /var/cache/conftool/dbconfig/20250804-210119-ladsgroup.json
  • 20:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2222 (T400854)', diff saved to https://phabricator.wikimedia.org/P80778 and previous config saved to /var/cache/conftool/dbconfig/20250804-205837-ladsgroup.json
  • 20:58 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2222.codfw.wmnet with reason: Maintenance
  • 20:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T400854)', diff saved to https://phabricator.wikimedia.org/P80777 and previous config saved to /var/cache/conftool/dbconfig/20250804-205813-ladsgroup.json
  • 20:57 cjming@deploy1003: matmarex, cjming: Continuing with sync
  • 20:57 cjming@deploy1003: matmarex, cjming: Backport for Clear edit count when unattaching local users for rename (T313900), fixStuckGlobalRename: Fix using actor_id from the wrong wiki (T398177), SessionManager: Add $sessionWriteReason to shutdown and when saves are triggered from the destructor (T400249) synced to the testservers (see https://wikitech.wikimedia.org/w
  • 20:55 cjming@deploy1003: Started scap sync-world: Backport for Clear edit count when unattaching local users for rename (T313900), fixStuckGlobalRename: Fix using actor_id from the wrong wiki (T398177), SessionManager: Add $sessionWriteReason to shutdown and when saves are triggered from the destructor (T400249)
  • 20:45 ottomata: eventgate-analytics in eqiad cannot be deployed due to stuck helm STATUS: pending-upgrade. This needs to be deployed to rollback to a version that doesn't cause logspam. cc cwhite, rzl - T376026
  • 20:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P80776 and previous config saved to /var/cache/conftool/dbconfig/20250804-204305-ladsgroup.json
  • 20:39 Daimona: mwscript-k8s --comment="T397270" -f --file /srv/mediawiki/php-1.45.0-wmf.12/extensions/CampaignEvents/maintenance/countryExceptionMappings.csv -- CampaignEvents:UpdateCountriesColumn --wiki metawiki --exceptions countryExceptionMappings.csv --commit
  • 20:37 Daimona: mwscript-k8s --comment="T397270" -f --file /srv/mediawiki/php-1.45.0-wmf.12/extensions/CampaignEvents/maintenance/countryExceptionMappings.csv -- CampaignEvents:UpdateCountriesColumn --wiki officewiki --exceptions countryExceptionMappings.csv --commit
  • 20:36 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:36 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 20:35 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:35 Daimona: mwscript-k8s --comment="T397270" -f --file /srv/mediawiki/php-1.45.0-wmf.12/extensions/CampaignEvents/maintenance/countryExceptionMappings.csv -- CampaignEvents:UpdateCountriesColumn --wiki test2wiki --exceptions countryExceptionMappings.csv --commit
  • 20:34 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 20:34 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:34 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 20:34 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 20:33 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 20:33 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 20:33 Daimona: mwscript-k8s --comment="T397270" -f --file /srv/mediawiki/php-1.45.0-wmf.12/extensions/CampaignEvents/maintenance/countryExceptionMappings.csv -- CampaignEvents:UpdateCountriesColumn --wiki testwiki --exceptions countryExceptionMappings.csv --commit
  • 20:32 swfrench-wmf: reprepro include php8.3_8.3.24-1+wmf11u2 in component/php83 - T398245
  • 20:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P80771 and previous config saved to /var/cache/conftool/dbconfig/20250804-202754-ladsgroup.json
  • 20:26 Daimona: Re-run CampaignEvents country migration script in dry-run mode one last time for all wikis # T397270
  • 20:24 cjming@deploy1003: Finished scap sync-world: Backport for Add exceptions to country code migration script following test (T397270) (duration: 07m 30s)
  • 20:19 cjming@deploy1003: daimona, cjming: Continuing with sync
  • 20:19 cjming@deploy1003: daimona, cjming: Backport for Add exceptions to country code migration script following test (T397270) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:17 cjming@deploy1003: Started scap sync-world: Backport for Add exceptions to country code migration script following test (T397270)
  • 20:17 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:15 krinkle@deploy1003: Finished scap sync-world: Backport for Set wgCentralBannerRecorder to /beacon/… instead of //example.org/beacon/… (T400586) (duration: 09m 05s)
  • 20:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2221 (T400854)', diff saved to https://phabricator.wikimedia.org/P80769 and previous config saved to /var/cache/conftool/dbconfig/20250804-201246-ladsgroup.json
  • 20:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2221 (T400854)', diff saved to https://phabricator.wikimedia.org/P80768 and previous config saved to /var/cache/conftool/dbconfig/20250804-201003-ladsgroup.json
  • 20:09 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2221.codfw.wmnet with reason: Maintenance
  • 20:09 krinkle@deploy1003: krinkle: Continuing with sync
  • 20:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T400854)', diff saved to https://phabricator.wikimedia.org/P80767 and previous config saved to /var/cache/conftool/dbconfig/20250804-200938-ladsgroup.json
  • 20:07 krinkle@deploy1003: krinkle: Backport for Set wgCentralBannerRecorder to /beacon/… instead of //example.org/beacon/… (T400586) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 20:06 krinkle@deploy1003: Started scap sync-world: Backport for Set wgCentralBannerRecorder to /beacon/… instead of //example.org/beacon/… (T400586)
  • 20:05 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 20:04 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 20:02 otto@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
  • 20:01 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
  • 20:01 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
  • 20:01 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
  • 20:01 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
  • 19:59 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-main: apply
  • 19:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P80765 and previous config saved to /var/cache/conftool/dbconfig/20250804-195431-ladsgroup.json
  • 19:50 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 19:39 rzl@deploy1003: mwscript-k8s job started: Version.php --wiki=urwiki # Testing --sal for T376776
  • 19:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P80764 and previous config saved to /var/cache/conftool/dbconfig/20250804-193923-ladsgroup.json
  • 19:38 otto@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
  • 19:37 otto@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
  • 19:36 otto@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
  • 19:36 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 19:35 otto@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
  • 19:35 ottomata: deploying eventgate-analytics and eventgate-main to pick up meta.dt field logic change - T376026
  • 19:35 otto@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
  • 19:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2218 (T400854)', diff saved to https://phabricator.wikimedia.org/P80763 and previous config saved to /var/cache/conftool/dbconfig/20250804-192415-ladsgroup.json
  • 19:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2218 (T400854)', diff saved to https://phabricator.wikimedia.org/P80762 and previous config saved to /var/cache/conftool/dbconfig/20250804-192129-ladsgroup.json
  • 19:21 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2218.codfw.wmnet with reason: Maintenance
  • 19:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T400854)', diff saved to https://phabricator.wikimedia.org/P80761 and previous config saved to /var/cache/conftool/dbconfig/20250804-192107-ladsgroup.json
  • 19:20 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 19:19 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 19:17 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 19:12 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1009.eqiad.wmnet with reason: Maintenance
  • 19:12 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T399728)', diff saved to https://phabricator.wikimedia.org/P80760 and previous config saved to /var/cache/conftool/dbconfig/20250804-191213-fceratto.json
  • 19:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P80759 and previous config saved to /var/cache/conftool/dbconfig/20250804-190559-ladsgroup.json
  • 18:59 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:58 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:57 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P80758 and previous config saved to /var/cache/conftool/dbconfig/20250804-185705-fceratto.json
  • 18:55 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P80757 and previous config saved to /var/cache/conftool/dbconfig/20250804-185052-ladsgroup.json
  • 18:41 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P80756 and previous config saved to /var/cache/conftool/dbconfig/20250804-184156-fceratto.json
  • 18:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2208 (T400854)', diff saved to https://phabricator.wikimedia.org/P80755 and previous config saved to /var/cache/conftool/dbconfig/20250804-183543-ladsgroup.json
  • 18:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2208 (T400854)', diff saved to https://phabricator.wikimedia.org/P80754 and previous config saved to /var/cache/conftool/dbconfig/20250804-183259-ladsgroup.json
  • 18:32 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2208.codfw.wmnet with reason: Maintenance
  • 18:31 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2200.codfw.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2198.codfw.wmnet with reason: Maintenance
  • 18:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T400854)', diff saved to https://phabricator.wikimedia.org/P80753 and previous config saved to /var/cache/conftool/dbconfig/20250804-183033-ladsgroup.json
  • 18:26 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1231 (T399728)', diff saved to https://phabricator.wikimedia.org/P80752 and previous config saved to /var/cache/conftool/dbconfig/20250804-182649-fceratto.json
  • 18:24 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1231 (T399728)', diff saved to https://phabricator.wikimedia.org/P80751 and previous config saved to /var/cache/conftool/dbconfig/20250804-182420-fceratto.json
  • 18:24 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1231.eqiad.wmnet with reason: Maintenance
  • 18:23 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 18:23 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T399728)', diff saved to https://phabricator.wikimedia.org/P80750 and previous config saved to /var/cache/conftool/dbconfig/20250804-182309-fceratto.json
  • 18:23 vriley@cumin1002: START - Cookbook sre.hosts.reimage for host cloudcephosd1043.eqiad.wmnet with OS bullseye
  • 18:20 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:19 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P80749 and previous config saved to /var/cache/conftool/dbconfig/20250804-181526-ladsgroup.json
  • 18:08 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P80748 and previous config saved to /var/cache/conftool/dbconfig/20250804-180801-fceratto.json
  • 18:06 vriley@cumin1002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:06 swfrench@deploy1003: Finished scap sync-world: Deployment to pick up rebuilt mediawiki-httpd image (duration: 08m 33s)
  • 18:02 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 18:01 swfrench@deploy1003: swfrench: Continuing with sync
  • 18:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P80747 and previous config saved to /var/cache/conftool/dbconfig/20250804-180017-ladsgroup.json
  • 17:59 swfrench@deploy1003: swfrench: Deployment to pick up rebuilt mediawiki-httpd image synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:58 swfrench@deploy1003: Started scap sync-world: Deployment to pick up rebuilt mediawiki-httpd image
  • 17:54 dancy@deploy1003: Installation of scap version "4.195.0" completed for 2 hosts
  • 17:53 dancy@deploy1003: Installing scap version "4.195.0" for 2 host(s)
  • 17:52 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187', diff saved to https://phabricator.wikimedia.org/P80746 and previous config saved to /var/cache/conftool/dbconfig/20250804-175252-fceratto.json
  • 17:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2182 (T400854)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20250804-174505-ladsgroup.json
  • 17:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2182 (T400854)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20250804-174212-ladsgroup.json
  • 17:42 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2182.codfw.wmnet with reason: Maintenance
  • 17:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T400854)', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20250804-174145-ladsgroup.json
  • 17:37 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1187 (T399728)', diff saved to https://phabricator.wikimedia.org/P80743 and previous config saved to /var/cache/conftool/dbconfig/20250804-173745-fceratto.json
  • 17:35 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1187 (T399728)', diff saved to https://phabricator.wikimedia.org/P80742 and previous config saved to /var/cache/conftool/dbconfig/20250804-173518-fceratto.json
  • 17:35 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1187.eqiad.wmnet with reason: Maintenance
  • 17:34 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T399728)', diff saved to https://phabricator.wikimedia.org/P80741 and previous config saved to /var/cache/conftool/dbconfig/20250804-173454-fceratto.json
  • 17:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P80740 and previous config saved to /var/cache/conftool/dbconfig/20250804-172637-ladsgroup.json
  • 17:19 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P80739 and previous config saved to /var/cache/conftool/dbconfig/20250804-171945-fceratto.json
  • 17:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P80738 and previous config saved to /var/cache/conftool/dbconfig/20250804-171130-ladsgroup.json
  • 17:04 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180', diff saved to https://phabricator.wikimedia.org/P80737 and previous config saved to /var/cache/conftool/dbconfig/20250804-170436-fceratto.json
  • 16:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2168 (T400854)', diff saved to https://phabricator.wikimedia.org/P80736 and previous config saved to /var/cache/conftool/dbconfig/20250804-165623-ladsgroup.json
  • 16:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2168 (T400854)', diff saved to https://phabricator.wikimedia.org/P80735 and previous config saved to /var/cache/conftool/dbconfig/20250804-165335-ladsgroup.json
  • 16:53 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2168.codfw.wmnet with reason: Maintenance
  • 16:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T400854)', diff saved to https://phabricator.wikimedia.org/P80734 and previous config saved to /var/cache/conftool/dbconfig/20250804-165312-ladsgroup.json
  • 16:49 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1180 (T399728)', diff saved to https://phabricator.wikimedia.org/P80733 and previous config saved to /var/cache/conftool/dbconfig/20250804-164928-fceratto.json
  • 16:48 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1180 (T399728)', diff saved to https://phabricator.wikimedia.org/P80732 and previous config saved to /var/cache/conftool/dbconfig/20250804-164759-fceratto.json
  • 16:47 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1180.eqiad.wmnet with reason: Maintenance
  • 16:47 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T399728)', diff saved to https://phabricator.wikimedia.org/P80731 and previous config saved to /var/cache/conftool/dbconfig/20250804-164736-fceratto.json
  • 16:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P80730 and previous config saved to /var/cache/conftool/dbconfig/20250804-163803-ladsgroup.json
  • 16:32 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P80729 and previous config saved to /var/cache/conftool/dbconfig/20250804-163226-fceratto.json
  • 16:31 Lucas_WMDE: lucaswerkmeister-wmde Deployed security patch for T401099
  • 16:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P80725 and previous config saved to /var/cache/conftool/dbconfig/20250804-162255-ladsgroup.json
  • 16:19 Daimona: Running maintenance script for T397270 in x1: testwiki, test2wiki, officewiki, wikishared
  • 16:17 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173', diff saved to https://phabricator.wikimedia.org/P80723 and previous config saved to /var/cache/conftool/dbconfig/20250804-161718-fceratto.json
  • 16:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2159 (T400854)', diff saved to https://phabricator.wikimedia.org/P80722 and previous config saved to /var/cache/conftool/dbconfig/20250804-160746-ladsgroup.json
  • 16:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2159 (T400854)', diff saved to https://phabricator.wikimedia.org/P80721 and previous config saved to /var/cache/conftool/dbconfig/20250804-160456-ladsgroup.json
  • 16:04 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2159.codfw.wmnet with reason: Maintenance
  • 16:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T400854)', diff saved to https://phabricator.wikimedia.org/P80720 and previous config saved to /var/cache/conftool/dbconfig/20250804-160433-ladsgroup.json
  • 16:04 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwmaint2002.codfw.wmnet
  • 16:04 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 16:04 jasmine@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mwmaint2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jasmine@cumin1003"
  • 16:03 jasmine@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mwmaint2002.codfw.wmnet decommissioned, removing all IPs except the asset tag one - jasmine@cumin1003"
  • 16:02 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1173 (T399728)', diff saved to https://phabricator.wikimedia.org/P80719 and previous config saved to /var/cache/conftool/dbconfig/20250804-160210-fceratto.json
  • 15:59 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1173 (T399728)', diff saved to https://phabricator.wikimedia.org/P80718 and previous config saved to /var/cache/conftool/dbconfig/20250804-155941-fceratto.json
  • 15:59 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1173.eqiad.wmnet with reason: Maintenance
  • 15:59 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T399728)', diff saved to https://phabricator.wikimedia.org/P80717 and previous config saved to /var/cache/conftool/dbconfig/20250804-155919-fceratto.json
  • 15:58 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 15:57 jasmine@cumin1003: START - Cookbook sre.dns.netbox
  • 15:52 jasmine@cumin1003: START - Cookbook sre.hosts.decommission for hosts mwmaint2002.codfw.wmnet
  • 15:50 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 15:49 jasmine@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts mwmaint1002.eqiad.wmnet
  • 15:49 jasmine@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:49 jasmine@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mwmaint1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jasmine@cumin1003"
  • 15:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P80716 and previous config saved to /var/cache/conftool/dbconfig/20250804-154925-ladsgroup.json
  • 15:49 jasmine@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: mwmaint1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - jasmine@cumin1003"
  • 15:44 jasmine@cumin1003: START - Cookbook sre.dns.netbox
  • 15:44 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1007.eqiad.wmnet
  • 15:44 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:44 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1007.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 15:44 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P80715 and previous config saved to /var/cache/conftool/dbconfig/20250804-154410-fceratto.json
  • 15:41 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1007.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 15:39 jasmine@cumin1003: START - Cookbook sre.hosts.decommission for hosts mwmaint1002.eqiad.wmnet
  • 15:34 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:34 brouberol@cumin1003: START - Cookbook sre.dns.netbox
  • 15:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P80714 and previous config saved to /var/cache/conftool/dbconfig/20250804-153418-ladsgroup.json
  • 15:33 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1042.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:31 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:30 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:30 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:29 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-airflow1007.eqiad.wmnet
  • 15:29 jgreen@dns1004: END - running authdns-update
  • 15:29 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:29 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1006.eqiad.wmnet
  • 15:29 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:29 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168', diff saved to https://phabricator.wikimedia.org/P80713 and previous config saved to /var/cache/conftool/dbconfig/20250804-152903-fceratto.json
  • 15:28 jgreen@dns1004: START - running authdns-update
  • 15:28 vriley@cumin1002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:27 vriley@cumin1002: START - Cookbook sre.hosts.provision for host cloudcephosd1043.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
  • 15:26 vriley@cumin1002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephosd1043
  • 15:26 brouberol@cumin1003: START - Cookbook sre.dns.netbox
  • 15:26 vriley@cumin1002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephosd1043
  • 15:25 vriley@cumin1002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:25 vriley@cumin1002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudcephosd1043 - vriley@cumin1002"
  • 15:25 vriley@cumin1002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt cloudcephosd1043 - vriley@cumin1002"
  • 15:24 cgoubert@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
  • 15:24 cgoubert@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
  • 15:23 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:22 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:22 vriley@cumin1002: START - Cookbook sre.dns.netbox
  • 15:21 cgoubert@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
  • 15:20 cgoubert@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
  • 15:19 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-airflow1006.eqiad.wmnet
  • 15:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2150 (T400854)', diff saved to https://phabricator.wikimedia.org/P80712 and previous config saved to /var/cache/conftool/dbconfig/20250804-151910-ladsgroup.json
  • 15:18 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1005.eqiad.wmnet
  • 15:18 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:18 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 15:18 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1005.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 15:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2150 (T400854)', diff saved to https://phabricator.wikimedia.org/P80710 and previous config saved to /var/cache/conftool/dbconfig/20250804-151621-ladsgroup.json
  • 15:16 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2150.codfw.wmnet with reason: Maintenance
  • 15:15 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T400854)', diff saved to https://phabricator.wikimedia.org/P80709 and previous config saved to /var/cache/conftool/dbconfig/20250804-151526-ladsgroup.json
  • 15:13 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1168 (T399728)', diff saved to https://phabricator.wikimedia.org/P80708 and previous config saved to /var/cache/conftool/dbconfig/20250804-151355-fceratto.json
  • 15:11 brouberol@cumin1003: START - Cookbook sre.dns.netbox
  • 15:11 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1168 (T399728)', diff saved to https://phabricator.wikimedia.org/P80707 and previous config saved to /var/cache/conftool/dbconfig/20250804-151127-fceratto.json
  • 15:11 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1168.eqiad.wmnet with reason: Maintenance
  • 15:11 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T399728)', diff saved to https://phabricator.wikimedia.org/P80706 and previous config saved to /var/cache/conftool/dbconfig/20250804-151105-fceratto.json
  • 15:06 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-airflow1005.eqiad.wmnet
  • 15:05 kemayo@deploy1003: Finished scap sync-world: Backport for GutterSidebarEditCheckDialog: Guard against null bounding rects (duration: 08m 16s)
  • 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1004.eqiad.wmnet
  • 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:04 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 15:03 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1004.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 15:00 kemayo@deploy1003: kemayo: Continuing with sync
  • 15:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P80705 and previous config saved to /var/cache/conftool/dbconfig/20250804-150018-ladsgroup.json
  • 14:59 brouberol@cumin1003: START - Cookbook sre.dns.netbox
  • 14:59 kemayo@deploy1003: kemayo: Backport for GutterSidebarEditCheckDialog: Guard against null bounding rects synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:57 kemayo@deploy1003: Started scap sync-world: Backport for GutterSidebarEditCheckDialog: Guard against null bounding rects
  • 14:55 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P80704 and previous config saved to /var/cache/conftool/dbconfig/20250804-145557-fceratto.json
  • 14:54 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-airflow1004.eqiad.wmnet
  • 14:52 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts an-airflow1002.eqiad.wmnet
  • 14:52 brouberol@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 14:52 brouberol@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 14:51 brouberol@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-airflow1002.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - brouberol@cumin1003"
  • 14:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P80703 and previous config saved to /var/cache/conftool/dbconfig/20250804-144509-ladsgroup.json
  • 14:45 brouberol@cumin1003: START - Cookbook sre.dns.netbox
  • 14:40 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165', diff saved to https://phabricator.wikimedia.org/P80702 and previous config saved to /var/cache/conftool/dbconfig/20250804-144050-fceratto.json
  • 14:38 brouberol@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-airflow1002.eqiad.wmnet
  • 14:32 Lucas_WMDE: UTC afternoon backport+config window hopefully done after some difficulties
  • 14:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1253 (T400854)', diff saved to https://phabricator.wikimedia.org/P80701 and previous config saved to /var/cache/conftool/dbconfig/20250804-143001-ladsgroup.json
  • 14:28 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Set wgCampaignEventsCountrySchemaMigrationStage to MIGRATION_WRITE_BOTH (T397476) (duration: 16m 17s)
  • 14:25 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1165 (T399728)', diff saved to https://phabricator.wikimedia.org/P80700 and previous config saved to /var/cache/conftool/dbconfig/20250804-142542-fceratto.json
  • 14:23 XioNoX: push pfw policies - https://phabricator.wikimedia.org/T400936
  • 14:23 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1165 (T399728)', diff saved to https://phabricator.wikimedia.org/P80699 and previous config saved to /var/cache/conftool/dbconfig/20250804-142314-fceratto.json
  • 14:23 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1015,1019].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 14:22 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1165.eqiad.wmnet with reason: Maintenance
  • 14:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1253 (T400854)', diff saved to https://phabricator.wikimedia.org/P80698 and previous config saved to /var/cache/conftool/dbconfig/20250804-142132-ladsgroup.json
  • 14:21 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1253.eqiad.wmnet with reason: Maintenance
  • 14:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T400854)', diff saved to https://phabricator.wikimedia.org/P80697 and previous config saved to /var/cache/conftool/dbconfig/20250804-142109-ladsgroup.json
  • 14:19 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde: Continuing with sync
  • 14:16 lucaswerkmeister-wmde@deploy1003: daimona, lucaswerkmeister-wmde: Backport for Set wgCampaignEventsCountrySchemaMigrationStage to MIGRATION_WRITE_BOTH (T397476) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:12 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Set wgCampaignEventsCountrySchemaMigrationStage to MIGRATION_WRITE_BOTH (T397476)
  • 14:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P80696 and previous config saved to /var/cache/conftool/dbconfig/20250804-140602-ladsgroup.json
  • 14:00 ebernhardson: T317599 start full-cluster reindex for eqiad/codfw/cloudelastic opensearch clusters
  • 13:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P80695 and previous config saved to /var/cache/conftool/dbconfig/20250804-135054-ladsgroup.json
  • 13:42 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
  • 13:42 jnuche@deploy1003: Finished deploy [releng/jenkins-deploy@b89eed0] (releasing): check fix for releases2003 (duration: 00m 26s)
  • 13:41 jnuche@deploy1003: Started deploy [releng/jenkins-deploy@b89eed0] (releasing): check fix for releases2003
  • 13:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
  • 13:40 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:40 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
  • 13:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1227 (T400854)', diff saved to https://phabricator.wikimedia.org/P80694 and previous config saved to /var/cache/conftool/dbconfig/20250804-133547-ladsgroup.json
  • 13:34 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:33 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
  • 13:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1227 (T400854)', diff saved to https://phabricator.wikimedia.org/P80693 and previous config saved to /var/cache/conftool/dbconfig/20250804-133314-ladsgroup.json
  • 13:33 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1227.eqiad.wmnet with reason: Maintenance
  • 13:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T400854)', diff saved to https://phabricator.wikimedia.org/P80692 and previous config saved to /var/cache/conftool/dbconfig/20250804-133251-ladsgroup.json
  • 13:30 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, kharlan: Continuing with sync
  • 13:26 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, kharlan: Backport for UserInfoCard: Add config var for making UIC available (T400627), CheckUser: Make user info card feature discoverable (T398681) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:24 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for UserInfoCard: Add config var for making UIC available (T400627), CheckUser: Make user info card feature discoverable (T398681)
  • 13:22 lucaswerkmeister-wmde@deploy1003: Finished scap sync-world: Backport for Use tempaccounts.dblist to manage rollout wikis (T400672) (duration: 16m 31s)
  • 13:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P80691 and previous config saved to /var/cache/conftool/dbconfig/20250804-131744-ladsgroup.json
  • 13:15 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 13:14 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 13:14 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T399728)', diff saved to https://phabricator.wikimedia.org/P80690 and previous config saved to /var/cache/conftool/dbconfig/20250804-131417-fceratto.json
  • 13:14 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, stran: Continuing with sync
  • 13:07 lucaswerkmeister-wmde@deploy1003: lucaswerkmeister-wmde, stran: Backport for Use tempaccounts.dblist to manage rollout wikis (T400672) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 13:05 lucaswerkmeister-wmde@deploy1003: Started scap sync-world: Backport for Use tempaccounts.dblist to manage rollout wikis (T400672)
  • 13:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P80689 and previous config saved to /var/cache/conftool/dbconfig/20250804-130236-ladsgroup.json
  • 12:59 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P80688 and previous config saved to /var/cache/conftool/dbconfig/20250804-125909-fceratto.json
  • 12:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 12:50 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
  • 12:49 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
  • 12:48 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
  • 12:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1202 (T400854)', diff saved to https://phabricator.wikimedia.org/P80687 and previous config saved to /var/cache/conftool/dbconfig/20250804-124729-ladsgroup.json
  • 12:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1202 (T400854)', diff saved to https://phabricator.wikimedia.org/P80686 and previous config saved to /var/cache/conftool/dbconfig/20250804-124500-ladsgroup.json
  • 12:44 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1202.eqiad.wmnet with reason: Maintenance
  • 12:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T400854)', diff saved to https://phabricator.wikimedia.org/P80685 and previous config saved to /var/cache/conftool/dbconfig/20250804-124438-ladsgroup.json
  • 12:44 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P80684 and previous config saved to /var/cache/conftool/dbconfig/20250804-124402-fceratto.json
  • 12:41 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 12:40 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
  • 12:37 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
  • 12:36 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
  • 12:35 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
  • 12:35 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
  • 12:34 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:33 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
  • 12:31 dcausse: repooling wdqs1011
  • 12:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P80683 and previous config saved to /var/cache/conftool/dbconfig/20250804-122931-ladsgroup.json
  • 12:28 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T399728)', diff saved to https://phabricator.wikimedia.org/P80682 and previous config saved to /var/cache/conftool/dbconfig/20250804-122855-fceratto.json
  • 12:26 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T399728)', diff saved to https://phabricator.wikimedia.org/P80681 and previous config saved to /var/cache/conftool/dbconfig/20250804-122614-fceratto.json
  • 12:26 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 12:25 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 12:24 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T399728)', diff saved to https://phabricator.wikimedia.org/P80680 and previous config saved to /var/cache/conftool/dbconfig/20250804-122454-fceratto.json
  • 12:22 dcausse: depooling & restarting blazegraph on wdqs1011 (stuck for 3hours)
  • 12:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P80679 and previous config saved to /var/cache/conftool/dbconfig/20250804-121424-ladsgroup.json
  • 12:10 dcausse: depooling & restarting blazegraph on wdqs1016 (stuck for 7days)
  • 12:09 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P80678 and previous config saved to /var/cache/conftool/dbconfig/20250804-120946-fceratto.json
  • 11:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1194 (T400854)', diff saved to https://phabricator.wikimedia.org/P80677 and previous config saved to /var/cache/conftool/dbconfig/20250804-115917-ladsgroup.json
  • 11:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1194 (T400854)', diff saved to https://phabricator.wikimedia.org/P80676 and previous config saved to /var/cache/conftool/dbconfig/20250804-115649-ladsgroup.json
  • 11:56 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1194.eqiad.wmnet with reason: Maintenance
  • 11:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T400854)', diff saved to https://phabricator.wikimedia.org/P80675 and previous config saved to /var/cache/conftool/dbconfig/20250804-115626-ladsgroup.json
  • 11:54 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P80674 and previous config saved to /var/cache/conftool/dbconfig/20250804-115438-fceratto.json
  • 11:42 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 11:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P80673 and previous config saved to /var/cache/conftool/dbconfig/20250804-114119-ladsgroup.json
  • 11:39 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T399728)', diff saved to https://phabricator.wikimedia.org/P80672 and previous config saved to /var/cache/conftool/dbconfig/20250804-113931-fceratto.json
  • 11:39 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 11:36 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T399728)', diff saved to https://phabricator.wikimedia.org/P80671 and previous config saved to /var/cache/conftool/dbconfig/20250804-113649-fceratto.json
  • 11:36 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 11:36 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T399728)', diff saved to https://phabricator.wikimedia.org/P80670 and previous config saved to /var/cache/conftool/dbconfig/20250804-113625-fceratto.json
  • 11:28 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 11:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P80668 and previous config saved to /var/cache/conftool/dbconfig/20250804-112612-ladsgroup.json
  • 11:25 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 11:21 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P80667 and previous config saved to /var/cache/conftool/dbconfig/20250804-112118-fceratto.json
  • 11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 11:16 btullis@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 11:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1191 (T400854)', diff saved to https://phabricator.wikimedia.org/P80666 and previous config saved to /var/cache/conftool/dbconfig/20250804-111103-ladsgroup.json
  • 11:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1191 (T400854)', diff saved to https://phabricator.wikimedia.org/P80665 and previous config saved to /var/cache/conftool/dbconfig/20250804-110834-ladsgroup.json
  • 11:08 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1191.eqiad.wmnet with reason: Maintenance
  • 11:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T400854)', diff saved to https://phabricator.wikimedia.org/P80664 and previous config saved to /var/cache/conftool/dbconfig/20250804-110811-ladsgroup.json
  • 11:06 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P80663 and previous config saved to /var/cache/conftool/dbconfig/20250804-110609-fceratto.json
  • 10:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P80662 and previous config saved to /var/cache/conftool/dbconfig/20250804-105303-ladsgroup.json
  • 10:51 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T399728)', diff saved to https://phabricator.wikimedia.org/P80661 and previous config saved to /var/cache/conftool/dbconfig/20250804-105101-fceratto.json
  • 10:48 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T399728)', diff saved to https://phabricator.wikimedia.org/P80660 and previous config saved to /var/cache/conftool/dbconfig/20250804-104823-fceratto.json
  • 10:48 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 10:48 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T399728)', diff saved to https://phabricator.wikimedia.org/P80659 and previous config saved to /var/cache/conftool/dbconfig/20250804-104800-fceratto.json
  • 10:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181', diff saved to https://phabricator.wikimedia.org/P80658 and previous config saved to /var/cache/conftool/dbconfig/20250804-103756-ladsgroup.json
  • 10:32 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P80657 and previous config saved to /var/cache/conftool/dbconfig/20250804-103252-fceratto.json
  • 10:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1181 (T400854)', diff saved to https://phabricator.wikimedia.org/P80656 and previous config saved to /var/cache/conftool/dbconfig/20250804-102248-ladsgroup.json
  • 10:17 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P80655 and previous config saved to /var/cache/conftool/dbconfig/20250804-101745-fceratto.json
  • 10:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1181 (T400854)', diff saved to https://phabricator.wikimedia.org/P80654 and previous config saved to /var/cache/conftool/dbconfig/20250804-101421-ladsgroup.json
  • 10:14 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1181.eqiad.wmnet with reason: Maintenance
  • 10:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T400854)', diff saved to https://phabricator.wikimedia.org/P80653 and previous config saved to /var/cache/conftool/dbconfig/20250804-101358-ladsgroup.json
  • 10:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 10:02 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T399728)', diff saved to https://phabricator.wikimedia.org/P80652 and previous config saved to /var/cache/conftool/dbconfig/20250804-100237-fceratto.json
  • 10:00 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T399728)', diff saved to https://phabricator.wikimedia.org/P80651 and previous config saved to /var/cache/conftool/dbconfig/20250804-095958-fceratto.json
  • 09:59 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 09:59 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T399728)', diff saved to https://phabricator.wikimedia.org/P80650 and previous config saved to /var/cache/conftool/dbconfig/20250804-095935-fceratto.json
  • 09:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P80649 and previous config saved to /var/cache/conftool/dbconfig/20250804-095851-ladsgroup.json
  • 09:46 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
  • 09:44 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P80648 and previous config saved to /var/cache/conftool/dbconfig/20250804-094428-fceratto.json
  • 09:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P80647 and previous config saved to /var/cache/conftool/dbconfig/20250804-094343-ladsgroup.json
  • 09:29 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P80646 and previous config saved to /var/cache/conftool/dbconfig/20250804-092920-fceratto.json
  • 09:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1174 (T400854)', diff saved to https://phabricator.wikimedia.org/P80645 and previous config saved to /var/cache/conftool/dbconfig/20250804-092836-ladsgroup.json
  • 09:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1174 (T400854)', diff saved to https://phabricator.wikimedia.org/P80644 and previous config saved to /var/cache/conftool/dbconfig/20250804-092606-ladsgroup.json
  • 09:25 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1174.eqiad.wmnet with reason: Maintenance
  • 09:25 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1171.eqiad.wmnet with reason: Maintenance
  • 09:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T400854)', diff saved to https://phabricator.wikimedia.org/P80643 and previous config saved to /var/cache/conftool/dbconfig/20250804-092445-ladsgroup.json
  • 09:14 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T399728)', diff saved to https://phabricator.wikimedia.org/P80642 and previous config saved to /var/cache/conftool/dbconfig/20250804-091413-fceratto.json
  • 09:11 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T399728)', diff saved to https://phabricator.wikimedia.org/P80641 and previous config saved to /var/cache/conftool/dbconfig/20250804-091128-fceratto.json
  • 09:11 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 09:11 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 09:10 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T399728)', diff saved to https://phabricator.wikimedia.org/P80640 and previous config saved to /var/cache/conftool/dbconfig/20250804-091048-fceratto.json
  • 09:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P80639 and previous config saved to /var/cache/conftool/dbconfig/20250804-090938-ladsgroup.json
  • 08:55 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P80638 and previous config saved to /var/cache/conftool/dbconfig/20250804-085540-fceratto.json
  • 08:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P80637 and previous config saved to /var/cache/conftool/dbconfig/20250804-085430-ladsgroup.json
  • 08:40 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P80636 and previous config saved to /var/cache/conftool/dbconfig/20250804-084032-fceratto.json
  • 08:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1170 (T400854)', diff saved to https://phabricator.wikimedia.org/P80635 and previous config saved to /var/cache/conftool/dbconfig/20250804-083921-ladsgroup.json
  • 08:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1170 (T400854)', diff saved to https://phabricator.wikimedia.org/P80634 and previous config saved to /var/cache/conftool/dbconfig/20250804-083646-ladsgroup.json
  • 08:36 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1170.eqiad.wmnet with reason: Maintenance
  • 08:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T400854)', diff saved to https://phabricator.wikimedia.org/P80633 and previous config saved to /var/cache/conftool/dbconfig/20250804-083623-ladsgroup.json
  • 08:25 fceratto@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T399728)', diff saved to https://phabricator.wikimedia.org/P80632 and previous config saved to /var/cache/conftool/dbconfig/20250804-082524-fceratto.json
  • 08:22 fceratto@cumin1002: dbctl commit (dc=all): 'Depooling db1159 (T399728)', diff saved to https://phabricator.wikimedia.org/P80631 and previous config saved to /var/cache/conftool/dbconfig/20250804-082237-fceratto.json
  • 08:22 fceratto@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on db1159.eqiad.wmnet with reason: Maintenance
  • 08:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P80630 and previous config saved to /var/cache/conftool/dbconfig/20250804-082116-ladsgroup.json
  • 08:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P80629 and previous config saved to /var/cache/conftool/dbconfig/20250804-080608-ladsgroup.json
  • 07:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1158 (T400854)', diff saved to https://phabricator.wikimedia.org/P80628 and previous config saved to /var/cache/conftool/dbconfig/20250804-075101-ladsgroup.json
  • 07:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1158 (T400854)', diff saved to https://phabricator.wikimedia.org/P80627 and previous config saved to /var/cache/conftool/dbconfig/20250804-074333-ladsgroup.json
  • 07:43 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 07:43 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1158.eqiad.wmnet with reason: Maintenance
  • 06:50 _joe_: deleting unhealthy thumbor pods
  • 06:26 _joe_: defragmented etcd k8s cluster in eqiad
  • 05:25 tstarling@deploy1003: Finished scap sync-world: Backport for In sitemap responses set CC: public (T400023) (duration: 37m 03s)
  • 05:13 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
  • 05:13 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
  • 05:13 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: UX improvements - oblivian@cumin1003
  • 05:13 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "UX improvements - oblivian@cumin1003"
  • 05:13 tstarling@deploy1003: krinkle, tstarling: Continuing with sync
  • 05:09 tstarling@deploy1003: krinkle, tstarling: Backport for In sitemap responses set CC: public (T400023) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 04:48 tstarling@deploy1003: Started scap sync-world: Backport for In sitemap responses set CC: public (T400023)
  • 01:11 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 11m 03s)
  • 01:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T400854)', diff saved to https://phabricator.wikimedia.org/P80626 and previous config saved to /var/cache/conftool/dbconfig/20250804-010722-ladsgroup.json
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 00:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P80625 and previous config saved to /var/cache/conftool/dbconfig/20250804-005214-ladsgroup.json
  • 00:37 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238', diff saved to https://phabricator.wikimedia.org/P80624 and previous config saved to /var/cache/conftool/dbconfig/20250804-003706-ladsgroup.json
  • 00:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2238 (T400854)', diff saved to https://phabricator.wikimedia.org/P80623 and previous config saved to /var/cache/conftool/dbconfig/20250804-002159-ladsgroup.json
  • 00:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2238 (T400854)', diff saved to https://phabricator.wikimedia.org/P80622 and previous config saved to /var/cache/conftool/dbconfig/20250804-001908-ladsgroup.json
  • 00:19 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2238.codfw.wmnet with reason: Maintenance
  • 00:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T400854)', diff saved to https://phabricator.wikimedia.org/P80621 and previous config saved to /var/cache/conftool/dbconfig/20250804-001845-ladsgroup.json
  • 00:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P80620 and previous config saved to /var/cache/conftool/dbconfig/20250804-000337-ladsgroup.json

2025-08-03

  • 23:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226', diff saved to https://phabricator.wikimedia.org/P80619 and previous config saved to /var/cache/conftool/dbconfig/20250803-234829-ladsgroup.json
  • 23:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2226 (T400854)', diff saved to https://phabricator.wikimedia.org/P80618 and previous config saved to /var/cache/conftool/dbconfig/20250803-233322-ladsgroup.json
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2226 (T400854)', diff saved to https://phabricator.wikimedia.org/P80617 and previous config saved to /var/cache/conftool/dbconfig/20250803-233037-ladsgroup.json
  • 23:30 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2226.codfw.wmnet with reason: Maintenance
  • 23:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T400854)', diff saved to https://phabricator.wikimedia.org/P80616 and previous config saved to /var/cache/conftool/dbconfig/20250803-233013-ladsgroup.json
  • 23:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P80615 and previous config saved to /var/cache/conftool/dbconfig/20250803-231505-ladsgroup.json
  • 22:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225', diff saved to https://phabricator.wikimedia.org/P80614 and previous config saved to /var/cache/conftool/dbconfig/20250803-225957-ladsgroup.json
  • 22:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2225 (T400854)', diff saved to https://phabricator.wikimedia.org/P80613 and previous config saved to /var/cache/conftool/dbconfig/20250803-224450-ladsgroup.json
  • 22:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2225 (T400854)', diff saved to https://phabricator.wikimedia.org/P80612 and previous config saved to /var/cache/conftool/dbconfig/20250803-224159-ladsgroup.json
  • 22:41 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2225.codfw.wmnet with reason: Maintenance
  • 22:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T400854)', diff saved to https://phabricator.wikimedia.org/P80611 and previous config saved to /var/cache/conftool/dbconfig/20250803-224147-ladsgroup.json
  • 22:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P80610 and previous config saved to /var/cache/conftool/dbconfig/20250803-222640-ladsgroup.json
  • 22:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207', diff saved to https://phabricator.wikimedia.org/P80609 and previous config saved to /var/cache/conftool/dbconfig/20250803-221132-ladsgroup.json
  • 21:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2207 (T400854)', diff saved to https://phabricator.wikimedia.org/P80608 and previous config saved to /var/cache/conftool/dbconfig/20250803-215625-ladsgroup.json
  • 21:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2207 (T400854)', diff saved to https://phabricator.wikimedia.org/P80607 and previous config saved to /var/cache/conftool/dbconfig/20250803-215335-ladsgroup.json
  • 21:53 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2207.codfw.wmnet with reason: Maintenance
  • 21:51 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2197.codfw.wmnet with reason: Maintenance
  • 21:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T400854)', diff saved to https://phabricator.wikimedia.org/P80606 and previous config saved to /var/cache/conftool/dbconfig/20250803-215131-ladsgroup.json
  • 21:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P80605 and previous config saved to /var/cache/conftool/dbconfig/20250803-213623-ladsgroup.json
  • 21:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189', diff saved to https://phabricator.wikimedia.org/P80604 and previous config saved to /var/cache/conftool/dbconfig/20250803-212116-ladsgroup.json
  • 21:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2189 (T400854)', diff saved to https://phabricator.wikimedia.org/P80603 and previous config saved to /var/cache/conftool/dbconfig/20250803-210608-ladsgroup.json
  • 21:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2189 (T400854)', diff saved to https://phabricator.wikimedia.org/P80602 and previous config saved to /var/cache/conftool/dbconfig/20250803-210318-ladsgroup.json
  • 21:03 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2189.codfw.wmnet with reason: Maintenance
  • 21:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T400854)', diff saved to https://phabricator.wikimedia.org/P80601 and previous config saved to /var/cache/conftool/dbconfig/20250803-210255-ladsgroup.json
  • 20:47 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P80600 and previous config saved to /var/cache/conftool/dbconfig/20250803-204747-ladsgroup.json
  • 20:32 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175', diff saved to https://phabricator.wikimedia.org/P80599 and previous config saved to /var/cache/conftool/dbconfig/20250803-203238-ladsgroup.json
  • 20:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2175 (T400854)', diff saved to https://phabricator.wikimedia.org/P80598 and previous config saved to /var/cache/conftool/dbconfig/20250803-201730-ladsgroup.json
  • 20:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2175 (T400854)', diff saved to https://phabricator.wikimedia.org/P80597 and previous config saved to /var/cache/conftool/dbconfig/20250803-201435-ladsgroup.json
  • 20:14 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2175.codfw.wmnet with reason: Maintenance
  • 20:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T400854)', diff saved to https://phabricator.wikimedia.org/P80596 and previous config saved to /var/cache/conftool/dbconfig/20250803-201412-ladsgroup.json
  • 19:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P80595 and previous config saved to /var/cache/conftool/dbconfig/20250803-195904-ladsgroup.json
  • 19:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148', diff saved to https://phabricator.wikimedia.org/P80594 and previous config saved to /var/cache/conftool/dbconfig/20250803-194357-ladsgroup.json
  • 19:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2148 (T400854)', diff saved to https://phabricator.wikimedia.org/P80593 and previous config saved to /var/cache/conftool/dbconfig/20250803-192846-ladsgroup.json
  • 19:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2148 (T400854)', diff saved to https://phabricator.wikimedia.org/P80592 and previous config saved to /var/cache/conftool/dbconfig/20250803-192551-ladsgroup.json
  • 19:25 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2148.codfw.wmnet with reason: Maintenance
  • 19:24 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 19:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T400854)', diff saved to https://phabricator.wikimedia.org/P80591 and previous config saved to /var/cache/conftool/dbconfig/20250803-192426-ladsgroup.json
  • 19:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P80590 and previous config saved to /var/cache/conftool/dbconfig/20250803-190919-ladsgroup.json
  • 18:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1259', diff saved to https://phabricator.wikimedia.org/P80589 and previous config saved to /var/cache/conftool/dbconfig/20250803-185411-ladsgroup.json
  • 18:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1259 (T400854)', diff saved to https://phabricator.wikimedia.org/P80588 and previous config saved to /var/cache/conftool/dbconfig/20250803-183904-ladsgroup.json
  • 18:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1259 (T400854)', diff saved to https://phabricator.wikimedia.org/P80587 and previous config saved to /var/cache/conftool/dbconfig/20250803-183624-ladsgroup.json
  • 18:36 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1259.eqiad.wmnet with reason: Maintenance
  • 18:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T400854)', diff saved to https://phabricator.wikimedia.org/P80586 and previous config saved to /var/cache/conftool/dbconfig/20250803-183601-ladsgroup.json
  • 18:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to https://phabricator.wikimedia.org/P80585 and previous config saved to /var/cache/conftool/dbconfig/20250803-182054-ladsgroup.json
  • 18:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1254', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20250803-180541-ladsgroup.json
  • 17:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1254 (T400854)', diff saved to https://phabricator.wikimedia.org/P80583 and previous config saved to /var/cache/conftool/dbconfig/20250803-175034-ladsgroup.json
  • 17:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1254 (T400854)', diff saved to https://phabricator.wikimedia.org/P80582 and previous config saved to /var/cache/conftool/dbconfig/20250803-174354-ladsgroup.json
  • 17:43 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1254.eqiad.wmnet with reason: Maintenance
  • 17:42 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1239.eqiad.wmnet with reason: Maintenance
  • 17:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T400854)', diff saved to https://phabricator.wikimedia.org/P80581 and previous config saved to /var/cache/conftool/dbconfig/20250803-174235-ladsgroup.json
  • 17:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P80580 and previous config saved to /var/cache/conftool/dbconfig/20250803-172727-ladsgroup.json
  • 17:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233', diff saved to https://phabricator.wikimedia.org/P80579 and previous config saved to /var/cache/conftool/dbconfig/20250803-171218-ladsgroup.json
  • 16:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1233 (T400854)', diff saved to https://phabricator.wikimedia.org/P80578 and previous config saved to /var/cache/conftool/dbconfig/20250803-165710-ladsgroup.json
  • 16:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1233 (T400854)', diff saved to https://phabricator.wikimedia.org/P80577 and previous config saved to /var/cache/conftool/dbconfig/20250803-165432-ladsgroup.json
  • 16:54 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1233.eqiad.wmnet with reason: Maintenance
  • 16:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T400854)', diff saved to https://phabricator.wikimedia.org/P80576 and previous config saved to /var/cache/conftool/dbconfig/20250803-165409-ladsgroup.json
  • 16:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P80575 and previous config saved to /var/cache/conftool/dbconfig/20250803-163901-ladsgroup.json
  • 16:23 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229', diff saved to https://phabricator.wikimedia.org/P80574 and previous config saved to /var/cache/conftool/dbconfig/20250803-162354-ladsgroup.json
  • 16:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1229 (T400854)', diff saved to https://phabricator.wikimedia.org/P80573 and previous config saved to /var/cache/conftool/dbconfig/20250803-160846-ladsgroup.json
  • 16:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1229 (T400854)', diff saved to https://phabricator.wikimedia.org/P80572 and previous config saved to /var/cache/conftool/dbconfig/20250803-160616-ladsgroup.json
  • 16:06 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1229.eqiad.wmnet with reason: Maintenance
  • 16:05 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1225.eqiad.wmnet with reason: Maintenance
  • 16:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T400854)', diff saved to https://phabricator.wikimedia.org/P80571 and previous config saved to /var/cache/conftool/dbconfig/20250803-160455-ladsgroup.json
  • 15:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P80570 and previous config saved to /var/cache/conftool/dbconfig/20250803-154947-ladsgroup.json
  • 15:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197', diff saved to https://phabricator.wikimedia.org/P80569 and previous config saved to /var/cache/conftool/dbconfig/20250803-153439-ladsgroup.json
  • 15:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1197 (T400854)', diff saved to https://phabricator.wikimedia.org/P80568 and previous config saved to /var/cache/conftool/dbconfig/20250803-151932-ladsgroup.json
  • 15:17 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1197 (T400854)', diff saved to https://phabricator.wikimedia.org/P80567 and previous config saved to /var/cache/conftool/dbconfig/20250803-151702-ladsgroup.json
  • 15:16 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1197.eqiad.wmnet with reason: Maintenance
  • 15:16 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T400854)', diff saved to https://phabricator.wikimedia.org/P80566 and previous config saved to /var/cache/conftool/dbconfig/20250803-151639-ladsgroup.json
  • 15:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P80565 and previous config saved to /var/cache/conftool/dbconfig/20250803-150132-ladsgroup.json
  • 14:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188', diff saved to https://phabricator.wikimedia.org/P80564 and previous config saved to /var/cache/conftool/dbconfig/20250803-144624-ladsgroup.json
  • 14:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1188 (T400854)', diff saved to https://phabricator.wikimedia.org/P80563 and previous config saved to /var/cache/conftool/dbconfig/20250803-143117-ladsgroup.json
  • 14:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1188 (T400854)', diff saved to https://phabricator.wikimedia.org/P80562 and previous config saved to /var/cache/conftool/dbconfig/20250803-142847-ladsgroup.json
  • 14:28 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1188.eqiad.wmnet with reason: Maintenance
  • 14:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T400854)', diff saved to https://phabricator.wikimedia.org/P80561 and previous config saved to /var/cache/conftool/dbconfig/20250803-142824-ladsgroup.json
  • 14:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P80560 and previous config saved to /var/cache/conftool/dbconfig/20250803-141316-ladsgroup.json
  • 13:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182', diff saved to https://phabricator.wikimedia.org/P80559 and previous config saved to /var/cache/conftool/dbconfig/20250803-135808-ladsgroup.json
  • 13:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1182 (T400854)', diff saved to https://phabricator.wikimedia.org/P80558 and previous config saved to /var/cache/conftool/dbconfig/20250803-134300-ladsgroup.json
  • 13:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1182 (T400854)', diff saved to https://phabricator.wikimedia.org/P80557 and previous config saved to /var/cache/conftool/dbconfig/20250803-134019-ladsgroup.json
  • 13:40 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1182.eqiad.wmnet with reason: Maintenance
  • 13:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T400854)', diff saved to https://phabricator.wikimedia.org/P80556 and previous config saved to /var/cache/conftool/dbconfig/20250803-134008-ladsgroup.json
  • 13:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P80555 and previous config saved to /var/cache/conftool/dbconfig/20250803-132500-ladsgroup.json
  • 13:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162', diff saved to https://phabricator.wikimedia.org/P80554 and previous config saved to /var/cache/conftool/dbconfig/20250803-130952-ladsgroup.json
  • 12:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1162 (T400854)', diff saved to https://phabricator.wikimedia.org/P80553 and previous config saved to /var/cache/conftool/dbconfig/20250803-125444-ladsgroup.json
  • 12:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1162 (T400854)', diff saved to https://phabricator.wikimedia.org/P80552 and previous config saved to /var/cache/conftool/dbconfig/20250803-125214-ladsgroup.json
  • 12:52 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1162.eqiad.wmnet with reason: Maintenance
  • 12:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T400854)', diff saved to https://phabricator.wikimedia.org/P80551 and previous config saved to /var/cache/conftool/dbconfig/20250803-125152-ladsgroup.json
  • 12:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P80550 and previous config saved to /var/cache/conftool/dbconfig/20250803-123644-ladsgroup.json
  • 12:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156', diff saved to https://phabricator.wikimedia.org/P80549 and previous config saved to /var/cache/conftool/dbconfig/20250803-122136-ladsgroup.json
  • 12:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1156 (T400854)', diff saved to https://phabricator.wikimedia.org/P80548 and previous config saved to /var/cache/conftool/dbconfig/20250803-120629-ladsgroup.json
  • 12:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1156 (T400854)', diff saved to https://phabricator.wikimedia.org/P80547 and previous config saved to /var/cache/conftool/dbconfig/20250803-120346-ladsgroup.json
  • 12:03 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
  • 12:03 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1156.eqiad.wmnet with reason: Maintenance

2025-08-02

  • 21:49 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2239.codfw.wmnet with reason: Maintenance
  • 21:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T400854)', diff saved to https://phabricator.wikimedia.org/P80546 and previous config saved to /var/cache/conftool/dbconfig/20250802-214929-ladsgroup.json
  • 21:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P80544 and previous config saved to /var/cache/conftool/dbconfig/20250802-213421-ladsgroup.json
  • 21:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2227', diff saved to https://phabricator.wikimedia.org/P80543 and previous config saved to /var/cache/conftool/dbconfig/20250802-211914-ladsgroup.json
  • 21:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2227 (T400854)', diff saved to https://phabricator.wikimedia.org/P80542 and previous config saved to /var/cache/conftool/dbconfig/20250802-210406-ladsgroup.json
  • 20:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2227 (T400854)', diff saved to https://phabricator.wikimedia.org/P80541 and previous config saved to /var/cache/conftool/dbconfig/20250802-204951-ladsgroup.json
  • 20:49 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2227.codfw.wmnet with reason: Maintenance
  • 20:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T400854)', diff saved to https://phabricator.wikimedia.org/P80540 and previous config saved to /var/cache/conftool/dbconfig/20250802-204928-ladsgroup.json
  • 20:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P80539 and previous config saved to /var/cache/conftool/dbconfig/20250802-203421-ladsgroup.json
  • 20:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209', diff saved to https://phabricator.wikimedia.org/P80538 and previous config saved to /var/cache/conftool/dbconfig/20250802-201913-ladsgroup.json
  • 20:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2209 (T400854)', diff saved to https://phabricator.wikimedia.org/P80537 and previous config saved to /var/cache/conftool/dbconfig/20250802-200405-ladsgroup.json
  • 19:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2209 (T400854)', diff saved to https://phabricator.wikimedia.org/P80536 and previous config saved to /var/cache/conftool/dbconfig/20250802-194953-ladsgroup.json
  • 19:49 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2209.codfw.wmnet with reason: Maintenance
  • 19:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T400854)', diff saved to https://phabricator.wikimedia.org/P80535 and previous config saved to /var/cache/conftool/dbconfig/20250802-194931-ladsgroup.json
  • 19:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P80534 and previous config saved to /var/cache/conftool/dbconfig/20250802-193423-ladsgroup.json
  • 19:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194', diff saved to https://phabricator.wikimedia.org/P80533 and previous config saved to /var/cache/conftool/dbconfig/20250802-191915-ladsgroup.json
  • 19:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2194 (T400854)', diff saved to https://phabricator.wikimedia.org/P80532 and previous config saved to /var/cache/conftool/dbconfig/20250802-190408-ladsgroup.json
  • 18:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2194 (T400854)', diff saved to https://phabricator.wikimedia.org/P80531 and previous config saved to /var/cache/conftool/dbconfig/20250802-184952-ladsgroup.json
  • 18:49 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2194.codfw.wmnet with reason: Maintenance
  • 18:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T400854)', diff saved to https://phabricator.wikimedia.org/P80530 and previous config saved to /var/cache/conftool/dbconfig/20250802-184929-ladsgroup.json
  • 18:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P80529 and previous config saved to /var/cache/conftool/dbconfig/20250802-183421-ladsgroup.json
  • 18:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190', diff saved to https://phabricator.wikimedia.org/P80528 and previous config saved to /var/cache/conftool/dbconfig/20250802-181914-ladsgroup.json
  • 18:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2190 (T400854)', diff saved to https://phabricator.wikimedia.org/P80527 and previous config saved to /var/cache/conftool/dbconfig/20250802-180406-ladsgroup.json
  • 17:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2190 (T400854)', diff saved to https://phabricator.wikimedia.org/P80526 and previous config saved to /var/cache/conftool/dbconfig/20250802-174952-ladsgroup.json
  • 17:49 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2190.codfw.wmnet with reason: Maintenance
  • 17:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T400854)', diff saved to https://phabricator.wikimedia.org/P80525 and previous config saved to /var/cache/conftool/dbconfig/20250802-174929-ladsgroup.json
  • 17:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P80524 and previous config saved to /var/cache/conftool/dbconfig/20250802-173422-ladsgroup.json
  • 17:33 hnowlan: clean up some misbehaving thumbor pods
  • 17:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177', diff saved to https://phabricator.wikimedia.org/P80523 and previous config saved to /var/cache/conftool/dbconfig/20250802-171914-ladsgroup.json
  • 17:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2177 (T400854)', diff saved to https://phabricator.wikimedia.org/P80522 and previous config saved to /var/cache/conftool/dbconfig/20250802-170407-ladsgroup.json
  • 16:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2177 (T400854)', diff saved to https://phabricator.wikimedia.org/P80521 and previous config saved to /var/cache/conftool/dbconfig/20250802-165012-ladsgroup.json
  • 16:50 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2177.codfw.wmnet with reason: Maintenance
  • 16:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T400854)', diff saved to https://phabricator.wikimedia.org/P80520 and previous config saved to /var/cache/conftool/dbconfig/20250802-164949-ladsgroup.json
  • 16:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P80519 and previous config saved to /var/cache/conftool/dbconfig/20250802-163441-ladsgroup.json
  • 16:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156', diff saved to https://phabricator.wikimedia.org/P80518 and previous config saved to /var/cache/conftool/dbconfig/20250802-161933-ladsgroup.json
  • 16:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2156 (T400854)', diff saved to https://phabricator.wikimedia.org/P80517 and previous config saved to /var/cache/conftool/dbconfig/20250802-160426-ladsgroup.json
  • 15:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2156 (T400854)', diff saved to https://phabricator.wikimedia.org/P80516 and previous config saved to /var/cache/conftool/dbconfig/20250802-155032-ladsgroup.json
  • 15:50 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2156.codfw.wmnet with reason: Maintenance
  • 15:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T400854)', diff saved to https://phabricator.wikimedia.org/P80515 and previous config saved to /var/cache/conftool/dbconfig/20250802-155008-ladsgroup.json
  • 15:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P80514 and previous config saved to /var/cache/conftool/dbconfig/20250802-153501-ladsgroup.json
  • 15:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149', diff saved to https://phabricator.wikimedia.org/P80513 and previous config saved to /var/cache/conftool/dbconfig/20250802-151953-ladsgroup.json
  • 15:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2149 (T400854)', diff saved to https://phabricator.wikimedia.org/P80512 and previous config saved to /var/cache/conftool/dbconfig/20250802-150446-ladsgroup.json
  • 14:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2149 (T400854)', diff saved to https://phabricator.wikimedia.org/P80511 and previous config saved to /var/cache/conftool/dbconfig/20250802-145049-ladsgroup.json
  • 14:50 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2149.codfw.wmnet with reason: Maintenance
  • 14:47 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
  • 14:43 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1240.eqiad.wmnet with reason: Maintenance
  • 14:43 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T400854)', diff saved to https://phabricator.wikimedia.org/P80510 and previous config saved to /var/cache/conftool/dbconfig/20250802-144311-ladsgroup.json
  • 14:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P80509 and previous config saved to /var/cache/conftool/dbconfig/20250802-142803-ladsgroup.json
  • 14:12 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212', diff saved to https://phabricator.wikimedia.org/P80508 and previous config saved to /var/cache/conftool/dbconfig/20250802-141256-ladsgroup.json
  • 13:57 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1212 (T400854)', diff saved to https://phabricator.wikimedia.org/P80507 and previous config saved to /var/cache/conftool/dbconfig/20250802-135748-ladsgroup.json
  • 13:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1212 (T400854)', diff saved to https://phabricator.wikimedia.org/P80506 and previous config saved to /var/cache/conftool/dbconfig/20250802-135234-ladsgroup.json
  • 13:52 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 13:52 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1212.eqiad.wmnet with reason: Maintenance
  • 13:51 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T400854)', diff saved to https://phabricator.wikimedia.org/P80505 and previous config saved to /var/cache/conftool/dbconfig/20250802-135152-ladsgroup.json
  • 13:36 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P80504 and previous config saved to /var/cache/conftool/dbconfig/20250802-133645-ladsgroup.json
  • 13:21 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198', diff saved to https://phabricator.wikimedia.org/P80503 and previous config saved to /var/cache/conftool/dbconfig/20250802-132137-ladsgroup.json
  • 13:06 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1198 (T400854)', diff saved to https://phabricator.wikimedia.org/P80502 and previous config saved to /var/cache/conftool/dbconfig/20250802-130629-ladsgroup.json
  • 13:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1198 (T400854)', diff saved to https://phabricator.wikimedia.org/P80501 and previous config saved to /var/cache/conftool/dbconfig/20250802-130143-ladsgroup.json
  • 13:01 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1198.eqiad.wmnet with reason: Maintenance
  • 13:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T400854)', diff saved to https://phabricator.wikimedia.org/P80500 and previous config saved to /var/cache/conftool/dbconfig/20250802-130120-ladsgroup.json
  • 12:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P80499 and previous config saved to /var/cache/conftool/dbconfig/20250802-124612-ladsgroup.json
  • 12:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189', diff saved to https://phabricator.wikimedia.org/P80498 and previous config saved to /var/cache/conftool/dbconfig/20250802-123105-ladsgroup.json
  • 12:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1189 (T400854)', diff saved to https://phabricator.wikimedia.org/P80497 and previous config saved to /var/cache/conftool/dbconfig/20250802-121557-ladsgroup.json
  • 12:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1189 (T400854)', diff saved to https://phabricator.wikimedia.org/P80496 and previous config saved to /var/cache/conftool/dbconfig/20250802-121112-ladsgroup.json
  • 12:11 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1189.eqiad.wmnet with reason: Maintenance
  • 12:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T400854)', diff saved to https://phabricator.wikimedia.org/P80495 and previous config saved to /var/cache/conftool/dbconfig/20250802-121050-ladsgroup.json
  • 11:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P80494 and previous config saved to /var/cache/conftool/dbconfig/20250802-115542-ladsgroup.json
  • 11:40 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175', diff saved to https://phabricator.wikimedia.org/P80493 and previous config saved to /var/cache/conftool/dbconfig/20250802-114035-ladsgroup.json
  • 11:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1175 (T400854)', diff saved to https://phabricator.wikimedia.org/P80492 and previous config saved to /var/cache/conftool/dbconfig/20250802-112527-ladsgroup.json
  • 11:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1175 (T400854)', diff saved to https://phabricator.wikimedia.org/P80491 and previous config saved to /var/cache/conftool/dbconfig/20250802-112037-ladsgroup.json
  • 11:20 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1175.eqiad.wmnet with reason: Maintenance
  • 11:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T400854)', diff saved to https://phabricator.wikimedia.org/P80490 and previous config saved to /var/cache/conftool/dbconfig/20250802-112015-ladsgroup.json
  • 11:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P80489 and previous config saved to /var/cache/conftool/dbconfig/20250802-110507-ladsgroup.json
  • 10:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166', diff saved to https://phabricator.wikimedia.org/P80488 and previous config saved to /var/cache/conftool/dbconfig/20250802-104959-ladsgroup.json
  • 10:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1166 (T400854)', diff saved to https://phabricator.wikimedia.org/P80487 and previous config saved to /var/cache/conftool/dbconfig/20250802-103452-ladsgroup.json
  • 10:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1166 (T400854)', diff saved to https://phabricator.wikimedia.org/P80486 and previous config saved to /var/cache/conftool/dbconfig/20250802-103001-ladsgroup.json
  • 10:29 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1166.eqiad.wmnet with reason: Maintenance
  • 10:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T400854)', diff saved to https://phabricator.wikimedia.org/P80485 and previous config saved to /var/cache/conftool/dbconfig/20250802-102938-ladsgroup.json
  • 10:14 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P80484 and previous config saved to /var/cache/conftool/dbconfig/20250802-101431-ladsgroup.json
  • 09:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157', diff saved to https://phabricator.wikimedia.org/P80483 and previous config saved to /var/cache/conftool/dbconfig/20250802-095923-ladsgroup.json
  • 09:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1157 (T400854)', diff saved to https://phabricator.wikimedia.org/P80482 and previous config saved to /var/cache/conftool/dbconfig/20250802-094416-ladsgroup.json
  • 09:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1157 (T400854)', diff saved to https://phabricator.wikimedia.org/P80481 and previous config saved to /var/cache/conftool/dbconfig/20250802-093924-ladsgroup.json
  • 09:39 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1157.eqiad.wmnet with reason: Maintenance
  • 09:36 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1150.eqiad.wmnet with reason: Maintenance
  • 01:11 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 10m 52s)
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image

2025-08-01

  • 23:57 ryankemper@cumin2002: END (PASS) - Cookbook sre.wdqs.data-reload (exit_code=0) reloading wikidata_main on wdqs1022.eqiad.wmnet from DumpsSource.HDFS (hdfs:///wmf/data/discovery/wikidata/munged_n3_dump/wikidata/main/20250714/ using stat1009.eqiad.wmnet)
  • 21:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T400854)', diff saved to https://phabricator.wikimedia.org/P80480 and previous config saved to /var/cache/conftool/dbconfig/20250801-213802-ladsgroup.json
  • 21:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P80479 and previous config saved to /var/cache/conftool/dbconfig/20250801-212254-ladsgroup.json
  • 21:07 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P80478 and previous config saved to /var/cache/conftool/dbconfig/20250801-210746-ladsgroup.json
  • 20:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2228 (T400854)', diff saved to https://phabricator.wikimedia.org/P80477 and previous config saved to /var/cache/conftool/dbconfig/20250801-205239-ladsgroup.json
  • 20:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2228 (T400854)', diff saved to https://phabricator.wikimedia.org/P80476 and previous config saved to /var/cache/conftool/dbconfig/20250801-204903-ladsgroup.json
  • 20:48 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2228.codfw.wmnet with reason: Maintenance
  • 20:48 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T400854)', diff saved to https://phabricator.wikimedia.org/P80475 and previous config saved to /var/cache/conftool/dbconfig/20250801-204840-ladsgroup.json
  • 20:33 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P80474 and previous config saved to /var/cache/conftool/dbconfig/20250801-203332-ladsgroup.json
  • 20:18 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P80473 and previous config saved to /var/cache/conftool/dbconfig/20250801-201825-ladsgroup.json
  • 20:03 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2223 (T400854)', diff saved to https://phabricator.wikimedia.org/P80472 and previous config saved to /var/cache/conftool/dbconfig/20250801-200317-ladsgroup.json
  • 19:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2223 (T400854)', diff saved to https://phabricator.wikimedia.org/P80471 and previous config saved to /var/cache/conftool/dbconfig/20250801-195940-ladsgroup.json
  • 19:59 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2223.codfw.wmnet with reason: Maintenance
  • 19:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T400854)', diff saved to https://phabricator.wikimedia.org/P80470 and previous config saved to /var/cache/conftool/dbconfig/20250801-195917-ladsgroup.json
  • 19:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P80468 and previous config saved to /var/cache/conftool/dbconfig/20250801-194409-ladsgroup.json
  • 19:29 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P80467 and previous config saved to /var/cache/conftool/dbconfig/20250801-192901-ladsgroup.json
  • 19:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2211 (T400854)', diff saved to https://phabricator.wikimedia.org/P80466 and previous config saved to /var/cache/conftool/dbconfig/20250801-191354-ladsgroup.json
  • 19:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2211 (T400854)', diff saved to https://phabricator.wikimedia.org/P80465 and previous config saved to /var/cache/conftool/dbconfig/20250801-191016-ladsgroup.json
  • 19:10 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2211.codfw.wmnet with reason: Maintenance
  • 19:08 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2201.codfw.wmnet with reason: Maintenance
  • 19:08 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T400854)', diff saved to https://phabricator.wikimedia.org/P80464 and previous config saved to /var/cache/conftool/dbconfig/20250801-190817-ladsgroup.json
  • 18:53 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P80463 and previous config saved to /var/cache/conftool/dbconfig/20250801-185310-ladsgroup.json
  • 18:38 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P80462 and previous config saved to /var/cache/conftool/dbconfig/20250801-183802-ladsgroup.json
  • 18:22 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2192 (T400854)', diff saved to https://phabricator.wikimedia.org/P80461 and previous config saved to /var/cache/conftool/dbconfig/20250801-182254-ladsgroup.json
  • 18:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2192 (T400854)', diff saved to https://phabricator.wikimedia.org/P80460 and previous config saved to /var/cache/conftool/dbconfig/20250801-182017-ladsgroup.json
  • 18:20 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2192.codfw.wmnet with reason: Maintenance
  • 18:19 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T400854)', diff saved to https://phabricator.wikimedia.org/P80459 and previous config saved to /var/cache/conftool/dbconfig/20250801-181954-ladsgroup.json
  • 18:16 cjming@deploy1003: Finished scap sync-world: Backport for Revert^2 "MetricsPlatform: Disable synchronous configs fetching" (duration: 09m 13s)
  • 18:10 cjming@deploy1003: cjming: Continuing with sync
  • 18:09 cjming@deploy1003: cjming: Backport for Revert^2 "MetricsPlatform: Disable synchronous configs fetching" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 18:07 cjming@deploy1003: Started scap sync-world: Backport for Revert^2 "MetricsPlatform: Disable synchronous configs fetching"
  • 18:04 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P80458 and previous config saved to /var/cache/conftool/dbconfig/20250801-180447-ladsgroup.json
  • 17:49 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P80457 and previous config saved to /var/cache/conftool/dbconfig/20250801-174939-ladsgroup.json
  • 17:35 cjming@deploy1003: Finished scap sync-world: Backport for Enable AA test on all wikis (T399486) (duration: 08m 06s)
  • 17:34 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2178 (T400854)', diff saved to https://phabricator.wikimedia.org/P80456 and previous config saved to /var/cache/conftool/dbconfig/20250801-173431-ladsgroup.json
  • 17:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2178 (T400854)', diff saved to https://phabricator.wikimedia.org/P80455 and previous config saved to /var/cache/conftool/dbconfig/20250801-173056-ladsgroup.json
  • 17:30 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2178.codfw.wmnet with reason: Maintenance
  • 17:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T400854)', diff saved to https://phabricator.wikimedia.org/P80454 and previous config saved to /var/cache/conftool/dbconfig/20250801-173033-ladsgroup.json
  • 17:30 cjming@deploy1003: ksarabia, cjming: Continuing with sync
  • 17:29 cjming@deploy1003: ksarabia, cjming: Backport for Enable AA test on all wikis (T399486) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 17:27 cjming@deploy1003: Started scap sync-world: Backport for Enable AA test on all wikis (T399486)
  • 17:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P80453 and previous config saved to /var/cache/conftool/dbconfig/20250801-171525-ladsgroup.json
  • 17:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P80452 and previous config saved to /var/cache/conftool/dbconfig/20250801-170018-ladsgroup.json
  • 16:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2171 (T400854)', diff saved to https://phabricator.wikimedia.org/P80451 and previous config saved to /var/cache/conftool/dbconfig/20250801-164510-ladsgroup.json
  • 16:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2171 (T400854)', diff saved to https://phabricator.wikimedia.org/P80450 and previous config saved to /var/cache/conftool/dbconfig/20250801-164134-ladsgroup.json
  • 16:41 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2171.codfw.wmnet with reason: Maintenance
  • 16:41 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T400854)', diff saved to https://phabricator.wikimedia.org/P80449 and previous config saved to /var/cache/conftool/dbconfig/20250801-164111-ladsgroup.json
  • 16:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P80448 and previous config saved to /var/cache/conftool/dbconfig/20250801-162603-ladsgroup.json
  • 16:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P80447 and previous config saved to /var/cache/conftool/dbconfig/20250801-161056-ladsgroup.json
  • 16:08 jly@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:08 jly@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:08 jly@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:08 jly@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:08 jly@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
  • 16:07 jly@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
  • 16:07 jly@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
  • 16:07 jly@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
  • 16:07 jly@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
  • 16:07 jly@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
  • 15:55 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2157 (T400854)', diff saved to https://phabricator.wikimedia.org/P80446 and previous config saved to /var/cache/conftool/dbconfig/20250801-155548-ladsgroup.json
  • 15:52 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2157 (T400854)', diff saved to https://phabricator.wikimedia.org/P80445 and previous config saved to /var/cache/conftool/dbconfig/20250801-155212-ladsgroup.json
  • 15:52 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2157.codfw.wmnet with reason: Maintenance
  • 15:51 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
  • 15:50 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1245.eqiad.wmnet with reason: Maintenance
  • 15:50 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T400854)', diff saved to https://phabricator.wikimedia.org/P80444 and previous config saved to /var/cache/conftool/dbconfig/20250801-155024-ladsgroup.json
  • 15:35 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P80442 and previous config saved to /var/cache/conftool/dbconfig/20250801-153516-ladsgroup.json
  • 15:30 jiji@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
  • 15:22 jiji@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: apply
  • 15:20 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230', diff saved to https://phabricator.wikimedia.org/P80441 and previous config saved to /var/cache/conftool/dbconfig/20250801-152009-ladsgroup.json
  • 15:13 jiji@deploy1003: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
  • 15:12 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.provision (exit_code=99) for device lsw1-e2-codfw.mgmt.codfw.wmnet
  • 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 15:12 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for lsw1-e2-codfw - ayounsi@cumin1003"
  • 15:12 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove management record for lsw1-e2-codfw - ayounsi@cumin1003"
  • 15:09 jiji@deploy1003: helmfile [codfw] START helmfile.d/services/thumbor: apply
  • 15:08 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 15:05 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1230 (T400854)', diff saved to https://phabricator.wikimedia.org/P80440 and previous config saved to /var/cache/conftool/dbconfig/20250801-150501-ladsgroup.json
  • 15:02 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1230 (T400854)', diff saved to https://phabricator.wikimedia.org/P80439 and previous config saved to /var/cache/conftool/dbconfig/20250801-150228-ladsgroup.json
  • 15:02 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1230.eqiad.wmnet with reason: Maintenance
  • 15:01 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1216.eqiad.wmnet with reason: Maintenance
  • 15:01 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T400854)', diff saved to https://phabricator.wikimedia.org/P80438 and previous config saved to /var/cache/conftool/dbconfig/20250801-150119-ladsgroup.json
  • 14:53 cjming@deploy1003: Finished scap sync-world: Backport for Revert "MetricsPlatform: Disable synchronous configs fetching" (duration: 08m 50s)
  • 14:48 cjming@deploy1003: cjming: Continuing with sync
  • 14:46 cjming@deploy1003: cjming: Backport for Revert "MetricsPlatform: Disable synchronous configs fetching" synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 14:46 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P80437 and previous config saved to /var/cache/conftool/dbconfig/20250801-144611-ladsgroup.json
  • 14:44 cjming@deploy1003: Started scap sync-world: Backport for Revert "MetricsPlatform: Disable synchronous configs fetching"
  • 14:31 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P80436 and previous config saved to /var/cache/conftool/dbconfig/20250801-143104-ladsgroup.json
  • 14:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1207 (T400854)', diff saved to https://phabricator.wikimedia.org/P80435 and previous config saved to /var/cache/conftool/dbconfig/20250801-141553-ladsgroup.json
  • 14:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1207 (T400854)', diff saved to https://phabricator.wikimedia.org/P80434 and previous config saved to /var/cache/conftool/dbconfig/20250801-141320-ladsgroup.json
  • 14:13 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1207.eqiad.wmnet with reason: Maintenance
  • 14:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T400854)', diff saved to https://phabricator.wikimedia.org/P80433 and previous config saved to /var/cache/conftool/dbconfig/20250801-141308-ladsgroup.json
  • 14:05 elukey: upgrade redis-server and tools package on idm nodes for security upgrades
  • 13:58 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P80432 and previous config saved to /var/cache/conftool/dbconfig/20250801-135800-ladsgroup.json
  • 13:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P80431 and previous config saved to /var/cache/conftool/dbconfig/20250801-134253-ladsgroup.json
  • 13:37 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
  • 13:37 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-e2-codfw - ayounsi@cumin1003"
  • 13:37 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add management record for lsw1-e2-codfw - ayounsi@cumin1003"
  • 13:33 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
  • 13:33 ayounsi@cumin1003: START - Cookbook sre.network.provision for device lsw1-e2-codfw.mgmt.codfw.wmnet
  • 13:27 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1200 (T400854)', diff saved to https://phabricator.wikimedia.org/P80430 and previous config saved to /var/cache/conftool/dbconfig/20250801-132745-ladsgroup.json
  • 13:25 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1200 (T400854)', diff saved to https://phabricator.wikimedia.org/P80429 and previous config saved to /var/cache/conftool/dbconfig/20250801-132514-ladsgroup.json
  • 13:25 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1200.eqiad.wmnet with reason: Maintenance
  • 13:24 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T400854)', diff saved to https://phabricator.wikimedia.org/P80428 and previous config saved to /var/cache/conftool/dbconfig/20250801-132451-ladsgroup.json
  • 13:09 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P80427 and previous config saved to /var/cache/conftool/dbconfig/20250801-130943-ladsgroup.json
  • 12:58 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 274685
  • 12:57 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 274685
  • 12:57 Amir1: re-running recountCategories.php on all wikis except s4 and s1 (T400987)
  • 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 263252
  • 12:57 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 263252
  • 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 37662
  • 12:56 jiji@deploy1003: helmfile [staging] DONE helmfile.d/services/thumbor: apply
  • 12:56 jiji@deploy1003: helmfile [staging] START helmfile.d/services/thumbor: apply
  • 12:56 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 37662
  • 12:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 5400
  • 12:54 ladsgroup@deploy1003: Finished scap sync-world: Backport for recountCategories: Avoid escpaing column name (T400987) (duration: 08m 36s)
  • 12:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P80426 and previous config saved to /var/cache/conftool/dbconfig/20250801-125436-ladsgroup.json
  • 12:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 5400
  • 12:49 ladsgroup@deploy1003: ladsgroup: Continuing with sync
  • 12:48 ladsgroup@deploy1003: ladsgroup: Backport for recountCategories: Avoid escpaing column name (T400987) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 12:47 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
  • 12:46 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
  • 12:46 ladsgroup@deploy1003: Started scap sync-world: Backport for recountCategories: Avoid escpaing column name (T400987)
  • 12:39 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1185 (T400854)', diff saved to https://phabricator.wikimedia.org/P80424 and previous config saved to /var/cache/conftool/dbconfig/20250801-123928-ladsgroup.json
  • 12:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1185 (T400854)', diff saved to https://phabricator.wikimedia.org/P80423 and previous config saved to /var/cache/conftool/dbconfig/20250801-123057-ladsgroup.json
  • 12:30 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1185.eqiad.wmnet with reason: Maintenance
  • 12:30 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T400854)', diff saved to https://phabricator.wikimedia.org/P80422 and previous config saved to /var/cache/conftool/dbconfig/20250801-123034-ladsgroup.json
  • 12:15 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P80421 and previous config saved to /var/cache/conftool/dbconfig/20250801-121526-ladsgroup.json
  • 12:00 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P80420 and previous config saved to /var/cache/conftool/dbconfig/20250801-120019-ladsgroup.json
  • 11:45 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1161 (T400854)', diff saved to https://phabricator.wikimedia.org/P80419 and previous config saved to /var/cache/conftool/dbconfig/20250801-114511-ladsgroup.json
  • 11:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1161 (T400854)', diff saved to https://phabricator.wikimedia.org/P80418 and previous config saved to /var/cache/conftool/dbconfig/20250801-114238-ladsgroup.json
  • 11:42 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 12:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
  • 11:42 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1161.eqiad.wmnet with reason: Maintenance
  • 11:42 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T400854)', diff saved to https://phabricator.wikimedia.org/P80417 and previous config saved to /var/cache/conftool/dbconfig/20250801-114155-ladsgroup.json
  • 11:26 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P80415 and previous config saved to /var/cache/conftool/dbconfig/20250801-112647-ladsgroup.json
  • 11:24 hnowlan@deploy1003: helmfile [eqiad] DONE helmfile.d/services/thumbor: sync
  • 11:18 hnowlan@deploy1003: helmfile [eqiad] START helmfile.d/services/thumbor: sync
  • 11:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P80414 and previous config saved to /var/cache/conftool/dbconfig/20250801-111139-ladsgroup.json
  • 10:56 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db1159 (T400854)', diff saved to https://phabricator.wikimedia.org/P80413 and previous config saved to /var/cache/conftool/dbconfig/20250801-105631-ladsgroup.json
  • 10:54 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db1159 (T400854)', diff saved to https://phabricator.wikimedia.org/P80412 and previous config saved to /var/cache/conftool/dbconfig/20250801-105400-ladsgroup.json
  • 10:53 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db1159.eqiad.wmnet with reason: Maintenance
  • 10:18 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 10:14 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 10:01 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 09:52 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 09:44 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bookworm
  • 09:38 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bookworm
  • 09:38 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bullseye
  • 09:32 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bullseye
  • 09:02 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp2043.codfw.wmnet with OS bullseye
  • 08:55 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS bullseye
  • 08:06 bwojtowicz@deploy1003: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 08:01 bwojtowicz@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 07:55 bwojtowicz@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
  • 07:42 brouberol@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/mediawiki-dumps-legacy: apply
  • 07:41 brouberol@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/services/mediawiki-dumps-legacy: apply
  • 06:04 tstarling@deploy1003: Finished scap sync-world: Backport for Enable sitemaps API (T400023) (duration: 49m 59s)
  • 05:59 tstarling@deploy1003: tstarling: Continuing with sync
  • 05:16 tstarling@deploy1003: tstarling: Backport for Enable sitemaps API (T400023) synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
  • 05:14 tstarling@deploy1003: Started scap sync-world: Backport for Enable sitemaps API (T400023)
  • 01:11 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 10m 44s)
  • 01:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
  • 00:59 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T400854)', diff saved to https://phabricator.wikimedia.org/P80408 and previous config saved to /var/cache/conftool/dbconfig/20250801-005907-ladsgroup.json
  • 00:51 eileen: * civicrm upgraded from 82a5306d to f202b616
  • 00:44 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P80407 and previous config saved to /var/cache/conftool/dbconfig/20250801-004359-ladsgroup.json
  • 00:28 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224', diff saved to https://phabricator.wikimedia.org/P80406 and previous config saved to /var/cache/conftool/dbconfig/20250801-002852-ladsgroup.json
  • 00:13 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2224 (T400854)', diff saved to https://phabricator.wikimedia.org/P80405 and previous config saved to /var/cache/conftool/dbconfig/20250801-001345-ladsgroup.json
  • 00:11 ladsgroup@cumin1002: dbctl commit (dc=all): 'Depooling db2224 (T400854)', diff saved to https://phabricator.wikimedia.org/P80404 and previous config saved to /var/cache/conftool/dbconfig/20250801-001119-ladsgroup.json
  • 00:11 ladsgroup@cumin1002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6:00:00 on db2224.codfw.wmnet with reason: Maintenance
  • 00:10 ladsgroup@cumin1002: dbctl commit (dc=all): 'Repooling after maintenance db2217 (T400854)', diff saved to https://phabricator.wikimedia.org/P80403 and previous config saved to /var/cache/conftool/dbconfig/20250801-001055-ladsgroup.json

Other archives

See Server Admin Log/Archives.