Page MenuHomePhabricator

Marostegui (Manuel Aróstegui)
Staff Database Administrator

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Monday

  • Clear sailing ahead.

User Details

User Since
Sep 1 2016, 6:48 AM (466 w, 2 d)
Availability
Available
IRC Nick
marostegui
LDAP User
Marostegui
MediaWiki User
MArostegui (WMF) [ Global Accounts ]

TZ: UTC +1/+2

Recent Activity

Tue, Jul 29

Marostegui triaged T399633: Fix AUTO_INCREMENT values for the content_models table as Low priority.

I just quickly checked the first example you gave and:

[email protected][frwiki]> select * from content_models;
+----------+------------------------+
| model_id | model_name             |
+----------+------------------------+
|        9 | MassMessageListContent |
|        2 | Scribunto              |
|        6 | css                    |
|        3 | flow-board             |
|        5 | javascript             |
|        7 | json                   |
|       10 | sanitized-css          |
|        8 | text                   |
|        1 | wikitext               |
+----------+------------------------+
9 rows in set (0.002 sec)
Tue, Jul 29, 7:53 AM · DBA
Marostegui added a comment to T388389: Create a cookbook to depool parsercache sections.

What's pending to close?

Tue, Jul 29, 7:43 AM · DBA
Marostegui moved T254738: Create a monitoring check for event_scheduler and for installed events from In progress to Ready on the DBA board.
Tue, Jul 29, 6:54 AM · Sustainability (Incident Followup), DBA
Marostegui closed T377451: zarcillo: Take inventory of all active use cases, a subtask of T384810: MariaDB lifetime management system, as Declined.
Tue, Jul 29, 6:53 AM · Patch-For-Review, DBA
Marostegui closed T377451: zarcillo: Take inventory of all active use cases as Declined.

I am going to decline this for now as this will be taken care of by T384810

Tue, Jul 29, 6:53 AM · Data-Persistence-Automations, DBA
Marostegui moved T375589: mariadb monitoring: buffer pool usage from In progress to Ready on the DBA board.
Tue, Jul 29, 6:52 AM · Data-Persistence-Automations, DBA
Marostegui moved T399927: Transition codfw data persistence external storage (es) hosts to 10G from Ready to In progress on the DBA board.
Tue, Jul 29, 6:51 AM · SRE, DC-Ops, ops-codfw, DBA
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Tue, Jul 29, 6:36 AM · DBA
Marostegui added a comment to T400214: Q1:rack/setup/install db126[0-3].

Patches are done

Tue, Jul 29, 6:26 AM · SRE, Data-Persistence, ops-eqiad, DC-Ops
Marostegui updated the task description for T400214: Q1:rack/setup/install db126[0-3].
Tue, Jul 29, 6:26 AM · SRE, Data-Persistence, ops-eqiad, DC-Ops
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

You can probably skip 2039 for now and jump to 2040 until we figure out what's best for 2039.

Tue, Jul 29, 6:14 AM · SRE, DC-Ops, ops-codfw, DBA
Marostegui updated the task description for T399927: Transition codfw data persistence external storage (es) hosts to 10G.
Tue, Jul 29, 5:12 AM · SRE, DC-Ops, ops-codfw, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

@Marostegui es2038 is moved, updated, and powered up!

Tue, Jul 29, 5:11 AM · SRE, DC-Ops, ops-codfw, DBA

Mon, Jul 28

Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Mon, Jul 28, 2:22 PM · Data-Engineering, Schema-change-in-production, DBA
Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Mon, Jul 28, 2:20 PM · Data-Engineering, Schema-change-in-production, DBA
Marostegui closed T400599: Switchover x3 master (db2162 -> db2241) as Resolved.

Done

Mon, Jul 28, 10:04 AM · DBA
Marostegui updated the task description for T400599: Switchover x3 master (db2162 -> db2241).
Mon, Jul 28, 10:03 AM · DBA
Marostegui updated the task description for T400599: Switchover x3 master (db2162 -> db2241).
Mon, Jul 28, 10:01 AM · DBA
Marostegui added a parent task for T400599: Switchover x3 master (db2162 -> db2241): Unknown Object (Task).
Mon, Jul 28, 9:51 AM · DBA
Marostegui claimed T400599: Switchover x3 master (db2162 -> db2241).
Mon, Jul 28, 9:50 AM · DBA
Marostegui merged task T400598: Switchover x3 master (db2162 -> db2241) into T400599: Switchover x3 master (db2162 -> db2241).
Mon, Jul 28, 9:50 AM · DBA
Marostegui merged T400598: Switchover x3 master (db2162 -> db2241) into T400599: Switchover x3 master (db2162 -> db2241).
Mon, Jul 28, 9:49 AM · DBA
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Mon, Jul 28, 9:25 AM · DBA
Marostegui added a parent task for T400591: Switchover s7 master (db2218 -> db2220): Unknown Object (Task).
Mon, Jul 28, 9:16 AM · DBA
Marostegui closed T400591: Switchover s7 master (db2218 -> db2220) as Resolved.

Done

Mon, Jul 28, 9:07 AM · DBA
Marostegui closed T400591: Switchover s7 master (db2218 -> db2220), a subtask of T399955: Migrate s7 to MariaDB 10.11, as Resolved.
Mon, Jul 28, 9:07 AM · DBA
Marostegui updated the task description for T400591: Switchover s7 master (db2218 -> db2220).
Mon, Jul 28, 9:06 AM · DBA
Marostegui updated the task description for T400591: Switchover s7 master (db2218 -> db2220).
Mon, Jul 28, 9:02 AM · DBA
Marostegui updated the task description for T400591: Switchover s7 master (db2218 -> db2220).
Mon, Jul 28, 9:00 AM · DBA
Marostegui added a comment to T400591: Switchover s7 master (db2218 -> db2220).

Old 300
API 100

Mon, Jul 28, 8:59 AM · DBA
Marostegui added a comment to T400591: Switchover s7 master (db2218 -> db2220).

db2218 patched

Mon, Jul 28, 7:54 AM · DBA
Marostegui updated the task description for T399540: Upgrade masters to 10.6.22 and 10.11.13 .2 update.
Mon, Jul 28, 7:53 AM · DBA
Marostegui added a parent task for T400591: Switchover s7 master (db2218 -> db2220): T399955: Migrate s7 to MariaDB 10.11.
Mon, Jul 28, 7:51 AM · DBA
Marostegui added a subtask for T399955: Migrate s7 to MariaDB 10.11: T400591: Switchover s7 master (db2218 -> db2220).
Mon, Jul 28, 7:51 AM · DBA
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Mon, Jul 28, 7:37 AM · DBA
Marostegui updated the task description for T400591: Switchover s7 master (db2218 -> db2220).
Mon, Jul 28, 7:16 AM · DBA
Marostegui claimed T400591: Switchover s7 master (db2218 -> db2220).
Mon, Jul 28, 7:16 AM · DBA
Marostegui closed T400513: db2196 (x1 codfw master) unreachable as Resolved.

I am repooling this host as replica and closing this for now.

Mon, Jul 28, 7:03 AM · DBA
Marostegui added a comment to T400513: db2196 (x1 codfw master) unreachable.

There are no HW errors on the idrac and the syslog only shows:

Mon, Jul 28, 6:59 AM · DBA
Marostegui added a comment to T400513: db2196 (x1 codfw master) unreachable.

There are no HW errors on the idrac and the syslog only shows (before and after the crash):

Mon, Jul 28, 6:58 AM · DBA
Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Mon, Jul 28, 6:41 AM · Data-Engineering, Schema-change-in-production, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

@Jhancock.wm es2038 is ready for you

Mon, Jul 28, 6:37 AM · SRE, DC-Ops, ops-codfw, DBA

Fri, Jul 25

Marostegui added a comment to T400513: db2196 (x1 codfw master) unreachable.

db2196 is no longer a master - I will leave it depooled and with notifications disabled for the weekend and will investigate more next week.

Fri, Jul 25, 10:39 PM · DBA
Marostegui closed T400514: Switchover x1 master (db2196 -> db2215) as Resolved.

Done

Fri, Jul 25, 10:38 PM · DBA
Marostegui closed T400514: Switchover x1 master (db2196 -> db2215), a subtask of T400513: db2196 (x1 codfw master) unreachable, as Resolved.
Fri, Jul 25, 10:38 PM · DBA
Marostegui updated the task description for T400514: Switchover x1 master (db2196 -> db2215).
Fri, Jul 25, 10:38 PM · DBA
Marostegui updated the task description for T400514: Switchover x1 master (db2196 -> db2215).
Fri, Jul 25, 10:35 PM · DBA
Marostegui renamed T400513: db2196 (x1 codfw master) unreachable from db2196 unreachable to db2196 (x1 codfw master) unreachable.
Fri, Jul 25, 10:33 PM · DBA
Marostegui claimed T400514: Switchover x1 master (db2196 -> db2215).
Fri, Jul 25, 10:28 PM · DBA
Marostegui added a parent task for T400514: Switchover x1 master (db2196 -> db2215): T400513: db2196 (x1 codfw master) unreachable.
Fri, Jul 25, 10:27 PM · DBA
Marostegui added a subtask for T400513: db2196 (x1 codfw master) unreachable: T400514: Switchover x1 master (db2196 -> db2215).
Fri, Jul 25, 10:27 PM · DBA
Marostegui added a comment to T400513: db2196 (x1 codfw master) unreachable.

I will do an emergency switchover of this host, I don't trust it especially now that we are starting the weekend.

Fri, Jul 25, 10:26 PM · DBA
Marostegui added a comment to T400513: db2196 (x1 codfw master) unreachable.

db2196 was rebooted via idrac - it shows no HW on its logs

Fri, Jul 25, 10:26 PM · DBA
Marostegui updated the task description for T399927: Transition codfw data persistence external storage (es) hosts to 10G.
Fri, Jul 25, 3:40 PM · SRE, DC-Ops, ops-codfw, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

Thanks, I will get es2038 ready by Monday

Fri, Jul 25, 3:40 PM · SRE, DC-Ops, ops-codfw, DBA
Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Fri, Jul 25, 2:43 PM · Data-Engineering, Schema-change-in-production, DBA
Marostegui updated the task description for T399540: Upgrade masters to 10.6.22 and 10.11.13 .2 update.
Fri, Jul 25, 9:10 AM · DBA
Marostegui added a comment to T394487: Migrate backup sources to MariaDB 10.11.

@Marostegui All backups are now generated with the 10.11 package.

Fri, Jul 25, 9:08 AM · database-backups
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Fri, Jul 25, 8:50 AM · DBA
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Fri, Jul 25, 6:14 AM · DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

@Jhancock.wm es2037 is ready for you - homer was run.

Fri, Jul 25, 6:05 AM · SRE, DC-Ops, ops-codfw, DBA
Marostegui closed T400436: Switchover es6 master (es2037 -> es2035) as Resolved.

Done

Fri, Jul 25, 6:02 AM · DBA
Marostegui updated the task description for T400436: Switchover es6 master (es2037 -> es2035).
Fri, Jul 25, 6:01 AM · DBA
Marostegui claimed T400436: Switchover es6 master (es2037 -> es2035).
Fri, Jul 25, 5:56 AM · DBA
Marostegui closed T400435: Switchover es7 master (es2038 -> es2039) as Resolved.

Done

Fri, Jul 25, 5:55 AM · DBA
Marostegui updated the task description for T400435: Switchover es7 master (es2038 -> es2039).
Fri, Jul 25, 5:55 AM · DBA
Marostegui updated the task description for T400435: Switchover es7 master (es2038 -> es2039).
Fri, Jul 25, 5:53 AM · DBA
Marostegui claimed T400435: Switchover es7 master (es2038 -> es2039).
Fri, Jul 25, 5:50 AM · DBA

Thu, Jul 24

Marostegui updated the task description for T399927: Transition codfw data persistence external storage (es) hosts to 10G.
Thu, Jul 24, 5:27 PM · SRE, DC-Ops, ops-codfw, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

es2036 done

[   21.582858] bnxt_en 0000:4b:00.0 eno12399np0: NIC Link is Up, 10000 Mbps (NRZ) full duplex, Flow control: none
[   21.582868] bnxt_en 0000:4b:00.0 eno12399np0: FEC autoneg off encoding: None
Thu, Jul 24, 5:27 PM · SRE, DC-Ops, ops-codfw, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

@Marostegui today or tomorrow is fine.

Thu, Jul 24, 4:40 PM · SRE, DC-Ops, ops-codfw, DBA
Marostegui updated the task description for T400213: Q1:rack/setup/install db224[5-8].
Thu, Jul 24, 1:24 PM · SRE, Data-Persistence, ops-codfw, DC-Ops
Marostegui added a comment to T400213: Q1:rack/setup/install db224[5-8].

Patches are done

Thu, Jul 24, 1:23 PM · SRE, Data-Persistence, ops-codfw, DC-Ops
Marostegui updated the task description for T399927: Transition codfw data persistence external storage (es) hosts to 10G.
Thu, Jul 24, 7:20 AM · SRE, DC-Ops, ops-codfw, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

@Marostegui lemme know when you want to do es2036

Thu, Jul 24, 7:20 AM · SRE, DC-Ops, ops-codfw, DBA
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Thu, Jul 24, 7:18 AM · DBA
Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Thu, Jul 24, 7:10 AM · Data-Engineering, Schema-change-in-production, DBA
Marostegui updated the task description for T400195: Q1:rack/setup/install es2049-es2057.
Thu, Jul 24, 7:01 AM · SRE, Data-Persistence, ops-codfw, DC-Ops
Marostegui added a comment to T400195: Q1:rack/setup/install es2049-es2057.

Patches done

Thu, Jul 24, 7:01 AM · SRE, Data-Persistence, ops-codfw, DC-Ops
Marostegui added a comment to T400198: Q1:rack/setup/install es1049-es1057.

Patches done

Thu, Jul 24, 6:50 AM · SRE, Data-Persistence, ops-eqiad, DC-Ops
Marostegui updated the task description for T400198: Q1:rack/setup/install es1049-es1057.
Thu, Jul 24, 6:50 AM · SRE, Data-Persistence, ops-eqiad, DC-Ops

Tue, Jul 22

Marostegui added a comment to T398422: MetricsPlatform: InstrumentConfigFetcher: Make fetching asynchronous.

it recovered yes, thank you

Tue, Jul 22, 12:55 PM · Metrics Platform, MW-1.45-notes (1.45.0-wmf.13; 2025-08-05), Experimentation Lab (Experiment Platform Sprint 10), Patch-For-Review
Marostegui added a comment to T398422: MetricsPlatform: InstrumentConfigFetcher: Make fetching asynchronous.

Thanks, let's see if traffic goes back to normal

Tue, Jul 22, 10:49 AM · Metrics Platform, MW-1.45-notes (1.45.0-wmf.13; 2025-08-05), Experimentation Lab (Experiment Platform Sprint 10), Patch-For-Review
Marostegui added a comment to T398422: MetricsPlatform: InstrumentConfigFetcher: Make fetching asynchronous.

This deployment matches a sudden increase in our mainstash database cluster:
https://grafana.wikimedia.org/d/000000278/mysql-aggregated?orgId=1&from=2025-07-21T12:17:48.161Z&to=2025-07-22T10:06:34.613Z&timezone=utc&var-site=eqiad&var-group=core&var-shard=ms1&var-shard=ms2&var-shard=ms3&var-role=$__all

Tue, Jul 22, 10:10 AM · Metrics Platform, MW-1.45-notes (1.45.0-wmf.13; 2025-08-05), Experimentation Lab (Experiment Platform Sprint 10), Patch-For-Review
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Tue, Jul 22, 10:01 AM · DBA
Marostegui updated the task description for T399955: Migrate s7 to MariaDB 10.11.
Tue, Jul 22, 9:57 AM · DBA
Marostegui updated the task description for T383795: Move sX to STATEMENT based replication.
Tue, Jul 22, 9:47 AM · DBA
Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Tue, Jul 22, 9:38 AM · Data-Engineering, Schema-change-in-production, DBA
Marostegui updated the task description for T399249: Add cl_timestamp_id index to categorylinks table.
Tue, Jul 22, 8:31 AM · Data-Engineering, Schema-change-in-production, DBA
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

@Jhancock.wm es2035 is ready

Tue, Jul 22, 6:58 AM · SRE, DC-Ops, ops-codfw, DBA

Mon, Jul 21

Marostegui added a comment to T391056: Drop afl_patrolled_by from abuse_filter_log in production.

@FCeratto-WMF now that the s1 codfw master is switched, remember to apply this schema change there too.

Mon, Jul 21, 5:02 PM · Data-Engineering-Radar, Schema-change-in-production, DBA, Data-Engineering
Marostegui updated the task description for T383795: Move sX to STATEMENT based replication.
Mon, Jul 21, 2:36 PM · DBA
Marostegui added a comment to T400055: PHP Warning: Undefined array key "DEFAULT".

Well, for now it seems to have gone away again – last seen Jul 21, 2025 @ 13:38:00.096 UTC, which was a few minutes before repooling db1166.

My suspicion at the moment is that dbctl had some kind of rare error condition which caused the DEFAULT key to be missing, but fixed itself on the next run.

Mon, Jul 21, 2:02 PM · Dumps-Generation, Wikidata, Wikimedia-production-error
Marostegui added a comment to T399927: Transition codfw data persistence external storage (es) hosts to 10G.

Yeah, we'll do one at the time, to be on the safe side. Do you want me to get es2035 ready for tomorrow?

Mon, Jul 21, 1:25 PM · SRE, DC-Ops, ops-codfw, DBA
Marostegui added a comment to T400055: PHP Warning: Undefined array key "DEFAULT".

First seen on Jul 21, 2025 @ 12:17:59.330 UTC. In the SAL, two @Marostegui messages (repooling db1157 and depooling db1166) are suspiciously close to that, and the stack trace says “DB config” – any chance this could be related?

Mon, Jul 21, 1:22 PM · Dumps-Generation, Wikidata, Wikimedia-production-error
Marostegui updated the task description for T383795: Move sX to STATEMENT based replication.
Mon, Jul 21, 1:12 PM · DBA
Marostegui updated the task description for T383795: Move sX to STATEMENT based replication.
Mon, Jul 21, 1:11 PM · DBA
Marostegui updated the task description for T383795: Move sX to STATEMENT based replication.
Mon, Jul 21, 11:31 AM · DBA
Marostegui moved T399302: Catalog x1 tables from Triage to In progress on the DBA board.
Mon, Jul 21, 10:26 AM · Connection-Team, Growth-Team (Current Sprint), Trust and Safety Product Sprint (Sprint Cannoli (July 7 - July 25)), CampaignEvents, Community-Tech, DBA, MediaWiki-extensions-UrlShortener, Reading List Service, MediaWiki-extensions-LoginNotify, Notifications (Echo), CheckUser, MediaWiki-extensions-BounceHandler, GrowthExperiments, MediaModeration
Marostegui updated subscribers of T395669: globalblocks table: SQL in extension and production have different type for gb_address.
Mon, Jul 21, 10:24 AM · Trust and Safety Product Sprint, Data-Persistence, Schema-change, Trust and Safety Product Team, GlobalBlocking, Data-Engineering