-
Notifications
You must be signed in to change notification settings - Fork 47
Closed
Labels
Description
Issue occurred on my own staging during LWT with enabled tablets testing. Error itself:
< t:2025-08-22 11:46:39,578 f:db_log_reader.py l:123 c:sdcm.db_log_reader p:DEBUG > 2025-08-22T11:46:15.339+00:00 longevity-lwt-24h-lwt-test-db-node-353e20c2-0-4 !INFO | scylla[4700]: [shard 0:strm] repair - repair[e0f68f3e-a74a-4d3e-9920-819b62a70b67]: Finished to process repair_flush_hints_batchlog_request from node=10.142.0.34 updated=false flush_hints_batchlog_time= 1755863136 flush_cache_time=60000ms flush_duration=0s
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Run: 35448721-7f4d-11f0-805e-42010a8e0032
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Status: ERROR
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Cause: see more errors in logs: master 10.142.0.35 keyspace lwt_keyspace table lwt_io$paxos command 0: schedule tablet repair task: giving up after 10 attempts: agent [HTTP 500] std::runtime_error (Can't set repair request on a co-located table)
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Start time: 22 Aug 25 11:43:23 UTC
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: End time: 22 Aug 25 11:46:22 UTC
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Duration: 2m59s
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Progress: 99%
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Intensity: 1
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Parallel: 0
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Host: invalid IP
< t:2025-08-22 11:46:39,579 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Datacenters:
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: - us-east1
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>:
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ╭───────────────────────────────┬────────────────────────────────┬──────────┬──────────╮
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ Keyspace │ Table │ Progress │ Duration │
< t:2025-08-22 11:46:39,580 f:db_log_reader.py l:123 c:sdcm.db_log_reader p:DEBUG > 2025-08-22T11:46:14.999+00:00 longevity-lwt-24h-lwt-test-db-node-353e20c2-0-3 !INFO | scylla[6438]: [shard 5: gms] hints_manager - Draining starts for 5d5f5d46-e93b-44f1-b53c-733eab1e35e6
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
< t:2025-08-22 11:46:39,580 f:db_log_reader.py l:123 c:sdcm.db_log_reader p:DEBUG > 2025-08-22T11:46:15.753+00:00 longevity-lwt-24h-lwt-test-db-node-353e20c2-0-5 !INFO | scylla[5888]: [shard 0: gms] tablets - Set sstables_repaired_at=0 table=52fc3eb0-7f41-11f0-80b5-8f6de9b0b120 tablet=69
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ banned_keyspace │ table1 │ 100% │ 4s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ lwt_keyspace │ lwt_io │ 100% │ 13s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ lwt_keyspace │ lwt_io$paxos │ 73%/27% │ 2m42s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ system_distributed_everywhere │ cdc_generation_descriptions_v2 │ 100% │ 1s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ system_distributed │ cdc_generation_timestamps │ 100% │ 1s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ system_distributed │ cdc_streams_descriptions_v2 │ 100% │ 1s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ system_distributed │ service_levels │ 100% │ 1s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ system_distributed │ view_build_status │ 100% │ 1s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ├───────────────────────────────┼────────────────────────────────┼──────────┼──────────┤
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: │ system_replicated_keys │ encrypted_keys │ 100% │ 1s │
< t:2025-08-22 11:46:39,580 f:base.py l:229 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: ╰───────────────────────────────┴────────────────────────────────┴──────────┴──────────╯
< t:2025-08-22 11:46:39,581 f:base.py l:141 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.142.0.50>: Command "sudo sctool -c 57d3fa0a-508e-490b-a4aa-bf37417d9b83 progress repair/3617483c-6936-4da5-b283-7d0e45bb7a7e" finished with status 0
No keyspaces were passed as params while invoking SM repair. Not sure if its a bug but it seems that SM shouldn't report such error or should not even try to repair certain table or colocated tables in general.
Argus run