pg_autoctl perform failover

Environment : Two node active-passive and third node dedicated as monitor (Postgresql 14, ubuntu 22.04)
Purpose : Manage auto **failover/failback** without manual intervention

After configuration, monitor node, shows active (node_1) - passive(node_2) as healthy, with reported & assigned as **PRIMARY** and **SECONDARY** respectively. all pre-requisites denote readiness for configuration and setup of nodes.

scenario : before go-live, tried manual failover for feel and experience.

using - pg_autoctl perform failover             # From current primary node

Findings : 
- background process, changed the state for node_1 : demoted/catchingup and node_2 : wait_primary/wait_primary
- initiated pg_basebackup from node_2 onto node_1 after remove content of data directory 
- due to blank of data directory - cluster DOWN
- mismatch for special files between config and data directory (using symlink)
- in absence of required special files at place where it should be, base backup restore went into endless loop
- assign of pgautofailover_replicator password to system
- after manual intervention for provide of base backup to data directory
- make available of all required files at respective positions
- lastly max_wal_sender parameter was the show stopper (which was same as defined earlier stage)
- ultimately both nodes were in state, where their required positions after time consuming efforts.

if these efforts are during manual implement of failover, how the tool is reliable for **GO-LIVE** scenario, is there any extra efforts to make it auto process ?

Any experience/suggestion always welcome from expert team.

Thanks in advance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

pg_autoctl perform failover #1088

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

pg_autoctl perform failover #1088

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions