Sample Header Ad - 728x90

Slony-I replication stopped working

0 votes
1 answer
384 views
I inherited a 5 node postgres cluster running postgres 8.4 and slony 1.2.21. We are in the process of migrating the application to all new code and have not wanted to do very little maintenance on the cluster. Yesterday we decided to take down two nodes out of the cluster which were not being used. We used the slonik script to DROP NODE for the two nodes in the cluster. This seemed to work correctly and we shut down the nodes today. However I noticed this morning that our master database where the collect writes is not replicating the changes to the rest of the servers. I have tried everything I can think of but it appears but nothing seems to work. When I run a query to collect the status I see that events are not being acknowleged since yesterday. The st_last_received has not changed at all. st_origin | st_received | st_last_event | st_last_event_ts | st_last_received | st_last_received_ts | st_last_received_event_ts | st_lag_num_events | now -----------+-------------+---------------+----------------------------+------------------+----------------------------+----------------------------+-------------------+------------------------------- 25 | 24 | 26196903 | 2016-11-29 17:39:06.859051 | 26187885 | 2016-11-29 12:51:45.396619 | 2016-11-28 11:11:48.909855 | 9018 | 2016-11-29 17:39:07.247598-05 25 | 27 | 26196903 | 2016-11-29 17:39:06.859051 | 26187885 | 2016-11-28 11:11:49.203193 | 2016-11-28 11:11:48.909855 | 9018 | 2016-11-29 17:39:07.247598-05 25 | 26 | 26196903 | 2016-11-29 17:39:06.859051 | 26187885 | 2016-11-28 11:11:50.253235 | 2016-11-28 11:11:48.909855 | 9018 | 2016-11-29 17:39:07.247598-05 I first restarted the slony daemons on all the nodes and have subsequently done this multiple times. I have set debugging to level 4 for debug logging and have combed through them without finding a single issue. I have looked through all the *.sl_* tables for anything that might tell me why it is not working. Our configuration is as follows for the important replication set. select * from _ads.sl_set; set_id | set_origin | set_locked | set_comment --------+------------+------------+----------------- 1 | 25 | | mgt tables select * from _ads.sl_subscribe ; sub_set | sub_provider | sub_receiver | sub_forward | sub_active ---------+--------------+--------------+-------------+------------ 1 | 25 | 26 | t | t 1 | 25 | 27 | t | t 2 | 25 | 27 | t | t 1 | 25 | 24 | t | t select * from _ads.sl_listen ; li_origin | li_provider | li_receiver -----------+-------------+------------- 24 | 24 | 25 26 | 26 | 25 27 | 27 | 25 27 | 25 | 26 26 | 25 | 27 27 | 25 | 24 24 | 25 | 27 24 | 25 | 26 26 | 25 | 24 26 | 24 | 25 27 | 26 | 25 26 | 27 | 25 24 | 26 | 25 27 | 24 | 25 24 | 27 | 25 25 | 25 | 24 25 | 25 | 26 25 | 25 | 27 Any advice assistance, or an idea on where to look would be greatly appreciated. I am in full on panic mode now.
Asked by Chris Hinshaw (103 rep)
Nov 29, 2016, 11:15 PM
Last activity: Dec 10, 2016, 10:09 PM