Hello,
I have an issue with VIP stability on new site(there is still no production traffic there). Our monitoring systems alert us that the VIP is down and not reachable, both devices primary and back up are up. We have to restart the machine to get the VIP online again.
In the bellow example our monitoring systems registered that the VIP went down about 20:00
In the messeges log of the backup machine shows soemthing earlier:
Nov 3 18:51:01 pzn-wsg-002 systemd: Started Cleanup of Temporary Directories.
Nov 3 19:03:46 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) Transition to MASTER STATE
Nov 3 19:03:47 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) Entering MASTER STATE
Nov 3 19:03:47 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) setting protocol VIPs.
Nov 3 19:03:47 pzn-wsg-002 Keepalived_vrrp[3154]: Sending gratuitous ARP on eth0 for 172.21.47.20
Nov 3 19:03:47 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 172.21.47.20
Nov 3 19:03:47 pzn-wsg-002 Keepalived_vrrp[3154]: Sending gratuitous ARP on eth0 for 172.21.47.20
Nov 3 19:03:48 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) Received advert with higher priority 70, ours 50
Nov 3 19:03:48 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) Entering BACKUP STATE
Nov 3 19:03:48 pzn-wsg-002 Keepalived_vrrp[3154]: VRRP_Instance(VI_1) removing protocol VIPs.
Nov 3 19:48:04 pzn-wsg-002 systemd: Reloading.
In the messages log of the primary device we have nothing.
Nov 3 19:51:55 pzn-wsg-001 systemd: Started Cleanup of Temporary Directories.
Nov 3 20:48:04 pzn-wsg-001 systemd: Reloading.
In the Ha proxy logs there are no logs from this event.
How can I troubleshoot this issue?
Regards
Piotr K.