/
[incident + post-mortem] 2024-11-26 06:40 PM - 2024-11-27 02:00 PM (Australia EST) - Relay Server Performance Degradation (Melbourne 1, Australia)

[incident + post-mortem] 2024-11-26 06:40 PM - 2024-11-27 02:00 PM (Australia EST) - Relay Server Performance Degradation (Melbourne 1, Australia)

Incident Overview

Investigation Log Timeline (AEST)

2024-11-26 2:19 PM: We started getting complaints about the connection time that was significantly increased in Oceania.

2024-11-26 7:07 PM: We narrowed the issue down to the Melbourne 1 Relay Server and started investigation.

2024-11-26 10:50 PM: We noticed significant service degradation at 6:50 PM on Melbourne 1 Relay Server vultr-mel-2.vmsproxy.com (67.219.103.112).

2024-11-27 1:02 AM: We identified the issue and started working on the solution.

2024-11-27 2:00 PM:

  • We deployed TWO new Relay Servers in the Asia-Pacific region:

    • Sydney, Australia 1 relay-au-syd-1-prod-dp.vmsproxy.com (95.173.193.212)

    • Sydney, Australia 2 relay-au-syd-2-prod-dp.vmsproxy.com (95.173.193.213)

  • We disabled TWO existing Relay Servers in the Asia-Pacific region:

    • Sydney 4, Australia vultr-syd-4.vmsproxy.com (45.77.51.96)

    • Melbourne 1, Australia vultr-mel-2.vmsproxy.com (67.219.103.112)

  • We restarted the Connection Mediator in that area to re-route traffic to the new Relay Servers.

2024-11-27 2:10 PM: The performance increase is confirmed. The Firewall Passlist article has been updated.

Root Cause

The incident is caused by high load on the Melbourne 1 Relay Server.

Corrective Actions

  1. Relay Servers Update
    The issue will be fixed once all Relay Servers will be updated in all regions

  2. Enhanced Monitoring
    The monitoring system will be updated to track issues associated with the Relay Servers performance.

  3. Required Actions
    Update Firewall Passlist configurations and monitoring endpoints according to the Firewall Passlist article.

Related pages