Performance Issues - AWS network
Incident Report for TrackVia
Resolved
This incident has been resolved.
Posted Sep 18, 2023 - 22:07 MDT
Monitoring
TrackVia's cloud provider has successfully applied an update to the subsystem responsible for network mapping propagation to address resource contention. They have seen network mapping propagation times stabilize but they have not yet begun to trend towards normal levels. They expect that to begin over the next 30 minutes, at which time we expect latencies and error rates to improve. We will continue to keep you updated on our progress towards full recovery.
Posted Sep 18, 2023 - 18:36 MDT
Update
TrackVia's cloud provider continues to progress toward resolving the increased networking latencies and errors. At this time, they are approximately 50% completed with the update to address resource contention within the subsystem responsible for network mappings propagation. Their current expectation is to have the problem resolved within the next 60 to 90 minutes, and we will continue to provide updates as recovery progresses.
Posted Sep 18, 2023 - 17:31 MDT
Update
While TrackVia's cloud provider continues to make progress in addressing the issue, we wanted to provide some more details on the issue. Starting at 10:00 AM PDT this morning, TrackVia's cloud provider has been experiencing a delay in the propagation of network configuration changes to the underlying hardware ensuring network packets can flow between source and destination. The root cause appears to be increased load to the subsystem responsible for the handling of these network mappings.

Once load has been reduced on the subsystem responsible for network mapping propagation, we would expect full recovery.

We will continue to keep you updated as we make progress towards full recovery.
Posted Sep 18, 2023 - 15:33 MDT
Update
TrackVia's cloud provider is continuing to investigate increased networking latencies and errors affecting Availability Zones in the US-WEST-2 Region. The issue affects some instances in each of these Availability Zones, where network mappings are not being propagated to the underlying hardware. Any changes to network configurations would see delayed mappings, which will affect network connectivity.

For other TrackVia services, such as micro services and backup/recovery, delays in function as well as increased error rates may occur due to this issue. AWS is working to resolve the issue and is seeing improvements as a result of these efforts. We will continue to keep you updated as we make progress towards full recovery.
Posted Sep 18, 2023 - 13:46 MDT
Update
TrackVia's cloud provider is continuing to work on a fix for this issue.
Posted Sep 18, 2023 - 13:10 MDT
Identified
[11:57 AM PDT] TrackVia's cloud provider confirmed increased networking latencies and errors affecting multiple Availability Zones in the US-WEST-2 Region. We have identified a potential root cause of the errors and are attempting mitigations. Early signs are this mitigation is reducing error rates and latencies. We continue to work towards a full root case and recovery.

[11:43 AM PDT] TrackVia's cloud provider is investigating increased networking latencies and errors affecting multiple Availability Zones in the US-WEST-2 Region.
Posted Sep 18, 2023 - 13:09 MDT
This incident affected: Commercial Cloud, Private Cloud, and HIPAA Cloud.