PractiTest Service Status Log
2022 - Updates:
Thursday, June 2nd, 8:06-8:50 GMT - US Only - Issues with a Push to production (32 min overall of intermittent connection issues)
Thursday, March 3rd, 0:33-1:11 GMT - US Only - Login infrastructure issues (34 min overall of intermittent connection issues)
Wednesday, January 19th, 7:48-7:55 GMT - US Only - Memory issues on web machines
Friday, January 8th, 23:56-0:05 GMT - US Only - Scheduled db maintenance
Friday, January 8th, 23:56-0:05 GMT - US Only - Scheduled db maintenance
2021 - Updates (US Datacenter & EU datacenter - total of 99.99% uptime each)
Tuesday, Dec 7th, 2:33-8:29 GMT - EU Only -> AWS issues caused overall to 6 min of intermittent accessibility issues for users
Friday, October 22nd, 23:09-23:13 GMT - US Only -> Scheduled database maintenance
Monday, October 18th, 7:37-7:45 GMT - US Only -> Slow query in the product caused a database IO load
Monday, September 27th, 7:15-7:21 GMT - US Only -> AWS issues caused a database failover
Thursday, September 9th, 4:36-5:10 GMT - US Only -> Login infrastructure issue
Monday, July 26th, 12:59-13:02 GMT - US Only -> Spike on incoming requests caused timeouts for some users
Tuesday, July 6th, 3:17-3:26 GMT - US Only -> Heavy load on our servers due to a DB CPU spike
Tuesday, March 16th, 6:53 - 6:54 GMT - US Only - Downtime due to a system update that didn't handle well the ORM caching. Service was unavailable for ~90 seconds
Monday, March 22nd, 7:02 - 7:08 GMT - US Only - 6 minutes downtime due to an update
2020 - Updates (US Datacenter & EU datacenter - total of more than 99.99% uptime each)
Wednesday, Nov 25th, 15:16 - 20:30 GMT - AWS outages affected access to our service. During some periods access was not possible, while during other periods the service worked properly with a small amount of interruptions to a limited number of users - https://www.zdnet.com/article/aws-outage-impacts-thousands-of-online-services/ .
Tuesday, Oct 20th, 23:03-23:12 - EU datacenter only - service failure due to cache server node failure.
Friday, Sep 4th, 7:30-7:31, 7:33-7:34 - US Only - A bad query generated an excessive load on DB. US service was partially unavailable for 2 minutes. We fixed the code that generated the bad query.
Saturday, July 18th, 19:00-19:42 GMT - Scheduled maintenance completed (scheduled for 60 minutes)
Tuesday, July 14th, 15:23-15:35 GMT - US Only - cache servers errors. US service was unavailable for 12 minutes.§
Saturday, Jun 6, 8:47-8:51 GMT - US Only - cache server was down. US service was unavailable for 4 minutes.
Saturday, April 18, 11:28 - 11:36 GMT - US Only - a scheduled maintenance operation that was not supposed to incur in downtime caused our US database servers to timeout. US service was partially available during this time.
2019 - Updates (Total of more than 99.99% uptime)
Friday, September 6th, 19:37-19:44 GMT - US only - HW failure on DB server. Failover server were up after less than a minute but not all the application servers were restarted automatically; we will fix the restart configuration.
Wednesday, July 31st, 7:38-7:41 GMT - Heavy load on and off in our servers, due to a table lock from one of our background processes.
Tuesday, July 23rd, 12:55-12:59 GMT - Heavy load on and off in our servers. We haven't found the root cause; we are switching to a different performance tool.
Monday, July 22nd, 20:53 - 21:08 GMT - Heavy load on and off in many of our servers because of logging subsystem. We increased capacity for now and will configured the alarms to find out issues before they occur.
Saturday, June 8th, 20:11-20:14 GMT - Scheduled maintenance completed (scheduled for 20 minutes)
Tuesday, June 4th, 17:32 - 17:36 GMT - another DB CPU spike which dropped service availability.
Monday, April 22nd 19:40 GMT - we had a DB CPU spike which dropped service availability for many customers for 7 minutes overall.
2018 - Updates (Total of more than 99.99% uptime)
Saturday, Dec 22nd, 9:10-9:12 GMT - Scheduled maintenance completed (total scheduled for 20 minutes)
Saturday, Dec 1st, 13:06-13:09 GMT -> 3 minutes of an interrupted service - due to a cache server replacement by AWS.
Sunday, Oct 3rd, 20:03 - 20:09 GMT- Scheduled maintenance completed (total scheduled for 15 minutes)
Wednesday, October 31st, 9:56-10:02 UTC - some users were experiencing a heavy load error, due to migration in our backend servers.
Tuesday, August 21st - 14:03-14:04 GMT - some users were experiencing a heavy load error for about 40 seconds.
Sunday, August 12th - 8:13 - 8:16 GMT- Scheduled maintenance completed (about 3 minutes off services, total scheduled for half an hour)
Saturday, August 11th - 11:22 - 11:24 GMT- 2 minutes of an interrupted service.
Tuesday, July 17th - 9:28 - 9:300 GMT -> 5 minutes of an interrupted service.
Saturday, June 23rd - 13:00 - 13:47 GMT- Scheduled maintenance completed (about 45 minutes of on/off services, total scheduled for the full hour)
Wednesday, April 11th, 14:49-14:54 -> 6 minutes of interrupted service due to bad network timeout configurations (micro-service networking issue)
Monday, March 5th - 10:54-10:55 GMT -> 2 minutes of overloaded results due to a high average latency inside PT network
2017 - Updates (Total of more than 99.96% uptime)
Monday, November 13th - 11:51-12:21 -> Heavy loading experience -> service went up and down multiple times during this time.
Tuesday, November 7th - Some heavy exception experience due to high migration volume 16:26 and 16:32 GMT
Monday, November 6th - Heavy load and connection issues for some users between 13:46 and 13:53 GMT
Wednesday, August 30 - 13:12-13:14 -> Some of our main modules were not working properly for 2 minutes due to a bad version push. Reverted immediately.
Tuesday, April 4th - 12:33-12:42, 12:58-13:01 GMT -> Heavy load issues from multiple locations. 12 minutes total of interrupted service.
Saturday, March 18th - Connection issues between 09:11 and 09:22 GMT
Thursday, February 2nd - Issues causing servers outages for ~22 minutes at: 4:07-4:11, 8:31-8:35, 20:39-20:43, 20:56-21:05 (all in GMT)
2016 - Updates (Total of more than 99.9% uptime)
Wednesday, Dec 28, 4:16PM - 5:33PM GMT - We experienced issues due to a disk failure from one of our servers. This caused services to fail and come back for the whole 78 minutes. We replaced that EC2 machine, and we'll work to eliminate this SPOF.
Saturday, October 8th, 5:00AM - 6:58PM GMT- Scheduled maintenance for 118 minutes.
Wednesday, October 5th, 7:12-10:20 - Intermittent issues causing some of our users not to be able to access the system for periods of time.
Wednesday, September 21st, 7:06-7:10 - Heavy load message for some of our users, some could not use the Test Library module for 2.5 minutes.
Wednesday, August 24th, 13:20-13:21 GMT - Heavy load message for some of our users. This was resolved within ~2 minutes.
Saturday, July 16th, 12:00PM - 12:38PM GMT- Scheduled maintenance for 38 minutes (originally scheduled 3 hours).
Wednesday, April 27th, 6:05PM - 6:10PM GMT - we experienced some issues with one of our servers, users might have been affected for about 5 minutes.
Sunday, April 17th, 6:49AM - 6:52AM GMT - Downtime for 3 minutes due to an error in mod_passenger.
Sunday, January 31st, 7:07AM - 7:22AM GMT - Downtime for 15 minutes due to a problem in ELB (Elastic Load Balancer).
Saturday, January 30th, 12:05PM - 12:40PM GMT- Scheduled maintenance for 35 minutes (originally scheduled 1 hour).
2015 - Updates (Total of more than 99.99% uptime)
Friday, Oct 16, 17:43-17:50 GMT - Heavy load delays on our servers for 7 minutes, some users experienced problems connecting to PT.
Tuesday, May 19, 20:55-20:58 GMT - Heavy load delays on our servers for 4 minutes.
Thursday, April 16, 08:20 - 08:53 GMT - There was a global issue with our DNS provider (Godaddy); browsers which had to refresh the dns entry from the godaddy servers could not connect.
2014 - Updates (Total of more than 99.99% uptime)
Friday, December 26, 06:18 - 06:20 GMT - Heavy load delays on our servers for 3 minutes, limited number of users experienced problems connecting to PT
Thursday, December 11, 11:30 - 11:33 GMT - Heavy load delays on our servers for 4 minutes, users experienced problems connecting to PT
Sunday, September 29, 12:26 - 12:28 GMT - Issues with one of our servers for 3 minutes, users experienced problems connecting to PT
Friday, September 12, 6:30 - 7 GMT - slowness in several modules of the application
Saturday, August 3rd, 5:00 - 8:05 GMT- Scheduled maintenance for 3:05 hours (originally scheduled 4 hours).
Friday, April 4, 19:13- 19:25 GMT - an elevated error rate (some users experienced delays for up to 12 minutes)
2013 - No downtimes (100% up time!). One scheduled maintenance:
Saturday, Dec 7, 9:00-9:10PM GMT - Scheduled maintenance for 10 minutes
2012 Updates (total of more than 99.8% uptime):
Dec 25, 9:33-9:36 GMT - Some latency issues (for about 3 min) caused by AWS.
Oct 22nd 5:45 PM GMT- Oct 23 8:45 AM GMT - Downtime due to Amazon AWS Storage Issue. We had to restore from our backups. Click here for more information.
October 15th, 7:51 AM - 11:52 AM GMT - Limited connectivity caused by AWS internal issues. Some users experienced high response times.
March 15th, 9:35AM - 9:48AM GMT - Downtime due to Amazon AWS connectivity issue
March 3rd, Saturday, 9PM - 11PM GMT- Scheduled maintenance for two hours