Anticipating Traffic Surges - Lessons Learned from ESPN Crash
September 13, 2016

Michelle McLean
ScaleArc

Share this

ESPN made news headlines this past weekend – the bittersweet kind. Unfortunately, the news highlighted that ESPN's fantasy football app was crashing, on the first Sunday of the NFL season. Where's the "sweet" part? The crash likely signals a huge amount of user popularity.

We see these types of stories often during so-called "surge" events, like when Black Friday takes down a retailer. Why? Often, it's the database that's been swamped in the process.

The application-to-database connection is fragile, because applications have to directly tie into the database and the coding of the app must match the database infrastructure. For example, if the database has multiple database servers that can all respond to an inbound request, the application needs to know which type of server to send its request to. While those changes can ensure a better response time, the work isn't trivial – a programmer must go through hundreds of thousands of lines of code to program how to handle reads vs. writes – and it can lead to errors.

Any recent changes by ESPN to increase database capacity or update the app could jeopardize that fragile connection. If ESPN recently modified the application to talk to different database servers, for example, the team might have accidentally introduced a "bad" query that the database can't handle or might have changed how the application talks to the database and broken that connection.

Organizations that are anticipating a surge in traffic have a number of best practices they should follow to ensure a smooth experience for their customers, including:

1. Freezing code early

Despite the understandable desire to make the app or site as current as possible, it's essential for engineering to force a code freeze many weeks before the "go live" date. Quality assurance (QA) and other testing require adequate time to ensure the updated site or app is working as needed.

2. Load testing

A big part of that testing work needs to come in the form of load testing. After a QA team has performed functional testing – that is, does each feature work – the next step is to see how the code performs when it's swamped with traffic. The key is to perform this load testing with traffic that's as close to production traffic as possible.

3. Increasing resiliency at the data tier

The lifeblood of any app or site is data; without it, you're down. To build in resiliency at this layer, organizations need to employ techniques such as database scale out to have multiple copies of the data available and database load balancing to ensure traffic is serviced by the fastest-responding server to the user.

4. Enabling redundancy in all network services

Beyond the data tier, organizations need to make sure the rest of the technology stack has all the redundancy built in as possible. Web server infrastructure and web load balancers are critical, as is network redundancy into both the web farms and the database server clusters. If you're hosting the app or service in the cloud, ensure a redundant version is available in an alternate cloud region.

Michelle McLean is VP of Marketing at ScaleArc.

Share this

The Latest

March 24, 2017

A growing IT delivery gap is slowing down the majority of the businesses surveyed and directly putting revenue at risk, according to MuleSoft's 2017 Connectivity Benchmark Report on digital transformation initiatives and the business impact of APIs ...

March 23, 2017

Why containers are growing in popularity is no surprise — they’re extremely easy to spin up or down, but come with an unforeseen issue. Without the right foresight, DevOps and IT teams may lose a lot of visibility into these containers resulting in operational blind spots and even more haystacks to find the presumptive performance issue needle ...

March 22, 2017

Much emphasis is placed on servers and storage when discussing Application Performance, mainly because the application lives on a server and uses storage. However, the network has considerable importance, certainly in the case of WANs where there are ways of speeding up the transmission of data of a network ...

March 21, 2017

The majority of IT executives believe investment in IT Service Management (ITSM) is important to gain the agility needed to compete in an era of global, cross-industry disruption and digital transformation, according to Delivering Value to Today’s Digital Enterprise: The State of IT Service Management 2017, a report by BMC, conducted in association with Forbes ...

March 17, 2017

Let’s say your company has examined all the potential pros and cons, and moved your critical business applications to the cloud. The advertised benefits of the cloud seem like they’ll work out great. And in many ways, life is easier for you now. But as often happens when things seem too good to be true, reality has a way of kicking in to reveal just exactly how many things can go wrong with your cloud setup – things that can directly impact your business ...

March 16, 2017

IT leadership is more driven to be innovative than ever, but also more in need of justifying costs and showing value than ever. Combining the two is no mean feat, especially when individual technologies are put forward as the single tantalizing answer ...

March 15, 2017

The move to Citrix 7.X is in full swing. This has improved the centralizing of Management and reduction of costs, but End User Experience is becoming top of the business objectives list. However, delivering that is not something to be considered after the upgrade ...

March 14, 2017

As organizations understand the findings of the Cyber Monday Web Performance Index and look to improve their site performance for the next Cyber Monday shopping day, I wanted to offer a few recommendations to help any organization improve in 2017 ...

March 13, 2017

Online retailers stand to make a lot of money on Cyber Monday as long as their infrastructure can keep up with customers. If your company's site goes offline or substantially slows down, you're going to lose sales. And even top ecommerce sites experience performance or stability issues at peak loads, like Cyber Monday, according to Apica's Cyber Monday Web Performance Index ...

March 10, 2017

Applications and infrastructure are being deployed and commissioned at a faster rate than ever before, the number of tools it takes to effectively manage these services is multiplying, and the expectations placed on IT to ensure customer satisfaction is increasing, according to The State of Monitoring 2017 report from BigPanda ...