Start with Part One of this article: Down Goes the Internet (Again) – Are You Ready?
In this era of unprecedented complexity, it's virtually impossible for a modern website to eliminate all the risk associated with using third parties. However, there are proactive strategies an organization can implement to better manage and minimize their risk. These include:
1. Proactively monitor speed and availability
Proactively monitor the speed and availability of websites, web applications and mobile sites from the true end-user perspective.
Today, there are so many elements out there on the web that stand between your data center and your users, including not just third-party services, but content delivery networks (CDNs), local and regional ISPs, mobile carrier networks and browsers, for example. Measuring performance from your data center alone is insufficient – unless, of course, your users live in your data center, which is highly unlikely.
The true browser-based perspective is the only place where you can accurately gauge your user's experience at the end of an extremely long and complicated technology path known as the application delivery chain. Today's new generation application performance management (APM) solutions are based on this true user perspective.
2. Monitor all transactions
Monitor all transactions, 24x7 along the complete application delivery chain. Sampling is not a sufficient means of gauging performance, of course, because a major performance issue may very well occur outside your testing interval – think of the Amazon EC2 outage that impacted Netflix on Christmas day last year!
Due to the unpredictability of major service outages, you need to be monitoring all transactions around the clock, to identify all performance aberrations and their root causes – both within and beyond the firewall – quickly and accurately, and get ahead of them.
3. Baseline and uphold performance-focused SLAs
Service-level agreements (SLAs) promising a certain level of availability on the part of a third-party service provider mean very little when it comes to performance.
For example, just because your cloud service provider's servers are up and running does not mean your users are experiencing an acceptable level of speed and reliability. Remember, third party services of all types are serving thousands of customers like you around the globe, and a spike in another customer's traffic may impact you.
With little insight into third party service providers' capacity planning decisions, you need to monitor performance levels yourself to ensure they don't drop off, and validate these against performance-focused SLAs. To get a sense of how a third party service provider may be impacting your overall performance, it can be helpful to compare your site's speed and availability before the third party service is added, to afterwards.
4. Utilize industry resources
Utilize industry resources to better assess if the source of a performance problem lies with you or one of your third-party service providers, as well as the likely performance impact on your customers.
These services may not prevent third party service outages from happening, but they can help companies better understand the source of performance problems so they can get in front of them more confidently and efficiently.
Conclusion
The reality is: the delivery chain underlying the services we often take for granted is so tenuous, that it's a marvel they don't break down more often. While outages may be inevitable, this does not make them any less costly or damaging to a company's reputation and revenues.
For example, on August 19, Amazon's North American retail site went down for about 49 minutes, with visitors greeted with the word “oops.” No explanation was given, but one estimate by Forbes put the cost to Amazon at nearly $2 million in sales.
But it's not just the “big guys” like Amazon that you need to focus on. The fact is that little storms are happening on the internet all the time, and you need to be prepared for them. When it comes to surviving and thriving in the age of increasing web complexity, an ounce of prevention can be worth a pound of cure. By taking advantage of several relatively simple and inexpensive approaches, organizations can better exploit all that third party services have to offer, while reducing the inherent risks.
Klaus Enzenhofer is Technology Strategist for Compuware APM’s Center of Excellence.
The Latest
In the heat of the holiday online shopping rush, retailers face persistent challenges such as increased web traffic or cyber threats that can lead to high-impact outages. With profit margins under high pressure, retailers are prioritizing strategic investments to help drive business value while improving the customer experience ...
In a fast-paced industry where customer service is a priority, the opportunity to use AI to personalize products and services, revolutionize delivery channels, and effectively manage peaks in demand such as Black Friday and Cyber Monday are vast. By leveraging AI to streamline demand forecasting, optimize inventory, personalize customer interactions, and adjust pricing, retailers can have a better handle on these stress points, and deliver a seamless digital experience ...
Broad proliferation of cloud infrastructure combined with continued support for remote workers is driving increased complexity and visibility challenges for network operations teams, according to new research conducted by Dimensional Research and sponsored by Broadcom ...
New research from ServiceNow and ThoughtLab reveals that less than 30% of banks feel their transformation efforts are meeting evolving customer digital needs. Additionally, 52% say they must revamp their strategy to counter competition from outside the sector. Adapting to these challenges isn't just about staying competitive — it's about staying in business ...
Leaders in the financial services sector are bullish on AI, with 95% of business and IT decision makers saying that AI is a top C-Suite priority, and 96% of respondents believing it provides their business a competitive advantage, according to Riverbed's Global AI and Digital Experience Survey ...
SLOs have long been a staple for DevOps teams to monitor the health of their applications and infrastructure ... Now, as digital trends have shifted, more and more teams are looking to adapt this model for the mobile environment. This, however, is not without its challenges ...
Modernizing IT infrastructure has become essential for organizations striving to remain competitive. This modernization extends beyond merely upgrading hardware or software; it involves strategically leveraging new technologies like AI and cloud computing to enhance operational efficiency, increase data accessibility, and improve the end-user experience ...
AI sure grew fast in popularity, but are AI apps any good? ... If companies are going to keep integrating AI applications into their tech stack at the rate they are, then they need to be aware of AI's limitations. More importantly, they need to evolve their testing regiment ...
If you were lucky, you found out about the massive CrowdStrike/Microsoft outage last July by reading about it over coffee. Those less fortunate were awoken hours earlier by frantic calls from work ... Whether you were directly affected or not, there's an important lesson: all organizations should be conducting in-depth reviews of testing and change management ...
In MEAN TIME TO INSIGHT Episode 11, Shamus McGillicuddy, VP of Research, Network Infrastructure and Operations, at EMA discusses Secure Access Service Edge (SASE) ...