This week marks the first Amazon Web Services user conference. The AWS event, re:Invent, is being held in Las Vegas and given the massive awareness that AWS and its ecosystem has, we should see lots of product announcements from both Amazon themselves and ecosystem companies.
First up is Boundary who is today releasing a new set of monitoring capabilities for AWS customers that are designed to enable users to get early warnings of application infrastructure issues. This would appear to be the holy grail from an operations perspective – the ability to proactively get a sense of performance issues takes much of the risk and operations burden out of cloud (and more general) operations. Boundary takes its aggregate understanding of “normal” application behavior and, using an analytics engine, warns users of potential problems.
The new version collects performance data every second, and over time builds an understanding the dynamic application topology, that can provide real-time, analytics-driven warnings on performance abnormalities. Alongside the core reporting functions, Boundary has also introduced a data store that enables customers to archive detailed performance data and report upon long-term trends and comparative metrics.
Using the combined reporting capability and long-term data store, Boundary customers can examine all the metrics for prior periods to help in problem diagnosis. Over time this provides an opportunity to improve performance and hence reduce any service degradation on customers.
Interestingly, Boundary is reporting that they detected the recent AWS outage over two full hours before Amazon announced it and that a customer detected the Azure outage 15 hours before it was announced by Microsoft – this level of forward insight is increasingly important for cloud users.
While many talk about the cloud as some kind of magic bullet that renders any sort of management unnecessary, the truth is somewhat more nuanced than that. While it is true that the cloud general abstracts responsibility for operations away from the customer, there is still a degree of management that is required. Giving a proactive view of performance, and allowing a user to gain a degree of predictive breadth from long term trends is a positive move for cloud customers.
That said, monitoring without management seems to me to be only a partial answer. I predict some consolidation occurring whereby monitoring and management solutions are combined into one combined service. While tight integrations between management and monitoring solutions are useful and valuable, an even tighter link that creates a one-stop-shop, is preferable.