Thanks to the maturation of DevOps culture, the ubiquity of cloud services, and the momentum of microservices-based architectures, it’s safe to say that modern software teams now need to monitor their applications and infrastructure much differently from how they did a few years back.
They need observability—i.e., the gathering, visualisation, and analysis of metrics, traces, and logs in one place in order to gain full insight into the operation of their systems.
The term “observability” is not just a fancy synonym for monitoring.
Observability is the answer for when traditional monitoring won’t suffice, and when modern software teams need to play offence with software.
Development teams face huge pressure to continually innovate and ship new features in order to stay ahead of the competition.
The bar is continually being raised by end-users, and there’s an expectation that software will deliver more when it comes to rich features and performance.
Customers have also become less tolerant of slow, error-prone experiences, and won’t hesitate to uninstall an app if it crashes or causes errors.
Observability vs basic monitoring
High demands from the business and from end-users make it critical for companies to quickly identify and respond to errors as soon as they arise.
Traditional monitoring once relied on error logs and a variety of metrics to identify a problem, so that a team could respond as quickly as possible.
Observability goes deeper.
It looks at the “why” rather than just the “what-went-wrong” – gathering data about every component within a system.
It’s not just about collecting error data, but also analysing this data and gaining insights into the reasons why an error occurred.
Yuri Shkuro, author of “Mastering Distributed Tracing”, and software engineer at Uber Technologies, explains the difference this way:
- Monitoring is about measuring what you decide in advance is important
- Observability is the ability to ask questions about your system that you don’t know upfront
In the software lifecycle, observability encompasses the gathering, visualisation, and analysis of metrics, events, logs, and traces to gain a holistic understanding of a system’s operation.
It lets you understand why something is wrong, compared to monitoring, which simply tells you when something is wrong.
The rise of DevOps and automation
The growing trend of DevOps and automation in software development has also created the need for observability.
Automation reduces repetitive, low-value work, but it can also fail.
For a CI/CD (continuous integration and continuous delivery) process to work properly, it needs continual feedback.
Frequent deployments and dynamic infrastructure means introducing more risk more frequently.
Changes can’t just be pushed through unless it’s clear whether they make things better or worse.
Team members have to understand and be able to troubleshoot parts of an application that they weren’t previously familiar with.
For example, a database expert must also know about networking and APIs.
All these new and different technologies are too vast for any one person to master. Observability helps teams to better understand different technologies in the context of the work they accomplish.
One global food company that we work with has 1000 developers in its engineering organisation working across 15 platforms, with over 500 applications to oversee.
No one has the time to look through each individual application, which is why it becomes vital to have a high-level view with insight into errors to see how all systems are performing, and if there are any issues to be fixed.
The advantages of observability
Observability has a range of important benefits for development teams, such as enabling the creation of a data-driven, innovative culture while delivering high-quality software at scale.
Observability delivers a connected view of all of the performance data in one place, in real-time.
It reveals how all entities in a system are related to each other at any moment in time. And this allows teams to pinpoint issues faster, understand why they happened, and ultimately deliver optimal customer experiences.
Successful digital businesses are turning to observability to get ahead and create more perfect software.
by Dmitri Chen, New Relic APJ executive vice president & general manager
This article was first published by IT Brief