How to Validate & Optimize Telemetry with the OpenTelemetry Remote Tap Processor Read More

Observability – Now What?

Bob Quillin November 7, 2024
Image Credit ESA/Hubble & NASA, M. Sun

First there was monitoring. And logging. Then came traces. And let’s add RUM and profiling into that mix, too. Oh yeah, don’t forget APM. That’s a lot of signals, and I’m sure I missed a few. Along the way though, “Observability” was born. It first appeared in this new context in print in a Twitter paper in 2013. In parallel, it had already started to spring up around the industry, serving to collect a set of discrete concepts into a holistic mission for a growing group of practitioners looking for a name that better described their work and their philosophy. 

Modern Control Theory and Observability

Scientifically, the term “observability” goes back even further to 1960, to the Father of Modern Control Theory Professor Rudolf E. Kálmán. It’s fascinating to look back at the math and critical supporting concepts such as Controllability and the Principle of Duality that form the basis for Control Systems and compare where Observability is today versus where it started back then. More on that later. 

Observability Adrift

Over the last decade, observability has been successful in creating an umbrella concept that elevates all the individual building blocks into a greater purpose, or as Honeycomb puts it, to search for the unknown unknowns with the high goal of truly understanding your systems. Unfortunately, it also created an umbrella excuse that gave many vendors the green light to collect all the data that now fell under this new umbrella term, from logs to metrics to APM to RUM to whatever comes next, creating higher and higher bills, less choice, and the dreaded vendor lock-in dilemma. The distance between the greater goals of observability and the current vendor monolithic superstore solutions has never seemed further apart.

Why Are We Observing

Worse than that, the current state of Observability and the original Control Theory concepts also couldn’t be more divergent. In Control Theory, observability is a dual to controllability with the greater goal focused on control, not just observation.The question shouldn’t be about how much more observability we can or should generate, store or dashboard. The question we should be asking ourselves is why. Why are we generating this data in the first place? What are we actually controlling? The unintended consequence of searching for unknown-unknowns is that most teams just collect everything they can until their costs get too high. At this point they have to turn off certain signals, sample, or re-instrument their applications. Unfortunately, new problems then emerge or old problems return without the data to troubleshoot, analyze, or diagnose them. And we then repeat that cycle over and over.

Blame the Cloud Era

The client-server, distributed computing, and virtualization period that preceded the cloud, created huge advancements in network, system, storage, and application management. SNMP democratized network instrumentation and enterprise management solutions consolidated what were once siloed disciplines into unified platforms that enabled broader automation, visibility, runbooks, and root cause analysis. 

That all changed with the cloud, containers, microservices, and an explosion of so much more data, more complexity, new security concerns, and new vendors who geared up to capitalize on all these. The previous problems didn’t disappear, but were swept up with a whole new set of challenges. The OpenView and Tivoli and VCenter platforms faded away and were replaced with new Observability solutions like Datadog and CloudWatch and Grafana. 

Sadly, just collecting, storing, and dashboarding everything in one place didn’t solve the problem. Sometimes we didn’t know what to collect so we collected everything that was available from logs to metrics to more. Sometimes we didn’t have a choice and just collected whatever was available from our developers, cloud providers, and vendors. And still we searched and chased the elusive “observability” goal.

With the emergence and new awareness brought about by this observability movement, we got smarter and began to tune our signals, collection, and storage based on cost or specific use case or both. But the incredible rate of change in our environments and infrastructure made this manual process of tuning and re-tuning and tuning again a maddening cycle of whack-a-mole. As soon as we got the telemetry and KPIs we liked, their price went up or became a premium offer, or just as likely, our applications, infrastructure, and business changed which shifted the needs and economics again.

Changes Ahead – Observability Leads Us To?

It’s clear that a new AI era is emerging and many predict it will be much bigger than the current cloud epoch. History tells us that new problems, new platforms, and a new era will usher in new solutions while previous winners hang on but eventually fade away. Observability as we know it is over 11 years old. It has succeeded in raising the awareness and science of how we manage our systems and should serve us well as  a building block and foundation as we move forward. 

Observability was essential for the cloud. We will need more than observability for what’s coming next. Observability is a starting point but it should be leading us to much higher aspirational goals of stability and controllability. AI will unleash a whole new set of data plane intelligence at the edge, control plane intelligence in the middle, and management plane intelligence for teams in the form of new agentic AI and generative AI solutions. 

OpenTelemetry, much like SNMP, Kubernetes, and Docker before it, will accelerate innovation, democratize, and open up instrumentation. This will be perhaps the first domino that falls in breaking up the observability superstores to unleash new waves of higher-order solutions, much better business data, greater efficiency, lower costs, and a new era of choice for organizations.

What’s next for observability? Thank it for its service and now reach for something more.

For more information, please contact us at info@controltheory.com

Contact

Stay in touch.

Be the first to gain control