Observability at Scale

Speaker:
Natalia Chechina


Abstract:

Observability is about understanding your system – its performance, reasons for actions, inactions, and failures, and the ability to pre-emptively act on various system limitations before they become a problem. The main tools of observability are metrics and logs. To empower developers, rather than introduce overheads and endless useless data, both metrics and logs should be designed into the system, rather than be added on. This is particularly true for large-scale systems.

In this talk, I will share my experience and the rules of thumb for working with metrics and logs at scale. I will also cover the theory behind these concepts.

Key Takeaways:

  • Observability is one of the key elements of code development and maintenance.

Target Audience:

  • everybody involved in code development

Tags:
observability, performance, scale