Ricardo Castro
mccricardo.bsky.social
Ricardo Castro
@mccricardo.bsky.social
Senior Principal Engineer, tech speaker & writer, @DevOpsPorto and @DevOpsDaysPT, @CDeliveryFdn Ambassador, martial arts amateur, and metal lover. Opinions are my own.

mccricardo.com
"You Should Write An Agent" by Thomas Ptacek

fly.io/blog/everyon...
You Should Write An Agent
They're like riding a bike: easy, and you don't get it until you try.
fly.io
November 10, 2025 at 5:01 PM
"Faster root cause for slow traces with ClickStack Event Deltas" by Dale McDiarmid

clickhouse.com/blog/%20fast...
Faster root cause for slow traces with ClickStack Event Deltas
Read how ClickStack's improved Event Deltas make it effortless to pinpoint the root causes of performance outliers in observability data - turning complex trace analysis into instant, actionable…
clickhouse.com
November 10, 2025 at 1:01 PM
"Announcing Istio 1.28.0"

istio.io/latest/news/...
Announcing Istio 1.28.0
Istio 1.28 Release Announcement.
istio.io
November 8, 2025 at 6:01 PM
"Cloud Native Computing Foundation Announces Graduation of Crossplane"

www.cncf.io/announcement...
Crossplane’s Graduation Announcement
Graduation marks Crossplane’s readiness for widespread use and its evolution from a control plane framework to groundwork for intelligent, secure, and scalable cloud operations and platform…
www.cncf.io
November 7, 2025 at 5:01 PM
TicketOps is perfectly fine for relatively stable stuff.

At scale, it breaks.
November 7, 2025 at 2:51 PM
"SQL expressions in Grafana: Combine and manipulate data from multiple sources" by Sam Jewell and Kyle Brandt

grafana.com/blog/2025/10...
SQL expressions in Grafana: Combine and manipulate data from multiple sources | Grafana Labs
SQL expressions are a versatile and powerful feature that opens up all sorts of creative possibilities by manipulating and combining data from different data sources.
grafana.com
November 7, 2025 at 1:01 PM
In the dawn of a new wave of AI, if you're still thinking about infrastructure as code and not infrastructure as software, you're living in the past.
November 7, 2025 at 12:56 PM
SRE is much more than just incident response.

I thought this needed to be highlighted since many are talking about "AI SRE", which mostly focuses on incident response.
November 6, 2025 at 6:03 PM
"OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces" by Anjali Udasi

last9.io/blog/consist...
OTel Updates: Consistent Probability Sampling Fixes Fragmented Traces | Last9
One sampling decision, propagated everywhere. OpenTelemetry's Consistent Probability Sampling fixes fragmented traces across services.
last9.io
November 6, 2025 at 1:01 PM
Consistency is underrated.

Many people believe in a "big bang" event that propels their career. And while there are certain cases where that's true, consistency is usually a better investment of your time.

Invest in being consistent and you'll reap rewards.
November 5, 2025 at 6:02 PM
"Introducing Agent HQ: Any agent, any way you work" by Kyle Daigle

github.blog/news-insight...
Introducing Agent HQ: Any agent, any way you work
At Universe 2025, GitHub's next evolution introduces a single, unified workflow for developers to be able to orchestrate any agent, any time, anywhere.
github.blog
November 5, 2025 at 5:01 PM
"Effortless Observability - Integrating CloudWatch Application Signals with OpenTelemetry" by Tobias Schmidt

awsfundamentals.com/blog/cloudwa...
How to Use AWS CloudWatch Application Signals with OpenTelemetry on ECS Fargate and Lambda
This guide shows how to connect CloudWatch Application Signals with OpenTelemetry. See simple steps for ECS Fargate and Lambda. Example code included. Get clear metrics and traces fast.
awsfundamentals.com
November 5, 2025 at 1:01 PM
"Go and enhance your calm: demolishing an HTTP/2 interop problem" by Lucas Pardue and Zak Cutner

blog.cloudflare.com/go-and-enhan...
Go and enhance your calm- demolishing an HTTP:2 interop problem
HTTP/2 implementations often respond to suspected attacks by closing the connection with an ENHANCE_YOUR_CALM error code. Learn how a common pattern of using Go's HTTP/2 client can lead to unintended…
blog.cloudflare.com
November 4, 2025 at 5:04 PM
"From Signals to Reliability: SLOs, Runbooks and Post-Mortems" by Fatih Koç

fatihkoc.net/posts/sre-ob...
From Signals to Reliability: SLOs, Runbooks and Post-Mortems
Build reliability with SLOs, runbooks and post-mortems. Turn observability into systematic incident response and learning. Practical examples for Kubernetes environments.
fatihkoc.net
November 4, 2025 at 1:02 PM
Reliability, like any other feature, needs to be prioritised accordingly.

There will be times where reliability work will be the priority. Other times, product features will be the priority.
And so on.

If one topic massively overshadows all the others, problems will arise.
November 3, 2025 at 6:03 PM
For platforms to be valuable they need to be force multipliers.

That means being more than the sum of its parts.
November 3, 2025 at 1:02 PM
You always need to take roles and titles with a grain of salt.

I often meet DevOps/SREs/PlatEng all doing very similar jobs.

I also often meet groups of DevOps doing quite different jobs. The same applies for SREs and PlatEngs.

Context is crucial.
October 31, 2025 at 6:03 PM
Some people look down on or think of quality assurance and security as annoyances.

In the age of AI, if they continue to have that perspective, they'll have a rude awakening.
October 31, 2025 at 1:04 PM
Important: hire adults.

Also important: treat them like adults.
October 31, 2025 at 12:50 PM
Strive for civil discourse on your teams.

Some of the most creative solutions I've seen were born from discussions between people with completely different views on how to approach a problem.

Promoting diversity lays a good foundation for this to happen organically.
October 31, 2025 at 9:46 AM
People that say "that's a DevOps team problem" have absolutely no clue what DevOps is about.
October 30, 2025 at 6:02 PM
For complex issues, I like runbooks because they allow me to really understand the problem before trying to automate it.

In the long-run, for most issues, I strive for automation. But starting with runbooks allows me to understand the quirks before automation.
October 30, 2025 at 1:05 PM