infra weekly recap: Early October 2025
Another saturday, so time for another weekly recap!
## Looping mailing list bounces are not fun
We had a bit of fun in the early part of the week with our mailman server alerting about having a lot of mails in the queue. Looking at it, I found that they were almost entirely bounces, but why?
Well, it was a sad confluence of events: some providers send bounce emails that are almost completely useless. I'll go ahead and name names: mimecast (used by Red Hat) and ibm (their own thing I guess?). These bounces don't include the orig email, don't include headers from the orig email, don't include who the email was sent to. So, for example, say a fictional someone named Bob Smith signs up for a fedora list with [email protected]. They then leave the company and emails to them bounce with a message saying "foobar@somethinginternal" bounced. You have no way to tell who it was really unless their internal name and external name match up. mailman cannot process these bounce messages at all, so it just drops them.
But it gets worse. If someone in that state was a owner of a list, and they also enabled the 'send bounce emails you cannot process to list owners' then... the email bounces, the bounce can't be processed, it sends to the list owners, where it causes another bounce, etc. ;(
I managed to figure out the addresses causing the current issue, but it's frustrating. </end rant>
## rdu2-cc to rdu3 dc move
The move of our rdu2 community cage hardware to rdu3 continues. I was working on network acls to pass to the networking folks this week. Hopefully I got everything at least to a working starter state.
Still looking like we are going to try a november move, but I am hoping we can get in a new replacement machine before that and I can actually migrate pagure.io before the move. The rest of the hosts there are not too critical and can be down, but it would be nice to avoid downtime for pagure.io.
## mass update/reboot
We did another mass update/reboot cycle this week. Wanted to get everything up and on the latest updates before going into final freeze next week.
To give a bit of history here, you may wonder why we do these periodic outages instead of just making it so everything is up all the time? We may consider that again, but at least in the past there were problems with databases and a few other difficult to manage things. Of course you can definitely setup databases these days with clusters and we might try and move to that at some point now, but in the past failover and back was prone to a lot of issues. In the mean time a few hours every few months doesn't seem like a undue burden.
## some builder adjustments / additions
This last week I brought on line 5 more buildhw-a64's (hardware aarch64 builders). With that done, I then adjusted our 'heavybuilder' and 'secure-boot' channels to take advantage of the new hardware.
So, I think we are in pretty good shape on x86_64 and aarch64 now. On power, our power10 buildvm's seem to be doing fine to me. We are planning some changes there in coming weeks though: We are moving from a 'entire machine is kvm host' to using lpars (logical partitions). This will allow us to move 1/2 of the current builders to a second power10 chassis and perhaps increase performance. On s390x, nothing much has changed. We continue to restrict ci jobs there and I try and balance out number of builders vs specs.
## DANE update
Small update for anyone who noticed or cares: I updated the ssl cert for *.fedoraproject.org last week, and finally got to updating the DANE record for it today.
DANE is a way to tie a ssl cert to dns for the host. Postfix and exim at least can automatically use that to verify things, as well as a firefox extension I have that tells you if it validates or not.
## slim7x laptop news
I've kept using my lenovo slim7x laptop and have switched over to mainline rawhide kernels a while back. The only missing thing that wasn't upsreamed for me was bluetooth support and since that was heading upstream, I got Fedora kernel maintainers to just include the patch.
Recent merge window kernels seem to have broken something in the devicetree file for the laptop tho. It boots to a blank screen with the dtb from the kernel. Passing it an old one and it works fine.
There's still work to get the devicetree files on the live media, at which time booting from usb on these just becomes a manual step of passing the right dtb, which is a great deal better than 'build your own live media with devicetree files on it'.
I guess for now I'll just keep daly driving it, but the lack of webcam is kinda anoying.
## Radxa Orion O6
Picked up one of these the other day with a set of flimsy excuses: "I can use it to build kernels for the laptop" and "I can help test fedora rcs". It was also on sale at the time. :)
I just installed it this morning. Pretty painless overall, just switching it to 'acpi' mode from 'devicetree' and then an anoying detour of it not liking the first usb stick I plugged in. With an older one it booted right up with the f43 workstation live and was installed a few minutes later.
Will probibly do a seperate blog post with review soon.
## comments? additions? reactions?
As always, comment on mastodon: