Brandon Rohrer
@brandonrohrer.com
3.5K followers 2.3K following 640 posts
Robotics and Reinforcement Learning tinkerer. brandonrohrer.com Wrangler of algorithms for Confluence @ Atlassian. Eater of bread. Sipper of whisky. Reports to a Shih Tzu.
Posts Media Videos Starter Packs
Pinned
brandonrohrer.com
Update from my long-running #ReinforcementLearning side project.

It uses a combination of new tools to control a simulated pendulum.
Solving an easy RL problem on hard mode
Brandon Rohrer
A story about a reinforcement learning approach learning to make a pendulum stand up straight, while making very few assumptions.

tl;dr
This is a demonstration of an RL approach that uses a combination of new tools to control a simulated pendulum:

BucketTree to learn a discretization of the pendulum’s continuous state variables, angle and angular velocity,
Ziptie to bundle the discretized values into discrete states,
Fuzzy Naive Cartographer (FNC) to learn common state-action-state sequences and to make conditional predictions of reward for each action.
It all runs in Myrtle, a real-time reinforcement learning workbench.

This approach requires only a little domain specific design and makes very few assumptions about its world. The physical representation of the pendulum is simplistic. The arm has uniform mass. It has a small amount of rotational friction, proportional to its speed. And gravity acts on the pendulum's center of mass, pulling it downward.

As always, the canonical source of information is the code itself.

Sensors
There are two quantities returned by sensors: pendulum position and rotational speed.

Position, θ, is measured in radians, zero when pointing straight down, θ = π/2 when pointing to the right, θ = π when pointing upward, and continuing around to almost 2π when it reaches the bottom again and resets to 0.

Angular speed, ω, is measured in radians per second. Positive angular speed is counter-clockwise (the direction of increasing position) and negative speed is clockwise.

Actions
Actions take the form of torque, τ, applied to the base of the pendulum. Positive torque accelerates the pendulum counter-clockwise, in the direction of increasing position. Negative torque accelerates it in the clockwise direction.

Actions are discrete in time. Each action is a constant torque that lasts for 1/8 second.

There are 13 discrete values the torque can take, 6 positive, 6 negative, and zero. Possible torque values are distributed nonuniformly across the range, with the middle half of the range having denser coverage. This results in a finer grained representation of small torques, useful for making fine-tuning adjustments.

Reward
The pendulum returns reward, r, related to how high the its swinging end reaches, its vertical distance from its straight down θ = 0 position. It reaches a maximum of r = 2 at the straight up θ = π position. A successful learning curve will work its way up to 2 and stay there.
brandonrohrer.com
New post: Controlling IP traffic on your webserver

A cool part about having your own webserver is that you get to choose who can visit. When IP addresses try to access sensitive files or aggressively scrape, you can just block them.

Here's how.

brandonrohrer.com/hosting5.html
Block an IP address
The most straightforward way to block and IP address is in the firewall. It is the tool build specifically for this.

To block the address 101.101.101.101, run from the command line


sudo ufw insert 1 deny from 101.101.101.101

This instructs ufw (the Uncomplicated FireWall) to insert a rule at the the top of the list (position 1) to deny all incoming traffic from the address. After running this, no restart of the firewall is needed. The rule is active. (ufw docs

The position 1 is important because in ufw, the first rule that matches is applied. If there was a rule to allow all addresses that started with 101. and that rule came before the deny rule, then the deny rule would never be reached.

While it's possible to block specific ports, or even to block an IP address from seeing particular pages, complex rules and conditions get difficult to analyze very quickly, and can lead to cases where there are loopholes. Use fancy rule combinations sparingly.

Parse logs
In their raw form access logs are technically human-readable, but they are a lot. I found it really useful to do a little parsing to pull out the bits I'm interested in. (I'm working with the default nginx log format, so adjust this according to your own.)

I wrote a script to take advantage of the repeatable structure of these logs to dissect them into their parts. It uses tricks like splitting the log based on brackets and spaces. It productes a pandas dataframe with columns containing the IP address, requested URI, HTTP status code, and every component of the date and time.
Reposted by Brandon Rohrer
brandonrohrer.com
Now you’re just being cruel
brandonrohrer.com
My design friends don’t use the word “abomination” lightly but…
brandonrohrer.com
Well that made my day
brandonrohrer.com
💯 my mind is finally at peace. Until the next itch sets in.
brandonrohrer.com
daaaamn 🤘🤘🤘 that is a hell of a jam
brandonrohrer.com
This dumb joke was in my head for months
brandonrohrer.com
High five to that prof
brandonrohrer.com
They should tell you that.
brandonrohrer.com
You will be asked how much money your team will spend every month for the next 12 months. how many individuals will use the feature that you just built, and how often. How many hours it will take to complete an engineering task that has never been done before and is not yet fully defined.
brandonrohrer.com
They don’t tell you an engineering school how much of your job will be telling the future.
brandonrohrer.com
Caring for Your Webserver is out.

It covers
- browsing the access logs
- catching missed pages
- automatically adding .html to requests, when needed
- redirecting URLs
- setting up log rotations
- finding a content provider for large files

brandonrohrer.com/hosting4.html
Caring for your webserver
Brandon Rohrer
The Blog     RSS

In parts one through three, we set up a web server, connected it to a domain name, and instituted some basic security. Now comes the fun part! The web server is humming along doing its thing, and we can watch and admire, making little improvements here and there. It’s not required, but for some, this is part of the payoff.

A webserver needs weeding and watering
It can be counterintuitive that a thing made out of code should need ongoing attention and care. On the surface it seems like it should be self-sufficient, like a wall made of stones. We put the pieces in place where we want them and when we’re happy with it, we stop. It's all silicon and bits after all, why should it need watching?

The bigger picture, though, is that the web server operates in a world that’s always changing. Software updates cause tools to behave differently. Edits happen to the HTML and other content we host. There are dramatic changes in who is trying to reach the content and for what purposes. There can be outages, policy changes, and any number of second-order effects in the wider world that can make our web server stop operating the way we want. So on that scale a web server starts to more closely resemble a vegetable garden—something growing, decaying, and very much a product of the environment that it's in.

This page is my notes on care and feeding practices I've found helpful and enjoyable. I'm not an expert on this, and this is not authoritative by any means. But I put it here in case you find it helpful. If I missed anything, or got it egregiously wrong, please let me know.
brandonrohrer.com
And still another perk - RSS feeds are usually message-in-a-bottle one-way communication, buuuut if you publish a new post on RSS a day before you officially announce it anywhere else, you can see how much traffic it gets in that first day and get a sense of how many folks are tuning in to your feed
brandonrohrer.com
zipties are the baling wire of the modern era
brandonrohrer.com
come on buddy i feel like you're not even trying
| 403 196.251.87.13 /images/images/cache.php?pass=3a0e007c895888c9c4bf234e2772a0dd

| 403 196.251.87.13 /images/images/images/cache.php?pass=331b3elaefcffd452db92a2cicé61c525

lL 403 196.251.87.13 /images/images/images/images/cache.php?pass=26aadf8d261066584120034c6303716F

| 403 196.251.87.13 /images/images/images/images/images/cache.php?pass=c93ef613c69e7f82b5446413c46bBaeca

lL 403 196.251.87.13 /images/images/images/images/images/images/cache.php?pass=6765bcbc711528e2760d2133cdf17685

| 403 196.251.87.13 /images/images/images/images/images/images/images/cache.php?pass=e2f2d10231c076274cbc76336b530c67

| 403 196.251.87.13 /images/images/images/images/images/images/images/images/cache.php?pass=cfe52c34e32b72e0844fd3ac8c834e96

| 403 196.251.87.13 /images/images/images/images/images/images/images/images/images/cache.php?pass=b4a@cel105e127632d397c11b662c026

lL 403 196.251.87.13 /images/images/images/images/images/images/images/images/images/images/cache.php?pass=1105a5a43bd65cd41776c47401232849
| 403 196.251.87.13 /images/images/images/images/images/images/images/images/images/images/images/cache.php?pass=341a599afb55832de18a18d89cff3cse
| 403 196.251.87.13 /wp—-admin/network/network/cache.phn?pass=38efBafddaalead7a?21aef35Fab6d?209
Reposted by Brandon Rohrer
eugenevinitsky.bsky.social
For folks considering grad school in ML, my advice is to explore programs that mix ML with a domain interest. ML programs are wildly oversubscribed while a lot of the fun right now is in figuring out what you can do with it
brandonrohrer.com
I created some educational content pre-LLMs and found that there was only a small fraction of folks who wanted to understand the workings. The vast majority wanted simple recipes and rote methods. It was rough if you’re trying to sell subscriptions but good for connecting with seekers of knowledge.
brandonrohrer.com
My new pomodoro app doesn’t have many features, but there are also no updates and I won’t have to recharge it for 500 years.
A large hourglass by a pen, notebook, and pair of reading glasses
brandonrohrer.com
alt text :chefs-kiss:
brandonrohrer.com
“I don’t want to have to keep thinking about that”
If something is important but you don’t want to spend time on it maybe one of those things isn’t true.
brandonrohrer.com
“ but what if it’s really really important?”
Three people get the reminder. This is a solved problem.