A whole lot of managering going on
October 28, 2025 at 5:05 AM
A whole lot of managering going on
I think Phil is not good at the managering
February 23, 2025 at 10:20 PM
I think Phil is not good at the managering
tasque managering here too because why not
July 28, 2025 at 6:36 AM
tasque managering here too because why not
straight up managering that thang
August 3, 2025 at 12:10 PM
straight up managering that thang
Did Neville and Rob Gale go to the same school of managering where they teach you to constantly be yelling instructions at your team the entire match, seems like it would really annoying as a player
August 17, 2025 at 3:27 AM
Did Neville and Rob Gale go to the same school of managering where they teach you to constantly be yelling instructions at your team the entire match, seems like it would really annoying as a player
I was in Providence today for my haircut (bc I cannot find a barber in Boston I like) but also at Heartleaf for Indie Bookstore day! I got to say hi to Penny and got a whole bunch of new books
April 26, 2025 at 10:50 PM
I was in Providence today for my haircut (bc I cannot find a barber in Boston I like) but also at Heartleaf for Indie Bookstore day! I got to say hi to Penny and got a whole bunch of new books
That’s what my FFXIV hours look like if you add them up, but that was across like 6 years, and it’s an MMO.
This is a lot of Football Managering.
This is a lot of Football Managering.
December 11, 2024 at 12:28 AM
That’s what my FFXIV hours look like if you add them up, but that was across like 6 years, and it’s an MMO.
This is a lot of Football Managering.
This is a lot of Football Managering.
New #blog: Categorising My Daily Todo List with Deepseek-r1
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Categorising My Daily Todo List with Deepseek-r1
I've made no _particular_ secret of the fact that I am **not** a fan of large language models: they're overhyped, under deliver and were built off the back of a ripping off of others work at a scale that (for anyone else) would have resulted in jail and/or bankruptcy (even before the more incontrovertible acts of copyright infringement are considered).
But, I also work in a industry which is going through a phase of significant AI use (even if a lot of companies do come to regret being over-eager).
As a generalist, still decades from retirement, I _have_ to keep my hand in.
If nothing else, doing so means that I'm better positioned to explain _why_ and _when_ they seem to be ill-suited.
Moving on, I've finally got around to reading the book that I mentioned while procrastinating recently.
In one section, the author talks about reviewing how their time is spent:
> Every quarter, I spend a few hours categorizing my calendar from the past three months to figure out how I’ve invested my time. This is useful for me to reflect on the major projects I’ve done, and also to get a sense of my general allocation of time.
This struck me as a _particularly_ sensible thing to start doing.
However, I needed a slightly different approach: although it plays a part, my calendar is not the main organiser of my day because I work from a daily todo list in Obsidian.
Although small/quick tasks might sometimes get omitted, they're still far more likely to appear in my TODO list than in my calendar.
The trade off, of course, was that I would only able to collect the numbers of task completed rather than a measure of time spent.
Pivoting back to AI, I decided that categorising TODO list items **should** be a piece of cake for any reasoning model.
This post discusses how I developed metrics about my todo list along with where (and why) I found the LLM unreliable.
* * *
### Requirements
I had a simple (but strict) set of requirements
1. **The LLM must be local** : I'm a manager and my todo list sometimes contains references to team members. I'm not comfortable sending _any_ of that to one of the hosted providers1
2. **The script should be cron-able** : the whole point is that it should just run without me needing to do anything
3. **It must use my existing format** : Shuffling the TODO list structure around is OK, but changing the way that I write TODOs is likely to be unrealistic - it's supposed to be somewhere to quickly note a need to do something, rather than a document which requires a linter
4. **Tasks should be sorted into a finite set of categories** : LLMs are quite inconsistent in their output, so it'd be unwise to rely on the LLM to also define the categories: I'd end up with tasks spread over "discussions", "email" and "correspondence"
5. **The output should be simple counts** : The aim here wasn't to create a second copy of my TODO list, but instead to express it in terms of simple metrics
* * *
### TODO List Structure
My list uses a simple markdown template:
# Daily Notes: {{date}}
### TODO
- [ ] Backlog item
### Today
- [ ] Incomplete task
- [x] Complete task
- [ ] Morning Sync (09:00)
**Calls**
**Morning Routine**
- [ ] Slack
- [ ] Email
.. etc ..
**Follow Up Tomorrow**
The `TODO` section contains my backlog which is rolled into the current days notes by the Daily todos rollover plugin (along with any incomplete items under `Today`).
As I work through my morning routine, I populate the `Today` section with calls broken out under their own subheading.
* * *
### Processing
Because they're stored as markdown, the lists are easy to process and the headings make it easy to see which section an item falls under.
I've used a similar format in previous roles and previously wrote a simple burndown stats script which built metrics by walking through line by line:
for line in fh.readlines():
# Process the line
if "### TODO" in line:
in_section = "main"
print("Started")
continue
if "**Calls**" in line:
in_section = "calls"
continue
if "**Morning Routine**" in line:
in_section = "morning-routine"
continue
elif in_section and line.startswith('- [ ]'):
counters["items"] += 1
elif in_section and line.startswith('- [x]'):
counters["items"] += 1
counters["complete"] += 1
For this project, I added a function call to the conditional block for completed items:
category = categorise_item(line, in_section, ai, categories)
* * *
### Adding AI
I span up an ollama docker container and loaded Deepseek-r1 into it
ollama run deepseek-r1:7b
Use of the 7 billion token model was a compromise: the solution had to run on a laptop (I actually started with the much smaller `phi3:mini-4k` but it _really_ struggled to give accurate responses).
I have a python wrapper that I previously wrote to provide a command-line REPL for interacting with Ollama3, so I imported that.
Although there's a bit of class initialisation to do, it makes sending a message simple:
response = ai.chat_with_model("Hello, are you there?")
Next, I needed to write a prompt to tell the LLM what to do with the messages that would be sent to it:
You are a task categorisation specialist and are being employed in order to identify what categories of work each of the user's completed todo list items falls into.
The user will provide you with a file in markdown format which consists of a list of checkboxes with todo list items next to them.
Using the rules below, you are to categorise any task items that the user passes you.
**File Structure**
* An item starting with - [ ] indicates a task that has not yet been completed, ignore these.
* An item starting with - [x] indicates that the task was completed, you should categorise the item
* Items that follow the text Calls are calls - these need further classification (see the calls selection below)
* Items under Morning Routine are routine items, these should be classified as "daily admin"
You may encounter a heading with the text Defunct. If so, ignore everything that follows it.
**Calls**
There are two types of call classificiations: call-meetings and call-11
If the todo item includes the text 1:1, it's a call-11, otherwise it should be classified as call-meeting.
**Categories**
ALL completed items must be categorised but must ONLY be categorised into one of the following categories
* call-meeting
* call-11
* daily admin
* incident response
* PR work
* documentation and planning
* other
To test, I passed it a single item2
- [x] Swear at AI
The response and the reasoning seemed in line with what I wanted
> Okay, let's tackle this problem step by step. The user has provided a task list with several items marked as completed (indicated by - [x]). My job is to categorize each of these items into one of the specified categories based on the given rules.
>
> First, I need to go through each item and determine its category. Let's look at each item:
>
> 1. "- [x] Swear at AI" This task clearly states a negative action but doesn't provide any specific context like calls or routine tasks. Since it's under "Other" as per the categories list, this should be classified as "other".
>
>
> I don't see any other items provided in the example besides this one. Each item must be categorized without inventing new ones that aren't present in the user's message.
>
> So, the only category needed here is "other".
>
> { "category": "other" }
I threw it a few more items and all came back correctly.
* * *
#### Problems
However, issues soon began to creep in.
Despite having been passed a list of acceptable categories, the model invented a new one: security.
Expanding my prompt to include explicit rules (you MUST NOT return a category not listed above) helped a little, but not much.
To improve the chances of getting a successful response, I adjusted my script to add the ability to trigger a reprompt:
def get_ai_category(item, ai, section, ffs=False):
''' Ask the AI to categorise
We often need to strip the response though, because many reasoning
models include their reasoning too
'''
msg = []
if ffs:
msg.append("Reminder: you MUST stick to the categories provided in the prompt. You're a very naughty LLM, your previous response made changes - please try again but stick specifically to the rules in the prompt")
msg.append(item)
response = ai.chat_with_model("\n".join(msg)).split("\n")
However, it didn't help much: the LLM often replied with the same thing the second time.
Taking a closer look at the responses revealed something even more frustrating though: sometimes it selected the correct category but then fucked up the spelling.
For example, instead of `documentation and planning` it returned the following
* document and planning
* documentaton and planning
* documenta...
* documentations and planning
Sometimes the response would contain a valid category name but prefixed by some other text
*category*: documentation and planning
*final decision*: calls-11
I messed aroumd with making it respond with JSON, but nothing _really_ helped.
In the end, I created a function called `unfuck_ai_response` and started adding string processing to account for its common mistakes.
As a side note, the AI also **staunchly** refused to allow me to use non-words as category names.
I defined a category with the name `managering` but it insisted on returning `managerial`. That behaviour makes sense, because _actual words_ are obviously going to have more weight, but it doesn't help with the feeling that AI is a fun sponge.
* * *
#### Taking Load Off the AI
Even with the new function, there were still far too many items that weren't being categorised correctly.
Quite a few of the regular items in my todo list can be categorised by looking for specific strings, so I adjusted the script to start handling those simpler items itself:
if in_section == "calls":
if "1:1" in item:
return "call-11"
elif "adhoc" in item:
return "call-unplanned"
else:
return "call-meeting"
I adjusted the script so that all todo items categorised at `other` would be written to a file before triggering a full run across all of my notes.
Even if it didn't make sense to adjust the script for every `other` item, the list allowed me to spot patterns and build the ruleset out.
* * *
### Collecting Metrics
With the script now capable of emitting semi-reliable metrics, I needed to collect the data somewhere.
For this project, I decided to give VictoriaMetrics a try: it accepts a variety of input formats, including Influx line protocol, so making the script generate LP gave me the ability to try VictoriaMetrics whilst retaining the ability to pivot back to InfluxDB if needed.
* * *
#### Standing VictoriaMetrics up
Using the VictoriaMetrics docs as a guide, I was able to come up with a `docker-compose` snippet to stand a container up:
victoria_metrics:
restart: always
image: victoriametrics/victoria-metrics:v1.122.0
container_name: victoria_metrics
command:
- "-selfScrapeInterval=5s"
- "-storageDataPath=victoria-metrics-data"
- "-retentionPeriod=100y"
- "-disablePerDayIndex"
ports:
- 8428:8428
volumes:
- /home/ben/docker_files/victoria/data:/victoria-metrics-data
Writing line protocol in was easy:
curl \
-d 'metric,foo=bar field1=3.4' \
http://127.0.0.1:8428/api/v2/write?db=bar
So I updated my script:
print("Submitting to VictoriaMetrics")
res = requests.post(
f"{vm_url}/api/v2/write?db=workload_stats",
data = '\n'.join(lp_buffer)
)
print(res.status_code)
print(res.text)
Note: VictoriaMetrics doesn't have the concept of separate databases and the name of the DB provided instead gets added as a label.
After the next script run, I was able to query data:
curl -s 'https://vm.83n.uk/api/v1/series?match[]=todo_list_completed'
{
"status":"success",
"data":[{
"__name__":"todo_list_completed",
"db":"workload_stats",
"hostname":"flibble"
}]
}
I could also delete that series - something that I used _quite a bit_ whilst continuing to refine the script's output
curl -s 'https://vm.83n.uk/api/v1/admin/tsdb/delete_series?match[]=todo_list_completed'
* * *
#### Visualising
I installed the VictoriaMetrics plugin for Grafana and started to build a dashboard
Historically, I've written quite a lot of Flux on this blog, but VictoriaMetrics uses MetricsQL.
It's a PromQL-alike language, which itself is similar to LogQL, so I didn't experience any real issues writing queries to drive the graphs:
todo_list_items_count{db="workload_stats", category!="error", category!="scheduled-for-tomorrow"}
The visual query builder is also pretty good.
* * *
#### Metrics: use of AI
The TODO processing script also writes metrics about itself into VictoriaMetrics, so the dashboard includes a section specifically for script run stats.
One graph indicates the proportion of TODO items categorised by the script vs categorised by AI:
Although I'd been careful to only add general rules (rather than targetting specific items), the ruleset embedded within the script accounted for two thirds of all categorisations.
I also wanted to understand how often AI was misbehaving, so I adjusted the script to calculate an AI reliability score:
# Calculate a reliability score for the AI
# for every retry we had to send the count goes down.
if source == "ai":
# So the logic here is that we want to generate
# a percentage reliability score based on
# comparison to the max retry depth.
#
# We add 1 onto max_retry_depth to account for
# the first request (which doesn't get treated
# as a retry)
#
# So, if max_retry_depth is 4, the max becomes 5
# and scores would look like this:
#
# * we didn't retry at all: 100%
# * we retried once: 80%
# * we retried 4 times: 20%
# * we burned through the retries: 0%
#
score = (1 - (RETRY_DEPTH / (max_retry_depth + 1))) * 100
For each TODO list run, the script reports mean, min and max reliability scores.
It seems that it _generally_ behaves, but when it misbehaves, it does so quite stubbornly:
* * *
### Conclusion
I now have a script which can _roughly_ categorise my TODO items from each day ready for visualisation in a dashboard to shows where the majority of my efforts are spent.
My initial hopes for AI making it quick and easy were quickly, if predictably, dashed: the LLM does a "good enough" job most of the time, but only because substring matching within the script handles ~66% of items.
One of the **huge** benefits of having opted for a local model is that I don't have to worry about results suddenly drifting as the result of a provider deploying an update which results in the model responding differently.
That consistency has its limit though: the dashboard has looked slightly different every time that I've triggered a full re-run - the LLM is rarely consistent in the categories that it selects.
Equally, though, there probably isn't any _reasonable_ way that I could have script logic correctly categorise the 1/3rd that ends up being passed off to the LLM.
Although I could either limit the language used or include a category alongside items, the likely result of that added burden is that the list would fail to include so much of my workday.
* * *
1. It doesn't _matter_ what their terms say. Data protection by design and by default, if it can be done without sending to OpenAI/Google/Musk then it should be ↩
2. It's possible that I've skipped a few iterations in this post and was growing fed up at this point ↩
3. In case you haven't picked up on this: the privacy benefits mean that local AI will always be > remote AI ↩
www.bentasker.co.uk
July 31, 2025 at 7:30 AM
New #blog: Categorising My Daily Todo List with Deepseek-r1
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Kicking off my (usually) weekly bout of Football Managering in about 10 minutes or so - would love to have some company as I try to steer a lowly Vanarama League North side to promotion! www.twitch.tv/valerynorth
valerynorth - Twitch
Genderfluid enby leftie/socialist writer/musician/game creator in my mid-40s. Watch a lot of sports, play a lot of video games, play a lot of musical instruments. Love TTRPGs, especially slightly quir...
www.twitch.tv
April 16, 2024 at 6:23 PM
Kicking off my (usually) weekly bout of Football Managering in about 10 minutes or so - would love to have some company as I try to steer a lowly Vanarama League North side to promotion! www.twitch.tv/valerynorth
Tell me this is good managering then.
May 11, 2025 at 3:29 PM
Tell me this is good managering then.
I work on a team where, when you ask PMs if they've talked to developers about the semantics and behavior of the developer tool they are product managering, they say "yes" because they talked to downstream developers' PMs
fucking maddening
fucking maddening
September 10, 2024 at 12:31 AM
I work on a team where, when you ask PMs if they've talked to developers about the semantics and behavior of the developer tool they are product managering, they say "yes" because they talked to downstream developers' PMs
fucking maddening
fucking maddening
Into these sister fingering, kopparberg avoiding, blue seat dressing, self harm managering, bin sniffing, paedo harbouring, massive advertise board using, 19 premier league rival letter writing cheats
December 1, 2024 at 12:04 PM
Into these sister fingering, kopparberg avoiding, blue seat dressing, self harm managering, bin sniffing, paedo harbouring, massive advertise board using, 19 premier league rival letter writing cheats
U nas menadžer bio na godišnjem 3 sedmice i kaže vraća se u srijedu što je sutra.
Lik je laganica jutros u ofisu bio tačno u 8 i gledao nas kako jedno po jedno dolazimo tek poslije 9.
I smijao nam se - sleepy heads, coming in so late, hm?
Managering done right ✅
Lik je laganica jutros u ofisu bio tačno u 8 i gledao nas kako jedno po jedno dolazimo tek poslije 9.
I smijao nam se - sleepy heads, coming in so late, hm?
Managering done right ✅
November 7, 2023 at 10:17 AM
U nas menadžer bio na godišnjem 3 sedmice i kaže vraća se u srijedu što je sutra.
Lik je laganica jutros u ofisu bio tačno u 8 i gledao nas kako jedno po jedno dolazimo tek poslije 9.
I smijao nam se - sleepy heads, coming in so late, hm?
Managering done right ✅
Lik je laganica jutros u ofisu bio tačno u 8 i gledao nas kako jedno po jedno dolazimo tek poslije 9.
I smijao nam se - sleepy heads, coming in so late, hm?
Managering done right ✅
My art journy this year , not much, altho i could have a Miji bingo if soemthing funny comes up hehe
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
September 19, 2025 at 3:35 PM
My art journy this year , not much, altho i could have a Miji bingo if soemthing funny comes up hehe
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
it’s taking me back to junior high stage managering… she would be a supporting character who was shocked to not get the lead. Aevery show i would be telling her to be quiet backstage and put her props back where they go!!
July 22, 2025 at 1:54 AM
it’s taking me back to junior high stage managering… she would be a supporting character who was shocked to not get the lead. Aevery show i would be telling her to be quiet backstage and put her props back where they go!!
The Corroded Coffin guys reunite to go back on tour with them and Eddie is working double time but everyone is happy and a big family and they take care of each other and make a lot of money the end.
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
February 12, 2025 at 10:10 PM
The Corroded Coffin guys reunite to go back on tour with them and Eddie is working double time but everyone is happy and a big family and they take care of each other and make a lot of money the end.
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
is cousell a better manager than ross?
yes but also i've spent years mocking counsell for being a crybaby about rain delays, at least when i mocked ross i actually liked him and just didn't like his managering
yes but also i've spent years mocking counsell for being a crybaby about rain delays, at least when i mocked ross i actually liked him and just didn't like his managering
November 6, 2023 at 11:37 PM
is cousell a better manager than ross?
yes but also i've spent years mocking counsell for being a crybaby about rain delays, at least when i mocked ross i actually liked him and just didn't like his managering
yes but also i've spent years mocking counsell for being a crybaby about rain delays, at least when i mocked ross i actually liked him and just didn't like his managering
shes tasqué managering it
September 3, 2025 at 5:04 PM
shes tasqué managering it
ugh i'm mister managering so hard and so effectively today it makes me fucking sick
August 28, 2023 at 7:20 PM
ugh i'm mister managering so hard and so effectively today it makes me fucking sick
Seagull managering.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
August 7, 2025 at 6:35 PM
Seagull managering.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
Making an offer on the way to the airport to go give a talk about hiring: next-level managering by @marisabrandt.😂 #velocityconf
November 14, 2024 at 1:41 PM
Making an offer on the way to the airport to go give a talk about hiring: next-level managering by @marisabrandt.😂 #velocityconf
LYDIA: *rolls her eyes*
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
April 30, 2025 at 2:18 PM
LYDIA: *rolls her eyes*
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
I've done so much managering today
May 12, 2025 at 1:07 PM
I've done so much managering today
Whilst working at a Warhammer store, every new manager had to do what we called ‘managering’, finding something to change. One of them was concerned that a pot of demo paint was going missing a week. His solution was to superglue the paint pots to the tables. Next manager had to write off the tables
February 18, 2025 at 7:11 PM
Whilst working at a Warhammer store, every new manager had to do what we called ‘managering’, finding something to change. One of them was concerned that a pot of demo paint was going missing a week. His solution was to superglue the paint pots to the tables. Next manager had to write off the tables