A whole lot of managering going on
October 28, 2025 at 5:05 AM
A whole lot of managering going on
I'm not gentle parenting myself, I'm night shift managering myself. If i don't close right (dishes, shower) it's gonna be hell on opening shift (leaving on time for work) and then I'm the one having to explain to the owner (me) wtf happened last night (unfortunately also me)
September 30, 2025 at 2:06 AM
I'm not gentle parenting myself, I'm night shift managering myself. If i don't close right (dishes, shower) it's gonna be hell on opening shift (leaving on time for work) and then I'm the one having to explain to the owner (me) wtf happened last night (unfortunately also me)
My art journy this year , not much, altho i could have a Miji bingo if soemthing funny comes up hehe
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
September 19, 2025 at 3:35 PM
My art journy this year , not much, altho i could have a Miji bingo if soemthing funny comes up hehe
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
This year got my big commission that i proud of and other piece to even be small
Myselft,i prefer do it with my own hands
SP in editing and drawing and litter in managering
shes tasqué managering it
September 3, 2025 at 5:04 PM
shes tasqué managering it
ce n'est pas près de s'arrêter quand on voit le patron de microsoft faire du vibe-managering :D
September 3, 2025 at 1:42 PM
ce n'est pas près de s'arrêter quand on voit le patron de microsoft faire du vibe-managering :D
Did Neville and Rob Gale go to the same school of managering where they teach you to constantly be yelling instructions at your team the entire match, seems like it would really annoying as a player
August 17, 2025 at 3:27 AM
Did Neville and Rob Gale go to the same school of managering where they teach you to constantly be yelling instructions at your team the entire match, seems like it would really annoying as a player
Seagull managering.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
August 7, 2025 at 6:35 PM
Seagull managering.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
He did this all the time as a developer, too. He'd show up to the site, inspect things like he knows what he's looking at, make demands as if he's some genius architect, berate the foreman, give one of those "no excuses, get it done" commands, and then leave.
straight up managering that thang
August 3, 2025 at 12:10 PM
straight up managering that thang
New #blog: Categorising My Daily Todo List with Deepseek-r1
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Categorising My Daily Todo List with Deepseek-r1
I've made no _particular_ secret of the fact that I am **not** a fan of large language models: they're overhyped, under deliver and were built off the back of a ripping off of others work at a scale that (for anyone else) would have resulted in jail and/or bankruptcy (even before the more incontrovertible acts of copyright infringement are considered).
But, I also work in a industry which is going through a phase of significant AI use (even if a lot of companies do come to regret being over-eager).
As a generalist, still decades from retirement, I _have_ to keep my hand in.
If nothing else, doing so means that I'm better positioned to explain _why_ and _when_ they seem to be ill-suited.
Moving on, I've finally got around to reading the book that I mentioned while procrastinating recently.
In one section, the author talks about reviewing how their time is spent:
> Every quarter, I spend a few hours categorizing my calendar from the past three months to figure out how I’ve invested my time. This is useful for me to reflect on the major projects I’ve done, and also to get a sense of my general allocation of time.
This struck me as a _particularly_ sensible thing to start doing.
However, I needed a slightly different approach: although it plays a part, my calendar is not the main organiser of my day because I work from a daily todo list in Obsidian.
Although small/quick tasks might sometimes get omitted, they're still far more likely to appear in my TODO list than in my calendar.
The trade off, of course, was that I would only able to collect the numbers of task completed rather than a measure of time spent.
Pivoting back to AI, I decided that categorising TODO list items **should** be a piece of cake for any reasoning model.
This post discusses how I developed metrics about my todo list along with where (and why) I found the LLM unreliable.
* * *
### Requirements
I had a simple (but strict) set of requirements
1. **The LLM must be local** : I'm a manager and my todo list sometimes contains references to team members. I'm not comfortable sending _any_ of that to one of the hosted providers1
2. **The script should be cron-able** : the whole point is that it should just run without me needing to do anything
3. **It must use my existing format** : Shuffling the TODO list structure around is OK, but changing the way that I write TODOs is likely to be unrealistic - it's supposed to be somewhere to quickly note a need to do something, rather than a document which requires a linter
4. **Tasks should be sorted into a finite set of categories** : LLMs are quite inconsistent in their output, so it'd be unwise to rely on the LLM to also define the categories: I'd end up with tasks spread over "discussions", "email" and "correspondence"
5. **The output should be simple counts** : The aim here wasn't to create a second copy of my TODO list, but instead to express it in terms of simple metrics
* * *
### TODO List Structure
My list uses a simple markdown template:
# Daily Notes: {{date}}
### TODO
- [ ] Backlog item
### Today
- [ ] Incomplete task
- [x] Complete task
- [ ] Morning Sync (09:00)
**Calls**
**Morning Routine**
- [ ] Slack
- [ ] Email
.. etc ..
**Follow Up Tomorrow**
The `TODO` section contains my backlog which is rolled into the current days notes by the Daily todos rollover plugin (along with any incomplete items under `Today`).
As I work through my morning routine, I populate the `Today` section with calls broken out under their own subheading.
* * *
### Processing
Because they're stored as markdown, the lists are easy to process and the headings make it easy to see which section an item falls under.
I've used a similar format in previous roles and previously wrote a simple burndown stats script which built metrics by walking through line by line:
for line in fh.readlines():
# Process the line
if "### TODO" in line:
in_section = "main"
print("Started")
continue
if "**Calls**" in line:
in_section = "calls"
continue
if "**Morning Routine**" in line:
in_section = "morning-routine"
continue
elif in_section and line.startswith('- [ ]'):
counters["items"] += 1
elif in_section and line.startswith('- [x]'):
counters["items"] += 1
counters["complete"] += 1
For this project, I added a function call to the conditional block for completed items:
category = categorise_item(line, in_section, ai, categories)
* * *
### Adding AI
I span up an ollama docker container and loaded Deepseek-r1 into it
ollama run deepseek-r1:7b
Use of the 7 billion token model was a compromise: the solution had to run on a laptop (I actually started with the much smaller `phi3:mini-4k` but it _really_ struggled to give accurate responses).
I have a python wrapper that I previously wrote to provide a command-line REPL for interacting with Ollama3, so I imported that.
Although there's a bit of class initialisation to do, it makes sending a message simple:
response = ai.chat_with_model("Hello, are you there?")
Next, I needed to write a prompt to tell the LLM what to do with the messages that would be sent to it:
You are a task categorisation specialist and are being employed in order to identify what categories of work each of the user's completed todo list items falls into.
The user will provide you with a file in markdown format which consists of a list of checkboxes with todo list items next to them.
Using the rules below, you are to categorise any task items that the user passes you.
**File Structure**
* An item starting with - [ ] indicates a task that has not yet been completed, ignore these.
* An item starting with - [x] indicates that the task was completed, you should categorise the item
* Items that follow the text Calls are calls - these need further classification (see the calls selection below)
* Items under Morning Routine are routine items, these should be classified as "daily admin"
You may encounter a heading with the text Defunct. If so, ignore everything that follows it.
**Calls**
There are two types of call classificiations: call-meetings and call-11
If the todo item includes the text 1:1, it's a call-11, otherwise it should be classified as call-meeting.
**Categories**
ALL completed items must be categorised but must ONLY be categorised into one of the following categories
* call-meeting
* call-11
* daily admin
* incident response
* PR work
* documentation and planning
* other
To test, I passed it a single item2
- [x] Swear at AI
The response and the reasoning seemed in line with what I wanted
> Okay, let's tackle this problem step by step. The user has provided a task list with several items marked as completed (indicated by - [x]). My job is to categorize each of these items into one of the specified categories based on the given rules.
>
> First, I need to go through each item and determine its category. Let's look at each item:
>
> 1. "- [x] Swear at AI" This task clearly states a negative action but doesn't provide any specific context like calls or routine tasks. Since it's under "Other" as per the categories list, this should be classified as "other".
>
>
> I don't see any other items provided in the example besides this one. Each item must be categorized without inventing new ones that aren't present in the user's message.
>
> So, the only category needed here is "other".
>
> { "category": "other" }
I threw it a few more items and all came back correctly.
* * *
#### Problems
However, issues soon began to creep in.
Despite having been passed a list of acceptable categories, the model invented a new one: security.
Expanding my prompt to include explicit rules (you MUST NOT return a category not listed above) helped a little, but not much.
To improve the chances of getting a successful response, I adjusted my script to add the ability to trigger a reprompt:
def get_ai_category(item, ai, section, ffs=False):
''' Ask the AI to categorise
We often need to strip the response though, because many reasoning
models include their reasoning too
'''
msg = []
if ffs:
msg.append("Reminder: you MUST stick to the categories provided in the prompt. You're a very naughty LLM, your previous response made changes - please try again but stick specifically to the rules in the prompt")
msg.append(item)
response = ai.chat_with_model("\n".join(msg)).split("\n")
However, it didn't help much: the LLM often replied with the same thing the second time.
Taking a closer look at the responses revealed something even more frustrating though: sometimes it selected the correct category but then fucked up the spelling.
For example, instead of `documentation and planning` it returned the following
* document and planning
* documentaton and planning
* documenta...
* documentations and planning
Sometimes the response would contain a valid category name but prefixed by some other text
*category*: documentation and planning
*final decision*: calls-11
I messed aroumd with making it respond with JSON, but nothing _really_ helped.
In the end, I created a function called `unfuck_ai_response` and started adding string processing to account for its common mistakes.
As a side note, the AI also **staunchly** refused to allow me to use non-words as category names.
I defined a category with the name `managering` but it insisted on returning `managerial`. That behaviour makes sense, because _actual words_ are obviously going to have more weight, but it doesn't help with the feeling that AI is a fun sponge.
* * *
#### Taking Load Off the AI
Even with the new function, there were still far too many items that weren't being categorised correctly.
Quite a few of the regular items in my todo list can be categorised by looking for specific strings, so I adjusted the script to start handling those simpler items itself:
if in_section == "calls":
if "1:1" in item:
return "call-11"
elif "adhoc" in item:
return "call-unplanned"
else:
return "call-meeting"
I adjusted the script so that all todo items categorised at `other` would be written to a file before triggering a full run across all of my notes.
Even if it didn't make sense to adjust the script for every `other` item, the list allowed me to spot patterns and build the ruleset out.
* * *
### Collecting Metrics
With the script now capable of emitting semi-reliable metrics, I needed to collect the data somewhere.
For this project, I decided to give VictoriaMetrics a try: it accepts a variety of input formats, including Influx line protocol, so making the script generate LP gave me the ability to try VictoriaMetrics whilst retaining the ability to pivot back to InfluxDB if needed.
* * *
#### Standing VictoriaMetrics up
Using the VictoriaMetrics docs as a guide, I was able to come up with a `docker-compose` snippet to stand a container up:
victoria_metrics:
restart: always
image: victoriametrics/victoria-metrics:v1.122.0
container_name: victoria_metrics
command:
- "-selfScrapeInterval=5s"
- "-storageDataPath=victoria-metrics-data"
- "-retentionPeriod=100y"
- "-disablePerDayIndex"
ports:
- 8428:8428
volumes:
- /home/ben/docker_files/victoria/data:/victoria-metrics-data
Writing line protocol in was easy:
curl \
-d 'metric,foo=bar field1=3.4' \
http://127.0.0.1:8428/api/v2/write?db=bar
So I updated my script:
print("Submitting to VictoriaMetrics")
res = requests.post(
f"{vm_url}/api/v2/write?db=workload_stats",
data = '\n'.join(lp_buffer)
)
print(res.status_code)
print(res.text)
Note: VictoriaMetrics doesn't have the concept of separate databases and the name of the DB provided instead gets added as a label.
After the next script run, I was able to query data:
curl -s 'https://vm.83n.uk/api/v1/series?match[]=todo_list_completed'
{
"status":"success",
"data":[{
"__name__":"todo_list_completed",
"db":"workload_stats",
"hostname":"flibble"
}]
}
I could also delete that series - something that I used _quite a bit_ whilst continuing to refine the script's output
curl -s 'https://vm.83n.uk/api/v1/admin/tsdb/delete_series?match[]=todo_list_completed'
* * *
#### Visualising
I installed the VictoriaMetrics plugin for Grafana and started to build a dashboard
Historically, I've written quite a lot of Flux on this blog, but VictoriaMetrics uses MetricsQL.
It's a PromQL-alike language, which itself is similar to LogQL, so I didn't experience any real issues writing queries to drive the graphs:
todo_list_items_count{db="workload_stats", category!="error", category!="scheduled-for-tomorrow"}
The visual query builder is also pretty good.
* * *
#### Metrics: use of AI
The TODO processing script also writes metrics about itself into VictoriaMetrics, so the dashboard includes a section specifically for script run stats.
One graph indicates the proportion of TODO items categorised by the script vs categorised by AI:
Although I'd been careful to only add general rules (rather than targetting specific items), the ruleset embedded within the script accounted for two thirds of all categorisations.
I also wanted to understand how often AI was misbehaving, so I adjusted the script to calculate an AI reliability score:
# Calculate a reliability score for the AI
# for every retry we had to send the count goes down.
if source == "ai":
# So the logic here is that we want to generate
# a percentage reliability score based on
# comparison to the max retry depth.
#
# We add 1 onto max_retry_depth to account for
# the first request (which doesn't get treated
# as a retry)
#
# So, if max_retry_depth is 4, the max becomes 5
# and scores would look like this:
#
# * we didn't retry at all: 100%
# * we retried once: 80%
# * we retried 4 times: 20%
# * we burned through the retries: 0%
#
score = (1 - (RETRY_DEPTH / (max_retry_depth + 1))) * 100
For each TODO list run, the script reports mean, min and max reliability scores.
It seems that it _generally_ behaves, but when it misbehaves, it does so quite stubbornly:
* * *
### Conclusion
I now have a script which can _roughly_ categorise my TODO items from each day ready for visualisation in a dashboard to shows where the majority of my efforts are spent.
My initial hopes for AI making it quick and easy were quickly, if predictably, dashed: the LLM does a "good enough" job most of the time, but only because substring matching within the script handles ~66% of items.
One of the **huge** benefits of having opted for a local model is that I don't have to worry about results suddenly drifting as the result of a provider deploying an update which results in the model responding differently.
That consistency has its limit though: the dashboard has looked slightly different every time that I've triggered a full re-run - the LLM is rarely consistent in the categories that it selects.
Equally, though, there probably isn't any _reasonable_ way that I could have script logic correctly categorise the 1/3rd that ends up being passed off to the LLM.
Although I could either limit the language used or include a category alongside items, the likely result of that added burden is that the list would fail to include so much of my workday.
* * *
1. It doesn't _matter_ what their terms say. Data protection by design and by default, if it can be done without sending to OpenAI/Google/Musk then it should be ↩
2. It's possible that I've skipped a few iterations in this post and was growing fed up at this point ↩
3. In case you haven't picked up on this: the privacy benefits mean that local AI will always be > remote AI ↩
www.bentasker.co.uk
July 31, 2025 at 7:30 AM
New #blog: Categorising My Daily Todo List with Deepseek-r1
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
Author: Ben Tasker
https://www.bentasker.co.uk/posts/blog/software-development/ai-todo-list-categorisation.html?utm_source=rss
#ai #deepseek #genai #llm #managering #ollama #softwaredevelopment #victoriametrics
tasque managering here too because why not
July 28, 2025 at 6:36 AM
tasque managering here too because why not
Betterer at managering the wreckedconomy.
July 28, 2025 at 4:34 AM
Betterer at managering the wreckedconomy.
He is so funny. The orher day he brought his brother in as an intern, while he recovered from managering Mom all morning lol. Same dad, diff mom, one year apart!
July 26, 2025 at 10:41 PM
He is so funny. The orher day he brought his brother in as an intern, while he recovered from managering Mom all morning lol. Same dad, diff mom, one year apart!
it’s taking me back to junior high stage managering… she would be a supporting character who was shocked to not get the lead. Aevery show i would be telling her to be quiet backstage and put her props back where they go!!
July 22, 2025 at 1:54 AM
it’s taking me back to junior high stage managering… she would be a supporting character who was shocked to not get the lead. Aevery show i would be telling her to be quiet backstage and put her props back where they go!!
#TavQOTD Althea, a nurse, catches Astarion creeping around the hospital's blood storage area and is convinced to help him.
Minuet is managering her sibling's music act and Raphael comes creeping around looking for new contracts. #BG3Althea #BG3Minuet #BG3
Minuet is managering her sibling's music act and Raphael comes creeping around looking for new contracts. #BG3Althea #BG3Minuet #BG3
July 4, 2025 at 3:25 AM
#TavQOTD Althea, a nurse, catches Astarion creeping around the hospital's blood storage area and is convinced to help him.
Minuet is managering her sibling's music act and Raphael comes creeping around looking for new contracts. #BG3Althea #BG3Minuet #BG3
Minuet is managering her sibling's music act and Raphael comes creeping around looking for new contracts. #BG3Althea #BG3Minuet #BG3
scar! These are gorgeous! Thanks for sharing them with us! 🥰🫂💛
How is managering going this week?
How is managering going this week?
May 29, 2025 at 3:49 PM
scar! These are gorgeous! Thanks for sharing them with us! 🥰🫂💛
How is managering going this week?
How is managering going this week?
I've done so much managering today
May 12, 2025 at 1:07 PM
I've done so much managering today
Tell me this is good managering then.
May 11, 2025 at 3:29 PM
Tell me this is good managering then.
LYDIA: *rolls her eyes*
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
April 30, 2025 at 2:18 PM
LYDIA: *rolls her eyes*
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
CHRIS: Question is, are YOU a diva? Are you gonna refute every bit of gossip that goes around about your managering style? *teases* Are you going to deny the absolute reaming you give anyone that catches you without your 'face' on? *chuckles* Or the stomp in your stride
8/?
I was in Providence today for my haircut (bc I cannot find a barber in Boston I like) but also at Heartleaf for Indie Bookstore day! I got to say hi to Penny and got a whole bunch of new books
April 26, 2025 at 10:50 PM
I was in Providence today for my haircut (bc I cannot find a barber in Boston I like) but also at Heartleaf for Indie Bookstore day! I got to say hi to Penny and got a whole bunch of new books
In baseball writing online, there's a term for coaches who so need to be seen as effective leaders that they constantly make visible "I'm in charge here" moves: "managering." They'd rather do something counterproductive than be seen doing nothing.
March 6, 2025 at 2:09 PM
In baseball writing online, there's a term for coaches who so need to be seen as effective leaders that they constantly make visible "I'm in charge here" moves: "managering." They'd rather do something counterproductive than be seen doing nothing.
I think Phil is not good at the managering
February 23, 2025 at 10:20 PM
I think Phil is not good at the managering
Whilst working at a Warhammer store, every new manager had to do what we called ‘managering’, finding something to change. One of them was concerned that a pot of demo paint was going missing a week. His solution was to superglue the paint pots to the tables. Next manager had to write off the tables
February 18, 2025 at 7:11 PM
Whilst working at a Warhammer store, every new manager had to do what we called ‘managering’, finding something to change. One of them was concerned that a pot of demo paint was going missing a week. His solution was to superglue the paint pots to the tables. Next manager had to write off the tables
The Corroded Coffin guys reunite to go back on tour with them and Eddie is working double time but everyone is happy and a big family and they take care of each other and make a lot of money the end.
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
February 12, 2025 at 10:10 PM
The Corroded Coffin guys reunite to go back on tour with them and Eddie is working double time but everyone is happy and a big family and they take care of each other and make a lot of money the end.
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
Steve is just number one fan/managering off the side like "that's my family!" with the biggest grin
Anyway I’m taking Valencia to the La Liga title today. That’s what Christmas is all about. Lazy days football managering.
December 23, 2024 at 9:55 AM
Anyway I’m taking Valencia to the La Liga title today. That’s what Christmas is all about. Lazy days football managering.