Mason James
masonjames.bsky.social
Mason James
@masonjames.bsky.social
masonjames.com | growagainorchids.com
I see. Thanks for the explanation internet friend!

Lots of ways to get code into LLMs, but this is one of my favorites for quick big experiments. I'm often using it in combo with Gemini's 2M window, testing limits on context size
January 4, 2025 at 4:20 AM
Not "random", but now the output is trying to include images as SVGs or something because I raised the slider?

If so maybe the interface could warn or offer an option to exclude assets? I was hoping to capture oddly long contribution docs, examples, etc, but not media.
January 3, 2025 at 8:32 PM
Ah! Then here it is - I thought this slider meant it would simply include random files in the repo that were unusually large. When I up the slider, that idea is confirmed by seeing more estimated tokens... I thought there was just more data available. What does the slider do? 🤷‍♂️
January 3, 2025 at 8:28 PM
Yup yup. In this repo's output, I removed 6 occurrences, which took it from 2.7M tokens to about 400k: github.com/roboflow/sup...
GitHub - roboflow/supervision: We write your reusable computer vision tools. 💜
We write your reusable computer vision tools. 💜. Contribute to roboflow/supervision development by creating an account on GitHub.
github.com
January 3, 2025 at 4:08 AM
I absolutely love this project - not a coder so using it all the time to provide context :)

One thing I've hit a couple times is that if the repo has a data outputting image/png the gitingest includes the whole hash. This quickly skyrockets the number of tokens, but I'm not sure how to exclude it.
January 1, 2025 at 4:37 PM
excellent, thank you!
November 23, 2024 at 1:17 AM
Is there a list of labelers somewhere? Didn't even know this existed til your post.
November 23, 2024 at 1:12 AM