Egor Marin
marinegor.bsky.social
Egor Marin
@marinegor.bsky.social
ML Scientist @ ENPICOM B.V. (Den Bosch, Netherlands)

computational biology, ML, protein design, cheminformatics, fancy dev tooling, tinge of bouldering

https://marinegor.dev
or about people obtaining a first structure of a very important (multiple molecules in clinic) drug target, but not publishing it because they couldn't settle some priority conflict within an institute

hope to see it in a better state one day

3/3
December 23, 2025 at 11:07 AM
for instance, I know about a PI deliberately obscuring experimental details for years so that they could have de-facto monopoly on his method

2/3
December 23, 2025 at 11:07 AM
obligatiey teams meme
December 15, 2025 at 6:12 PM
wait until you try teams
December 15, 2025 at 5:29 PM
>Can polars call sassy on a batch of rows at once? that would probably make things more efficient, especially for small (short read) records.
Yes, it internally operates on chunked arrays, and I believe there's also an option to control their size somehow, although I've never done it myself.
December 10, 2025 at 5:03 PM
>I don't think I have the experience and time to make such a plugin right now, but would be happy to help :)
thanks, noted! I'm figuring out how to do that for other usecase (currently on pickling, see issue: github.com/birkenfeld/s...), but later it's directly convertible to sassy.
December 10, 2025 at 5:03 PM
>does it support multithreading?

yes, polars has its own global allocator which is based on rayon, so it manages all reading and compute together.

You can have a look at e.g. polars-distance for an inspiration: github.com/ion-elgreco/...
December 10, 2025 at 4:37 PM
>but with polars you'd probably want to have the bindings directly in the Rust backend right?
yep, exactly -- to not cross python/rust border twice.

>You're thinking of a plugin that filters the rows (records) of a table?
something like that, yes: gist.github.com/marinegor/a5...

(note `Lazy`)
December 10, 2025 at 4:37 PM
main reason for me to implement such plugins, apart from fun, is getting automatic access to many tabular formats (via e.g. polars-bio: github.com/biodatageeks...) and their lazy streaming from s3, gcp and so on, which is extremely useful in more production-like settings.
GitHub - biodatageeks/polars-bio: Blazing-Fast Bioinformatic Operations on Python DataFrames
Blazing-Fast Bioinformatic Operations on Python DataFrames - biodatageeks/polars-bio
github.com
December 10, 2025 at 3:58 PM
oh wow, I didn't know it looks so cool in action, kudos to the tui design!

I wonder if you've ever considered building a polars plugin for that as well? We've been using it a bunch for the sequence data consumption for ML purposes, and I wonder if it's something you have on your roadmap.
December 10, 2025 at 3:58 PM
kinda reminds me of these projects: github.com/rustedpy

sadly neither `maybe` nor `result` are not maintained anymore💔
rustedpy
rustedpy has 3 repositories available. Follow their code on GitHub.
github.com
November 27, 2025 at 10:40 PM
that was actually my motivation behind choosing a uni: I deliberately went for the one offering most flexibility throughout the bachelor, and ended up using it to its maximu.
November 23, 2025 at 6:40 PM
damn that's complicated! I guess i'm too lazy to go that deep😁

ah, and caps is language switching for me, also insanely useful.
November 23, 2025 at 10:49 AM
hey Pablo, congrats on the publication!

I imagine you started working on it before MDAnalysis introduced parallelization, so I wonder if you'd be interested in implementing it for eRMSF as described here: docs.mdanalysis.org/stable/docum...
4.2.4. Parallel analysis — MDAnalysis 2.10.0 documentation
docs.mdanalysis.org
November 22, 2025 at 11:28 PM
<<<< binding right cmd to escape
November 22, 2025 at 11:02 PM
please tell me there was "rust is blazingly fast" option
November 17, 2025 at 8:45 PM
I checked Plank and Einstein obituaries -- they've referred as Dr all throughout. But I guess it's, unfortunately, not the worst disgrace she's had throughout.
November 9, 2025 at 4:49 PM
within 24h, someone who I never comes in with a chunky PR that a) is more concise than my code, b) fixes all of my issues, c) also fixes relevant documentation for that. All I had left to do is to add this person to the changelog.

Feels unexpectedly good, I should say.
November 6, 2025 at 12:29 PM
I immediately wonder if there's any preference for antibody fold around heavy/light chain variable fragment length🤔
October 27, 2025 at 9:19 PM