Tanya Shapiro
@tanyashapiro.bsky.social
7.1K followers 370 following 1.2K posts
Solopreneur @ IndieVisual Data geek who loves problem solving Rambles about fitness & random stuff too https://indievisual.tech
Posts Media Videos Starter Packs
Pinned
tanyashapiro.bsky.social
"Fall down seven times, get up eight" - Japanese proverb

"Fix one bug, spawn ten more." - Developer proverb
Reposted by Tanya Shapiro
dariia.bsky.social
This workshop is tomorrow, don't miss your chance to register!
Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
dariia.bsky.social
❗️Our next workshop will be on October 9th, 6 pm CEST titled From slow analysis to a fast and structured program in R by
@johanzvrskovec.bsky.social

Register or sponsor a student by donating to support Ukraine!
Details: bit.ly/3wBeY4S
Please share!
#AcademicSky #EconSky #RStats
tanyashapiro.bsky.social
What’s your data passion project these days?

I’ve gotten into gamer stats territory in my free time and it brings me so much joy 👾
tanyashapiro.bsky.social
Oh my god it’s only Monday
tanyashapiro.bsky.social
SDLC: ship, debug, learn, cry
tanyashapiro.bsky.social
Yessss. I know this journey well!
tanyashapiro.bsky.social
Data Magic, brought to you by R + {httr}
tanyashapiro.bsky.social
Have I mentioned how much I love APIs?

I had to clean up 500 dupe CRM records. Alas, there was no bulk merge feature in the UI. It would take hours to clean up 😱

And since laziness is my greatest coding motivator, I started to dig thru dev docs… low & behold, they had an API!

THANK YOU DATA MAGIC
tanyashapiro.bsky.social
But @alanau.bsky.social totally agree with you. It's a tough but necessary exercise.

And I'm a people pleaser. So I do this to myself 😅
tanyashapiro.bsky.social
The logic is sound. In a perfect world, everyone is equal parts data consumer and data conservator.

Alas, data quality assurance is not on everyone’s priority list. But analytics and “insights” are…and thus most of the work gets relegated downstream to analysts, engineers, and data scientists 🙃
tanyashapiro.bsky.social
Me, the system admin, cleaning up data after users who are on a mission to fill the CRM with dirty, filthy, data.

This reminds me of a talk by renown behavioral economist, Dan Ariely about work/motivation using legos...sigh.

#iamsisyphus
a woman is kneeling on the floor in a living room with toys and a baby and a dog .
ALT: a woman is kneeling on the floor in a living room with toys and a baby and a dog .
media.tenor.com
tanyashapiro.bsky.social
I had to do a double take on this map.

To really highlight “high” vs “low,” it’s clearer to use completely different colors instead of similar hues. The choice below reads like a sequential scale - light blue feels like a middle bin, not a low one.

Color opinions aside, interesting info!
Choropleth map of the United States depicting States with the highest and lowest proportion of same sex couples per 1,000 households. The legend shows dark blue box to represent “Highest proportion of same sex couples” while a light blue box represents “Lowest proportion of same sex couples.” Other states are muted with a grey blue tone and have values that are somewhere between the high and low bins. Graphic by UCLA School of Law Williams Institute.
tanyashapiro.bsky.social
I always tell myself “this is, it’s the last time I fall for this”

And yet here I am. Again. #foolmemorethantwice
tanyashapiro.bsky.social
The audacity! Yeah those are equally rude and disappointing (one reason I’m more selective on connection requests)

Equally annoying, I’ve had sales guys abuse my “Discovery Call” scheduling link to pitch me THEIR product/service. 🙄
tanyashapiro.bsky.social
Opening LinkedIn to see a shiny red inbox notification

…only to realize it’s another sponsored message

Suckered. Every. Time.
a man in a green suit and tie is smiling in a dark room
ALT: a man in a green suit and tie is smiling in a dark room
media.tenor.com
tanyashapiro.bsky.social
truly! i'm leaning heavily on fuzzyjoin & stringdist packages so i get away with minimal thinking (thank you #rstats community). processing lv was ⚡
tanyashapiro.bsky.social
in case anyone wants to embark on this learning journey with me, I used Claude to help me make a cheatsheet of methodologies and best use cases.
List of string matching methodologies in a table along with info on best use cases, examples, and characteristics.
tanyashapiro.bsky.social
learning about different string matching methodologies today - Levenshtein, q-gram, cosine, Jaro...

data people who wrangle text data for a living are truly the unsung heroes of our data community.
tanyashapiro.bsky.social
Excited to see this out in the wild. Proud of work. So much fun getting to collaborate with you and the WSDA team!
tanyashapiro.bsky.social
Thank you! Yes, fairly custom. Lot of JavaScript behind the scenes 😊