Hellina Hailu Nigatu
@hellinanigatu.bsky.social
2.2K followers 250 following 130 posts
CS PhD candidate @UCBerkeley. Interested in multilingual and low-resourced language NLP + HCI. @SIGHPC CDS Fellow. Interned @MBZUAI. Current intern at DAIR Website: https://hhnigatu.github.io
Posts Media Videos Starter Packs
Pinned
hellinanigatu.bsky.social
I started a personal project a while back...Working on African languages, I meet people from all over the continent. I found it very interesting how similar we can be on some things and how drastically different we are on others. So I decided to read one book from each African country...🧵
Reposted by Hellina Hailu Nigatu
rajiinio.bsky.social
There is so much about navigating the Internet in a low resourced language that makes one unnecessarily vulnerable to malicious actors. It's not just a quality of experience difference, but literally the soft belly through which misinformation spreaders attack.
hellinanigatu.bsky.social
Very excited for our upcoming #AIES paper Into the Void: Understanding Online Health Information in Low-Web Data Languages.

Link: arxiv.org/pdf/2509.20245

1/n
arxiv.org
hellinanigatu.bsky.social
This work was done with my wonderful collaborators Nuredin Ali, Fiker Tewelde, @schancellor.bsky.social and @iamdaricia.bsky.social

5/n
hellinanigatu.bsky.social
Based on our findings, we introduce the concept of Data Horizons: a critical boundary where algorithmic structures begin to degrade the relevance and reliability of search results.

4/n
hellinanigatu.bsky.social
We investigate online health information on #YouTube and #TikTok in two low-web data languages, Amharic and Tigrinya. We find that linguistic, technological, and socio-cultural constraints on information access and production lead to degraded information quality for low-web data languages.

3/n
hellinanigatu.bsky.social
While social media platforms are increasingly being used as sources of information for critical sectors like healthcare, the quality and quantity of information available is not always guaranteed, especially for languages with limited data available online.
2/n
hellinanigatu.bsky.social
Very excited for our upcoming #AIES paper Into the Void: Understanding Online Health Information in Low-Web Data Languages.

Link: arxiv.org/pdf/2509.20245

1/n
arxiv.org
hellinanigatu.bsky.social
እንኳን አብሮ አደረሰን!
So far so good navigating the documentation! Will reach out if i need help or have questions 😊 thank you!
hellinanigatu.bsky.social
@meg48.bsky.social's Ethiopian new years gift to me is a new version of HornMorpho exactly as i am working on a project that requires morphological analyzer for Amharic, Tigrinya, and Afan Oromo 💃💃
hellinanigatu.bsky.social
That explains a lot 😂😂
hellinanigatu.bsky.social
What are you up to Nina 👀
Reposted by Hellina Hailu Nigatu
schasins.bsky.social
If you or your students are interested in visualization tools, may I suggest signing up for my student @parkie-doo.sh's study! We're learning *a lot* about how to build direct manipulation programming tools these days! Please pass the sign up link along to your labs!
docs.google.com/forms/d/e/1F...
Reposted by Hellina Hailu Nigatu
milamiceli.bsky.social
I am thrilled to be recognized by TIME as one of the 100 most influential people worldwide in the field of artificial intelligence for my work with @dataworkersinquiry.bsky.social.

>> #TIME100AI time.com/time100ai

I want to take this opportunity to share a few reflections on this work 👇🧵
Portrait of Milagros Miceli in a frame that reads TIME100/AI 2025.
hellinanigatu.bsky.social
Oh no! I ran out of wall space for my tally!!!😌
hellinanigatu.bsky.social
I am gonna start a tally for every time i have to contend with publication policies at top tier conferences that implicitly stall Global South scholarship.
hellinanigatu.bsky.social
Came accross a common Ethiopian name on one of the poems in this book as a dedication 😊
hellinanigatu.bsky.social
this is not to say all MT is bad or MT has no place in contribution...more on that as an output of my work @dairinstitute.bsky.social 😎
hellinanigatu.bsky.social
Lol here is an example:

A google translated Tigrinya article: ti.wikipedia.org/wiki/%E1%88%...

English version: en.wikipedia.org/wiki/Wedding...

I took the part that says "Ethiopia" from the English article and ran it through Google Translate...almost identical output save a few words.
hellinanigatu.bsky.social
Book #11
Missing in action and presumed Dead by Rashidah Ismaili from Benin

Got this from Thrift Books and by luck got a version with the author signature ☺️

Its a beautiful collection of poems and my fav one is Nomad attached in the picture below
hellinanigatu.bsky.social
Omg our advisor @schasins.bsky.social got us beanbags for our lab space a while back and we loveee them
hellinanigatu.bsky.social
This is a good step IMO...but i think we conflate "Wikipedia" with "English Wikipedia" and "AI Generated" with "LLM generated"

We should also be having conversations on Machine Translated text in non-English Wikipedia...those are also "AI Generated"😐
datasociety.bsky.social
Wikipedia's policy for handling AI-generated articles could be "an important example for how to deal with the growing AI slop problem from a platform that has so far managed to withstand various forms of enshittification that have plagued the rest of the internet." www.404media.co/wikipedia-ed...
Wikipedia Editors Adopt ‘Speedy Deletion’ Policy for AI Slop Articles
“The ability to quickly generate a lot of bogus content is problematic if we don't have a way to delete it just as quickly.”
www.404media.co
hellinanigatu.bsky.social
Was a pleasure to work with you Chinasa❤ here is to many more collaborations 🥂
Reposted by Hellina Hailu Nigatu
chinasa.bsky.social
My latest work, “Examining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages,” co-authored with Abigail Oppong and Hellina Nigatu, is now published at the Workshop on Gender Bias in Natural Language Processing at #ACL2025!

aclanthology.org/2025.gebnlp-...
Screenshot of paper on the ACL website with the title (Examining the Cultural Encoding of Gender Bias in LLMs for Low-Resourced African Languages) and abstract that reads: "Abstract
Large Language Models (LLMs) are deployed in several aspects of everyday life. While the technology could have several benefits, like many socio-technical systems, it also encodes several biases. Trained on large, crawled datasets from the web, these models perpetuate stereotypes and regurgitate representational bias that is rampant in their training data. Languages encode gender in varying ways; some languages are grammatically gendered, while others do not. Bias in the languages themselves may also vary based on cultural, social, and religious contexts. In this paper, we investigate gender bias in LLMs by selecting two languages, Twi and Amharic. Twi is a non-gendered African language spoken in Ghana, while Amharic is a gendered language spoken in Ethiopia. Using these two languages on the two ends of the continent and their opposing grammatical gender system, we evaluate LLMs in three tasks: Machine Translation, Image Generation, and Sentence Completion. Our results give insights into the gender bias encoded in LLMs using two low-resourced languages and broaden the conversation on how culture and social structures play a role in disparate system performances."