#ahrefsbot
Block the Top 10 Website Scrapers Today!
Protect Your Privacy & Content

Click on The Alt Text of the Image to see the top 10 Scraper Bots!

#socialmediamarketing #smallbusinesstips #contentcreation #socialmediastrategist #womeninbusiness
#smallbusinessshoutouts
February 28, 2025 at 5:29 AM
They are bot's, mostly googlebot

Googlebot 11,448
AhrefsBot 9,128
Unknown robot identified by bot\* 7,723+
Applebot 2,719
SemrushBot 2,182
empty user agent string 2,408
Mediapartners-Google 2,324
bingbot 2,129
feed 2,016
Googlebot-Image 2,043
Go-http-client 1,670
facebookexternalhit 1,589
October 31, 2024 at 3:56 PM
Ok actually, because robots.txt is optional and ignorable, here is our current, full AI bot blocking solution using the non-optional .htaccess mod_rewrite (see alt text)
October 28, 2025 at 5:09 PM
Adding this to your .htaccess is a better defense against bots and AI scrapers than robots.txt
December 2, 2024 at 8:26 PM
It looks like several scrapers have found my Discworld tar pit. 😈

Today's stats so far:
GPTBot/1.2: 12405406 Bytes
Googlebot/2.1: 1391937 Bytes
ClaudeBot/1.0: 6359 Bytes
Amazonbot/0.1: 4622 Bytes
AhrefsBot/7.0: 1414 Bytes

#discworld #auditortrap #aipoisoning #iocaine
May 22, 2025 at 12:20 PM
799683 requests in 5 days. And more than half an hour processing time to evaluate the logs.

Breakdown:

532335 49.46% 12 0.06% 17.91 GiB ~T~\ ~T~@ GPTBot/1.2
57612 5.35% 15 0.07% 486.08 MiB ~T~\ ~T~@ AhrefsBot/7.0
56898 5.29% 5 0.02% 3.05 GiB ~T~\ ~T~@ […]

[Original post on rollenspiel.social]
January 6, 2025 at 3:25 PM
Over 98% of the December traffic to an old site was from bots. Seems portent. And sad, tbh.
ClaudeBot/1.0
Googlebot/2.1
bingbot/2.0
ChatGPT-User/1.0
AhrefsBot/7.0
panscient
SemrushBot/7~bl
DotBot/1.2
meta-externalagent/1.1
ImagesiftBot
bidswitchbot/1.0
January 3, 2025 at 11:03 AM
`'` single quote are not strings in Caddy. See Caddyfile Concepts — Caddy Documentation. In your case you don’t need quotes at all, just do `header User-Agent *AhrefsBot*` for example.

Use `abort` instead of `respond`, it’ll drop the connection without writing a response.
July 20, 2024 at 9:58 PM
Our robots.txt:

User-agent: *
Disallow:
#
User-agent: AhrefsBot
User-agent: Scrapy
User-agent: Barkrowler
User-agent: GPTBot
User-agent: AI2Bot
User-agent: Ai2Bot-Dolma
User-agent: Amazonbot
User-agent: Applebot
User-agent: Applebot-Extended
User-agent: Bytespider
User-agent: CCBot
User-agent […]
Original post on mastodon.scot
mastodon.scot
June 1, 2025 at 8:06 PM
Ein Datenkrake mit bitterem Beigeschmack! In diesem Artikel zeigen wir dir, wer wirklich vom AhrefsBot profitiert und wie du ihn von deiner Website verbannen kannst 👇

teufelswerk.net/ahrefsbot-kl...

#ahref #ahrefsbot #datenkrake #crawler #website #seo #datensicherheit #cybersicherheit
AhrefsBot klaut deine Daten – so stoppst du ihn!
Im Netz tummeln sich nicht nur Menschen. Überall huschen stille, unsichtbare Besucher durch die virtuellen Flure deiner Website, wie z. B. AhrefsBot.
teufelswerk.net
September 5, 2025 at 9:17 PM
Ah yes... we see those hit across a fair bit of our infra here at work. It's annoying as fuck to deal with because it's not just amazonbot, it's also ahrefsbot, claudebot, etc and so on and so forth...
January 18, 2025 at 5:52 AM
Seems ahrefsbot has been crawling all over our blog ... thousands of http requests per week for a blog that has about 120 posts and a handful of pages. And which gets updated maybe once every couple of weeks. That's just unreasonable, an unacceptable a...

| Details | Interest | Feed |
Origin
mastodon.scot
June 1, 2025 at 8:03 PM
Here's a non-exhaustive list of bots that hit the website within the few minutes I tailed the access logs:

- AhrefsBot
- meta-externalagent
- bingbot
- Bytespider
- Amazonbot
- Googlebot
- PetalBot
- SemrushBot
- ChatGPT-User
June 6, 2025 at 9:43 AM
I've been running the user agent blocker extension on my personal site on Netlify the last few days to get some data on the type of user agents hitting the site before I do any blocking.

out of the LLM type bots, it seems PetalBot (never heard of it) and ClaudeBot are the more active ones.
April 4, 2025 at 9:54 PM
25% en fait
avec un petit grep en q&d
les gagnants sont par ordre croissant : claudebot Googlebot BLEXBot AhrefsBot SemrushBot amazonbot robot petalbot bingbot ... et le tres originale 'bot'
June 6, 2025 at 2:51 PM
Dans mes logs, je vois beaucoup, mais beaucoup de visite du #aibot #ahrefsbot. 😶

Bon, je l'ai ajouté à mon fichier robots.txt... Espérons qu'il comprenne le message. 🤞

#AutoHébergement
November 7, 2025 at 1:28 AM
Our robots.txt: User-agent: * Disallow: # User-agent: AhrefsBot User-agent: Scrapy User-agent: Barkrowler User-agent: GPTBot User-agent: AI2Bot User-agent: Ai2Bot-Dolma User-agent: Amazonbot User-agent: Applebot User-agent: Applebot-Extended User-agent...

| Details | Interest | Feed |
Origin
mastodon.scot
June 1, 2025 at 8:05 PM
Just from today:

AhrefsBot
AliyunSecBot
ClaudeBot
DotBot
GPTBot
PerplexityBot
PetalBot
SearchBot
SemrushBot
YandexBot
Amazonbot
Applebot
bingbot
bot
bots
claudebot
dotbot
Googlebot
gptbot
imgbotapp
joeytalbot
nanamikubota
nonsabotage
petalbot
robot
searchbot
January 6, 2025 at 11:47 AM
Children’s Health Defense, the organization from RFK Jr., includes Yoast generated Sitemaps for its WordPress site to literally spread publicly harmful information.
Robots.txt only has AhrefsBot, DataForSeoBot, FacebookBot, and SemrushBot listed.

Per Ed Martin's standard for Wikimedia...
April 26, 2025 at 1:01 AM
a sampling of bots crawling my website at haibane.info

im going to add a robots.txt to exclude all but google.

100 SemrushBot
189 GPTBot
195 um-IC
310 AhrefsBot
341 UptimeRobot
405 MJ12bot
481 Googlebot
haibane.info
January 4, 2025 at 11:32 PM
Seems ahrefsbot has been crawling all over our blog ... thousands of http requests per week for a blog that has about 120 posts and a handful of pages. And which gets updated maybe once every couple of weeks.

That's just unreasonable, an unacceptable an […]

[Original post on mastodon.scot]
June 1, 2025 at 8:05 PM
Hi 👋, we are a SaaS company, doing SaaS things, but also an internet crawler (AhrefsBot), currently the biggest crawler after Google's one.
December 16, 2024 at 11:17 AM

if ($http_user_agent ~ "meta-externalagent|Semrush|DataForSeoBot|GPTBot|AhrefsBot|bingbot/|Bytespider|TikTokSpider") {
return 444;
}

Теперь использование процессора со стороны форджейо и постгреса околонулевое.

Ещё есть хороший список вот здесь, но он отклоняет запросы даже от […]
Original post on gts.dc09.ru
gts.dc09.ru
March 30, 2025 at 4:07 PM