![]() |
you right did not see that bit late hear could be the reason :)
|
I installed this mod yesterday and from then I have been bombarded with this spider AhrefsBot Spider I've tried adding them to this mod, but no joy.
|
Quote:
|
Quote:
|
Quote:
If the person put "AhrefsBot" in this Mod, it should be blocked no matter the IP. |
I think you'll find that if you check WOL when arhefsbot is online and then choose to show useragent from the dropdown ahrefsbot isn't actually in their useragent, i think i posted about this to another user a few posts or so ago.
|
why not just put
* / it blocks all useragents and will only require standard 1 line box this is what i have done |
Quote:
Here's some UA's that ahrefs use Mozilla/5.0 (compatible; AhrefsBot/1.0; +http://ahrefs.com/robot/) Mozilla/5.0 (compatible; AhrefsBot/2.0; +http://ahrefs.com/robot/) Mozilla/5.0 (compatible; AhrefsBot/3.0; +http://ahrefs.com/robot/) Mozilla/5.0 (compatible; SiteBot/0.1; +http://www.sitebot.org/robot/) Mozilla/5.0 (compatible; SiteBot/0.1; +http://www.sitebot.org/robot/),gzip(gfe) Mozilla/5.0 (compatible; SiteBot/0.1; +http://www.sitebot.org/robot/),gzip(...(gfe),gzip(gfe) So use SiteBot or Ahrefs as the banning UA's :) |
for those who want to block blocks accesses through Facebook external hit
facebookexternalhit/1.0 |
Quote:
Pretty sure the OP is saying he sees "ahrefsbot" in his WOL after adding it to your mod. |
Quote:
|
Quote:
EDIT: this is actually mentioned in the FAQ thats referrenced in the mods description. |
Quote:
|
You only need to enter choopa if that displays in their UA and they'll be banned immediately and will disappear from WOL after the WOL timeout, the ip's are of no consequence, i have a ban ip mod but this one ban's the string found in the UA so they can use 100 different ips for the same UA and still they will be banned :)
|
That's what I did yesterday morning, but in the afternoon I had around 36 ahrefsbots under the UA Choopa.net I added choopa in the morning to your mod but they where still there in the afternoon that's way I asked. I haven't seen them this morning though..
|
Do you use any mod for who visited statistics? if you dont the only other explanation is that they had already accessed a thread or area prior to you banning them, the only time they can be banned is when they release that thread or area to move to another, after that they're history :)
|
I have boofo's mod for displaying spiders. I'm not sure if any spiders can view our site you need to be registered to view any content on our site.
Great mod and thanks for your help.. |
It doesn't matter that they cannot view any content, your thread url's are being indexed which is why you are being crawled, naturally they see the same as guest, just viewed your site you should also turn off displaying WOL for guests, it will save you queries and bandwidth :)
|
Cheers Simon :D
The feckers are still getting in: AhrefsBot Spider 02:15 PM / Viewing Index NIRC: 173.199.115.83.choopa.net |
If thats form the logging then yes you will get that until every thread they have tried to index previously becomes a 301 permanent redirect, if not, if you want to pm me temp admin access with all permissions i'll take a look and see what i can do for you.
|
Ok, i've checked and i dont see any of these bots in your native vbulletin WOL, the other mods you have for statistics and total visitors...etc WILL log these as visiting because the bots are directly accessing a url, the logging is done before the url loads completely, my mod also bans them at this point so both mods are working :)
Just as a note, you're using create a thread, you can quickly get thousands of threads, it's better to use the output.txt logging :) Note to all!: If you have Simon in your ban list this will ban the following: simon SimonLloyd Lloyd simon thisisanincrediblylongsimonwordhere Get the idea?, you dont need to add all those to your ban list, simply because the mod looks for the string "simon" (case doesn't matter) in the entire string, so, if you'd used this in your list: Simon*\Lloyd It would NOT ban: Simon Simon Lloyd thisissimonlloydinastring but it WOULD ban Simon*\Lloyd-in(this.string) thisstringSimon*\Lloydhere ....etc Hope you all understand this better now and can get to removing duplicates from your list. @tricksodave, you can delete the temp account for me now thanks, also if you read the above please prune your list. If any of you have any trouble with editing your lists let me know and i'll help with anything you're stuck with :) |
Thanks Simon, That's helped me understand it a bit better. Thanks again...
|
Simon Lloyd
i c wat u done there :) haha |
Simon does this also block Facebook's scrapper? I am getting slammed by Facebook IP's and spiders:
facebookexternalhit/1.0 (+http://www.facebook.com/externalhit_uatext.php) facebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php) I did it through htaccess but this blocks the ability for me to post any articles to facebook with a thumbnail. Here is a link: http://www.botopedia.org/user-agent-...k-external-hit |
Or is there a way to slow these guys down with crawl-delay like this:
User-Agent: * Crawl-Delay: 10 I read you should use the agent by name instead of the above, if you know how or does facebook follow the above? |
Here is something else on facebook's bot, spiders or what ever they really are. Facebook claims they are not spiders or bots but instead scrapers, but I have been getting 500 server side errors and I check my error logs and during or around the time they are hitting my site over 100 times sometimes within 2 minutes I see Facebook IPs in the error logs....sigh...
Help? lol.... |
Is there only Facebook in the error log? As for banning both or whoever read my post above, as you see it all depends on the UAs of each bot, banning is a personal thing, most both don't recognise the delay command in robots. Maybe look at their ip range and ban some of their ips you can use my other mod for that.
|
CAG you haven't downloaded or marked this as installed!
|
Hey! I did downloaded but I didn't hit installed! lol...Sorry ...
As for banning the ips I have done that, but that blocks the ability to post the articles with the right info on facebook, so I have to make a decision here on whether facebook will help my site or not. I was just asking the question really about the crawl-delay which shouldn't have been asked here Simon, I apologize for that. |
Banning ips are only for incoming unless you've banned them in cpanel or htaccess. As for asking about the delay there's no problem i like to help where i can.
|
Aboundex/0.2
seems to be a new one here is the full thing Aboundex/0.2 (http://www.aboundex.com/crawler/) ip is 173.193.219.168-static.reverse.softlayer.com I have checked the ip and it has come back that a spam bot is using it If someone wants to run checks see if it needs added to the list. I am not 100% sure if it is this is why it needs checked first |
Quote:
Yeah I used htaccess to ban them completely. I need to find out exactly what IPs I can band and still allow facebook to work properly when I post links to articles or posts...sigh...lol But hanks for understanding and helping out, it is very much appreciated Simon |
Quote:
|
Quote:
|
Quote:
|
Quote:
Great! now you tell me ! lol...Thanks again Max I will have to take a careful look at the IPs and try to see if they match facebooks then. |
From what I've seen over the years facebook's bots have good behavior and only come to see you when something is posted there, from your site. Then they don't crawl around and they SURE don't go anywhere suspicious.
|
Yeah, nothing suspicious about facebook's crawlers, scrapers or bots , what ever they are. But it has caused my forums to pop the 500 internal server error a bunch of times , I check around the time those errors happen and are reported to me and its facebook's ips around the times of the 500 errors.
|
Well I decided to completely deny Facebook crawlers, scrapers, spiders or bots to crawl my site.
I deleted all their active sessions from my database through phpMyAdmin and added "facebook" to the list and I haven't gotten one single facebook critter on my site since. Sucks because I can no longer share anything on facebook from my forums but I just had to do it. Facebook wont reply and doesn't seem to care about eating up bandwidth with their crawlers. Oh well. |
Quote:
|
All times are GMT. The time now is 12:26 AM. |
Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
![]() |
|
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|