PDA

View Full Version : Miscellaneous Hacks - Ban Spiders by User Agent


Pages : [1] 2 3 4

Simon Lloyd
08-08-2011, 10:00 PM
What this mod does
With this mod you can enter User Agents to watch or ban, you can also recieve emails or have an Output.txt created and updated with time and date of visits. It doesn't just have to be spiders, you can watch, log or ban any useragent!

How to install
Simply import the product ban_spider, the mod is active by default but none of the other options are turned on.

What is a UserAgent?
http://en.wikipedia.org/wiki/User_agent

Understanding a UserAgent string
http://user-agent-string.info/parse

Genuine User Getting Blocked?
https://vborg.vbsupport.ru/showpost.php?p=2252866&postcount=105

Tools to help
http://whatsmyuseragent.com/SwitchingUserAgents.asp
http://www.botsvsbrowsers.com/SimulateUserAgent.asp

FAQ
https://vborg.vbsupport.ru/showpost.php?p=2256512&postcount=137

How does it work?
https://vborg.vbsupport.ru/showpost.php?p=2377564&postcount=381

What's a bot?
http://en.wikipedia.org/wiki/Spambot

How do i ban a bot?
https://vborg.vbsupport.ru/showpost.php?p=2319989&postcount=318
https://vborg.vbsupport.ru/showpost.php?p=2244937&postcount=51

Where's output.txt located?
https://vborg.vbsupport.ru/showpost.php?p=2265096&postcount=216

Bad bot lists
https://vborg.vbsupport.ru/showpost.php?p=2281880&postcount=259
https://vborg.vbsupport.ru/showpost.php?p=2265667&postcount=224
https://vborg.vbsupport.ru/showpost.php?p=2309385&postcount=281

Tested on vb3.7.x, vB3.8.x , vB4.x.x but should work on any version.

__________________________________________________ __________________
Special thanks to:
Lior
KH99
BoP5
for helping me sort out a few issues

...and beta testers
ForceHSS (Special thanks to Force for latest testing)
ozzy47
GreyHost

If you use this please mark as INSTALLED

History
9th June 2011 Orginal xml added
12th June 2011 Added both email notification and text file logging
22nd June 2011 Version 2.0.0, Added create thread on activity

Added match facility you can now use something like Yandex and it will match MOZILLA/5.0 (COMPATIBLE; YANDEXBOT/3.0; +HTTP://YANDEX.COM/BOTS)
Added clickable link to visited thread

22nd September 2011 added user redirect url selection
08th October Beta testing started for thread creation.
20th October Beta testing started for emailing.
21st October Beta testing complete Ver 3.0.0 uploaded
29th October minor fix added to cope with empty userid on thread creation
30th October Beta testing automatic redirection to spiders/bots IP
31st October New xml uploaded with automatic redirect to IP
25th November Minor fix for blank forumid fixed
26th November 2011 Fixed version check & create thread Off by default
17th December 2014 Version 3.1.0 uploaded, Hook changed extra logging and statistics added by Ozzy47 (Chris)
18th December 2014 Version 3.1.1 uploaded, prevented spiders being counted when mod turned off.
17th December 2014 Version 3.1.2 uploaded, due to rogue code from another mod
The Bad Bots list is now included in the product :)
Please prune out all those that you wish to be able to see your site (i suggest you definately prune out "DA" and "Custo" :

Support will now only be given to those who have this mod marked as INSTALLED

Sforums
08-09-2011, 12:01 PM
When you say "User Agents", you mean users or what? Not sure I understand purpose of this mod?

Simon Lloyd
08-09-2011, 01:44 PM
A user agent string is delivered by pretty much everyone and everything that visits your site, if you have your vbulltin options set so admin can resolve ip addresses then go to who's online then click the ip address that you see on the right for a guest, spider or forum member it will resolve to a user agent string something like the one i posted above. Remember, Google is your friend http://en.wikipedia.org/wiki/User_agent

Jncocontrol
08-10-2011, 01:17 AM
From my understanding, Spiders are suppose to be our friends in the Forum community?

Boofo
08-10-2011, 01:55 AM
Not when there 25 of the same spiders crawling the site day after day, non-stop. And from China, to boot.

oddball118
08-10-2011, 03:03 AM
I was wondering about that Baidu Spider. What would I add to the list to get rid of them? I usually see 10 - 20 at a time.

Boofo
08-10-2011, 03:17 AM
Baidu

Simon Lloyd
08-10-2011, 06:00 AM
Thanks Boofo, if you added Baiduspider then it would ban everything that has exactly that in the UA but not everything that had Baidu, so using just Baidu will ban everything that has that as part of its string so Baiduspider would also be banned.

AURFSCAN
08-10-2011, 07:11 AM
I like the txt output. I'll have to check this out .... tagged

my htaccess :)

BrowserMatchNoCase Baiduspider bad_bot
Deny from env=bad_bot

Wayne Luke
08-11-2011, 05:42 PM
Baidu

I banned them at the server level. Not catering to the Chinese or Asian market and never will cater to the Chinese or Asian market so don't need them to index my site.

oddball118
08-11-2011, 11:57 PM
Works as advertised. Thanks!

Simon Lloyd
08-12-2011, 02:20 PM
I banned them at the server level. Not catering to the Chinese or Asian market and never will cater to the Chinese or Asian market so don't need them to index my site.It's been a long time since i've seen your name on a post Wayne, nice to see you again :)

Simon Lloyd
08-12-2011, 02:22 PM
How did you do it at the server level?

If you use cPanel you can use the IP deny manager and ban a whole block of IP's (as well as single ones)

Boofo
08-12-2011, 03:22 PM
I thought he was talking of banning spiders themselves and not IPs.

Simon Lloyd
08-12-2011, 04:10 PM
I banned them at the server level. Not catering to the Chinese or Asian market and never will cater to the Chinese or Asian market so don't need them to index my site.By the way Wayne, a long while ago i joined vbCodex, then it went offline for a long while and when it came back up (as it is now) there's no members, posts and i'm not registered anymore - what's happening or do you have another vb coding related site?

Conehead555
08-24-2011, 12:47 PM
Puts this over the header (4.1.4):

Warning: stristr() [function.stristr]: Empty delimiter in [path]/includes/class_bootstrap.php(917) : eval()'d code on line 39

Warning: stristr() [function.stristr]: Empty delimiter in [path]/includes/functions.php(7115) : eval()'d code on line 58

bosanci28
08-25-2011, 12:20 AM
i have installed this mod,but can someone post a screen how or what are the right setting for this mod?

i have this spider that is filling the online page with:

119.63.196.xx and a lot of numbers to the end ,i this like other people talk in the forum,
is the " Baidu " spider...


119.63.196.57
119.63.196.45
119.63.196.14
119.63.196.13
119.63.196.40
119.63.196.79
119.63.196.76
119.63.196.49
119.63.196.102
119.63.196.114
119.63.196.47
119.63.196.121
119.63.196.116
119.63.196.27


how to stop this...


thank you,

Simon Lloyd
08-25-2011, 02:56 AM
Puts this over the header (4.1.4):

Warning: stristr() [function.stristr]: Empty delimiter in [path]/includes/class_bootstrap.php(917) : eval()'d code on line 39

Warning: stristr() [function.stristr]: Empty delimiter in [path]/includes/functions.php(7115) : eval()'d code on line 58
Thats because you turned the mod on without selecting anything to ban!!!!!!! see previous posts.

Simon Lloyd
08-25-2011, 02:57 AM
i have installed this mod,but can someone post a screen how or what are the right setting for this mod?

i have this spider that is filling the online page with:

119.63.196.xx and a lot of numbers to the end ,i this like other people talk in the forum,
is the " Baidu " spider...


119.63.196.57
119.63.196.45
119.63.196.14
119.63.196.13
119.63.196.40
119.63.196.79
119.63.196.76
119.63.196.49
119.63.196.102
119.63.196.114
119.63.196.47
119.63.196.121
119.63.196.116
119.63.196.27


how to stop this...


thank you,Simply put Baidu in the box for banning bots :)

bosanci28
08-25-2011, 03:03 AM
like this:?
see pic:

thanks

KHALIK
08-25-2011, 04:29 AM
Installed

ty

Simon Lloyd
08-25-2011, 05:56 AM
like this:?
see pic:

thanksNo, you need to remove the IP's from there, this mod is not for banning IP's this one https://vborg.vbsupport.ru/showthread.php?t=268146 is!

Simply leave Baidu there and the mod will do the rest :)

WEBDosser
08-25-2011, 05:58 AM
thanks :)

Kangaroo666
08-25-2011, 07:18 AM
Works well, finally got rid of those pesty Baidu spiders. Thanks m8.

bosanci28
08-25-2011, 12:52 PM
No, you need to remove the IP's from there, this mod is not for banning IP's this one https://vborg.vbsupport.ru/showthread.php?t=268146 is!

Simply leave Baidu there and the mod will do the rest :)
ok,done.
i also dont see any spiders from that 119.xxx ips anymore for now,bt will be checking....

thanks for your help.

lcn
08-27-2011, 09:18 PM
installed, thank you.

Is there a comprehensive list of bad spider bots?

Some I have added to my site

Baidu
Yandex
EmailSiphon
EmailWolf
ExtractorPro
Crescent
CherryPicker
[Ww]eb[Bb]andit
WebEMailExtrac
NICErsPRO
Teleport
Zeus
Wget
LinkWalker
sitecheck.internetseer.com
ia_archiver
DIIbot
psbot
EmailCollector
nasty
verynastystuff
i-am-nasty
Twiceler


http://www.spam-whackers.com/bad.bots.htm (http://www.spam-whackers.com/bad.bots.htm)


http://www.forumpostersunion.com/showthread.php?t=1644

ponydaddy
09-08-2011, 11:32 PM
very nice mod works well just what I was looking for

ponydaddy
09-08-2011, 11:33 PM
Modification of the Month. clicked it well worth it

ForceHSS
09-08-2011, 11:35 PM
click it as well very good mod could not get rid of some bots this stopped them right away thanks for this

Simon Lloyd
09-09-2011, 08:15 AM
List of bad bots added in modification description, please remember to prune out all those that you wish to be able to see your site!

Boofo
09-09-2011, 09:04 AM
Simon, it would be nice to add a text file with the bad=bots listing in a zip for the mod. That way the users of this mod have it on hand locally. ;)

doctorsexy
09-09-2011, 09:46 AM
Installed 4.1.5... thank you

jaffaman
09-09-2011, 10:59 AM
Installed 4.1.5 PL1 Thanks.

Simon Lloyd
09-09-2011, 10:10 PM
Simon, it would be nice to add a text file with the bad=bots listing in a zip for the mod. That way the users of this mod have it on hand locally. ;)I'll just add another attachment, i gave the list openly like that so folk could just either copy it or pick out the ones they wanted to ban, plus its no mystery as to what they're getting but i'll definately do that tomorrow :)

ForceHSS
09-10-2011, 12:17 PM
the Create New Thread option does not work

Simon Lloyd
09-10-2011, 01:14 PM
I'll check that out, will post back :)

Simon Lloyd
09-10-2011, 03:42 PM
It seems that if the option for "Ban spiders in list" is checked then the spider is taken care of before the thread is created, if that option isn't checked then a thread is created, i will of course work on getting both to work together, i must have tested them one by one when creating the mod :)

ForceHSS
09-10-2011, 04:29 PM
thank you will be nice to see it working

DaffyDuck
09-11-2011, 06:36 PM
Works well enough for me (killed those baidu spiders) but would be nice to see both features work together.

ForceHSS
09-12-2011, 04:22 AM
Works well enough for me (killed those baidu spiders) but would be nice to see both features work together.

I am sure he will get it working soon

TheWhite
09-12-2011, 05:52 AM
Well guys, I have a decent forum with 185k members with a decent dedicated server and an acceptable Google adsense income.

In the last few years I've tried to "ban" the vicious Baidu Spider by using the robot text (useless because it don't obey it), then by using the .htacess (was great) which worked fine for a couple of years but since a month ago it has somehow found a way in not obeying that either.

In the last week or so I started banning the Ips as a last resort but these guys come out with a new one in a couple of days so this morning I did a Google search and came to this WONDERFUL VB MOD, this coder deserves a medal and not only the MOD OF THE MONTH (so vote for him!!). After an hour, Baidu is NO MORE!! I hit 253GB of bandwidth this morning which is over my average monthly (time period) rate by at least 75GB not mentioning the server slowdowns.

Baidu doesn't bring any good traffic (adsense wise), it only does harm by eating up your resources and slowing down your forum thus causing the Google Crawlers (GOOD BOTS) to take more time on indexing your forum which is bad.

I haven't tested the logging and emailing reportings yet but I will in the next few days.

Cheers!!

Simon Lloyd
09-12-2011, 06:00 AM
Hi Guys, thanks for your kind words :), the logging seems to work fine with the banning but i suggest you only turn logging on at least 30 minutes after setting the banning as the text file can get huge quickly. I'm only a few more tests off releasing the fix for "Thread Creation" so maybe later today when i have time or first thing tomorrow ;)

TheWhite
09-12-2011, 06:14 AM
Hi Simon, great work!!

I don't want to turn on the server logging because I don't like big files but the thread creation ( similar to the Multiple Login Detection one) would be very much appreciated.
The mod must remain lean and mean ;)

I'm using VB 3.6.12 so I hope you keep this mod compatible.

Regards

Simon Lloyd
09-12-2011, 06:30 AM
As an extra thought, if you have a large .htaccess your forum will slow down as every user has to be compared against it or at least thats what i've been led to believe!

Simon Lloyd
09-12-2011, 06:33 AM
Hi Simon, great work!!

I don't want to turn on the server logging because I don't like big files but the thread creation ( similar to the Multiple Login Detection one) would be very much appreciated.
The mod must remain lean and mean ;)

I'm using VB 3.6.12 so I hope you keep this mod compatible.

RegardsI hadn't tested this that far back but glad it works for you, it should remain compatible :).

Boofo
09-12-2011, 02:02 PM
I'll just add another attachment, i gave the list openly like that so folk could just either copy it or pick out the ones they wanted to ban, plus its no mystery as to what they're getting but i'll definately do that tomorrow :)

Any word on this yet?

Simon Lloyd
09-12-2011, 02:37 PM
Boofo i've been a little preoccupied, promise to do it in the next hour :)

TheWhite
09-12-2011, 03:57 PM
When are you going to fix/add the thread notification?
Regards

ForceHSS
09-12-2011, 04:02 PM
still does not post here are my settings

Simon Lloyd
09-12-2011, 04:19 PM
Lol the update you were notified of was just for the text file with bad bots being added, still working on getting banning and thread creation working at the same time :)

Simon Lloyd
09-12-2011, 04:24 PM
as a side note you don't need the full useragent string anymore to ban them, you can now enter any part of the string:
e.g
bai will result in baidu being banned just as will any string containing "bai"
Entering Mozilla will result in every useragent string containing that to be banned.

So, entering the full bot name but not useragent string will do, enter Baidu for that spider, dont enter Ya as something to ban as Yahoo will be banned just as Yandex will.

ForceHSS
09-12-2011, 06:54 PM
ok did not know as it did not say this in your first post

TheWhite
09-13-2011, 04:23 PM
A little word of advice, don't get carried away with the bot banning because it might affect your Google revenue in a negative way, start with these for a while and control your traffic wisely.

Baidu
Yeti
Twiceler

Regards

Boofo
09-13-2011, 04:45 PM
A little word of advice, don't get carried away with the bot banning because it might affect your Google revenue in a negative way, start with these for a while and control your traffic wisely.

Baidu
Yeti
Twiceler

Regards

Don't forget Yandex.

niteflyer32
09-13-2011, 07:36 PM
Works great except one thing with the Baidu spider, it knocked them off the forum for about 3 hours. Now I see the Baidu spider back again on the online list.????

Update: Looks like the mod knocked Baidu off again, haven't seen them for 1 hour. There was 4 Baidu spiders all with different IPs that showed up and then nothing. Is this the mod working or should we not even see the bots that are on the blacklist? I don't see other spiders we've blacklisted showing back up yet.

ForceHSS
09-13-2011, 10:55 PM
I have it like this in the list Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
and they never enter

Simon Lloyd
09-14-2011, 05:59 AM
Don't worry if you see them in the WOL, what you have to remember is whilst they are redirected the moment they arrive vbulletin may have already registered that they have arrived and for a split second show them in WOL, they only go missing from WOL when your WOL timeout has expired, i have mine set to 900 seconds (15 minutes) so i have to wait beyond that time before i see them disappear.

The fix im working on for "Create thread" to work with "Ban spiders in list" at this moment uses a hook thats compiled later so the bots always show in WOL (until they get the message for the 301) as the minute they are redirected another of their bots try to crawl your site but are instantly redirected, i could release that version but it will amount to loads of messages here saying "I can still see the spiders" so i wont release it until i can make the bots disappear after the WOL timeout.

niteflyer32
09-17-2011, 01:07 PM
The Baidu bots do seem to disappear once they are found. We used to have scores of them on the WOL and now just one or two show up and they are gone shortly.

One issue we have with a forum member who is getting blocked and the Google re-direct, he is using IE 8 and on Cox.net

Here is our block list, I'm not sure which one is blocking him. Any ideas? Thanks.

==================================

Simon Lloyd
09-17-2011, 03:38 PM
What you'll need to get him to do is visit here http://whatsmyuseragent.com/ and get him to post you the entire contents of the box that says "Your UserAgent" and then see if any part of that string matches any of your blocked bots, post back and i'll take a look :)

P.S can you edit your last post and remove all those bots and add an attachment of a text file? that way folk don't have to scroll for ages to read the thread :)

niteflyer32
09-18-2011, 06:28 AM
Sorry about the long post, got it changed to a text file.

Here is the forum member's response from the "Your UserAgent" link. I'm not sure which one from our list is blocking him, possibly one of the 3 Mozillas on the list?

Thanks for the help.

Your User Agent: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; Creative AutoUpdate v1.40.01)

Simon Lloyd
09-18-2011, 08:14 AM
Firstly it looks like your clients UA has been altered in the past, i have two suggestions you can try right now, firstly change DA in your list to something like DA1 and ask him to try, if that doesn't do it try altering Net Extractor to NET1 Extractor, these are the only things i can see may be a problem, if that doesn't work i'll look at it more in depth but in the mean time your user can perform a UA switch by following the instructions here http://whatsmyuseragent.com/SwitchingUserAgents.asp

Simon Lloyd
09-18-2011, 03:36 PM
I've got some new for you, i tested your clients UA against my forum using your spider list and was able to gain access to my site so it's not this mod keeping him/her out.

You can check for yourself here http://www.botsvsbrowsers.com/SimulateUserAgent.asp just enter his entire useragent in the test window and your site below, hit Go and enter the Captcha that pops in to the results window, hit Go again and you'll see your site :)

So from that this mods fine and not causing a problem.

EDIT: BTW do you know that every one of the bots you supplied in that list has a space before it? the mod will look for a space before each of those words if thats how it appears in you ban spider list.

niteflyer32
09-18-2011, 06:38 PM
We changed DA on the UA list and now the member can access the forum????

I'm not seeing the space before the UA in the list I posted. The list appears to be working as it has knocked many of the bots off our WOL list.

Simon Lloyd
09-18-2011, 06:51 PM
Well, if it's working for you all well and good :)

At least you now have the tools to test any future queries with users having problems with access because of their UA.

Simon Lloyd
09-18-2011, 07:02 PM
Hmmm seems to be a problem with page 5 of thread....this is a test post!

Edit: thread back to normal now ;)

niteflyer32
09-18-2011, 07:16 PM
Thank you for your time and help. This mod rocks.

Simon Lloyd
09-18-2011, 08:33 PM
Since you like it and it works for you can you mark it installed please!

niteflyer32
09-18-2011, 09:18 PM
Oops, I thought I had already marked that. Done. And MOTM done too.

We have 2 other users saying they are getting blocked now. I think I may have gone overboard on blocking UAs.

Your User Agent is:
Mozilla/4.0 (compatible; MSIE 8.0; AOL 9.0; AOLBuild 4327.5204; Windows NT
5.1; Trident/4.0; GTB6.4; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR
3.0.4506.2152; .NET CLR 3.5.30729; customie8)

and

Your User Agent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/14.0.835.163 Safari/535.1

Also, I selected the option to have a log created and written to output.txt in the forums root but I'm not seeing that log txt file. Any idea why that log is not appearing?

Simon Lloyd
09-19-2011, 06:10 AM
Thanks for that, have you used the tools (links) i gave you, you can test their useragent against your site to see f it is the mod blocking them, what you have to remember is that when you enter something like DA then the mod looks for anything that contains just that, your other members issues were sorted because you changed DA to DA1 but in their useragent was "Creative Autoupdate, when it comes to banning UA's that have just a couple of letters or such like then you are best entering either a longer string or best entering the entire useragent.

The logging and create thread i only tested on their own, that is to say i didn't have the option for banning them turned on, it seems that there is a small glitch that i'm working on, when banning the spider they are banned as your forum style is being compiled for them but the notifications are created after the forum is shown to them completely, so the spiders are taken care of well before it ever gets to the notification stage, youre not seeing notifications because there's nothing to notify.

I am working on getting them both to work together and will post here when i manage that :)

Simon Lloyd
09-19-2011, 06:27 AM
Your first UA is being blocked because you have "custo" in your block list!

Simon Lloyd
09-19-2011, 07:05 AM
Your second UA can access your site (i just used the UA simulator from the link i posted above)

Simon Lloyd
09-19-2011, 07:14 AM
post just to get to page 5 as this thread appears to be faulty!

niteflyer32
09-19-2011, 07:44 AM
Thanks for the help on the UA list.

When I tested the 2nd UA I listed above with your UA test website http://www.botsvsbrowsers.com/SimulateUserAgent.asp I see our forum but I also see a "Request Status: 500 : Internal Server Error" at the top. Is that caused by our US block list or is that a server setting issue I need to talk to our webhost about? I also get a "500 : Internal Server Error" when trying to verify we have our Comscore analytic code (like Google Analytics) inserted on the forum.

What is the status of the spiders Majestics MJ12bot Spider, Speedy Spider and Voila Spider? I've searched them and get info that is dated or conflicting on if they are good or bad. They are hitting our site pretty often.

The page 5 weirdness of this thread you're seeing appears to be okay on my end.

BadgerDog
09-19-2011, 11:51 AM
Installed with thanks .. :)

Unfortunately, I'm still getting Baidu spiders, even 4 days after installing this.

Attached is screen pic. What am I doing wrong?

Regards,
Doug

Simon Lloyd
09-19-2011, 12:59 PM
For now there is an issue when banning bots and having one of the notification enabled (either create thread or Output.txt file on server), so if you have those enabled please disable them, to ban bots for now you must only have the mod activated and ban bots in list selected (and of course bots that you want to ban, do that and all will be good :)

You will recieve notice when i solve the "working together" issue.

Simon Lloyd
09-19-2011, 01:04 PM
@niteflyer32, try turning the mod off then trying the UA at the UA simulation site and see what it returns, the mod shouldn't cause a 500 error and then show you the site, if you don't have access you simply get redirected so you wouldn't see your site.

I haven't researched the spiders, i built this mod to cut down on my server load as Baidu were hammering it, i have between 200 and 350 Baidu at my site at any one time, and because the index so vigourously their demand on the server is huge (although while im working on the issues with the notification i am allowing all bots at the moment).

BadgerDog
09-19-2011, 01:48 PM
For now there is an issue when banning bots and having one of the notification enabled (either create thread or Output.txt file on server), so if you have those enabled please disable them, to ban bots for now you must only have the mod activated and ban bots in list selected (and of course bots that you want to ban, do that and all will be good :)

You will recieve notice when i solve the "working together" issue.

Thanks .. :)

I've turned OFF logging (email notifications were already off) and I'll monitor it now ...

Regards,
Doug

BadgerDog
09-22-2011, 10:09 AM
Thanks .. :)

I've turned OFF logging (email notifications were already off) and I'll monitor it now ...

Regards,
Doug

That doesn't work either.... :confused:

Still getting lots of Baidu and Yandex spiders ...

I'm not sure this mod is working at all, regardless of any options set, or turned ON or OFF ... ;)

Regards,
Doug

smirkley
09-22-2011, 04:08 PM
Still testing but I can say so far,... NICE !!

Thank you.

I am only banning 4 useragnts at the moment, but I wish to ask is there a condensed version of 'must ban' useragents off that list here, as compared to the whole list? I dont want to go crazy and ban too much especially if it hurts my membership or adsense rev.

So far I ban:

Baidu
Yeti
Twiceler
Yandex

Simon Lloyd
09-22-2011, 04:52 PM
That doesn't work either.... :confused:

Still getting lots of Baidu and Yandex spiders ...

I'm not sure this mod is working at all, regardless of any options set, or turned ON or OFF ... ;)

Regards,
Dougif you want to pm me admin access details and url i'll take a look :)

Simon Lloyd
09-22-2011, 04:56 PM
Still testing but I can say so far,... NICE !!

Thank you.


I am only banning 4 useragnts at the moment, but I wish to ask is there a condensed version of 'must ban' useragents off that list here, as compared to the whole list? I dont want to go crazy and ban too much especially if it hurts my membership or adsense rev.


So far I ban:

Baidu
Yeti
Twiceler
Yandex99% of the chinese bots will bring no traffic so won't hurt your adsense revenue, on my other sites i ban ALL chinese bots as they index far too agressively, these are the ones i ban at my other sites:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Hope that helps you, but of course its a personal thing ;)

BadgerDog
09-22-2011, 05:07 PM
if you want to pm me admin access details and url i'll take a look :)

Well, there's nothing really to look at except your settings ... (see pic)...

Are they correct?

Regards,
Doug

Simon Lloyd
09-22-2011, 05:14 PM
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site :)

smirkley
09-22-2011, 05:39 PM
99% of the chinese bots will bring no traffic so won't hurt your adsense revenue, on my other sites i ban ALL chinese bots as they index far too agressively, these are the ones i ban at my other sites:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Hope that helps you, but of course its a personal thing ;)

Thank you. Helps.
After checking my session expiration setting, and just watched the lil' critters disapear!

Will watch for the fix upcoming, and if al works after testing, will most certainly vote motm!

BadgerDog
09-22-2011, 05:45 PM
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site :)

It's set for default 20 minutes, but PaulM's guest mod is showing dozens of accesses (logins) from those bots that have occurred in the last 24 hours, so am I misunderstanding what this mod is supposed to do?

Shouldn't there be NO logins by Baidu and Yandex spiders for at least 23 hours ago, since this mod has been running with your corrected settings for days?

Thanks .. :)

Regards,
Doug

Simon Lloyd
09-22-2011, 05:56 PM
What you forget is that they have to attempt access to your site to get banned (redirected 301) so thats why Pauls mod is showing those to you, also bots don't access homepage then select a forum then select a thread, they just go straight for a thread (or post), so as soon as that happens Pauls mod will log them, but if you look at WOL are they there now?

I doubt it :), Pauls mod is doing the job it's set out to, mine should be doing the job too, did you test that UA i gave above at the link i gave? If so what were the results?

Simon Lloyd
09-22-2011, 06:01 PM
Thank you. Helps.
After checking my session expiration setting, and just watched the lil' critters disapear!

Will watch for the fix upcoming, and if al works after testing, will most certainly vote motm!I'm close to a fix for this but it will probably mean an additional php file to be uploaded as it seems that it can't work comfortably with the bots being redirected the moment they call the forum to load as it's leaving nothing for the notification to notify, all the others work comfortably together i.e Output.txt logging, email and create thread, it's just when you ban the bot you either ban it late which means it always will be seen in WOL or ban it early so it's very rarely seen there, it's the early bit thats causing the issue!

smirkley
09-22-2011, 07:23 PM
That looks ok, next you need to check your session timeout settings and see what it's set at as nothing goes missing from the WOL until that has expired, if the timeout has passed and you've been watching WOL and they remain after that time then click WOL to view all those online, from the dropdown select yes for useragent and copy the UA then try it here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and see what results you get, the UA will look something like this:
Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

In fact you can try that at the link i gave you, make sure you set it to look at your site :)

Using this site and useragent tag to test, I get varying results.

1 - if I use just my home page (cms) it doesnt seem to be working. Not sure if this is even an issue really as my baidu bot count is nil now with this mod, maybe just doesnt work with cms.

2 - when I add the necessary /forums/ to my url on the test page, it seems to be working, but it redirects to google.com.hk (is that normal?)

Simon Lloyd
09-22-2011, 07:29 PM
Right, it wont work with cms as thats outside of the /forum folder, and yes they are getting redirected to a chinese google :):)

smirkley
09-22-2011, 07:44 PM
Right, it wont work with cms as thats outside of the /forum folder, and yes they are getting redirected to a chinese google :):)

Ahh, ok that explains it then.

1 - Are there plans to make this work with the vB suite (ie-cms/forum/blog/groups,etc)?

2 - Can you when you are able, make it so the admin can set where they want the redirect to? (I would rather redirect to baidu themselves, I dont want to play mean with google as they can get real pissy if they were to not like it and track back the redirects. Dont want to be on googles bad side ya know)

3 - (and last I promise) Are the 'redirects' true permenant 301's by definition?

Simon Lloyd
09-22-2011, 08:10 PM
301 is set in the redirect, i don't think i will be able to set the redirect back to the bots own source but certainly i can make the redirect selectable (and will next day or so) and i don't know whether i'll venture in to getting it to work with cms at the moment (as i have a lot on) but anything in /forum and lower i.e /forum/blog.....etc will benefit from the mod (or should!)

smirkley
09-22-2011, 08:15 PM
301 is set in the redirect, i don't think i will be able to set the redirect back to the bots own source but certainly i can make the redirect selectable (and will next day or so)

Thanks. Looking forward to this. (is there a suggested redirect url that is effective and safe?)


and i don't know whether i'll venture in to getting it to work with cms at the moment (as i have a lot on) but anything in /forum and lower i.e /forum/blog.....etc will benefit from the mod (or should!)
No problem.

Thanks for everything on this.

Simon Lloyd
09-22-2011, 08:50 PM
Added user redirect entry box, now you can select where you send them bad bots ;)

Suggestion to redirect them to:
http://www.klikhierniet.net/
It's dutch ( meaning, don't click here )
I think you'll find it's very annoying and I think you could find simular ones in english

smirkley
09-22-2011, 10:01 PM
Thanks for the option of where to redirect.

I set it for www.baidu.com,... checked it on the checking link you posted,.. and it shows a successful redirect back to them.

I know all bots entered will go to baidu.com, but I am not concerned with that.
(I thought funny the link you suggested ;) )

Simon Lloyd
09-22-2011, 10:07 PM
Thanks for reporting back that it works ok, i just dashed it off and was about to mark this as a beta when i realised i hadn't tested it so THANKS for being a guinea pig!

That link is annoying though :)

Boofo
09-22-2011, 10:14 PM
Is there a setting yet for the redirect link?

smirkley
09-22-2011, 10:17 PM
Yes, and I tested and it works

ForceHSS
09-27-2011, 03:11 PM
any word when the Forum ID and Thread Username options will work

Simon Lloyd
09-27-2011, 04:21 PM
Im still working on that, i'm trying my best to keep it all in one xml product, at the moment i'm experimenting with a seperate php file called from the product but thats not my goal.

You can use the thread creation...etc but without actually banning the spider at the moment.

There will be an update notice sent to you when i replace the xml or files here just as you did for the "use your own redirection url" change :)

oldfan
09-27-2011, 05:32 PM
installed and thanks :D

ForceHSS
09-27-2011, 06:08 PM
Im still working on that, i'm trying my best to keep it all in one xml product, at the moment i'm experimenting with a seperate php file called from the product but thats not my goal.

You can use the thread creation...etc but without actually banning the spider at the moment.

There will be an update notice sent to you when i replace the xml or files here just as you did for the "use your own redirection url" change :)

thanks ur the best

Simon Lloyd
10-03-2011, 02:44 AM
I will be releasing a beta back end of next week if anyone wants to try it pm me, remember i said BETA!, so it may have some issues that i'll need feedback on, i'm still trying to keep this in one product without additional files :)

Simon Lloyd
10-03-2011, 02:51 AM
Beta for Thread creation and Banning working together to be released - target date 7th October 2011

voglermc
10-03-2011, 07:32 PM
I just had a member get rejected using his android phone

Simon Lloyd
10-03-2011, 07:59 PM
This mod can ONLY block those useragents that you have entered in the list, firstly get your user to go here http://whatsmyuseragent.com/ (via his phone) and find out what his useragent is then you go here http://www.botsvsbrowsers.com/SimulateUserAgent.asp and paste his UA string in and test it to see if you get denied or not.

Something in his useragent string is in your list so it's not the mods fault as it's banning what you ask it to :)

voglermc
10-03-2011, 08:46 PM
Thanks

Simon Lloyd
10-03-2011, 08:51 PM
If you get stuck let me know :)

appsfinder
10-06-2011, 12:46 PM
Hi im trying to reinstall but now i get XML Error: Not well-formed (invalid token) at Line 1
can anyone help

Simon Lloyd
10-06-2011, 01:55 PM
Uninstal what you have and then install the latest one here, if your still having problems let me know and i'll check the xml out, it's possible that its a minimum version or thread fault, whichever it's an easy fix :)

rob39
10-06-2011, 10:12 PM
Awesome Mod....I was soooo sick of seing "Baidu".....got my eye on your "Insert Objects/ads anywhere", also.....

Simon Lloyd
10-07-2011, 08:33 AM
Glad you like it :) if you haven't and you like it that much you can visit the MOTM link above ;)

Simon Lloyd
10-07-2011, 08:46 AM
Hi all, the planned release of the Beta i mentioned won't happen today, our daughter struggled terribly to bring our grandson in to the world so all my thoughts and efforts have been toward helping her.

For the next week or so my support here may be sporadic because of that, so if by chance you don't get an answer from me right away please be patient and i will answer. :)

Simon Lloyd
10-07-2011, 03:27 PM
Ok, i said i wouldn't but it's been theraputic, i have the beta ready for release, if any of you want to try it and report back then please PM me.

Without testing this in a few dufferent environment i can't say that it will work ok for you, however it works on my test board :)

Samhayne-STS
10-08-2011, 06:15 PM
Installed on forum version 4.1.7 and works great! We used to have 150+ spider bots crawling the site (mostly Baidu) and after we're down to 5-10: Google, Bing, Yahoo, etc.

Good stuff :)

Simon Lloyd
10-08-2011, 06:26 PM
Glad you like it :) do you want to try the beta? the beta is where i'm attempting to get both the banning AND the create thread working together, right now they will only work one at a time so either you can ban OR you can create thread (on detection) OR create log OR send email notification.

No pressure but im looking for testers ;)

ForceHSS
10-08-2011, 06:49 PM
I will test pm me with the beta

Simon Lloyd
10-08-2011, 06:57 PM
Great thanks, pm'd you details :)

ozzy47
10-10-2011, 12:24 AM
So far the beta seems to be working great, I have Baidu in there now and it is posting threads, and showing in the output.txt, it was not doing either before.

Simon Lloyd
10-10-2011, 03:45 AM
Thats great, thanks for testing, It is also being banned?

-=Leb=-
10-10-2011, 11:19 AM
Oh man, your mod deserve gold. Baidu piece of shit was spamming my forum so bad . I had 129 baidu spiders was spamming my forum every day, and now all of them are gone thanks to you <3

Is it ok if i keep creat new thread for each UA detection turned off?


BTW i voted :)

Simon Lloyd
10-10-2011, 12:06 PM
Yes, the current product only allows you to either ban the spider OR have a notification like output to log, email or create a thread, i have a beta which allows you to do both at the same time, a couple of people are trialing it right now.

You only need to have the mod active, ban spiders in list set to yes and some spiders in the list to ban :)

Thanks for voting MOTM, you will recieve an update when this beta is proven ;)

ozzy47
10-10-2011, 08:39 PM
Thats great, thanks for testing, It is also being banned?


As far as I can tell it is.

Kat-2
10-10-2011, 09:17 PM
This mod is awesome. Thank you so much. I was so sick of the Baidu spider on my site. I installed this mod and watch as they left one by one, and haven't been back. :D

Simon Lloyd
10-11-2011, 03:48 AM
Glad you're happy with it - watch out for the new release after this beta has been tested a while longer :)

voglermc
10-12-2011, 12:38 AM
Is anyone getting this in your error log?

[11-Oct-2011 08:33:56] die 3
[11-Oct-2011 08:35:19] die 3
[11-Oct-2011 08:35:26] die 3
[11-Oct-2011 08:35:29] die 3
[11-Oct-2011 08:35:30] die 3
[11-Oct-2011 08:35:40] die 3
[11-Oct-2011 08:35:40] die 3
[11-Oct-2011 08:35:49] die 3
[11-Oct-2011 08:35:49] die 3
[11-Oct-2011 08:35:51] die 3
[11-Oct-2011 08:36:00] die 3
[11-Oct-2011 08:36:00] die 3
[11-Oct-2011 08:36:08] die 3
[11-Oct-2011 08:36:08] die 3
[11-Oct-2011 08:38:35] die 3
[11-Oct-2011 08:38:58] die 3
[11-Oct-2011 11:32:59] die 4 true
[11-Oct-2011 11:32:59] die 4 true
[11-Oct-2011 13:24:17] die 3
[11-Oct-2011 13:27:56] die 3

ForceHSS
10-12-2011, 02:17 AM
are you using the beta or the one from the first post

Simon Lloyd
10-12-2011, 05:20 AM
It would be natural i suspect to get those, the mod performs all the redirects...etc at style_fetch, so in other words even though spiders...etc go straight for a url they never get to see it as it never completes for them hence the connection for them dying, however this mod does NOT give a Die(); command, it gives a 301 redirect.

@ForceHSS there's only you and one other using the beta and another has just requested to be a tester so that will be 3 of you :)

ForceHSS
10-12-2011, 06:42 AM
It would be natural i suspect to get those, the mod performs all the redirects...etc at style_fetch, so in other words even though spiders...etc go straight for a url they never get to see it as it never completes for them hence the connection for them dying, however this mod does NOT give a Die(); command, it gives a 301 redirect.

@ForceHSS there's only you and one other using the beta and another has just requested to be a tester so that will be 3 of you :)
Did not know how many thought that might of been from the beta the reason I asked

Simon Lloyd
10-12-2011, 09:34 AM
Hows the testing going ForceHSS?

voglermc
10-12-2011, 09:59 AM
Not the beta

Simon Lloyd
10-12-2011, 10:44 AM
Are you isning my Ban Ip mod? that mod actually gives the Die(); command which breaks the connection. This mod allows first connection but redirects immediately so their request never completes.

voglermc
10-12-2011, 10:55 AM
Nope, only this mod of yours

Simon Lloyd
10-12-2011, 11:24 AM
Well you can rest assured it's only banning or stopping complete connection for those in your list :)

ForceHSS
10-12-2011, 09:03 PM
Hows the testing going ForceHSS?

going good no bugs in it so far that I can see
one thing would be nice to see as it seems to miss Thread Prefixes even if I make it forced to use them on a section it wont add them

ozzy47
10-12-2011, 10:08 PM
If a spider is banned, how do I get them to crawl my site again, I tried your full ban list, and now my website monitor services are no longer checking my site.

I removed all spiders from admin except Baidu.

GreyGhost
10-12-2011, 10:26 PM
No pressure but im looking for testers ;)

Hi Simon, I just sent PM to test beta.

I have the released version installed on our vBCMS 4.1.7 but it doesn't seem to be banning Baidu. Our forums are located in the root with the CMS (so no /forums/), not sure if it's to do with this.

I have Track Guest Visits (https://vborg.vbsupport.ru/showthread.php?t=232182) installed and it still shows 40-50 Baidu every day.

I've double checked my settings... only have "Ban Spiders In List" selected, no logging etc.

My List is:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Anyway, will try beta and see if that fixes it.

8-)

PS. I hope your daughter and grandson are doing well.

Simon Lloyd
10-12-2011, 10:36 PM
going good no bugs in it so far that I can see
one thing would be nice to see as it seems to miss Thread Prefixes even if I make it forced to use them on a section it wont add themIt wont add prefixes as they are added when the forum loads, your actual url stays the same, a prefix is never added to them - have you ever seen a url like this http:www.mysite.com/showthread?t=[solved]12345 ??? :)

If a spider is banned, how do I get them to crawl my site again, I tried your full ban list, and now my website monitor services are no longer checking my site.

I removed all spiders from admin except Baidu.You added your site monitoring service as a bad bot? bad move!, remember we're sending them a 301 which is a permanent redirect, if you don't see them back in a week check with them, you may ask for your url to crawled again.

Hi Simon, I just sent PM to test beta.

I have the released version installed on our vBCMS 4.1.7 but it doesn't seem to be banning Baidu. Our forums are located in the root with the CMS (so no /forums/), not sure if it's to do with this.

I have Track Guest Visits (https://vborg.vbsupport.ru/showthread.php?t=232182) installed and it still shows 40-50 Baidu every day.

I've double checked my settings... only have "Ban Spiders In List" selected, no logging etc.

My List is:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Anyway, will try beta and see if that fixes it.

8-)

PS. I hope your daughter and grandson are doing well.Right, firstly, thanks they're now doing great :), your "Track Guest Visits" mod will ALWAYS show the spiders but your native vBulletin WOL will not, the reason why the TGV mod picks them up is because they are actually accessing your site (so that mods doing it's job and recording them) but my mod prevents them from having their request completed i.e direct request for a url is a forum access but they are redirected permanently before the thread loads (so my mod is ALSO doing its job :))

Hope that clears things up for you all.

@GreyGhost i'll PM you details of the beta ;)

ozzy47
10-12-2011, 10:49 PM
Yeah my site monitoring site was in your bad bot list, and I did not see it.

ozzy47
10-12-2011, 11:15 PM
OK I got them back, 1 was missing, the other one was showing as guest, I upgraded and forgot to re up the spiders xml from WolfsHead

GreyGhost
10-12-2011, 11:16 PM
Right, firstly, thanks they're now doing great :),

Excellent! :)

your "Track Guest Visits" mod will ALWAYS show the spiders but your native vBulletin WOL will not, the reason why the TGV mod picks them up is because they are actually accessing your site (so that mods doing it's job and recording them) but my mod prevents them from having their request completed i.e direct request for a url is a forum access but they are redirected permanently before the thread loads (so my mod is ALSO doing its job :))

Hope that clears things up for you all.

Yes, suspected this was the case. Just after posting I tested it @ http://www.botsvsbrowsers.com/SimulateUserAgent.asp and Baidu were indeed being redirected (straight back to Baidu :D).

Great stuff!

8-)

ozzy47
10-12-2011, 11:22 PM
In your ban list you have InternetSeer.com

That would not be the same as this, Internet Seer Spider
because that is a web site monitoring service.

Simon Lloyd
10-13-2011, 12:37 AM
You may see in your WOL Internet Seer Spider but you need to know it's full UA (here's one: Mozilla/5.0 (compatible; Atwatch-InternetSeer monitoring; MSIE 6.0; ) Gecko) so we can decipher which part if any is being blocked, Maybe this will help you get them back http://www.internetseer.com/help/faq.xtp

Just a word though, internetseer has always been just an email scraper, they trade off with "monitoring" your site or of course you can pay them to do it, either way they are archiving email addresses found on your site - still want to use 'em?

As a final word, i did mention when posting the list of bad bots (both here in the thread and the txt file) to prune out those that you want to be able to see your site, you might have like to also prune out ia_archiver, you should definitely prune out DA and custo (as should most people).

Simon Lloyd
10-14-2011, 03:18 AM
Beta testing is going well :), those that are using the beta please pm me any results you have or bugs. If all is well i'll release it as stable next week ;)

ForceHSS
10-14-2011, 07:27 AM
Will post here and not pm. So far no bugs in the bot posting still have to test the other settings but I am sure others have but will test them as well

Simon Lloyd
10-14-2011, 08:25 AM
Will post here and not pm. So far no bugs in the bot posting still have to test the other settings but I am sure others have but will test them as well

Posting here is great, other folk get to know how it's going and if anything what to look out for :)

GreyGhost
10-14-2011, 09:34 AM
Posting here is great, other folk get to know how it's going and if anything what to look out for :)

You're welcome to quote my PMs here Simon, if you feel they'll be at all informative.
Although I will be removing the images after a few days so you may want to leave them out.

8-)

ForceHSS
10-14-2011, 10:09 AM
the email part of it dont seem to work

ozzy47
10-14-2011, 04:21 PM
I still see Baidu hitting my site every so often, and the last one that was banned was 10/11/2011 according to forum post and output.txt

Simon Lloyd
10-14-2011, 04:47 PM
Well baidu is persistant :), remember that you are getting notification BEFORE a thread or forum loads properly for them.

@ForceHSS & GreyGhost i'll work on the emailing a little later :)

stator
10-18-2011, 06:17 AM
I received this error after activation of "Write to log" option

Warning: fopen(output.txt) [function.fopen]: failed to open stream: Permission denied in [path]/includes/functions.php(7207) : eval()'d code on line 99

stator
10-18-2011, 06:28 AM
The highlighted word in the photo below is missing "i"

https://vborg.vbsupport.ru/external/2011/10/28.jpg

btw, does the url of redirect work? or I've to put something else?

thnx

Simon Lloyd
10-18-2011, 09:18 AM
I received this error after activation of "Write to log" optionthis seems like you do not have write permissions to your server!

Simon Lloyd
10-18-2011, 09:20 AM
The highlighted word in the photo below is missing "i"

https://vborg.vbsupport.ru/attachment.php?attachmentid=133923&stc=1&d=1318922782

btw, does the url of redirect work? or I've to put something else?

thnxThe typo i will fix (if i remember in future releases, it has nothing to do with the operation of the mod), the url redirect works fine, as in the instructions you cannot add http:// in that box but can use www. or just the domain - a word of warning - redirecting to Google is a very bad idea!

stator
10-19-2011, 05:02 PM
redirecting to Google is a very bad idea!

This what I asking about. What do you suggest ?

Simon Lloyd
10-19-2011, 06:05 PM
I made some suggestions a page or two back!, didn't the product you downloaded already have a url in the box?

stator
10-20-2011, 08:25 AM
I made some suggestions a page or two back!, didn't the product you downloaded already have a url in the box?

No, it haven't.

ForceHSS
10-20-2011, 08:56 AM
No, it haven't.

https://vborg.vbsupport.ru/showpost.php?p=2249057&postcount=94

use this one
www.klikhierniet.net

Simon Lloyd
10-20-2011, 10:43 AM
Thanks ForceHSS it must just be in the beta, i will release fix for emailing in the next day or so when i've finished work.

Simon Lloyd
10-20-2011, 10:44 AM
No, it haven't.Youdon't have this marked as installed so i have no idea of which version you downloaded!

dszuecs
10-20-2011, 01:45 PM
Just installed, works as described.
My problem: I had "Create New Thread" activated, didn't see the checkbox -.-"
No forum ID was given, so now my Top10 Stats shows two threads called "Activity from Bot No. 0 (Baidu) in your...", with no user and no forum. If i try to delete them it tells me, that i dont have access :S

Any tipps on how to delete those two posts?
Tanks!

Simon Lloyd
10-20-2011, 02:24 PM
Are you admin on your site? have you tried the prune posts in admincp?

Simon Lloyd
10-20-2011, 03:22 PM
Any of you that are currently beta testing please PM me, i have fixed the emailing and it works on my test board perfectly. The emailing now includes date and time of detection, complete UserAgent string, clickable link of the thread they visted and by which spider name in your ban list they were discovered by :)

I will pm all those that pm me at around 21:00 GMT

dszuecs
10-20-2011, 03:51 PM
Are you admin on your site? have you tried the prune posts in admincp?

Yes i am. Thanks for your message - i had to delete the threads via sql query haha.
Thanks & kindly regards

Simon Lloyd
10-20-2011, 03:52 PM
Glad you're sorted :)

Simon Lloyd
10-20-2011, 08:03 PM
Hi all, this next stage in the beta testing is going to be very short, i have proved it on 3 of my boards so i'm pretty confident that it does the business.

Version 3.0.0 will be released tomorrow around 8pm GMT provided there are no major issues raised by the beta testers.

I have further plans to develop this to provide some stats...etc (but don't hold your breath as i have an indepth programming project at work!) it may be some months yet :)

ForceHSS
10-21-2011, 06:19 AM
tested the email part over nite and it works well

Simon Lloyd
10-21-2011, 06:48 AM
Thanks ForceHSS, i'll upload that xml now then, you needn't download it as it will be the same version as you have now :)

Thanks for testing (credits given above!)

ForceHSS
10-21-2011, 10:58 AM
happy to help

nrasheed
10-21-2011, 01:26 PM
Hi,

I have Ver. 2.0.1. running on VB 4.1.7.

What would be the best approach to upgrade from 2.0.1 to 3.0.

Thanks,

Simon Lloyd
10-21-2011, 02:06 PM
Just import the product into admincp and check the overwrite radio button :)

stator
10-22-2011, 07:48 AM
You don't have this marked as installed so i have no idea of which version you downloaded!

Sorry, I marked it.
I've the vB4 version

Simon Lloyd
10-22-2011, 10:33 AM
Sorry, I marked it.
I've the vB4 versionGood because this is the vb4 version thread ;)

wizardrule
10-24-2011, 06:01 AM
Great job got rid of all my spiders. Love this, can not beat it for account security, Just a note about adding email in the notification space. I got flooded with emails of spiders being banned. it was truly amazing to see my notification risng up the screen like building blocks.

Simon Lloyd
10-24-2011, 10:55 AM
Glad you like it, there is a warning right there near the notifications about the possibility of hundreds of mails, if you are on shared hosting the chances are you are limited to 500 emails per day with a max of 500 per hour, dedicated servers usually get around 5000 per day max of 500 per hour, so using the email notification isn't the best option, in my opinion, and i only added all that notification stuff because people asked for it, the best one to use is the output file, you can prune it heavily at your leisure or just let it grow :) the only limit you then have is the amount of HDD space you have :)

voglermc
10-25-2011, 01:32 PM
what url are most of you using to redirect spiders to?

Simon Lloyd
10-25-2011, 01:47 PM
There's one included in the product but there's nothing to stop you createing a shtml for a 301 and redirecting them there, what i would say as previous is do not redirect them to Google ...etc :)

voglermc
10-25-2011, 02:00 PM
I removed that one and would rather redirect them away from my site altogether. Just wanted to know what a good one would be

Simon Lloyd
10-25-2011, 03:18 PM
Ok, in the next few days i'll release another beta where you can either enter a uel to redirect to or redirect them back to their own ip address - anyone think that would be a good idea?

voglermc
10-25-2011, 03:41 PM
Hell Yeah!

Alibass
10-28-2011, 06:59 PM
Installed and voted MOTM

I have a question, I've got a bot that's been sitting on my site for days.

It's UA is Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 2.0; and it shows in WOL as hosted-by.leaseweb.com
How can I get rid of this one other than blocking IP?

ForceHSS
10-28-2011, 07:07 PM
Installed and voted MOTM

I have a question, I've got a bot that's been sitting on my site for days.

It's UA is Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 2.0; and it shows in WOL as hosted-by.leaseweb.com
How can I get rid of this one other than blocking IP?

block its host

Alibass
10-28-2011, 07:14 PM
block its host

Are you talking about using this program and if so how or you talking about blocking in htaccess?

Simon Lloyd
10-28-2011, 08:56 PM
Installed and voted MOTM

I have a question, I've got a bot that's been sitting on my site for days.

It's UA is Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 2.0; and it shows in WOL as hosted-by.leaseweb.com
How can I get rid of this one other than blocking IP?What is it's full useragent?

Are you talking about using this program and if so how or you talking about blocking in htaccess?Not htaccess but with this mod...although i do have a similar mod that bans ip addresses:)

Alibass
10-28-2011, 09:04 PM
block its host

What is it's full useragent?

Not htaccess but with this mod...although i do have a similar mod that bans ip addresses:)
All I see is this,

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 2.0; )

ForceHSS
10-28-2011, 10:18 PM
Are you talking about using this program and if so how or you talking about blocking in htaccess?
use this program to block his host this is what it is for

Alibass
10-28-2011, 10:30 PM
use this program to block his host this is what it is for

As I posted in my first post the user-agent is,

Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.2; .NET CLR 2.0; )

Web host is,

hosted-by.leaseweb.com

I have tried both of these in the spider list of this program and it doesn't work. Now f there is another way using this program then please let me know.

ForceHSS
10-29-2011, 12:18 AM
do you have the option Ban Spiders In List = yes

Alibass
10-29-2011, 12:53 AM
do you have the option Ban Spiders In List = yes

The mod works fine and as it was designed to do, I just cant get rid of this one bot without blocking the IP.

The user-agent is not like the others and I cant tag it in the spider list to work correctly. Look at the UA I posted and you can see there is no bot name listed in the UA. If someone can tell me what to add to the spider list to block this bot then that's great, but if not please don't ask me if I have things ticked a certain way. This is a simple mod to use and by no means my first rodeo with mods. I'll just block the the IP range if no one can give me an answer without asking me a hundred questions.
Sorry mate, not trying to be rude but I'm looking for an answer to my question if someone knows it, not looking for questions. Thanks

Simon Lloyd
10-29-2011, 10:43 AM
The mod works fine and as it was designed to do, I just cant get rid of this one bot without blocking the IP.

The user-agent is not like the others and I cant tag it in the spider list to work correctly. Look at the UA I posted and you can see there is no bot name listed in the UA. If someone can tell me what to add to the spider list to block this bot then that's great, but if not please don't ask me if I have things ticked a certain way. This is a simple mod to use and by no means my first rodeo with mods. I'll just block the the IP range if no one can give me an answer without asking me a hundred questions.
Sorry mate, not trying to be rude but I'm looking for an answer to my question if someone knows it, not looking for questions. ThanksThis mod will do its job, simply enter the entire User Agent string in the list to ban and the mod will take care of the rest :)

If you experience further trouble i will investigate that bot further, i had a similar issue with a bot i thought i was banning, but then found the UA didn't contain the bot name, so in this instance i had to use the entire UA as i found it. Over the last few pages of this thread you will see i've posted links on how you can check UA's against your site or find UA's...etc, have a look for those links as they will help you in the fight against the bots ;)

Simon Lloyd
10-29-2011, 10:51 AM
Here's another of leaswebs UA's;
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6

So again, enter this entire UA to ban them :)

Edit: Here's a few more!
Mozilla/4.0 (compatible; MSIE 6.0; Windows 98; Win 9x 4.90; Creative)
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; MRA 4.6 (build 01425); MRSPUTNIK 1, 5, 0, 19
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) Netscape/8.0.4
Mozilla/4.76 [en] (Windows NT 5.0; U)

The above UA's are a little older but there's still a possibility of them using them, as you can see from the timings of these posts (or edits) with a little research and the help of Google you too can find the UA's that you want to ban.

Alibass
10-29-2011, 11:57 AM
This mod will do its job, simply enter the entire User Agent string in the list to ban and the mod will take care of the rest :)

If you experience further trouble i will investigate that bot further, i had a similar issue with a bot i thought i was banning, but then found the UA didn't contain the bot name, so in this instance i had to use the entire UA as i found it. Over the last few pages of this thread you will see i've posted links on how you can check UA's against your site or find UA's...etc, have a look for those links as they will help you in the fight against the bots ;)
Simon

Thanks for the info and this great mod. I had entered the complete UA in the spider list before posting here and that did not seem to stop it. I will enter the UA's you posted and some more I have found on Google for leaseweb. I will report back to you is this doesn't stop the bot.

Alibass

mavigul
10-29-2011, 05:26 PM
my mistake, how can i delete the threads activity from blabla

Simon Lloyd
10-29-2011, 07:13 PM
my mistake, how can i delete the threads activity from blablaIm not with you, what do you mean?

Simon Lloyd
10-30-2011, 12:38 PM
Hi all, i have further improved this mod to allow you to automatically redirect the bot to it's own IP, if you would like to beta test for me as usual send me a pm and i'll give you download details, like every other beta test please try it with one bot at first (do also use the bots vs browsers tool to ensure redirection).

Looking forward to hearing from you :)

ForceHSS
10-30-2011, 12:39 PM
send will test

Simon Lloyd
10-30-2011, 12:50 PM
send will testThanks PM sent :)

Simon Lloyd
10-31-2011, 07:56 AM
New feature, you can now automatically redirect the spider/bot to their own IP!

Bengie
11-03-2011, 11:29 PM
Well there are a lot of happy people out there, but alas, they are all much more clever than I am.
I have downloaded the mod and it's sitting on my desktop, where do I upload it to please?
Also, what is WOL?

ForceHSS
11-04-2011, 12:52 AM
Well there are a lot of happy people out there, but alas, they are all much more clever than I am.
I have downloaded the mod and it's sitting on my desktop, where do I upload it to please?
Also, what is WOL?
upload the files same place where your forums files are use a ftp program then go to your admin panel plugins and install the xml

Simon Lloyd
11-04-2011, 05:36 AM
upload the files same place where your forums files are use a ftp program then go to your admin panel plugins and install the xml

Force there isn't any files to upload :), simply go to Admincp>Plugins & Products>Product Manage Products>import new product and thats it, if you don't have a previous version there is no need to check the overwrite radio button.

WOL = Who's On Line

Bengie
11-04-2011, 07:24 AM
Thank you gentlemen. I did the import and 3.0.2 is now showing in my 'Installed Products', is that all I have to do, is it now active?

I might be almost 70, but how embarrasing, WOL = Who's On Line, sorry, should have seen that.

ForceHSS
11-04-2011, 03:15 PM
Force there isn't any files to upload :), simply go to Admincp>Plugins & Products>Product Manage Products>import new product and thats it, if you don't have a previous version there is no need to check the overwrite radio button.

WOL = Who's On Line

was thinking of another plugin, really should sleep more lol

Bengie
11-04-2011, 03:36 PM
Wonderful stuff, I can't thank you enough, and for the install instruction as well.
Baidu was creating havoc on my forum, sometimes there were around 30 at the same time and now there are none.

I have voted this mom and recommend the writer is taken out for a beer or 3.

Installed and working fine. (vb 4.1.5)

Simon Lloyd
11-04-2011, 03:59 PM
Wonderful stuff, I can't thank you enough, and for the install instruction as well.
Baidu was creating havoc on my forum, sometimes there were around 30 at the same time and now there are none.

I have voted this mom and recommend the writer is taken out for a beer or 3.

Installed and working fine. (vb 4.1.5)I'll be round to colect those beers :)

Glad it's working as you expected!

gigawiz
11-04-2011, 05:20 PM
Can you modify this mod so as the output.txt file is stored in a folder of your choosing which can then be CHMOD for read/write access, I don't like the idea of having my entire forum folder having that sort of access.

Unless I am missing something of course :P

Epic mod/hack this thanks very much.

marked as installed.

gigawiz.

Simon Lloyd
11-04-2011, 06:31 PM
Your entire forum folder isn't being given "that sort of access", it's simply one text file all restictions that all your other files have are and have never been unchanged, if i get time (really bogged down with working 2 jobs at the moment) i'll add a custom box so that you can set where the file is written but if you CHMOD that to read only then how can it write to it?

The contents of the output.txt do not give any information about your site that you cannot get from browsing your site or its users, the fact that it's just bot information makesit even less desirable info :)

Glad you like the mod ;)

victorvu
11-04-2011, 08:04 PM
Simon:

thank you for this awesome mod. It works like a charm.

However, I am receiving this below and If I do not want them to visit my webiste, what else can I do ? Is there a way to ban them when they type in the domain namẻ

DATE: 11-04-2011 14:58:13

UserAgent: Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

has visited

http://kieumau.net/index.php/forum/forumdisplay.php/181-Truy%E1%BB%87n-%C4%90%E1%BB%8Dc-D%C3%A0i-Ng%E1%BA%AFn-C%E1%BB%A7a-Nhi%E1%BB%81u-T%C3%A1c-Gi%E1%BA%A3?ltr=F

Found In Your List As Baidu

Regards,

Ban Spider Mod

Installed, voted and nominated.

Victor

Simon Lloyd
11-04-2011, 08:29 PM
Don't worry that you are getting notification, they have many many ip's, they are getting redirected before your forum, thread or post that they request loads, you get notofication of the thread they attempted to call.

So in a nutshell Victor, the mods working great, relax and worry not :)

Edit: You don't need to use the full UA to ban them, you can simpy use Baidu in the list and that will do the trick, using the full UA means you are only banning anything that visitis with that entire UA so if they use one thats slightly different they'll get through but using just Baidu will trap them all!

Simon Lloyd
11-04-2011, 08:39 PM
Victor, you should also upgrade to the very latest version 3.0.2

victorvu
11-04-2011, 11:43 PM
Don't worry that you are getting notification, they have many many ip's, they are getting redirected before your forum, thread or post that they request loads, you get notofication of the thread they attempted to call.

So in a nutshell Victor, the mods working great, relax and worry not :)

Edit: You don't need to use the full UA to ban them, you can simpy use Baidu in the list and that will do the trick, using the full UA means you are only banning anything that visitis with that entire UA so if they use one thats slightly different they'll get through but using just Baidu will trap them all!

Simon:

Thanks for the info .

I look through a list of 20 emails and found out that the duration for each email was 4 minutes consistantly. So, they must automated the access and record the results if any. That is why if somehow, as soon as they enter the domain name, the redirect will take them to a sex website for example. Before I have 5 spiders on my site 24 hours a day. I donot know why they target my website.

It is working awesome with your mod.

Thanks again.

Victor:up:

gigawiz
11-05-2011, 12:16 AM
Your entire forum folder isn't being given "that sort of access", it's simply one text file all restictions that all your other files have are and have never been unchanged, if i get time (really bogged down with working 2 jobs at the moment) i'll add a custom box so that you can set where the file is written but if you CHMOD that to read only then how can it write to it?

The contents of the output.txt do not give any information about your site that you cannot get from browsing your site or its users, the fact that it's just bot information makesit even less desirable info :)

Glad you like the mod ;)

OK now I feel a right idiot, it never occurred to me to just create the needed file in my forum root directory and CHMOD just the file for read/write access. I can't see the woods for the trees!

On a side note due to me making a slight error while setting up the hack I have threads made by the hack in all sorts of places, the threads don't actually exist and just make the forum look a mess. Any idea on how to remove them? Somebody mentioned about a SQL thing to do but I have no idea about that.

Thanks for the support.

Oh and I forgot to mention that I am running v3.8.5 of vBulletin.

gigawiz.

EDIT - I currently have a specific forum for the threads created by this hack and if I don't put them there then they end up everywhere. How do I set it so as no threads are made at all and just the output.txt file is made?

Simon Lloyd
11-05-2011, 06:32 AM
Firstly this is the vb4 thread so specific version questions should be in the thread for that version, however, to clean up just go to admincp>maintainance>Update Counters then update forum information.

To NOT create threads (which of course was my recommendation) then simply uncheck the radio button for "Create Thread" :)

gigawiz
11-05-2011, 12:24 PM
Firstly this is the vb4 thread so specific version questions should be in the thread for that version, however, to clean up just go to admincp>maintainance>Update Counters then update forum information.

To NOT create threads (which of course was my recommendation) then simply uncheck the radio button for "Create Thread" :)

You sir are a gentleman and a scholar! That cleanup bit did just the trick, sorry for posting in the wrong version thread I will look for the other one? Should I re-post my previous questions over in that thread don't want to be seen as double posting type thing.

Thanks again for your continued support!

gigawiz.

Simon Lloyd
11-05-2011, 04:47 PM
gigawiz, no need to post in the other thread now, the mods are the same but different versions can give different erros which is why i have versions of this for vb3.7 and vb3.8

Ath3na
11-06-2011, 06:56 PM
Awsome mod, thanks so much for this.

Spent hours trying to get rid of Baiduspider via htaccess and robots.txt then found this.
Twenty minutes after having it turned on no crappy unwanted bots.

Voted for MOTM

One quick question. I turned on the logging to the output.txt file that shout be in my forum root but I didn't see it generated in my httpdocs folder once the bots were removed?

I then turned logging off after just twenty minutes of having the mod installed. Does the log take a while to generate?

Thanks for this mod, it is really helpfull

Simon Lloyd
11-06-2011, 07:05 PM
The output.txt is generated as bots found in your list attempt to call a forum or thread, there's no time lag and the file should be created straight away. If you have no cms then the file should be available at www.mysite.com/output.txt if forum is in a folder then something like www.mysite.com/forum/output.txt

Any issues post back and i'll deal with them for you :)

Ath3na
11-06-2011, 07:40 PM
Ah ok, I will turn the logging back on and let you know. Should be fine though.

Thanks

bigtree
11-07-2011, 03:06 PM
Just installed this, very cool, thank you!
I'm using the full list but I see many are not. I don't care about most Asian traffic. Actually, I only care about the main bots, the rest can go you know where. Is the full list recommended then?


I don't need a log, notifications or to create threads etc. I just want to turn this on and have it work without having to dump logs etc. I've set it to the top 3 and pointing to www.klikhierniet.net Is this enough?

Thanks again!

Simon Lloyd
11-07-2011, 05:55 PM
Firstky glad you like it :)

You don't have to have any logging of any sort, that stuff was added by request so folk could monitor things...etc. The FULL list isn't exhaustive and there are many missing off it, denying bots/spiders is a personal thing, just use the names of those that you don't want to see your site (if you are using the full list remove DA and Custo as these may cause issues with real users), remember you are banning bots/spiders by user agent and what you see in WOL isn't necessarily in the UA, if you go to WOL and then chose the option for displaying useragents aswell it will help you.

I personally ban:
Yandex
Yeti
Youdao
Sogou
SoSo
Baidu
spinn3r
psbot
SBIder
exabot
speedy
omgili
wget

Amongst a few others, like i said, you can ban agressively as you like :)
EDIT: You can use the automatic option of redirecting each spider/bot to their own IP address instead of redirecting to a site!

bigtree
11-07-2011, 10:26 PM
This is such a great Mod! You are king!

RE: You can use the automatic option of redirecting each spider/bot to their own IP address instead of redirecting to a site!
What does the most damage to them without helping the bot to learn from this?

spillage
11-07-2011, 11:50 PM
Great mod, Simon.
I'm loving the difference it makes.

Today I noticed the Baidu spider on my site, despite it being in the ban list.

Any ideas?

Simon Lloyd
11-08-2011, 05:23 AM
This is such a great Mod! You are king!


What does the most damage to them without helping the bot to learn from this?I have no idea :), i originally built this to get rid of the chinese bots/spiders from my site as they were using up a lot of bandwidth and cpu time.

Great mod, Simon.
I'm loving the difference it makes.

Today I noticed the Baidu spider on my site, despite it being in the ban list.

Any ideas?I'll bet you are using Paul M's mod track guest visits or something like that, if so read back a page or two of this thread :)

Glad you're both happy with it!

BadgerDog
11-08-2011, 10:38 AM
I installed the update "31st October New xml uploaded with automatic redirect to IP" a few days ago and I noticed that by visitors number seemed to jump and be much higher afterwards. It used to work fine with the previous version.

I took the advice here and waited, but even after a few days, I'm still seeing "Baidu" spiders appearing and active in the "Who's On-line", even though this mod is active and Baidu is in the list of banned spiders?

What am I missing?

My ban list says ...

Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Regards,
Doug

ForceHSS
11-08-2011, 12:07 PM
try this list works well for me

Baidu
almaden
Anarchie
ASPSeek
attach
autoemailspider
BackWeb
Bandit
BatchFTP
BlackWidow
Bot\mailto:craftbot@yahoo.com
Buddy
bumblebee
CherryPicker
ChinaClaw
CICC
Collector
Copier
Copyscape
Crescent
DIIbot
DISCo
DISCo\Pump
dotbot
Download\Demon
Download\Wonder
Downloader
Drip
DSurf15a
eCatch
EasyDL/2.99
EirGrabber
email
EmailCollector
EmailSiphon
EmailWolf
Express\WebPictures
ExtractorPro
EyeNetIE
FileHound
FlashGet
FrontPage
GetRight
GetSmart
GetWeb!
gigabaz
Go\!Zilla
Go!Zilla
Go-Ahead-Got-It
gotit
Grabber
GrabNet
Grafula
grub-client
HMView
HTTrack
httpdown
.*httrack.*
ia_archiver
Image\Stripper
Image\Sucker
Indy*Library
Indy\Library
InterGET
InternetLinkagent
Internet\Ninja
InternetSeer.com
Iria
JBH*agent
JetCar
JOC\Web\Spider
JustView
larbin
LeechFTP
LexiBot
lftp
Link*Sleuth
likse
//Link
LinkWalker
Mag-Net
Magnet
Mass\Downloader
Memo
Microsoft.URL
MIDown\tool
Mirror
Mister\PiX
Mozilla.*Indy
Mozilla.*NEWT
Mozilla*MSIECrawler
MS\FrontPage*
MSFrontPage
MSIECrawler
MSProxy
Navroad
NearSite
NetAnts
NetMechanic
NetSpider
Net\Vampire
NetZIP
NICErsPRO
Ninja
Nutch
Octopus
Offline\Explorer
Offline\Navigator
Openfind
PageGrabber
Papa\Foto
pavuk
pcBrowser
Ping
PingALink
Pockey
psbot
Pump
QRVA
RealDownload
Reaper
Recorder
ReGet
Scooter
Seeker
Siphon
sitecheck.internetseer.com
SiteSnagger
SlySearch
SmartDownload
Snake
sogou
Soso
SpaceBison
Spinn3r
sproose
Stripper
Sucker
SuperBot
SuperHTTP
Surfbot
Szukacz
tAkeOut
Teleport\Pro
URLSpiderPro
Vacuum
VoidEYE
vBSEO
Web\Image\Collector
Web\Sucker
WebAuto
[Ww]eb[Bb]andit
webcollage
WebCopier
Web\Downloader
WebEMailExtrac.*
WebFetch
WebGo\IS
WebHook
WebLeacher
WebMiner
WebMirror
WebReaper
WebSauger
Website
Website\eXtractor
Website\Quester
Webster
WebStripper
WebWhacker
WebZIP
Wget
Whacker
Widow
WWWOFFLE
x-Tractor
Xaldon\WebSpider
Xenu
Yandex
Yeti
YOUDAOBOT
Zeus.*Webster
Zeus

Simon Lloyd
11-08-2011, 12:52 PM
I installed the update "31st October New xml uploaded with automatic redirect to IP" a few days ago and I noticed that by visitors number seemed to jump and be much higher afterwards. It used to work fine with the previous version.

I took the advice here and waited, but even after a few days, I'm still seeing "Baidu" spiders appearing and active in the "Who's On-line", even though this mod is active and Baidu is in the list of banned spiders?

What am I missing?

My ban list says ...

Yandexj
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Regards,
Doughi Doug, are you using any visitor tracking mods ?

BadgerDog
11-08-2011, 07:39 PM
hi Doug, are you using any visitor tracking mods ?

Only PaulM's Guesy Mod, but I always have with previous versions.

I'm referring to the "Who Is On-Line" vBulletin display, not his guest display, as to where the Baidu spiders have started to appear?

Thanks .. :)

Regards,
Doug

spillage
11-09-2011, 01:51 AM
I'll bet you are using Paul M's mod track guest visits or something like that, if so read back a page or two of this thread :)

Paul M mods on my site;

vBulletin Cron Based Database Backup
Doublepost Prevention

The latter is a recent addition, but other than that single occasion, I haven't seen any others getting by.

bigtree
11-09-2011, 04:53 AM
try this list works well for me

Baidu
almaden
Anarchie
ASPSeek
attach
autoemailspider
BackWeb
Bandit
BatchFTP
BlackWidow
Bot\mailto:craftbot@yahoo.com
Buddy
bumblebee
CherryPicker
ChinaClaw
CICC
Collector
Copier
Copyscape
Crescent
DIIbot
DISCo
DISCo\Pump
dotbot
Download\Demon
Download\Wonder
Downloader
Drip
DSurf15a
eCatch
EasyDL/2.99
EirGrabber
email
EmailCollector
EmailSiphon
EmailWolf
Express\WebPictures
ExtractorPro
EyeNetIE
FileHound
FlashGet
FrontPage
GetRight
GetSmart
GetWeb!
gigabaz
Go\!Zilla
Go!Zilla
Go-Ahead-Got-It
gotit
Grabber
GrabNet
Grafula
grub-client
HMView
HTTrack
httpdown
.*httrack.*
ia_archiver
Image\Stripper
Image\Sucker
Indy*Library
Indy\Library
InterGET
InternetLinkagent
Internet\Ninja
InternetSeer.com
Iria
JBH*agent
JetCar
JOC\Web\Spider
JustView
larbin
LeechFTP
LexiBot
lftp
Link*Sleuth
likse
//Link
LinkWalker
Mag-Net
Magnet
Mass\Downloader
Memo
Microsoft.URL
MIDown\tool
Mirror
Mister\PiX
Mozilla.*Indy
Mozilla.*NEWT
Mozilla*MSIECrawler
MS\FrontPage*
MSFrontPage
MSIECrawler
MSProxy
Navroad
NearSite
NetAnts
NetMechanic
NetSpider
Net\Vampire
NetZIP
NICErsPRO
Ninja
Nutch
Octopus
Offline\Explorer
Offline\Navigator
Openfind
PageGrabber
Papa\Foto
pavuk
pcBrowser
Ping
PingALink
Pockey
psbot
Pump
QRVA
RealDownload
Reaper
Recorder
ReGet
Scooter
Seeker
Siphon
sitecheck.internetseer.com
SiteSnagger
SlySearch
SmartDownload
Snake
sogou
Soso
SpaceBison
Spinn3r
sproose
Stripper
Sucker
SuperBot
SuperHTTP
Surfbot
Szukacz
tAkeOut
Teleport\Pro
URLSpiderPro
Vacuum
VoidEYE
vBSEO
Web\Image\Collector
Web\Sucker
WebAuto
[Ww]eb[Bb]andit
webcollage
WebCopier
Web\Downloader
WebEMailExtrac.*
WebFetch
WebGo\IS
WebHook
WebLeacher
WebMiner
WebMirror
WebReaper
WebSauger
Website
Website\eXtractor
Website\Quester
Webster
WebStripper
WebWhacker
WebZIP
Wget
Whacker
Widow
WWWOFFLE
x-Tractor
Xaldon\WebSpider
Xenu
Yandex
Yeti
YOUDAOBOT
Zeus.*Webster
Zeus
Thats a serious list. I would like to get rid of all BS spiders on my site but I worry about ranking? What effect does a list like this have?

How do we tell if a guest is a bad spider?

Simon Lloyd
11-09-2011, 04:56 AM
Only PaulM's Guesy Mod, but I always have with previous versions.

I'm referring to the "Who Is On-Line" vBulletin display, not his guest display, as to where the Baidu spiders have started to appear?

Thanks .. :)

Regards,
DougIf you want to pm me access i'll take a look but it wont be until around 5pm gmt

Simon Lloyd
11-09-2011, 04:57 AM
Paul M mods on my site;

vBulletin Cron Based Database Backup
Doublepost Prevention

The latter is a recent addition, but other than that single occasion, I haven't seen any others getting by.Glad to hear its working for you :)

Simon Lloyd
11-09-2011, 04:59 AM
Thats a serious list. I would like to get rid of all BS spiders on my site but I worry about ranking? What effect does a list like this have?

How do we tell if a guest is a bad spider?You can't other than doing lots of net searching, check out honeypot project and sites like that.

BadgerDog
11-09-2011, 09:43 AM
If you want to pm me access i'll take a look but it wont be until around 5pm gmt

Thanks ... :)

Here's a pic of my current "Who is On-line" display showing "Baidu" spiders with the mod active...

Regards,
Doug

Simon Lloyd
11-09-2011, 04:39 PM
Doug i see, can you pm me a link to your site..etc, there's nothing much i can do without being able to investigate your site, one thing you might want to check is that you don't have any leading or trailing spaces around the Baidu in your list, so it shouldn't look like this
" Baidu" or "Baidu " (without the quotes of course :))

EasyEazy
11-12-2011, 09:16 AM
Just what im looking for. Installed and will see how it goes.

Thank you.

fly
11-14-2011, 05:50 PM
Thanks. I recently switched from Apache to Nginx. I used to have an .htaccess that worked great, but Nginx doesn't use em.

This saved me from having to figure out how to do it with Nginx. :D

KissOfDeath
11-16-2011, 03:57 PM
Is it just spiders this bans or users aswell?

got a few spambots making spam posts for example

AtaTuvisio "173.193.227.141" resolves to 173.193.227.141-static.reverse.softlayer.com, so i've added "static.reverse.softlayer.com" to the list but it does not seem to be doing anything, am i understanding this wrong?

also have some the resolve to "localhost" what do i do about those?

Simon Lloyd
11-16-2011, 04:19 PM
Localhost? thats your server then???
If you have banned that useragent then after your WOL timeout (and after they have tried moving to view another thread..etc) will be gone!

Anyway aside from that the resolved UA that you are showing isn't their UA its their IP resolution, go to WOL at the bottom there's a dropdown "Display User Agent" change that to yes and then find them :)

KissOfDeath
11-16-2011, 05:29 PM
Localhost? thats your server then???
If you have banned that useragent then after your WOL timeout (and after they have tried moving to view another thread..etc) will be gone!

Anyway aside from that the resolved UA that you are showing isn't their UA its their IP resolution, go to WOL at the bottom there's a dropdown "Display User Agent" change that to yes and then find them :)

Ah i understand now, thanks a lot of the spam posters are mozzila so i guess i can't really ban that, trying to find a ballance that does not wrongfully ban people, tried the plugins that use stopforum spam but the block someone just for using an isp like AOL or Ttiscali and you lose legitimate traffic.....

the local host ones are weird it's not my server the ip address is completely different to my server "113.165.164.99" for example

Simon Lloyd
11-16-2011, 08:13 PM
No matter :), just ban their UA and you'll be good!

Alan_SP
11-25-2011, 04:00 AM
To NOT create threads (which of course was my recommendation) then simply uncheck the radio button for "Create Thread" :)

I installed vB4 version (I followed at first vB3 version). I have vB4.1.2PL4.

I read that someone had problems with "phantom" threads created after installing mod. I too just had this experience.

Of course in settings I selected that threads aren't created at all. Also, I had empty place for forum and user ID.

But threads where created, using user ID 1 (first user I created when installing forum). Forum ID is missing, I see empty name instead forum's name.

All of it I see with Valters Advanced Forum Statiscs: https://vborg.vbsupport.ru/showthread.php?t=235841

It looks like:

Activity from Bot No. 7 (Baidu) in your list 0 0 primus ▲ 25-11, 05:31

Note that the last thing is no forum name, i.e. empty space. :(

I can't see them in What's New. Or anywhere else in forum. When I try to access them I get error message. I don't know where they are really.

This is link to last one: http://slobodni.net/showthread.php?t=107268&goto=newpost

Threads are created, but I don't know where. Also I don't know where can I remove them and how? 6 Threads are created at the moment.

Trying to prevent this, I put forum ID in settings as well as user ID, so I could at least delete threads upon creation.

So, mod creates threads somehow even if option is disabled. How and where, not sure, but I see them. :(

Interesting thing is, it created only this six threads and at the moment there aren't any new threads. I think there should be more spiders activity from banned bots. Very strange.


EDIT: Just noticed this:

You had this in conditions:
$vbulletin->options ['bs_report_createthread'

I just changed this to:
$vbulletin->options['bs_report_createthread'

Note there was one space. Not sure if this is bug that caused thread creation or not, but I just removed that one space.

Simon Lloyd
11-25-2011, 06:35 AM
It shouldn't create threads and it may have been that typo (thanks for spotting it) in the updated xml i have also added a fix so that if the forumid is blank the routine is exited :)

EuroBeat2
11-25-2011, 06:45 AM
Threads are created, but I don't know where. Also I don't know where can I remove them and how? 6 Threads are created at the moment.
.

I had the same problem. However, being stupid I turned ON option to create threads, but have not specified the forum. Obviously threads should not be created if forum was not specified and I believe author is going to fix it.

Anyway, the best way to clear the threads (I got 20 of them) is to use your AdminCP -> Threads And Post -> Prune and purge them. I basically removed them using "Other Option'" and put the string: "Activity From Bot" in the TITLE field and it took care of it. I checked DB and they were removed. After this operation Advanced Statistics Plugin no longer reports them. :)

Hope it helps.

EB

Amadeusmq
11-25-2011, 07:07 AM
Does anyone have a pruned list of just the really BAD spiders? Honestly, my only problem has been with Baidu (80 spiders at one point) ....and I'm not really knowledgeable enough to know which of that LONG list are: "really bad", "kindof bad", "annoying", etc.

Thanks!

Simon Lloyd
11-25-2011, 07:19 AM
It really depends on your target audience, if you were catering for China then you'd allow Baidu, if you were catering for Russia then you'd allow Yandex (Russian Yahoo), most of the bots in that short list (trust me it is short as there are many more that add no value to your site or stats) are known scrapers, bad bots or foriegn spiders that you just don't need.

Check other sites and see which bots they are allowing, see if they are in your country, research the ones they show to see if they're good or bad for your site.

Amadeusmq
11-25-2011, 09:28 AM
Just as a FYI, the "Check Version" option in the admincp->Plugins&Products->Installed Products listing doesn't work.

The error message is:

Version check failed. No version number was found at this location. The URL for the version check may be incorrect, or the server may be experiencing problems. Please try again later.

The hard link in the listing does correctly point to this thread; however, the version check doesn't work properly.

Simon Lloyd
11-25-2011, 03:14 PM
Thanks i'll sort that out :)

Alan_SP
11-26-2011, 01:40 AM
Anyway, the best way to clear the threads (I got 20 of them) is to use your AdminCP -> Threads And Post -> Prune and purge them. I basically removed them using "Other Option'" and put the string: "Activity From Bot" in the TITLE field and it took care of it. I checked DB and they were removed. After this operation Advanced Statistics Plugin no longer reports them. :)

Thanks.

I found other method. Go to ACP->Maintenance->Update Counters->Remove Orphan Threads (at the end of list)

This sorted it out quickly and I too didn't see threads anymore.

Alan_SP
11-26-2011, 01:44 AM
It shouldn't create threads and it may have been that typo (thanks for spotting it) in the updated xml i have also added a fix so that if the forumid is blank the routine is exited :)

Yes, I know. It's very strange that it created this six threads.

Probably you should also put 0 instead 1 in options for thread creating by default? Maybe that could help?

EuroBeat2
11-26-2011, 01:46 AM
Thanks.

I found other method. Go to ACP->Maintenance->Update Counters->Remove Orphan Threads (at the end of list)

This sorted it out quickly and I too didn't see threads anymore.

Good thinking. I missed that one.

EB

Simon Lloyd
11-26-2011, 05:26 AM
...... Obviously threads should not be created if forum was not specified and I believe author is going to fix it.Fixed :)

Just as a FYI, the "Check Version" option in the admincp->Plugins&Products->Installed Products listing doesn't work.

The error message is:

The hard link in the listing does correctly point to this thread; however, the version check doesn't work properly.Fixed :)

Yes, I know. It's very strange that it created this six threads.

Probably you should also put 0 instead 1 in options for thread creating by default? Maybe that could help?Fixed :)