vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 4.x Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=245)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208)

Simon Lloyd 10-10-2011 12:06 PM

Yes, the current product only allows you to either ban the spider OR have a notification like output to log, email or create a thread, i have a beta which allows you to do both at the same time, a couple of people are trialing it right now.

You only need to have the mod active, ban spiders in list set to yes and some spiders in the list to ban :)

Thanks for voting MOTM, you will recieve an update when this beta is proven ;)

ozzy47 10-10-2011 08:39 PM

Quote:

Originally Posted by Simon Lloyd (Post 2255388)
Thats great, thanks for testing, It is also being banned?


As far as I can tell it is.

Kat-2 10-10-2011 09:17 PM

This mod is awesome. Thank you so much. I was so sick of the Baidu spider on my site. I installed this mod and watch as they left one by one, and haven't been back. :D

Simon Lloyd 10-11-2011 03:48 AM

Glad you're happy with it - watch out for the new release after this beta has been tested a while longer :)

voglermc 10-12-2011 12:38 AM

Is anyone getting this in your error log?

[11-Oct-2011 08:33:56] die 3
[11-Oct-2011 08:35:19] die 3
[11-Oct-2011 08:35:26] die 3
[11-Oct-2011 08:35:29] die 3
[11-Oct-2011 08:35:30] die 3
[11-Oct-2011 08:35:40] die 3
[11-Oct-2011 08:35:40] die 3
[11-Oct-2011 08:35:49] die 3
[11-Oct-2011 08:35:49] die 3
[11-Oct-2011 08:35:51] die 3
[11-Oct-2011 08:36:00] die 3
[11-Oct-2011 08:36:00] die 3
[11-Oct-2011 08:36:08] die 3
[11-Oct-2011 08:36:08] die 3
[11-Oct-2011 08:38:35] die 3
[11-Oct-2011 08:38:58] die 3
[11-Oct-2011 11:32:59] die 4 true
[11-Oct-2011 11:32:59] die 4 true
[11-Oct-2011 13:24:17] die 3
[11-Oct-2011 13:27:56] die 3

ForceHSS 10-12-2011 02:17 AM

are you using the beta or the one from the first post

Simon Lloyd 10-12-2011 05:20 AM

It would be natural i suspect to get those, the mod performs all the redirects...etc at style_fetch, so in other words even though spiders...etc go straight for a url they never get to see it as it never completes for them hence the connection for them dying, however this mod does NOT give a Die(); command, it gives a 301 redirect.

@ForceHSS there's only you and one other using the beta and another has just requested to be a tester so that will be 3 of you :)

ForceHSS 10-12-2011 06:42 AM

Quote:

Originally Posted by Simon Lloyd (Post 2256158)
It would be natural i suspect to get those, the mod performs all the redirects...etc at style_fetch, so in other words even though spiders...etc go straight for a url they never get to see it as it never completes for them hence the connection for them dying, however this mod does NOT give a Die(); command, it gives a 301 redirect.

@ForceHSS there's only you and one other using the beta and another has just requested to be a tester so that will be 3 of you :)

Did not know how many thought that might of been from the beta the reason I asked

Simon Lloyd 10-12-2011 09:34 AM

Hows the testing going ForceHSS?

voglermc 10-12-2011 09:59 AM

Not the beta

Simon Lloyd 10-12-2011 10:44 AM

Are you isning my Ban Ip mod? that mod actually gives the Die(); command which breaks the connection. This mod allows first connection but redirects immediately so their request never completes.

voglermc 10-12-2011 10:55 AM

Nope, only this mod of yours

Simon Lloyd 10-12-2011 11:24 AM

Well you can rest assured it's only banning or stopping complete connection for those in your list :)

ForceHSS 10-12-2011 09:03 PM

Quote:

Originally Posted by Simon Lloyd (Post 2256224)
Hows the testing going ForceHSS?

going good no bugs in it so far that I can see
one thing would be nice to see as it seems to miss Thread Prefixes even if I make it forced to use them on a section it wont add them

ozzy47 10-12-2011 10:08 PM

If a spider is banned, how do I get them to crawl my site again, I tried your full ban list, and now my website monitor services are no longer checking my site.

I removed all spiders from admin except Baidu.

GreyGhost 10-12-2011 10:26 PM

Quote:

Originally Posted by Simon Lloyd (Post 2254810)
No pressure but im looking for testers ;)

Hi Simon, I just sent PM to test beta.

I have the released version installed on our vBCMS 4.1.7 but it doesn't seem to be banning Baidu. Our forums are located in the root with the CMS (so no /forums/), not sure if it's to do with this.

I have Track Guest Visits installed and it still shows 40-50 Baidu every day.

I've double checked my settings... only have "Ban Spiders In List" selected, no logging etc.

My List is:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Anyway, will try beta and see if that fixes it.

8-)

PS. I hope your daughter and grandson are doing well.

Simon Lloyd 10-12-2011 10:36 PM

Quote:

Originally Posted by ForceHSS (Post 2256479)
going good no bugs in it so far that I can see
one thing would be nice to see as it seems to miss Thread Prefixes even if I make it forced to use them on a section it wont add them

It wont add prefixes as they are added when the forum loads, your actual url stays the same, a prefix is never added to them - have you ever seen a url like this http:www.mysite.com/showthread?t=[solved]12345 ??? :)

Quote:

Originally Posted by ozzy47 (Post 2256502)
If a spider is banned, how do I get them to crawl my site again, I tried your full ban list, and now my website monitor services are no longer checking my site.

I removed all spiders from admin except Baidu.

You added your site monitoring service as a bad bot? bad move!, remember we're sending them a 301 which is a permanent redirect, if you don't see them back in a week check with them, you may ask for your url to crawled again.

Quote:

Originally Posted by GreyGhost (Post 2256505)
Hi Simon, I just sent PM to test beta.

I have the released version installed on our vBCMS 4.1.7 but it doesn't seem to be banning Baidu. Our forums are located in the root with the CMS (so no /forums/), not sure if it's to do with this.

I have Track Guest Visits installed and it still shows 40-50 Baidu every day.

I've double checked my settings... only have "Ban Spiders In List" selected, no logging etc.

My List is:
Yandex
Yeti
Baidu
soso
sogou
ichiro
speedy
spinn3r
mlbot
psbot
SBIder
Ezooms
snap shots
metauri
YoudaoBot
youdao

Anyway, will try beta and see if that fixes it.

8-)

PS. I hope your daughter and grandson are doing well.

Right, firstly, thanks they're now doing great :), your "Track Guest Visits" mod will ALWAYS show the spiders but your native vBulletin WOL will not, the reason why the TGV mod picks them up is because they are actually accessing your site (so that mods doing it's job and recording them) but my mod prevents them from having their request completed i.e direct request for a url is a forum access but they are redirected permanently before the thread loads (so my mod is ALSO doing its job :))

Hope that clears things up for you all.

@GreyGhost i'll PM you details of the beta ;)

ozzy47 10-12-2011 10:49 PM

Yeah my site monitoring site was in your bad bot list, and I did not see it.

ozzy47 10-12-2011 11:15 PM

OK I got them back, 1 was missing, the other one was showing as guest, I upgraded and forgot to re up the spiders xml from WolfsHead

GreyGhost 10-12-2011 11:16 PM

Quote:

Originally Posted by Simon Lloyd (Post 2256512)
Right, firstly, thanks they're now doing great :),

Excellent! :)

Quote:

your "Track Guest Visits" mod will ALWAYS show the spiders but your native vBulletin WOL will not, the reason why the TGV mod picks them up is because they are actually accessing your site (so that mods doing it's job and recording them) but my mod prevents them from having their request completed i.e direct request for a url is a forum access but they are redirected permanently before the thread loads (so my mod is ALSO doing its job :))

Hope that clears things up for you all.
Yes, suspected this was the case. Just after posting I tested it @ http://www.botsvsbrowsers.com/SimulateUserAgent.asp and Baidu were indeed being redirected (straight back to Baidu :D).

Great stuff!

8-)

ozzy47 10-12-2011 11:22 PM

In your ban list you have InternetSeer.com

That would not be the same as this, Internet Seer Spider
because that is a web site monitoring service.

Simon Lloyd 10-13-2011 12:37 AM

You may see in your WOL Internet Seer Spider but you need to know it's full UA (here's one: Mozilla/5.0 (compatible; Atwatch-InternetSeer monitoring; MSIE 6.0; ) Gecko) so we can decipher which part if any is being blocked, Maybe this will help you get them back http://www.internetseer.com/help/faq.xtp

Just a word though, internetseer has always been just an email scraper, they trade off with "monitoring" your site or of course you can pay them to do it, either way they are archiving email addresses found on your site - still want to use 'em?

As a final word, i did mention when posting the list of bad bots (both here in the thread and the txt file) to prune out those that you want to be able to see your site, you might have like to also prune out ia_archiver, you should definitely prune out DA and custo (as should most people).

Simon Lloyd 10-14-2011 03:18 AM

Beta testing is going well :), those that are using the beta please pm me any results you have or bugs. If all is well i'll release it as stable next week ;)

ForceHSS 10-14-2011 07:27 AM

Will post here and not pm. So far no bugs in the bot posting still have to test the other settings but I am sure others have but will test them as well

Simon Lloyd 10-14-2011 08:25 AM

Quote:

Originally Posted by ForceHSS (Post 2257069)
Will post here and not pm. So far no bugs in the bot posting still have to test the other settings but I am sure others have but will test them as well

Posting here is great, other folk get to know how it's going and if anything what to look out for :)

GreyGhost 10-14-2011 09:34 AM

Quote:

Originally Posted by Simon Lloyd (Post 2257084)
Posting here is great, other folk get to know how it's going and if anything what to look out for :)

You're welcome to quote my PMs here Simon, if you feel they'll be at all informative.
Although I will be removing the images after a few days so you may want to leave them out.

8-)

ForceHSS 10-14-2011 10:09 AM

the email part of it dont seem to work

ozzy47 10-14-2011 04:21 PM

I still see Baidu hitting my site every so often, and the last one that was banned was 10/11/2011 according to forum post and output.txt

Simon Lloyd 10-14-2011 04:47 PM

Well baidu is persistant :), remember that you are getting notification BEFORE a thread or forum loads properly for them.

@ForceHSS & GreyGhost i'll work on the emailing a little later :)

stator 10-18-2011 06:17 AM

I received this error after activation of "Write to log" option

Quote:

Warning: fopen(output.txt) [function.fopen]: failed to open stream: Permission denied in [path]/includes/functions.php(7207) : eval()'d code on line 99

stator 10-18-2011 06:28 AM

1 Attachment(s)
The highlighted word in the photo below is missing "i"

https://vborg.vbsupport.ru/external/2011/10/28.jpg

btw, does the url of redirect work? or I've to put something else?

thnx

Simon Lloyd 10-18-2011 09:18 AM

Quote:

Originally Posted by stator (Post 2258475)
I received this error after activation of "Write to log" option

this seems like you do not have write permissions to your server!

Simon Lloyd 10-18-2011 09:20 AM

Quote:

Originally Posted by stator (Post 2258478)
The highlighted word in the photo below is missing "i"

https://vborg.vbsupport.ru/attachmen...1&d=1318922782

btw, does the url of redirect work? or I've to put something else?

thnx

The typo i will fix (if i remember in future releases, it has nothing to do with the operation of the mod), the url redirect works fine, as in the instructions you cannot add http:// in that box but can use www. or just the domain - a word of warning - redirecting to Google is a very bad idea!

stator 10-19-2011 05:02 PM

Quote:

Originally Posted by Simon Lloyd (Post 2258503)
redirecting to Google is a very bad idea!

This what I asking about. What do you suggest ?

Simon Lloyd 10-19-2011 06:05 PM

I made some suggestions a page or two back!, didn't the product you downloaded already have a url in the box?

stator 10-20-2011 08:25 AM

Quote:

Originally Posted by Simon Lloyd (Post 2259041)
I made some suggestions a page or two back!, didn't the product you downloaded already have a url in the box?

No, it haven't.

ForceHSS 10-20-2011 08:56 AM

Quote:

Originally Posted by stator (Post 2259259)
No, it haven't.

https://vborg.vbsupport.ru/showpost....7&postcount=94

use this one
www.klikhierniet.net

Simon Lloyd 10-20-2011 10:43 AM

Thanks ForceHSS it must just be in the beta, i will release fix for emailing in the next day or so when i've finished work.

Simon Lloyd 10-20-2011 10:44 AM

Quote:

Originally Posted by stator (Post 2259259)
No, it haven't.

Youdon't have this marked as installed so i have no idea of which version you downloaded!

dszuecs 10-20-2011 01:45 PM

Just installed, works as described.
My problem: I had "Create New Thread" activated, didn't see the checkbox -.-"
No forum ID was given, so now my Top10 Stats shows two threads called "Activity from Bot No. 0 (Baidu) in your...", with no user and no forum. If i try to delete them it tells me, that i dont have access :S

Any tipps on how to delete those two posts?
Tanks!


All times are GMT. The time now is 05:05 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01794 seconds
  • Memory Usage 1,835KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (18)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (1)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete