vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 4.x Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=245)
-   -   Miscellaneous Hacks - Ban Spiders by User Agent (https://vborg.vbsupport.ru/showthread.php?t=268208)

vb50kgpoo 11-12-2012 01:28 PM

Quote:

Originally Posted by ForceHSS (Post 2380510)
What is there full host name

That is it !
They go by those generic names only, nothing else.

Simon Lloyd 11-12-2012 04:31 PM

Quote:

Originally Posted by vb50kgpoo (Post 2380473)
Hi Simon
Yours is a great product. I made the mistake of uninstalling it in order to use AbyssGuard, which is plagued with problems. I have now reinstalled Ban Spiders By User Agent. One question, are there any ramifications in banning \wbot[\/\-] with your mod? I ask as putting \wbot[\/\-] directly into my htaccess banning mecahism causes issues.
Regards / RSVP

Im ,y mod you are banning any useragent that has any occurrence of one of the strings in your list, i very much doubt that \wbot[\/\-] is found in any useragent as the looks like a regular expression, in my mod simply wbot will do if thats in their useragent.

Quote:

Originally Posted by vb50kgpoo (Post 2380504)
Also.......

Does anyone know what these bots are;

Robot ID - Hits - Bandwidth - Last visit - Hits on robots.txt
robot 772 8576221 20121111093454 0
crawl 699 9556953 20121108085243 0
spider 5 114750 20121106065956 0

Bad bots using generic names?

Thats just stats from cpanel awstats and mean nothing other than spiders were identified with those in their identifier.

Quote:

Originally Posted by vb50kgpoo (Post 2380545)
That is it !
They go by those generic names only, nothing else.

They are not their useragents, read some of the links that i took time and trouble to post in the mod description on hpw to find the useragent.

bzcomputers 11-20-2012 03:09 PM

Quote:

Originally Posted by TheSupportForum (Post 2379008)
wont MSIE 1 block MSIE Beta 10 ?

which means MSIE 8, 9 only visitors

It won't block MSIE Beta 10, but it will block MSIE 10. Best to remove MSIE 1 from your list now. I haven't seen any references in the output.txt file of anything prior to MSIE 5 so removing MSIE 1 probably won't cause any problems.

https://vborg.vbsupport.ru/external/2012/11/15.jpg


Edit: If you were wondering what the IP Address 108.2.106.107 is, it is Verizon's Search Engine (www.verizon.net) definitely not something you would want to block.

Max Taxable 11-20-2012 03:35 PM

Quote:

Originally Posted by bzcomputers (Post 2382955)
It won't block MSIE Beta 10, but it will block MSIE 10. Best to remove MSIE 1 from your list now. I haven't seen any references in the output.txt file of anything prior to MSIE 5 so removing MSIE 1 probably won't cause any problems

Using the time based Mod in my signature, I often see user agents with early IE, such as 3, 4 and 5, but you're right - I haven't seen any IE 1 or 2 in so long, there's no doubt it would be good to remove MSIE 1 from the ban list.

Thanks for the information!

Simon Lloyd 11-20-2012 05:57 PM

Quote:

Originally Posted by bzcomputers (Post 2382955)
It won't block MSIE Beta 10, but it will block MSIE 10. Best to remove MSIE 1 from your list now. I haven't seen any references in the output.txt file of anything prior to MSIE 5 so removing MSIE 1 probably won't cause any problems.

https://vborg.vbsupport.ru/external/2012/11/15.jpg


Edit: If you were wondering what the IP Address 108.2.106.107 is, it is Verizon's Search Engine (www.verizon.net) definitely not something you would want to block.

It will block MSIE Beta 10 if it appears like that in the useragent, this mod will block ALL instances where the exact string in your list is found in the UA.

My Hattiesburg 12-15-2012 09:45 PM

Okay, I've installed this and it seems to be working fine, but I have some questions.

At first the Baidu spider was hammering us, hitting the site about every 8 seconds, but now it seems to have more or less given up on us. Yandex, on the other hand, seems to have intensified it's attempted crawls. At first it was hitting the site about every 25 seconds but over the course of this install has ranged everywhere from every 1 second to where it's at now, about every 1.5 minutes. The 1.5 minute attempts have only occurred in the last couple of days.

A couple of days ago when Yandex was hitting us every 1 second, we had some server load issues. I don't know if this is related but it seems it might be and I'm wondering if logging the blocks to a text file might be counterproductive in that area, writing to the file every second or so.

Also, Yandex is using the same IP address every time, so I thought it might be best to just block it using the .htaccess file, but that doesn't seem to have had any effect. Is this mod redirecting Yandex before it has a chance to read the .htaccess file or is Yandex simply ignoring it?

smirkley 12-15-2012 10:24 PM

Your server would be the one to decide if Yandex is blocked by the .htaccess file or not.
If the ip is denied in htaccess, your server will block before any vb module loads, or any page opther for that matter.

Simon Lloyd 12-16-2012 05:29 AM

Smirkley is right, .htaccess is loaded before anything else. As for my mod using Yandex as a blocking string will block it, if it is still showing then thats because either they dont actually have yandex in the user agent or you've entered more than just Yandex as the string, remember, my mod bans anything that has exactly your string to look for (including spaces), so if your strings in my mod look like this:
Baidu
Yandex
SoSo
.....etc
then yandex will be blocked, however if it looks like this:
Baidu
Yandex123
SoSo
....etc
then any bot with just yandex in their string or any other kind of yandex like YandexWorld will not be blocked, but anything containing yandex123 will be blocked.

As for writing to a file, just turn that bit off, it's there simply for test purposes, trouble shooting or checking on individual user agents.

My Hattiesburg 12-16-2012 08:20 PM

Your mod is blocking Yandex, but I just figured since it was using the same IP address every time it might be better to just go ahead and block that IP.

I guess I didn't do something right in the .htaccess file because Yandex was getting past it and was getting blocked by the mod.

Simon Lloyd 12-17-2012 03:52 PM

yandex doesn't always use the same IP, somewhere earlier in the pages for this mod i think i posted how to do it in .htaccess if you wish.


All times are GMT. The time now is 10:33 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01726 seconds
  • Memory Usage 1,751KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (7)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete