Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 3.0 > vBulletin 3.0 Full Releases

Reply
 
Thread Tools
Remove Bot SIDs from URL Requests Details »»
Remove Bot SIDs from URL Requests
Version: 1.00, by calorie calorie is offline
Developer Last Online: Nov 2023 Show Printable Version Email this Page

Version: 3.0.3 Rating:
Released: 09-29-2004 Last Update: Never Installs: 16
 
No support by the author.

Hack 1: vB303_remove_bot_sids_1.txt

Okay so I notice that there are some bots where SIDs are in the requests. One such bot is msnbot, and who knows of the current code behind this bot, but it seems that it treats each different SID as a new link. Here is a quick and dirty hack to prevent this. You need the $_SERVER['HTTP_USER_AGENT'] and $_SERVER['REQUEST_URI'] array elements or their equivalents to use this mini hack. The first step of the hack prevents SIDs in new requests. The second step forces a redirect in order to strip the SIDs from links in the bot memory. There is no need to apply this hack for bots that have google or slurp@inktomi or yahoo! slurp as part of their user agent. Like I said, it is a quick and dirty hack, but it does what I need it to do. If you use this mod, a click of the install button is appreciated.

Hack 2: vB303_remove_bot_sids_2.txt

Do the following to see a list of bots that may appear on the Who's Online list: AdminCP >> vBulletin Options >> Who's Online Options >> Spider Identification Strings & Enable Spider Display & Spider Identification Description

However, according to http://www.vbulletin.com/forum/showthread.php?t=112022, the user agents that don't receive session IDs are hard coded in the sessions.php file. The bots that are hard coded are as follows: google, slurp@inktomi, yahoo! slurp

Thus the bots for the "who's online list" versus the bots in the "remove SID list" are currently not the same. This hack removes the session ids from the list of bots in the vBulletin Options rather than from those that were hard coded in the script.

It may be the case that pages were already crawled by a bot not hard coded in the "remove SID list" so those bots may spider with session ids in the requests. This hack includes an optional step to remove session ids from such bots via redirect.

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #22  
Old 02-28-2005, 02:24 AM
calorie calorie is offline
 
Join Date: May 2003
Posts: 2,804
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Here is just one example where a bot indexed a thread, before vB 3.0.6 was released, and still remembers the SID even though reindexing a vB 3.0.7 board:
Code:
207.46.98.56 - - [26/Feb/2005:10:31:06 -0800] "GET /forum/showthread.php?s= 6f58fd4a031cc78ad7043cdfa0de3287 &t=952 HTTP/1.0" 301 0 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"

207.46.98.56 - - [26/Feb/2005:10:34:22 -0800] "GET /forum/showthread.php?t=952 HTTP/1.0" 200 36160 "-" "msnbot/1.0 (+http://search.msn.com/msnbot.htm)"
Note how the first request has a SID in the request, and note how it is 301 redirected, hence the second request. Do tail -f on your access log and watch.
Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 11:44 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.03410 seconds
  • Memory Usage 2,199KB
  • Queries Executed 16 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)bbcode_code
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (2)pagenav_pagelink
  • (2)post_thanks_box
  • (2)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (2)post_thanks_postbit_info
  • (1)postbit
  • (2)postbit_onlinestatus
  • (2)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete