Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 3.5 > vBulletin 3.5 Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Spider Watcher Details »»
Spider Watcher
Version: 1.0.0 B10, by mikelbeck mikelbeck is offline
Developer Last Online: Feb 2016 Show Printable Version Email this Page

Version: 3.5.4 Rating:
Released: 01-05-2006 Last Update: 08-08-2006 Installs: 194
DB Changes Uses Plugins Template Edits
Additional Files Is in Beta Stage  
No support by the author.

Spider Watcher
Author: Mikel Beck (mikel.beck@elite-computing.net)


This hack keeps track of the spiders (Search Engine robots) that visit your fourm. Every time a guest visits a page, the guest's IP address, user agent and the page they visited are logged to the database.

When somebody views the spider statistics page, this data is "rolled up", meaning the raw data is collated, the spider's name is determined by comparing the user agent to data contained in the spiders_bulletin.xml file, and the number of pages and visits is summarized and writted back to the database. In addition, and data from non-bots is removed.

The data is then displayed in a easy to read format for your viewing pleasure.

If the user viewing the report has permissions to view IP addresses, these are displayed as well.

A live version of the report from one of my sites can be seen here: http://www.happyhourpub.com/spiders.php

Also see the attached screenshot for an exmaple.


Revision History:
1.0.0 Beta 1 - 01/05/2006
- Initial Release

1.0.0 Beta 2 - 01/06/2006
- Included templates for spiders.php
- Removed text from templates, added them as phrases

1.0.0 Beta 3 - 01/07/2006
- Split up the display of "known" and "unknown" spiders

1.0.0 Beta 4 - 01/25/2006
- Corrected potentional SQL injection issue in plug-in
- Reduced the number of SQL queries required to display statistics
- Corrected date/time display issue

1.0.0 Beta 5 - 02/01/2006
- Reduced the number of SQL queries required to display statistics

1.0.0 Beta 6 - 02/08/2006
- No release

1.0.0 Beta 7 - 02/11/2006
- Corrected issue with "unknown" spiders not being displayed properly.
- Added tracking of the type of spider (searchspider, link checker, etc)

1.0.0 Beta 8 - 02/19/2006
- Change the display of IP addresses to be a pop-up so they're all not displayed on the main page.
- Combined the spiders that have the same name but different user agents.

1.0.0 Beta 9 - 03/10/2006
- Changed the display to group similar spiders together (search spiders, http check spiders, etc)

1.0.0 Beta 10 - 08/08/2006
- Changed how the rollup functions. Instead of rolling up every time somebody views the spider page, it rolls up once per hour.
- Corrected a few bugs here and there, mostly related to removing entries from the database.

Installation Instructions
1. Upload spiders.php to the root of your forum.
2. Upload spiders_rollup.php to the includes/cron directory.
3. Import the file product-spiderwatcher.xml using the Manage Products module.
4. Add a link to spiders.php on your navbar or footer.
5. Add a cron job with the following information:
Title: Spider Watcher Rollup
Day of the Week: *
Day of the Month: *
Hour: *
Minute: 0 - - -
Log entries: Yes
Filename: ./includes/cron/spiders_rollup.php

Upgrade Instructions
1. Upload (and overwrite) spiders to the root of your forum.
2. Upload spiders_rollup.php to the includes/cron directory.
3. Import the file product-spiderwatcher.xml using the Manage Products module. Make sure the "Allow Overwrite" option is set to "Yes".
4. Add a link to spiders.php on your navbar or footer.
5. Add a cron job with the following information:
Title: Spider Watcher Rollup
Day of the Week: *
Day of the Month: *
Hour: *
Minute: 0 - - -
Log entries: Yes
Filename: ./includes/cron/spiders_rollup.php

***UPGRADE NOTE***
When you upgrade from version 1.0.0 Beta 7 to 1.0.0 Beta 8 your existing spider data will be lost!


To make sure that you can decode the maximum amount of spiders, you should grab the latest spiderlist.xml and replace the spiders_vbulletin.xml file in your forumhome/includes/xml/ directory with the one from this thread: http://www.vbulletin.com/forum/showthread.php?t=76662

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #292  
Old 04-22-2006, 07:01 PM
The Notorious's Avatar
The Notorious The Notorious is offline
 
Join Date: Jan 2006
Posts: 118
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Good hack, installed.
Reply With Quote
  #293  
Old 04-27-2006, 03:24 PM
Trana Trana is offline
 
Join Date: Apr 2005
Posts: 604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

What does everyone recommend I do with this nasty beast:

Unknown Spider
Mozilla/5.0 (000000000; 0; 000 000 00 0 000000; 00000; 0000000000) 00000000000000 000000000000000

82.48.249.204
82.56.186.64

Is there an ID for it? If it won't identify itself should I just start blocking the IPs? Can I restrict them in my robots.txt?

Thanks!
Reply With Quote
  #294  
Old 04-27-2006, 04:00 PM
DementedMindz DementedMindz is offline
 
Join Date: Jan 2006
Posts: 1,474
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

personally im about to remove this hack... it has high queries and it sucks they dont update...
Reply With Quote
  #295  
Old 04-27-2006, 04:35 PM
Trana Trana is offline
 
Join Date: Apr 2005
Posts: 604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by DementedMindz
personally im about to remove this hack... it has high queries and it sucks they dont update...
Its pretty stupid that it won't update old records in the DB, and since it uses so many queries it is obviously coded poorly.
But won't the high number of queries on affect the server when you load the spider page? I only allow admins to load it, so people shouldn't be hitting it all the time.
Reply With Quote
  #296  
Old 04-27-2006, 04:55 PM
DementedMindz DementedMindz is offline
 
Join Date: Jan 2006
Posts: 1,474
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

yeah they affect the server load... im removing it since i dont think it will be updated anytime soon... if it is then i may reinstall but as of right now it sucks editing from sql or having 500 bots unknown
Reply With Quote
  #297  
Old 04-27-2006, 05:15 PM
Trana Trana is offline
 
Join Date: Apr 2005
Posts: 604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

But it only affects the server when you actually load the page right? Its not an ongoing thing with cron or anything is it?
Reply With Quote
  #298  
Old 04-27-2006, 10:04 PM
mikelbeck's Avatar
mikelbeck mikelbeck is offline
 
Join Date: Jul 2005
Location: 4C6F6E672049736C616E642C2
Posts: 238
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

It's not "stupid" that it doesn't update, the installation instructions say to update your spiders xml file BEFORE you install this. If you do that, then you won't have anything to update, as it'd be using the latest data.

The only time there should be a lot of queries is when it hasn't "rolled up" the data recently. After that first hit the number of queries should drop.

Quote:
and since it uses so many queries it is obviously coded poorly
That may be the stupidiest thing I've ever heard.

Look, people, I wrote this thing for myself. I posted it here thinking that other people may benefit from it. If you want to use it, use it. If you don't, then don't. But don't go bashing it or my code. If you think you can do better, then write your own.

I will update it when I can, right now my top priority is finding a new job so I can support my family. Updating the the freebie hacks that I write when the mood strikes me aren't even in the top 10 of my list of priorities right now.

{edit}
One other thing... The first version that I put out deciphered the spider data when the user viewed the spiders page. If the spiders xml file had been updated after the data was collected it would be interpreted properly. But that version had way too many queries and everybody complained about it. Now the spider data is decoded when the spider visits and it's written to the database. The spiders page just collates all of that data and doesn't do any deciphering.
Reply With Quote
  #299  
Old 04-27-2006, 10:40 PM
xStylezx xStylezx is offline
 
Join Date: Mar 2006
Posts: 50
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

This hack is great bro.The only real issue i see with it is that sometimes the date and time doesnt get updated on a fresh spider visit to the forum.Sometimes the date reflects the true last visit and sometimes it doesnt.That is about the only thing i wish could be fixed up.Other than that,this hack is excellent and dont worry about harsh opinions of it,i cant wait to see the next update
Reply With Quote
  #300  
Old 04-28-2006, 02:36 PM
Trana Trana is offline
 
Join Date: Apr 2005
Posts: 604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Mike,

I appreciate this hack, it gives me good visibility to what the spiders are actually doing. I'm not a very good coder so I can't improve on your work.

My previous message was just stating that 58 queries is an extraordinarily large amount for such a simple page. It was my impression that the data would be regenerated when the spider page was loaded only, now I realize that its doing a lot of work each time a spider hits the site.

Thanks again.
Reply With Quote
  #301  
Old 04-29-2006, 05:48 AM
Logikos Logikos is offline
 
Join Date: Jan 2003
Posts: 2,924
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

great hack, but there lays a very crtitcal problem for me at least. When viewing the page.

Page generated in 0.59078 seconds with 50 queries [Server Loads: 0.41 0.22 : 0.10]

Thats quite alot of quires. The more spiders it fetches, the more queires this adds.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 02:38 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.09052 seconds
  • Memory Usage 2,317KB
  • Queries Executed 26 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (2)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete