Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 3.5 > vBulletin 3.5 Add-ons
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Spider Watcher Details »»
Spider Watcher
Version: 1.0.0 B10, by mikelbeck mikelbeck is offline
Developer Last Online: Feb 2016 Show Printable Version Email this Page

Version: 3.5.4 Rating:
Released: 01-05-2006 Last Update: 08-08-2006 Installs: 194
DB Changes Uses Plugins Template Edits
Additional Files Is in Beta Stage  
No support by the author.

Spider Watcher
Author: Mikel Beck (mikel.beck@elite-computing.net)


This hack keeps track of the spiders (Search Engine robots) that visit your fourm. Every time a guest visits a page, the guest's IP address, user agent and the page they visited are logged to the database.

When somebody views the spider statistics page, this data is "rolled up", meaning the raw data is collated, the spider's name is determined by comparing the user agent to data contained in the spiders_bulletin.xml file, and the number of pages and visits is summarized and writted back to the database. In addition, and data from non-bots is removed.

The data is then displayed in a easy to read format for your viewing pleasure.

If the user viewing the report has permissions to view IP addresses, these are displayed as well.

A live version of the report from one of my sites can be seen here: http://www.happyhourpub.com/spiders.php

Also see the attached screenshot for an exmaple.


Revision History:
1.0.0 Beta 1 - 01/05/2006
- Initial Release

1.0.0 Beta 2 - 01/06/2006
- Included templates for spiders.php
- Removed text from templates, added them as phrases

1.0.0 Beta 3 - 01/07/2006
- Split up the display of "known" and "unknown" spiders

1.0.0 Beta 4 - 01/25/2006
- Corrected potentional SQL injection issue in plug-in
- Reduced the number of SQL queries required to display statistics
- Corrected date/time display issue

1.0.0 Beta 5 - 02/01/2006
- Reduced the number of SQL queries required to display statistics

1.0.0 Beta 6 - 02/08/2006
- No release

1.0.0 Beta 7 - 02/11/2006
- Corrected issue with "unknown" spiders not being displayed properly.
- Added tracking of the type of spider (searchspider, link checker, etc)

1.0.0 Beta 8 - 02/19/2006
- Change the display of IP addresses to be a pop-up so they're all not displayed on the main page.
- Combined the spiders that have the same name but different user agents.

1.0.0 Beta 9 - 03/10/2006
- Changed the display to group similar spiders together (search spiders, http check spiders, etc)

1.0.0 Beta 10 - 08/08/2006
- Changed how the rollup functions. Instead of rolling up every time somebody views the spider page, it rolls up once per hour.
- Corrected a few bugs here and there, mostly related to removing entries from the database.

Installation Instructions
1. Upload spiders.php to the root of your forum.
2. Upload spiders_rollup.php to the includes/cron directory.
3. Import the file product-spiderwatcher.xml using the Manage Products module.
4. Add a link to spiders.php on your navbar or footer.
5. Add a cron job with the following information:
Title: Spider Watcher Rollup
Day of the Week: *
Day of the Month: *
Hour: *
Minute: 0 - - -
Log entries: Yes
Filename: ./includes/cron/spiders_rollup.php

Upgrade Instructions
1. Upload (and overwrite) spiders to the root of your forum.
2. Upload spiders_rollup.php to the includes/cron directory.
3. Import the file product-spiderwatcher.xml using the Manage Products module. Make sure the "Allow Overwrite" option is set to "Yes".
4. Add a link to spiders.php on your navbar or footer.
5. Add a cron job with the following information:
Title: Spider Watcher Rollup
Day of the Week: *
Day of the Month: *
Hour: *
Minute: 0 - - -
Log entries: Yes
Filename: ./includes/cron/spiders_rollup.php

***UPGRADE NOTE***
When you upgrade from version 1.0.0 Beta 7 to 1.0.0 Beta 8 your existing spider data will be lost!


To make sure that you can decode the maximum amount of spiders, you should grab the latest spiderlist.xml and replace the spiders_vbulletin.xml file in your forumhome/includes/xml/ directory with the one from this thread: http://www.vbulletin.com/forum/showthread.php?t=76662

Supporters / CoAuthors

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #132  
Old 01-25-2006, 01:48 AM
XtremeOffroad XtremeOffroad is offline
 
Join Date: Jul 2005
Posts: 236
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Getting this error
Warning: array_multisort(): Argument #1 is expected to be an array or a sort flag in /spiders.php on line 223

And still no spiders show on my site.
I had asked for help with this in the past with no reply.
Reply With Quote
  #133  
Old 01-25-2006, 02:19 AM
Zia's Avatar
Zia Zia is offline
 
Join Date: Dec 2005
Location: golpo.net
Posts: 931
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

sounds nice.....
this detect search sipder & unknown spider.....but how to show the bot's home page url in the bots list page(generated page) bellow the botnick. ?

Just curious any one tried to detect unknown spiders??

thnx
Reply With Quote
  #134  
Old 01-25-2006, 02:24 AM
Megareus Rex's Avatar
Megareus Rex Megareus Rex is offline
 
Join Date: Feb 2004
Location: Pennsylvania, USA
Posts: 243
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

A concern and suggestion.

One of the spiders (Yahoo! Slurp) is returning literally DOZENS of IPs. Here's the particular block I have now (and I've only had it for a couple days):

Quote:
0.0.0.115, 68.142.249.15, 68.142.249.19, 68.142.249.20, 68.142.249.21, 68.142.249.25, 68.142.249.31, 68.142.249.35, 68.142.249.38, 68.142.249.42, 68.142.249.44, 68.142.249.47, 68.142.249.48, 68.142.249.58, 68.142.249.61, 68.142.249.67, 68.142.249.68, 68.142.249.73, 68.142.249.81, 68.142.249.83, 68.142.249.84, 68.142.249.85, 68.142.249.91, 68.142.249.92, 68.142.249.96, 68.142.249.98, 68.142.249.99, 68.142.249.102, 68.142.249.108, 68.142.249.110, 68.142.249.112, 68.142.249.115, 68.142.249.116, 68.142.249.117, 68.142.249.119, 68.142.249.120, 68.142.249.124, 68.142.249.127, 68.142.249.132, 68.142.249.152, 68.142.249.154, 68.142.249.159, 68.142.249.164, 68.142.249.168, 68.142.249.176, 68.142.249.188, 68.142.249.191, 68.142.249.201, 68.142.249.207, 68.142.249.208, 68.142.250.11, 68.142.250.12, 68.142.250.13, 68.142.250.14, 68.142.250.15, 68.142.250.22, 68.142.250.26, 68.142.250.28, 68.142.250.36, 68.142.250.43, 68.142.250.53, 68.142.250.65, 68.142.250.73, 68.142.250.77, 68.142.250.79, 68.142.250.83, 68.142.250.86, 68.142.250.91, 68.142.250.93, 68.142.250.101, 68.142.250.102, 68.142.250.111, 68.142.250.114, 68.142.250.116, 68.142.250.118, 68.142.250.119, 68.142.250.122, 68.142.250.124, 68.142.250.126, 68.142.250.130, 68.142.250.131, 68.142.250.141, 68.142.250.142, 68.142.250.147, 68.142.250.152, 68.142.250.153, 68.142.250.154, 68.142.250.155, 68.142.250.158, 68.142.250.163, 68.142.250.165, 68.142.250.167, 68.142.250.169, 68.142.250.172, 68.142.250.176, 68.142.250.180, 68.142.250.181, 68.142.250.183, 68.142.250.187, 68.142.250.189, 68.142.250.193, 68.142.250.199, 68.142.250.202, 68.142.250.203, 68.142.250.208, 68.142.251.14, 68.142.251.18, 68.142.251.19, 68.142.251.23, 68.142.251.25, 68.142.251.34, 68.142.251.46, 68.142.251.47, 68.142.251.59, 68.142.251.69, 68.142.251.81, 68.142.251.85, 68.142.251.86, 68.142.251.92, 68.142.251.96, 68.142.251.101, 68.142.251.110, 68.142.251.113, 68.142.251.119, 68.142.251.123, 68.142.251.129, 68.142.251.132, 68.142.251.144, 68.142.251.153, 68.142.251.154, 68.142.251.155, 68.142.251.159, 68.142.251.167, 68.142.251.170, 68.142.251.180, 68.142.251.184, 68.142.251.185, 68.142.251.190, 68.142.251.196, 68.142.251.201, 68.142.251.203, 202.160.180.127
You might want to consider adding an IP display limit with a "View All" link to see all the IPs at your request, rather than displaying all on the main page itself.
Reply With Quote
  #135  
Old 01-26-2006, 12:35 AM
mikelbeck's Avatar
mikelbeck mikelbeck is offline
 
Join Date: Jul 2005
Location: 4C6F6E672049736C616E642C2
Posts: 238
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Chris-777: Still working on that. Don't worry, I haven't forgotten about you.

XtremeOffroad: That sounds like there's nothing to be sorted, so the multisort is failing. I will put a check in there in the next version. As for no spiders... Do they visit your site? Are you sure that the plug-in is installed and active?

Zia: What if I made the spider's name a hotlink instead of displaying the URL under it?

Megareus Rex: I'll see what I can do with that. Maybe just display 5 or so with a "view others" link to show the rest? Keep in mind that only users who have the "view IP" privliledge (such as admins or moderators) will see the IPs. All other users see nothing.

Everybody: What about the number of queries?
Reply With Quote
  #136  
Old 01-26-2006, 12:37 AM
Guest210212002
Guest
 
Posts: n/a
Default

Thanks dude. If it's just me that has the problem, no worries, I'll get by. It's a fantastic hack otherwise.

/salute
Reply With Quote
  #137  
Old 01-26-2006, 12:43 AM
Megareus Rex's Avatar
Megareus Rex Megareus Rex is offline
 
Join Date: Feb 2004
Location: Pennsylvania, USA
Posts: 243
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by mikelbeck
Chris-777: Still working on that. Don't worry, I haven't forgotten about you.

XtremeOffroad: That sounds like there's nothing to be sorted, so the multisort is failing. I will put a check in there in the next version. As for no spiders... Do they visit your site? Are you sure that the plug-in is installed and active?

Zia: What if I made the spider's name a hotlink instead of displaying the URL under it?

Megareus Rex: I'll see what I can do with that. Maybe just display 5 or so with a "view others" link to show the rest? Keep in mind that only users who have the "view IP" privliledge (such as admins or moderators) will see the IPs. All other users see nothing.

Everybody: What about the number of queries?
A "View All" link would definitely be good. I know only those with IP viewing perms can see them, but I'm one of those people, and its rather annoying to see the Yahoo! Slurp spider growing ever larger... :P

As for the # of queries....well....just check this link out (look at the bottom):
http://www.evermoreforums.com/forums/spiders.php

I've been getting 800+ queries, though it seems to be growing (was only 700+ a few hours ago). So yeah...loads.
Reply With Quote
  #138  
Old 01-26-2006, 12:56 AM
mikelbeck's Avatar
mikelbeck mikelbeck is offline
 
Join Date: Jul 2005
Location: 4C6F6E672049736C616E642C2
Posts: 238
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Megareus Rex
I've been getting 800+ queries, though it seems to be growing (was only 700+ a few hours ago). So yeah...loads.
Yeah, that's way too much. I'll have to take another look at that.

I know how to fix it, the problem is I don't want to do it that way. What happens is this... When a guest arrives at a page, the spider plug in writes a line to the database. Just the timestamp, user agent and page. When you run the spiders page, it "rolls up" all of the data so it's displayed in a nice, neat format and all the old data is removed. That's what's creating so many queries.

If I were to change the plug in to check to see if there's already a record for that spider in the database and then just update it as the page is loaded, there would be no need to "roll up". But I think that would add a few queries on each page load (by a non-user, meaning a guest or spider) and I don't think that's a good idea.

I'll see what else I can come up with.
Reply With Quote
  #139  
Old 01-26-2006, 02:52 AM
Megareus Rex's Avatar
Megareus Rex Megareus Rex is offline
 
Join Date: Feb 2004
Location: Pennsylvania, USA
Posts: 243
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by mikelbeck
Yeah, that's way too much. I'll have to take another look at that.

I know how to fix it, the problem is I don't want to do it that way. What happens is this... When a guest arrives at a page, the spider plug in writes a line to the database. Just the timestamp, user agent and page. When you run the spiders page, it "rolls up" all of the data so it's displayed in a nice, neat format and all the old data is removed. That's what's creating so many queries.

If I were to change the plug in to check to see if there's already a record for that spider in the database and then just update it as the page is loaded, there would be no need to "roll up". But I think that would add a few queries on each page load (by a non-user, meaning a guest or spider) and I don't think that's a good idea.

I'll see what else I can come up with.
Just letting you know, its almost at 1100 queries, and a page generation time of 7+ seconds.

Not to mention the Yahoo Slurp spider's # of IPs has more than doubled (perhaps close to tripled) from earlier.
Reply With Quote
  #140  
Old 01-26-2006, 07:15 AM
Brandon Sheley's Avatar
Brandon Sheley Brandon Sheley is offline
 
Join Date: Mar 2005
Location: Google Kansas
Posts: 4,678
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by mikelbeck
Are the times & dates being displayed properly for you?
Have the number of queries decreased for you?

yes, queries are much lower.. under 100

not sure how to check time ? is there a way to reset,, so it shows zero spiders viewed ?

this would be handy i think..

so i could reset once a week, or a month,, to see how the spider traffic is...

thank you,, good hack still..

[high]* Brandon Sheley upgraded with no problems
[/high]

http://locoforum.com/forums/spiders.php
Page generated in 0.22789 seconds with 79 queries
Reply With Quote
  #141  
Old 01-26-2006, 10:56 AM
Megareus Rex's Avatar
Megareus Rex Megareus Rex is offline
 
Join Date: Feb 2004
Location: Pennsylvania, USA
Posts: 243
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

And now my queries are up to 1750....
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 06:26 AM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.04899 seconds
  • Memory Usage 2,328KB
  • Queries Executed 26 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (5)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (2)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (10)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete