vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.5 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=113)
-   -   Spider Watcher (https://vborg.vbsupport.ru/showthread.php?t=104582)

Trana 04-21-2006 02:04 AM

OK, I fixed the xml file. I'm assuming that it won't update the previous crawls in the database?

DementedMindz 04-21-2006 02:54 AM

nah it wont... thats why i been asking for a update button or something... if your good with sql then you can edit it yourself.. thats what i did... or you can clean them out from sql and let them crawl again

Trana 04-21-2006 03:27 AM

I'm just going to wait for one to be recognized so I know the xml format is working and then I'll empty the table.

Thanks.

DementedMindz 04-21-2006 03:33 AM

no problem :)

mikelbeck 04-21-2006 07:08 PM

Quote:

Originally Posted by DementedMindz
nah it wont... thats why i been asking for a update button or something... if your good with sql then you can edit it yourself.. thats what i did... or you can clean them out from sql and let them crawl again

That's coming... I haven't had time to get much of anything done lately, but it's on the list of things that will be in the next version.

DementedMindz 04-21-2006 07:44 PM

yeah that would be nice... i cant wait to see the new version :)

Trana 04-21-2006 09:33 PM

Hmmm, they are still not being recognized by my site. This is what I have at the top of my file:

HTML Code:

<?xml version="1.0" encoding="ISO-8859-1" ?>
<searchspiders>
<!DOCTYPE spiderlist (View Source for full doctype...)>
- <!--
 $Id$
-->

- <spiderlist version="1.0">
- <spider ident="ABCdatos">
<name>ABCdatos BotLink</name>
<type>searchspider</type>
<info>http://www.robotstxt.org/wc/active/html/abcdatos.html</info>
<email>botlink+AEA-abcdatos.com</email>
- <addresses>
<address type="CIDR">217.126.0.0/18</address>
</addresses>
</spider>
- <spider ident="abot/">

Does this look like yours?

Thanks.

DementedMindz 04-21-2006 10:26 PM

here is mine... just upload and overwrite it or use it as a example... im going to update mine in a few days as i have to add some new bots in it...

Trana 04-22-2006 03:17 PM

That worked great! Thanks again.

DementedMindz 04-22-2006 03:32 PM

no problem glad i could help...

The Notorious 04-22-2006 07:01 PM

Good hack, installed.

Trana 04-27-2006 03:24 PM

What does everyone recommend I do with this nasty beast:

Unknown Spider
Mozilla/5.0 (000000000; 0; 000 000 00 0 000000; 00000; 0000000000) 00000000000000 000000000000000

82.48.249.204
82.56.186.64

Is there an ID for it? If it won't identify itself should I just start blocking the IPs? Can I restrict them in my robots.txt?

Thanks!

DementedMindz 04-27-2006 04:00 PM

personally im about to remove this hack... it has high queries and it sucks they dont update...

Trana 04-27-2006 04:35 PM

Quote:

Originally Posted by DementedMindz
personally im about to remove this hack... it has high queries and it sucks they dont update...

Its pretty stupid that it won't update old records in the DB, and since it uses so many queries it is obviously coded poorly.
But won't the high number of queries on affect the server when you load the spider page? I only allow admins to load it, so people shouldn't be hitting it all the time.

DementedMindz 04-27-2006 04:55 PM

yeah they affect the server load... im removing it since i dont think it will be updated anytime soon... if it is then i may reinstall but as of right now it sucks editing from sql or having 500 bots unknown

Trana 04-27-2006 05:15 PM

But it only affects the server when you actually load the page right? Its not an ongoing thing with cron or anything is it?

mikelbeck 04-27-2006 10:04 PM

It's not "stupid" that it doesn't update, the installation instructions say to update your spiders xml file BEFORE you install this. If you do that, then you won't have anything to update, as it'd be using the latest data.

The only time there should be a lot of queries is when it hasn't "rolled up" the data recently. After that first hit the number of queries should drop.

Quote:

and since it uses so many queries it is obviously coded poorly
That may be the stupidiest thing I've ever heard.

Look, people, I wrote this thing for myself. I posted it here thinking that other people may benefit from it. If you want to use it, use it. If you don't, then don't. But don't go bashing it or my code. If you think you can do better, then write your own.

I will update it when I can, right now my top priority is finding a new job so I can support my family. Updating the the freebie hacks that I write when the mood strikes me aren't even in the top 10 of my list of priorities right now.

{edit}
One other thing... The first version that I put out deciphered the spider data when the user viewed the spiders page. If the spiders xml file had been updated after the data was collected it would be interpreted properly. But that version had way too many queries and everybody complained about it. Now the spider data is decoded when the spider visits and it's written to the database. The spiders page just collates all of that data and doesn't do any deciphering.

xStylezx 04-27-2006 10:40 PM

This hack is great bro.The only real issue i see with it is that sometimes the date and time doesnt get updated on a fresh spider visit to the forum.Sometimes the date reflects the true last visit and sometimes it doesnt.That is about the only thing i wish could be fixed up.Other than that,this hack is excellent and dont worry about harsh opinions of it,i cant wait to see the next update

Trana 04-28-2006 02:36 PM

Mike,

I appreciate this hack, it gives me good visibility to what the spiders are actually doing. I'm not a very good coder so I can't improve on your work.

My previous message was just stating that 58 queries is an extraordinarily large amount for such a simple page. It was my impression that the data would be regenerated when the spider page was loaded only, now I realize that its doing a lot of work each time a spider hits the site.

Thanks again.

Logikos 04-29-2006 05:48 AM

great hack, but there lays a very crtitcal problem for me at least. When viewing the page.

Page generated in 0.59078 seconds with 50 queries [Server Loads: 0.41 0.22 : 0.10]

Thats quite alot of quires. The more spiders it fetches, the more queires this adds.

The Notorious 05-01-2006 01:14 AM

This hack killed my server today, was working fine today but it's messed my SQL and turned the server down.
I got like hundreds of e-mails like this one:

Database error in vBulletin 3.5.4:

Invalid SQL:
INSERT INTO spider_watcher (nice_bot, bot, ip_address, page, type, info, timestamp) VALUES ('Unknown Spider','Mozilla/4.0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)','1198658994','index','','', NOW());

MySQL Error : Duplicate entry '431497' for key 1
Error Number : 1062
Date : Sunday, April 30th 2006 @ 08:17:10 PM

mikelbeck 05-01-2006 01:47 AM

Quote:

Originally Posted by The Notorious
This hack killed my server today, was working fine today but it's messed my SQL and turned the server down.
I got like hundreds of e-mails like this one:

Database error in vBulletin 3.5.4:

Invalid SQL:
INSERT INTO spider_watcher (nice_bot, bot, ip_address, page, type, info, timestamp) VALUES ('Unknown Spider','Mozilla/4.0 Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)','1198658994','index','','', NOW());

MySQL Error : Duplicate entry '431497' for key 1
Error Number : 1062
Date : Sunday, April 30th 2006 @ 08:17:10 PM

I don't think this error was caused by this hack... The key that it's complaining about is an auto_increment field, as far as I know there's no way to force it to try to insert a duplicate key like this is reporting. And the sql line shown above does not have anything about the key field that it's complaining about.

mikelbeck 05-01-2006 01:50 AM

Quote:

Originally Posted by LiveWire
great hack, but there lays a very crtitcal problem for me at least. When viewing the page.

Page generated in 0.59078 seconds with 50 queries [Server Loads: 0.41 0.22 : 0.10]

Thats quite alot of quires. The more spiders it fetches, the more queires this adds.

True, and I don't see any way to reduce those queries right now. The only thing I can think of to do is to run a job that will occasionaly "roll up" the spider data, and then when you view the spiders page it will only display data that's already been collated. But then the data displayed won't always be up to date. If anybody else has any ideas, I'm open to suggestions.

The Notorious 05-01-2006 01:57 AM

Quote:

Originally Posted by mikelbeck
I don't think this error was caused by this hack... The key that it's complaining about is an auto_increment field, as far as I know there's no way to force it to try to insert a duplicate key like this is reporting. And the sql line shown above does not have anything about the key field that it's complaining about.

Well as soon as I removed the hack it started working fine...

The Notorious 05-01-2006 03:16 AM

I started to get the same e-mails again. Can you tell me exactly what this hack modifies on my board so I can look if anything is left after the uninstall?
Thanks

mikelbeck 05-01-2006 05:40 PM

Quote:

Originally Posted by The Notorious
I started to get the same e-mails again. Can you tell me exactly what this hack modifies on my board so I can look if anything is left after the uninstall?
Thanks

If you used the un-install function, it should have removed everything.

It creates a few plug-ins:
Spider Watcher
Spiders Location (Part 1)
Spiders Location (Part 2)
Spider Watcher Template Group

and it creates a few template in the "Spider Watcher Templates" group.

hambil 05-01-2006 05:46 PM

Quote:

Originally Posted by mikelbeck
True, and I don't see any way to reduce those queries right now. The only thing I can think of to do is to run a job that will occasionaly "roll up" the spider data, and then when you view the spiders page it will only display data that's already been collated. But then the data displayed won't always be up to date. If anybody else has any ideas, I'm open to suggestions.

It's not a high traffic page. A lot of users jump onto the 'queries bad' bandwaggen a little to quickly. Queries make the vb world run.

The Notorious 05-02-2006 11:43 AM

Quote:

Originally Posted by mikelbeck
If you used the un-install function, it should have removed everything.

It creates a few plug-ins:
Spider Watcher
Spiders Location (Part 1)
Spiders Location (Part 2)
Spider Watcher Template Group

and it creates a few template in the "Spider Watcher Templates" group.

Thanks mate, I stopped getting the e-mails and sever is running just fine.:cool:

Dr.Viggy 05-03-2006 11:06 PM

nice hack, thanks.

*installed

mikelbeck 05-04-2006 01:18 AM

Quote:

Originally Posted by The Notorious
Thanks mate, I stopped getting the e-mails and sever is running just fine.:cool:

With or without this hack installed?

kurtbarker 05-04-2006 01:30 AM

i've been running this hack for a while, really nice work there mate...
and with no dramas ;)

The Notorious 05-05-2006 03:54 PM

Quote:

Originally Posted by mikelbeck
With or without this hack installed?

Without.

kofoid 05-12-2006 09:05 PM

Should it work instantly?

Zia 05-13-2006 09:41 AM

Quote:

Originally Posted by The Notorious
Without.


Really..Really strange...

its working SMOOTHLY in our board.. ..why u facing probs...

gez theres another thing..that s creating the pobs..
do u use "User access based on post count" hack?if yes..try w/o post count hack...

maranello 05-17-2006 10:27 PM

I uploaded the new spider list from the given link, I replaced it with the old one in my boards, but i still get unknown spider ALL the time.

http://www.total-clan.com/forums/spiders.php

maranello 05-18-2006 01:11 AM

hello anyone?

Zia 05-18-2006 04:37 AM

Quote:

Originally Posted by maranello
I uploaded the new spider list from the given link, I replaced it with the old one in my boards, but i still get unknown spider ALL the time.

http://www.total-clan.com/forums/spiders.php


did u update ur spider list.xml properly ?????
get that from vbulletin.com

maranello 05-18-2006 04:44 AM

What did I say in my first post? lol:) I updated it:)

maranello 05-18-2006 04:47 AM

<a href="http://www.vbulletin.com/forum/showpost.php?p=565415&postcount=12" target="_blank">http://www.vbulletin.com/forum/showp...5&postcount=12</a>

Isnt this the list we are talking about? spiders.xml file?

Trana 05-18-2006 01:16 PM

It sounds like you are having the same problem that I had. The format for the spider file is not the same as what VB expects. Read back a few pages in this thread and there is another copy of the spider file you can use to get started.


All times are GMT. The time now is 09:08 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01585 seconds
  • Memory Usage 1,831KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)bbcode_html_printable
  • (13)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete