vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vB3 General Discussions (https://vborg.vbsupport.ru/forumdisplay.php?f=111)
-   -   google spiders 60% of my threads (https://vborg.vbsupport.ru/showthread.php?t=46201)

SkuZZy 11-29-2002 05:24 AM

google spiders 60% of my threads
 
This hack has been released!

Click here: https://vborg.vbsupport.ru/showthrea...threadid=47087

woohoo :D

After trying the other two known spider scripts out there (overgrown's hack and fastforward's) and having no luck with them, I decided to try Xenon's. Back in mid-october I added the "beta" archive script (made by xenon, who is awesome) to my forums and just today google finally added many of my threads, about 2000, to their engine! FINALLY!!!

http://www2.google.com/search?hl=en&...ttleforums.com

I might consider releasing the hack (xenon wants me to) if there is enough interest for it... or atleast release it in beta form. However, with the release of vb3.0 so close, i'm not sure it would be worth the effort to release it as a full hack... so maybe just release it as a beta the way it is ( as seen at http://www.battleforums.com/history ) for those who are going to stay with vb2.

Any interest for this to be released? (edit: I've released it, visit https://vborg.vbsupport.ru/showthrea...threadid=47087)

Smoothie 11-29-2002 07:00 AM

yep. :)

thomas 11-29-2002 07:23 AM

I'm definitely interested! :)

GTGT 11-29-2002 10:35 AM

Hell yes!

Is it the best google hack out there?

SkuZZy 11-29-2002 10:43 AM

Quote:

Originally posted by GTGT
Hell yes!

Is it the best google hack out there?

Well, as far as getting threads spidered, yes. It will put all your threads in plain .html and it draws directly from the database, so it's instant. There are some requirements for it though and it is lacking a few features. For instance, it requires mod_rewrite if you want to have the files be .html and there is no feature in it right now to block out private forums selectively. I could probably add those small features to it though, just wondering if it's worth it since 2.2.9 is supposidly the last version of the 2.x.x series... and vb3 will probably be out in a few weeks. Which is why I started this thread, to see how much interest there is for this hack..... ;)

Xenon 11-29-2002 11:19 AM

well, there will be much users who won't upgrade fast, so it's worth to release :)

thanx for the kind words :)

Logician 11-29-2002 11:22 AM

If Stefan coded it, I'm sure it's great. :classic:

I just wonder the algorithm of it: Does it compile .html pages for everythread seperately and save it in the server or does it create the html page on the fly whenever it's requested (without having its physical existence in the server)?

Whatever way it works, so I believe you or Stefan should release it. It will definetely prove useful for many..

Xenon 11-29-2002 11:44 AM

well, the html file save process is skuzzy's work, my script just created &? free urls....

Velocd 11-29-2002 02:39 PM

Skuzzy forgot to mention that is he still using some of the code from either overgrown's or fastforward's hack, which is why it's producing the .html extensions, and which is why search engines are picking up on it better. The visual appearance and layout of his Archiv Hack is much different than original, not to mention the original purpose of the first archiv hack wasn't to allow easier spidering for search engines..but reduced file size in database.

Anyway, Skuzzy please if you could share the method for doing this as Logician mentioned. It would be of great help :p

SkuZZy 11-30-2002 03:10 PM

Quote:

Originally posted by Velocd
Skuzzy forgot to mention that is he still using some of the code from either overgrown's or fastforward's hack, which is why it's producing the .html extensions, and which is why search engines are picking up on it better. The visual appearance and layout of his Archiv Hack is much different than original, not to mention the original purpose of the first archiv hack wasn't to allow easier spidering for search engines..but reduced file size in database.

Anyway, Skuzzy please if you could share the method for doing this as Logician mentioned. It would be of great help :p

I'm not using any code from overgrowns hack. Overgrown's hack never made .html extensions, just directories... but it used 404 errors to do them, which search engines don't like. But with these scripts, I simply used mod_rewrite to do the .html part. It was pretty easy and it's all done server-side, so google can't tell the difference. The .html files don't actually exist though. About fastforward's hack, i'm not using any of his code, but it's the same basic idea. The only difference is, these archive scripts require no hacking of the board, it shouldn't even really be considered a hack, just an "addon". It's as simple as uploading the 3 scripts to a sub-directory and changing a few variables (you need to have mod_rewrite enabled though). With fastforward's hack (not that I have anything against it, it's a good hack) you needed to modify alot of templates and it gets pretty messy. Plus, with vb3 around the corner, it is silly to hack a board up when upgrading will soon be here ;)

But anyways, i've made up my mind... I will be releasing this hack. I'm going to add a few variables to it and clean it up a bit and should release it over the weekend, if I get some time. I want to state once again though, this is xenon's hack and he is the mastermind behind it, i've just spiced it up a bit.

So look forward to seeing this "addon" soon.

Thanks,
SkuZZy

Logician 11-30-2002 04:16 PM

Quote:

Originally posted by SkuZZy
The .html files don't actually exist though.
I was hoping so. This is the wise algorithm. I'm not sure if my server supports the feature but if it does, I'm definetely interested in this hack.. Thx to you both for coding and sharing..

Chruser 12-01-2002 01:56 AM

An excellent hack, if I may add. I just hope it is as good as it sounds. :)

KelteN 12-01-2002 03:33 AM

Release it! Release it! :D

SkuZZy 12-01-2002 12:11 PM

w00t! ok, the script is all done... nothing special, but definately cleaned up. The best part about it is how easy it is to setup, just upload all the files to a directory, change 3 variables and that's it. There are also header/footer files so you can control background color, ect.

Anyways, I need a couple of people to beta test it for me. If you're interested, send me a PM here on vb.org or contact me on AOL @ acidrush16[/b]. I need about 3.

Thanks,
SkuZZy

ArunanS 12-01-2002 12:44 PM

My archive looks like vB3 :p

I have yet to modify it to use html :)
http://www.noxmedia.net/support/archive

If there are any copyrights to the archive look, please tell me now so I may remoe it :)

Also click on News and Announcements, because that is the only forum I hae been testing my modifications and posts in :)

SkuZZy 12-01-2002 01:00 PM

Quote:

Originally posted by ArunanS
My archive looks like vB3 :p

I have yet to modify it to use html :)
http://www.noxmedia.net/support/archive

If there are any copyrights to the archive look, please tell me now so I may remoe it :)

Also click on News and Announcements, because that is the only forum I hae been testing my modifications and posts in :)

Looks nice man. Has google spidered it yet? I noticed the thread times (they say "NaN" on your site) load slow like vbulletin 3.0 does. I sincerely hope jelsoft fixes that. Some higher loading archives (with 300+ threads) can take literly 45 seconds to completely load, because the times load one at a time. Am I the only one experiencing this?

ArunanS 12-01-2002 01:13 PM

No...google hasn't yet, I hae yet to make the archive work efficiently. I also havn't put in the modification make the html files :)

Erwin 12-01-2002 09:33 PM

Just using a modified extended version of fastforward's hack (that completely makes all URLs .html) I've got 35,000 threads in Google, and it's still rising. :D

Velocd 12-02-2002 04:43 AM

I've been considering this rewrite thing to eliminate sessionhash and add .HTML for google and such to index efficiently, but there are many hacks out there for this purpose.

I'm probably going to go with the one Filburt has posted at vBulletin.com as it requires little hacking, and Fastfowards didn't work too well.

What is your version of this Erwin? Is it a hack that could be released, or is it integrated very deeply into your site that to prepare a hack for it would be too troublesome?

I took a quick browse through your forums Erwin, and you did a great job with the rewrite, and how it's all organized ;)

subduck 12-02-2002 06:42 AM

I'm not upgrading to vb3!
So please release it for 2.8 users!

Thanks :)

pimpingfools 12-02-2002 07:34 AM

I second that last post..

Erwin 12-02-2002 09:03 PM

Quote:

Originally posted by Velocd
What is your version of this Erwin? Is it a hack that could be released, or is it integrated very deeply into your site that to prepare a hack for it would be too troublesome?

I took a quick browse through your forums Erwin, and you did a great job with the rewrite, and how it's all organized ;)

Essentially, fastforward's hack, but applied to every page in every thread, also to memberlist, avatarlist, search pages and changed most PHP files so that sessionhash and dynamic URLs are eliminated except for latest posts etc. - too extensive to release. I don't really look forward to vB3 when I have to recode all the files again. :)

Thanks for the compliment too.

indiamike 12-02-2002 09:09 PM

Though it's a little off topic...and I just want to chime in a little bit here....
I use vbportal and google spiders my site almost everyday (usually 20 robots at a time), somtimes twice a day. It doesn't list all the posts in the search engine, at one time it did but dropped a bunch, however the big pitfall is that Google uses up around one gig of my bandwidth a month. I have a small site and that is still way to high.

I may start blocking Google just because of this....just a little warning if your inviting Google.

Cheers Anyway
Mike

NTLDR 12-02-2002 09:12 PM

Perhaps you should create a non-graphical style that is used when the googlebot visits? I'm sure this wouldn't be too hard to hack in.

SkuZZy 12-02-2002 09:42 PM

Quote:

Originally posted by indiamike
Though it's a little off topic...and I just want to chime in a little bit here....
I use vbportal and google spiders my site almost everyday (usually 20 robots at a time), somtimes twice a day. It doesn't list all the posts in the search engine, at one time it did but dropped a bunch, however the big pitfall is that Google uses up around one gig of my bandwidth a month. I have a small site and that is still way to high.

I may start blocking Google just because of this....just a little warning if your inviting Google.

Cheers Anyway
Mike

Good point being made here. I can see how some people might be concerned with this. I should point out, the archive scripts i'm releasing are very, very small in size the way they come. Of course, if you add images and styles into them, then they get much bigger. But using them the way they are, the bandwidth used will be virtually un-noticable. When it comes to getting google to archive your posts, the simpler = the better ;)

Erwin 12-02-2002 10:14 PM

You can use robots.txt to block Google or any other spiders spidering specific sections of your site or certain files like .gif etc.

Smoothie 12-03-2002 01:21 AM

Quote:

and there is no feature in it right now to block out private forums selectively. I could probably add those small features to it though
SkuZZy- This is something I need to have before using this. I only have one private forum, but it's the mod forum. Some pretty heavy stuff goes on there. I would need to block this forum.

Also, in your instructions;
Quote:

Add a link to the archive on your main forums page, otherwise google won't know it exists and therefore won't spider it!
Whats the best way to do this? Just a plain old link on forumhome? Then other members will be able to view it?

SkuZZy 12-03-2002 01:24 AM

Quote:

Originally posted by Smoothie
SkuZZy- This is something I need to have before using this. I only have one private forum, but it's the mod forum. Some pretty heavy stuff goes on there. I would need to block this forum.

Also, in your instructions;Whats the best way to do this? Just a plain old link on forumhome? Then other members will be able to view it?

I'll see about adding something to block out private forums, shouldn't be too hard. About the link, just add a text link to it at the bottom of your forums or whatever. You don't need to do this, you could just add the site to google (http://www.google.com/addurl.html) but linking it on your site is the best and fastest way to get it spidered... plus it will inherit your "PR" which will help the pages rank higher.

DrkFusion 12-03-2002 01:38 AM

Doesn't Xenon's Hack block all private forums to guests, and normal members.

Smoothie, just add a link at the bottom :)

Smoothie 12-03-2002 01:55 AM

Ok, probably a dumb question, but after I add the link to my forums, what link do I submit to google?

Erwin 12-03-2002 01:58 AM

Add your homepage = google will then spider all the links on your site it can spider.

SkuZZy 12-03-2002 02:01 AM

Quote:

Originally posted by DrkFusion
Doesn't Xenon's Hack block all private forums to guests, and normal members.

Smoothie, just add a link at the bottom :)

yeah it does already block access to private forums, but not selectively... so if you want a private forum to be spidered, you're out of luck. For now I don't think it needs to be added (selectivity)... ;)

Smoothie 12-03-2002 02:06 AM

SkuZZy-

Would it be possible to exclude all private forums from displaying?

fello9 12-03-2002 03:45 AM

I WANT IT!!! I WANT IT!!! I WANT IT!!!

Please release it ASAP!

Thank you!

Velocd 12-03-2002 04:32 AM

Quote:

Originally posted by Erwin
You can use robots.txt to block Google or any other spiders spidering specific sections of your site or certain files like .gif etc.
Funny this was mentioned, because when looking into Filburts tutorial for creating friendly URL's, I also came across a thread by MarkB questioning how to stop webcrawlers from consuming so much bandwidth.

Take a look at the following thread for more solutions, in regard to also using robots.txt:
http://www.vbulletin.com/forum/showt...threadid=44966

I installed Filburts hack as well today, and it was incredibly easy to set up compared to the troubles I had with Fastforwards hack. If anyone is interested in Filburts, here is the thread:
http://www.vbulletin.com/forum/showt...threadid=56783

SkuZZy 12-03-2002 12:38 PM

Quote:

Originally posted by Smoothie
SkuZZy-

Would it be possible to exclude all private forums from displaying?

Yes, that is build in.

Xenon 12-03-2002 03:43 PM

I don't understand why you want privat forums to be spidered, but not shown to users, bit confusing to me ;)

@Smoothie: The permissions are the same as you have in your normal board, so logged in as admin will show you all privat forums also in the archive...

Smoothie 12-03-2002 05:49 PM

I checked, and mod_rewrite is installed. How do I see if its enabled?

Smoothie 12-03-2002 05:53 PM

SkuZZy-

I tried your script, but I'm getting a no-permissions error.

SkuZZy 12-03-2002 06:40 PM

Quote:

Originally posted by Smoothie
SkuZZy-

I tried your script, but I'm getting a no-permissions error.

What are you trying to do? You want your private forums to be spidered? Or you don't want them to be spidered? Private forums will show up, but when you click on them, access will be denied (unless you're logged into an account that has access to view them).


All times are GMT. The time now is 09:37 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01416 seconds
  • Memory Usage 1,831KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (13)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete