vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 2.x Beta Releases (https://vborg.vbsupport.ru/forumdisplay.php?f=5)
-   -   Have Googlebot Crawl EVERYTHING (theoretically) (https://vborg.vbsupport.ru/showthread.php?t=50805)

filburt1 03-24-2003 10:00 PM

Have Googlebot Crawl EVERYTHING (theoretically)
 
Googlebot supposedly won't crawl links that have sessionhashes (32 character hex strings) in the URLs. So this hack detects when Googlebot is browsing your forum, and if it is, deletes the sessionhash. So, theoretically, Googlebot will be able to crawl every single guest-accessible link on your board. All of them. This means every thread, every forum, every post, even useless stuff like the memberlist and the FAQ.

Simple to install (1 step). This won't get instant gratification because Googlebot will only reindex your site on its next crawl.

Intex 03-25-2003 05:00 PM

Another great hack filburt1. Not sure I need it crawling my internal site, but nevertheless very cool.

colicab-d 03-25-2003 05:22 PM

lol so thats why u wanted the sessions code

filburt1 03-25-2003 05:38 PM

No it's not...

VampireMan 03-25-2003 05:54 PM

simple but very effective hack , thanks :-)

Xanthine 03-25-2003 06:02 PM

Sounds good thanks Filburt

Overgrow 03-25-2003 06:35 PM

Excellent :)


One note: This may not allow Google to see all of your posts (theoretically). If you have old ones that would not naturally show up on forumdisplay, Google will not use the select box to see all of the old posts.

Elegantly simple though and makes perfect sense.


[high]* Overgrow clicks install[/high]

Davey 03-25-2003 06:42 PM

Filburt how did you find that special part of code through all that 'mess'?
;):p:D:).
Thanks.
*Installs*.

Dave.

Webdork 03-25-2003 08:31 PM

On my site I require registration for users to view threads. Is there a way to allow Google to become a "registered" user so it can crawl all threads still?

filburt1 03-25-2003 08:32 PM

Yes but smart users can get around it...

edit: actually I thought of a way to fix that, too. Expect that hack eventually.
note to self: check first few IP bits and HTTP_USER_AGENT, update $bbuserinfo.

corsacrazy 03-25-2003 08:35 PM

installed ;) nice one

off topic

how r u lot doing those funky sigs ? i beleive u started the trend filb !

Webdork 03-25-2003 08:37 PM

I dont really mind if a smart user can get in without registering - my forums arent private I just like to know who is using them. But the true value is of course in having as many threads as possible in Google to attract new users.

Velocd 03-26-2003 12:21 AM

Nice hack, and I'll be awaiting that suggestion by Webdork in post #9 as well ;)

[high]* Velocd clicks install[/high]

filburt1 03-26-2003 01:42 AM

Forgot it, I can't stand looking at sessions.php after trying for an hour.

Slynderdale 03-26-2003 05:31 PM

Hmm, theres a similar hack for this in Tech's vB Archive hack that also includes several searchbots, not just google.

filburt1 03-26-2003 05:33 PM

This is one file edit, don't think you need to install that beast...

Slynderdale 03-26-2003 05:41 PM

You dont need to install vB Archive for the sessionhash stripping, you can find it in this post:
https://vborg.vbsupport.ru/showthrea...706#post342706

Just add a small function to functions.php and edit global.php and it strips all sessionhash's from search bots.

Also nice hack though filburt.

CityInet 03-26-2003 08:53 PM

this is not working for me. At the top of my home page I get...
if (strpos(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot") != false) { unset($session['sessionhash']); }

any suggestions?

filburt1 03-26-2003 09:04 PM

Read the instructions again.

DWZ 03-27-2003 03:00 AM

Quote:

Yesterday at 08:35 AM corsacrazy said this in Post #11
off topic

how r u lot doing those funky sigs ? i beleive u started the trend filb !

ummmm... how did you do that? lol

EDIT: found the hack ;)

Grunt 03-27-2003 03:39 AM

so this will do everything that the VBarchive hack will do with one file edit?

filburt1 03-27-2003 02:13 PM

Well, it has the same objective as vBArchive, it doesn't do everything it does.

Grunt 03-27-2003 02:24 PM

i recall that only 1 of the many steps of VBArchive was to remove the sessionhash's, that is why i ask

MaXxed 04-04-2003 04:55 AM

How is this working for everyone? Any results? I have it installed and saw a crawler today, but it didnt stay long....

filburt1 04-09-2003 02:52 PM

Googlebot is swarming my forums right now so when the index is updated we should know if this works.

amykhar 04-09-2003 04:06 PM

I have filbert1's template mod installed; therefore Google has already been crawling a lot of stuff. However, it never picked up the calendar or faq before. I put this hack in, and now my calendar is indexed on Google.

Amy

Smoothie 04-09-2003 04:26 PM

and google is registering, once again.

MaXxed 04-10-2003 03:00 AM

So far im showing no significant increase in google links, unfortunately. I will give it more time :)

mymilkexpired 04-10-2003 01:37 PM

any movement on this? i would like to know how effective it is to gauge whether or not i should install it! thanks...

MaXxed 04-14-2003 12:11 AM

Well after installing this, my google links dropped by 100, from 696 to 595. I was getting better results from a simple sitemap.

Not sure why this is, but obviously it doesnt work? Will wait for comments before uninstalling, but will take a month to fix the results with google.

zajako 04-14-2003 06:03 PM

i will install this when i get to my home machine !!

thanks filburt :]

Ryzer 06-01-2003 10:07 AM

THNX m8! :)

PixelFx 06-02-2003 09:25 AM

looks great, however I'm wondering how well this would work with robots.txt for sections of a site you don't want index'd ?

Skyline_GT 06-04-2003 01:43 AM

Nice hack
installed:D

gmarik 06-05-2003 02:07 PM

Filburt, how can I hide sessions forever?
Is there a hack to write the sessions in cookies and not show them for user at all?

Zach 06-06-2003 06:31 AM

Quote:

Yesterday at 08:07 AM gmarik said this in Post #35
Filburt, how can I hide sessions forever?
Is there a hack to write the sessions in cookies and not show them for user at all?


That would be pointless - you are trying to keep track of non cookied users.

Zach 06-06-2003 06:43 AM

I dunno, but I have the session hash in my urls and google crawls them, I have not got the results I hope to get yet, but I went from like a hundred to 1000 or so.

I am using something else I threw together to help google along but I know for sure that the hash is still there.

Zach 06-06-2003 06:52 AM

This is rather weird, my stats program stopped up dating a couple of of mornings ago - guess it got sick of scooter.

But this is the basically four days and four hours I guess of stats on search engines. Hum, now off to figure out whats up with my stats, or server time or what.

Scooter (AltaVista) 74909 4.53 GB 04 Jun 2003 - 22:57
Googlebot (Google) 10091 591.56 MB 04 Jun 2003 - 19:54
Inktomi Slurp 5833 320.94 MB 05 Jun 2003 - 04:07
Fast-Webcrawler (AllTheWeb) 3777 167.30 MB 05 Jun 2003 - 04:07
Alexa (IA Archiver) 1076 12.58 MB 05 Jun 2003 - 03:49
IBM_Planetwide 615 1.15 MB 04 Jun 2003 - 19:33
Road Runner: The ImageScape Robot 366 955.57 KB 04 Jun 2003 - 15:14
LinkWalker 137 9.79 MB 03 Jun 2003 - 13:01
SurveyBot 123 3.09 MB 02 Jun 2003 - 06:29
MSIECrawler 81 307.02 KB 02 Jun 2003 - 11:39
Pioneer 59 184.58 KB 01 Jun 2003 - 08:31
Unknown robot (identified by 'crawl') 54 1.67 MB 04 Jun 2003 - 15:54
InternetSeer 9 0 05 Jun 2003 - 02:08
arks 8 63.65 KB 04 Jun 2003 - 19:36
BaiDuSpider 7 289.54 KB 04 Jun 2003 - 05:12
WISENutbot (Looksmart) 6 108.84 KB 03 Jun 2003 - 23:19
Calif 6 25.09 KB 02 Jun 2003 - 16:11
WebClipping.com 3 23.83 KB 03 Jun 2003 - 11:24
Unknown robot (identified by 'spider') 2 99.97 KB 03 Jun 2003 - 15:08
ZealBot 2 30.25 KB 03 Jun 2003 - 06:07
Wget 1 49.42 KB 03 Jun 2003 - 16:56

I have no idea what got up scooters ass. I have never seen anything like it. He hit me for ten gigs the last week of May

dunno, but he can do what he please as long as I go over there and find my self number one for terms like

fantasy sports
sports forums

Zach 06-07-2003 02:33 PM

You know I can be so stupid at times, or I spend to much time in front of the computer )

Never considered that my computer clock/calendar would be off by two days :)

S.Shady 06-10-2003 01:26 AM

googlebot is on my forums now :) 16 mins on index.php :-\


how long does it take for the site results to be on google ?


All times are GMT. The time now is 04:47 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01223 seconds
  • Memory Usage 1,804KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (1)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete