Log in

View Full Version : Have Googlebot Crawl EVERYTHING (theoretically)


filburt1
03-24-2003, 10:00 PM
Googlebot supposedly won't crawl links that have sessionhashes (32 character hex strings) in the URLs. So this hack detects when Googlebot is browsing your forum, and if it is, deletes the sessionhash. So, theoretically, Googlebot will be able to crawl every single guest-accessible link on your board. All of them. This means every thread, every forum, every post, even useless stuff like the memberlist and the FAQ.

Simple to install (1 step). This won't get instant gratification because Googlebot will only reindex your site on its next crawl.

Intex
03-25-2003, 05:00 PM
Another great hack filburt1. Not sure I need it crawling my internal site, but nevertheless very cool.

colicab-d
03-25-2003, 05:22 PM
lol so thats why u wanted the sessions code

filburt1
03-25-2003, 05:38 PM
No it's not...

VampireMan
03-25-2003, 05:54 PM
simple but very effective hack , thanks :-)

Xanthine
03-25-2003, 06:02 PM
Sounds good thanks Filburt

Overgrow
03-25-2003, 06:35 PM
Excellent :)


One note: This may not allow Google to see all of your posts (theoretically). If you have old ones that would not naturally show up on forumdisplay, Google will not use the select box to see all of the old posts.

Elegantly simple though and makes perfect sense.


* Overgrow clicks install

Davey
03-25-2003, 06:42 PM
Filburt how did you find that special part of code through all that 'mess'?
;):p:D:).
Thanks.
*Installs*.

Dave.

Webdork
03-25-2003, 08:31 PM
On my site I require registration for users to view threads. Is there a way to allow Google to become a "registered" user so it can crawl all threads still?

filburt1
03-25-2003, 08:32 PM
Yes but smart users can get around it...

edit: actually I thought of a way to fix that, too. Expect that hack eventually.
note to self: check first few IP bits and HTTP_USER_AGENT, update $bbuserinfo.

corsacrazy
03-25-2003, 08:35 PM
installed ;) nice one

off topic

how r u lot doing those funky sigs ? i beleive u started the trend filb !

Webdork
03-25-2003, 08:37 PM
I dont really mind if a smart user can get in without registering - my forums arent private I just like to know who is using them. But the true value is of course in having as many threads as possible in Google to attract new users.

Velocd
03-26-2003, 12:21 AM
Nice hack, and I'll be awaiting that suggestion by Webdork in post #9 as well ;)

* Velocd clicks install

filburt1
03-26-2003, 01:42 AM
Forgot it, I can't stand looking at sessions.php after trying for an hour.

Slynderdale
03-26-2003, 05:31 PM
Hmm, theres a similar hack for this in Tech's vB Archive hack that also includes several searchbots, not just google.

filburt1
03-26-2003, 05:33 PM
This is one file edit, don't think you need to install that beast...

Slynderdale
03-26-2003, 05:41 PM
You dont need to install vB Archive for the sessionhash stripping, you can find it in this post:
https://vborg.vbsupport.ru/showthread.php?postid=342706#post342706

Just add a small function to functions.php and edit global.php and it strips all sessionhash's from search bots.

Also nice hack though filburt.

CityInet
03-26-2003, 08:53 PM
this is not working for me. At the top of my home page I get...
if (strpos(strtolower($_SERVER['HTTP_USER_AGENT']), "googlebot") != false) { unset($session['sessionhash']); }

any suggestions?

filburt1
03-26-2003, 09:04 PM
Read the instructions again.

DWZ
03-27-2003, 03:00 AM
Yesterday at 08:35 AM corsacrazy said this in Post #11 (https://vborg.vbsupport.ru/showthread.php?postid=373258#post373258)
off topic

how r u lot doing those funky sigs ? i beleive u started the trend filb ! ummmm... how did you do that? lol

EDIT: found the hack ;)

Grunt
03-27-2003, 03:39 AM
so this will do everything that the VBarchive hack will do with one file edit?

filburt1
03-27-2003, 02:13 PM
Well, it has the same objective as vBArchive, it doesn't do everything it does.

Grunt
03-27-2003, 02:24 PM
i recall that only 1 of the many steps of VBArchive was to remove the sessionhash's, that is why i ask

MaXxed
04-04-2003, 04:55 AM
How is this working for everyone? Any results? I have it installed and saw a crawler today, but it didnt stay long....

filburt1
04-09-2003, 02:52 PM
Googlebot is swarming my forums right now so when the index is updated we should know if this works.

amykhar
04-09-2003, 04:06 PM
I have filbert1's template mod installed; therefore Google has already been crawling a lot of stuff. However, it never picked up the calendar or faq before. I put this hack in, and now my calendar is indexed on Google.

Amy

Smoothie
04-09-2003, 04:26 PM
and google is registering, once again.

MaXxed
04-10-2003, 03:00 AM
So far im showing no significant increase in google links, unfortunately. I will give it more time :)

mymilkexpired
04-10-2003, 01:37 PM
any movement on this? i would like to know how effective it is to gauge whether or not i should install it! thanks...

MaXxed
04-14-2003, 12:11 AM
Well after installing this, my google links dropped by 100, from 696 to 595. I was getting better results from a simple sitemap.

Not sure why this is, but obviously it doesnt work? Will wait for comments before uninstalling, but will take a month to fix the results with google.

zajako
04-14-2003, 06:03 PM
i will install this when i get to my home machine !!

thanks filburt :]

Ryzer
06-01-2003, 10:07 AM
THNX m8! :)

PixelFx
06-02-2003, 09:25 AM
looks great, however I'm wondering how well this would work with robots.txt for sections of a site you don't want index'd ?

Skyline_GT
06-04-2003, 01:43 AM
Nice hack
installed:D

gmarik
06-05-2003, 02:07 PM
Filburt, how can I hide sessions forever?
Is there a hack to write the sessions in cookies and not show them for user at all?

Zach
06-06-2003, 06:31 AM
Yesterday at 08:07 AM gmarik said this in Post #35 (https://vborg.vbsupport.ru/showthread.php?postid=404744#post404744)
Filburt, how can I hide sessions forever?
Is there a hack to write the sessions in cookies and not show them for user at all?


That would be pointless - you are trying to keep track of non cookied users.

Zach
06-06-2003, 06:43 AM
I dunno, but I have the session hash in my urls and google crawls them, I have not got the results I hope to get yet, but I went from like a hundred to 1000 or so.

I am using something else I threw together to help google along but I know for sure that the hash is still there.

Zach
06-06-2003, 06:52 AM
This is rather weird, my stats program stopped up dating a couple of of mornings ago - guess it got sick of scooter.

But this is the basically four days and four hours I guess of stats on search engines. Hum, now off to figure out whats up with my stats, or server time or what.

Scooter (AltaVista) 74909 4.53 GB 04 Jun 2003 - 22:57
Googlebot (Google) 10091 591.56 MB 04 Jun 2003 - 19:54
Inktomi Slurp 5833 320.94 MB 05 Jun 2003 - 04:07
Fast-Webcrawler (AllTheWeb) 3777 167.30 MB 05 Jun 2003 - 04:07
Alexa (IA Archiver) 1076 12.58 MB 05 Jun 2003 - 03:49
IBM_Planetwide 615 1.15 MB 04 Jun 2003 - 19:33
Road Runner: The ImageScape Robot 366 955.57 KB 04 Jun 2003 - 15:14
LinkWalker 137 9.79 MB 03 Jun 2003 - 13:01
SurveyBot 123 3.09 MB 02 Jun 2003 - 06:29
MSIECrawler 81 307.02 KB 02 Jun 2003 - 11:39
Pioneer 59 184.58 KB 01 Jun 2003 - 08:31
Unknown robot (identified by 'crawl') 54 1.67 MB 04 Jun 2003 - 15:54
InternetSeer 9 0 05 Jun 2003 - 02:08
arks 8 63.65 KB 04 Jun 2003 - 19:36
BaiDuSpider 7 289.54 KB 04 Jun 2003 - 05:12
WISENutbot (Looksmart) 6 108.84 KB 03 Jun 2003 - 23:19
Calif 6 25.09 KB 02 Jun 2003 - 16:11
WebClipping.com 3 23.83 KB 03 Jun 2003 - 11:24
Unknown robot (identified by 'spider') 2 99.97 KB 03 Jun 2003 - 15:08
ZealBot 2 30.25 KB 03 Jun 2003 - 06:07
Wget 1 49.42 KB 03 Jun 2003 - 16:56

I have no idea what got up scooters ass. I have never seen anything like it. He hit me for ten gigs the last week of May

dunno, but he can do what he please as long as I go over there and find my self number one for terms like

fantasy sports
sports forums

Zach
06-07-2003, 02:33 PM
You know I can be so stupid at times, or I spend to much time in front of the computer )

Never considered that my computer clock/calendar would be off by two days :)

S.Shady
06-10-2003, 01:26 AM
googlebot is on my forums now :) 16 mins on index.php :-\


how long does it take for the site results to be on google ?

Zach
06-11-2003, 12:11 AM
Yesterday at 07:26 PM S.Shady said this in Post #40 (https://vborg.vbsupport.ru/showthread.php?postid=406829#post406829)
googlebot is on my forums now :) 16 mins on index.php :-\


how long does it take for the site results to be on google ?

Well, do not hold your breath waiting

IT can takes weeks to months, and will take months to start getting the good results. That also works in reverse, I have pages in google and others that have not existed for two years.

S.Shady
06-11-2003, 12:41 AM
hmm damn so its not auto. ok then ill forget all about it then check again in about a month. Thanks :) now i wont be wasting time by checking for awhile. i can finally clean house ;)

GameCrash
06-11-2003, 03:24 PM
/me clicks install

I, Brian
07-15-2003, 04:55 PM
I know this hack works because I joined filbert's design forum in it's early days, then lost the link in a re-format of my harddrive - a few months later I found a backlink from his forums. :)

He's shown by example how this hack can work. And google indexing of forums is precisely what sent me looking at vBulletin. Forum-Forum also had a big part to play in that when I asw how well they were indexed by Google.

Just starting up vB 2.3 and will implement filburt's hack as soon as I'm sure everything is running smoothly. :)

Erwin
07-15-2003, 10:41 PM
vB3 implements this already automatically. :) Sessionhash is automatically disabled for Google, Inktomi etc.

xs1
07-20-2003, 11:48 AM
i dont sessions.php file any help ? :( do i have to edit the one in admin folder ?

clamcrusher
07-24-2003, 08:16 PM
will this allow googlebot to read forums you have set to private, like for example a forum i made that only mods and admins can view? what is talked about in there i dont want shared.

also i heard a rumor that it might be able to read pm's too?

NTLDR
07-24-2003, 08:26 PM
will this allow googlebot to read forums you have set to private,

No

also i heard a rumor that it might be able to read pm's too?

and No :) All this does is remove the sessionhash in the URLs when google is crawling.

BlueHawk
07-27-2003, 02:17 PM
Nice hack, so many peeps need relevent info on problems they are having, hopefully our forums will now help :)

Pikok
07-29-2003, 08:22 AM
03-25-03 at 11:31 PM Webdork said this in Post #9 (https://vborg.vbsupport.ru/showthread.php?postid=373251#post373251)
On my site I require registration for users to view threads. Is there a way to allow Google to become a "registered" user so it can crawl all threads still?

This is what I did and tested it with a couple of different Web robots and it appears to be working okay. Already having inphinity's UserAgent Checking hack installed, I just wrote a bit of code to work along with it to set permissions for the robots.
Install inphinity's UserAgent Checking (https://vborg.vbsupport.ru/showthread.php?postid=352073#post352073) add-on for TECK's vBArchive (https://vborg.vbsupport.ru/showthread.php?s=&threadid=47667). UserAgent Checking works idependently of vBArchive.
Ceate a usergroup for Web robots and setup the various permissions as desired.
In "root/global.php" find:$permissions=getpermissions();And Add Above That:// Search Engine Perms Start
$sep_checkagent = $bbuserinfo['useragent'];
$sep_robot = useragentcheck($sep_checkagent, 1);
if ($sep_robot) {
$bbuserinfo[usergroupid] = 10; //Set Usergroup to Match the Robot Usergroup
$bbuserinfo[username] = $sep_robot; //Don't Think This Line's Really Needed, But I Make Use of It
}
// Search Engine Perms End
I guess you could say this would be an add-on for an add-on.. lol Let me know if it works for you.

Salazar
07-30-2003, 11:26 AM
Thanks filburt1! :)

* Salazar clicks install...

Khashyar
08-09-2003, 06:55 PM
Thank you for the hack....

I just installed the EASY-to-install hack in my forums, and I will let you all know what happens.

Has anyone noticed an increase on their website's search engine listings after installing the hack?

Thank again, filburt :)

Khashyar
www.russiameetingplace.com/forums

Khashyar
08-09-2003, 06:56 PM
filburt (and others...)

Would it be useful to also specify other search engines in a similar hack in the session.php file?

Thanks,

Khashyar

Matrixgl
08-17-2003, 04:18 AM
Its not working....I did change admin/sessions.php about 10 days ago, still no results :'(

dethfire
09-20-2003, 10:59 PM
hmmm doesn't seem to work, I see googlebot is look at this page /archive/topic/6075-1.html?

shouldn't the "s" string be unset?

MaDCaT75
09-23-2003, 10:59 PM
Great hack.

rikku3978
03-12-2004, 01:36 AM
I will use this, the more hits the better, right? :P

CF Web
03-30-2004, 04:10 PM
*bump*

Anyone else with results on the effectiveness of this hack?

zeeko1212
03-02-2005, 06:39 PM
very effective hack , thanks !

arsec
01-31-2006, 01:47 PM
i dont have the sessions.php file, is that normal? im downloading it now

Andreas
01-31-2006, 01:52 PM
Erm ... are you aware that this is a vBuletin 2.3.X Hack?