The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
Details »» | |||||||||||||||||||||||||
For vB 2.0
This little hackette is a quick fix to allow search engine bots to spider your threads. Although this will allow the bots to index every thread on your site, it will not make the threads 'search engine optimized'. They will see exactly what you see when you visit your site. It simply removes the CGI bits from the URL's which prevents most search engine bots from spidering more than one level deep. If you want a hack that allows to fully customize how the thread will look to the search engine bot, you should look at Overgrows more complete hack here. The advantage of this hack over Overgrows is that it does not require htaccess support which can have performance issues. This could also be seen as a disadvantage though as my hack requires that you have mod_rewrite enabled on your Apache Server, whereas Overgrows method should work with just about any web host out there. Take yer pick Show Your Support
|
Comments |
#272
|
|||
|
|||
I'm getting a bit carried away by this now, and someone needs to stop me
I've applied the same idea to assist with reducing bandwidth In my .htaccess file I've added this: Code:
# # avatar.php rewriting # # av1-1053412959.gif = userid + dateline RewriteRule ^av([0-9]+)-([0-9]+).gif$ avatar.php?userid=$1&dateline=$2 [L] # # attachment.php rewriting # # atp157156.gif = postid + extension RewriteRule ^atp([0-9]+).([a-z]+)$ attachment.php?postid=$1 [L] # att157156.gif = attachmentid + extension RewriteRule ^att([0-9]+).([a-z]+)$ attachment.php?attachmentid=$1 [L] Simply change all references of avatar.php?userid=$post[userid]&$post[dateline]... and variations, for av$post[userid]-$post[dateline].gif Don't worry about the extension, the correct mime-type will be returned by the php... and that's whats important. Then change template postbit_attachment so that the URL for attachments is this : atp$post[postid].$post[attachmentextension] Note that this also sidesteps a bug in Mozilla whereby downloading a zip file from a php page would prompt a php file extension rather than zip. I've been hacking for sure, and again I don't recall clearly every change I made. But if you got the gist of everything else in this thread then I've no doubt you can do this. Essentially the point is that a lack of querystring allows the browser and proxies/caches to cache the avatars and attachments. This obviously reduces bandwidth... and also reduces database load. I have high hopes for this little addition to this very fine hack |
#273
|
|||
|
|||
One last bug fix.
If you use lots of standard avatars... then the page navigation over pages of avatars from member.php will be broken (you'll only ever get the first page)... so you will also need to insert this: Code:
RewriteCond %{QUERY_STRING} ^(.*)-([0-9]+)\.html$ RewriteRule ^member.php$ member.php?%1&pagenumber=%2? [L] |
#274
|
||||
|
||||
Btw, I am doing something similar in vB3 - it's a lot easier, since sessionhash is coded by itself now.
|
#275
|
|||
|
|||
Thanks for the terrific job on the great hack guys. I'm one of the few who doesn't need to have their forums added to search engines, but it's great for everyone who uses Google AdSense to gain some revenue. I have a question, does anyone know how to get the sessionhashes removed from the navbits? i.e.:
My_vbulletin_board > Some_Forum > This_is_my_post Listed near the top in each forum or post level page. This same problem affects the Forum Jump menu on the bottom left and no doubt many other pages, but from personal experience, these are most commonly used ones. This is not interesting for those who use this hack purely to get their pages Googled but for those who're using this for AdSense sessionhashes nearly always means that you'll be getting charity placeholders for banners instead (since the crawler thinks that it hasn't cached the page to generate an appropriate banner). It also looks like that if a link on the forum sends you back to the index any different from http://www.mywebsite.com/forums/index.php (i.e. http://www.mywebsite.com/forums/index.php?s= or just http://www.mywebsite.com/forums/) that it will generate placeholder banners too. Perhaps this can be avoided as well? |
#276
|
|||
|
|||
My reason for all of this is AdSense... I'm not bothered about spidering at all.
I manually removed all mentions of 'sessionhash' as appropriate throughout the whole codebase (php & templates). There's a few subtle ones that linger... for example in the replacement variables for the styles... modify the header to remove the sessionhash from the main image and core navigation. The page nav bit is buried in admin/functions.php and you can remove the sessionhash from there. I also then adjusted all of my user options and registration forms to remove the option to not use cookies. And modified the FAQ to say that cookies are compulsory. |
#277
|
|||
|
|||
One thing to point out is that even when you successfully remove all sessionhashes... Google spiders still visit with one!
I think their software has learnt vb and just compensates and discards. But this didn't bother me because the lack of sessionhashes and querystrings does help with being cached by proxies (the particularly dumb ones that AOL seem to use). So there is a benefit to it... but not as much as you think there will be. |
#279
|
|||
|
|||
Oh no, my mistake
In my online.php there was still a place where a sessionhash was being echoed and I incorrectly thought that the spiders were using a hash... but they're not... it's just the display to me of where the spider is that inserted the hash. Ignore that last bit Which is good... as now it clearly is working better than I thought. |
#280
|
|||
|
|||
Thanks for the reply Buro9. I wonder if it's possible to get a step-by-step guide how to remove all the sessionhashes on every page where it is needed. If you or anyone has that amount of spare time of course
filburt1's beta script looks like something that could work with AdSense too, this might be interesting to look into. Did anyone try anything like this out for AdSense? EDIT: VB3 works like a charm with AdSense. I can't wait for RC1 (just like nearly everyone else here). |
#281
|
|||
|
|||
Found another bug:
The admin function to merge threads did not work, because your thread URL's are now of a different format. postings.php and the action 'domergethread' expected a URL with 'threadid=' in it. But if you've followed all instructions (!) your formats are more similar to: http://www.bowlie.com/forum/t5249.html and http://www.bowlie.com/forum/t5249-15-3.html So, to fix this, do this: FIND (in postings.php): Code:
$getthreadid=intval(substr($mergethreadurl,strpos($mergethreadurl,"threadid=")+9)); Code:
// HACK : START : SPIDER FRIENDLY URLS //$getthreadid=intval(substr($mergethreadurl,strpos($mergethreadurl,"threadid=")+9)); $getthreadid = intval(preg_replace("/(^.*\/t)|(-[\d]+-[\d]+)|(\.html)/", "", $mergethreadurl)); // HACK : END : SPIDER FRIENDLY URLS All it's doing is stripping out the threadid from the new format URL and putting that in the variable in the same way the old code did. As you'll note, I always leave old-code lying around commented out in case I ever want to roll-back... it's just my style... but if you trust my work you can delete that line. I also always leave those START and END blocks in, so I can see what the hell I changed and why Ogmuk, simply put... I started by using the template search to find all templates with 'sessionhash' in them. Then I edited each and every template (nigh on all of them) and removed the applicable code... which usually boils down to: Code:
s=$session[sessionhash] Once removed from all templates, I then searched through all .php files in the root of the forum directories, and similarly replaced all sessionhashes. EXCEPT where I found $dbsession[sessionhash] as this was usually being written TO the cookie and wasn't being echoed. You will have to read through each instance, but it's obvious that if it's appearing in a URL you can strip it out... but if it's in code then you'll probably want to keep it there. And lastly... I have AdSense running on my site, and thought I'd share this last tip for you: AdSense advises you not to place adverts on pages that you have to be logged on to view, or on search results pages. The former is because they'll never correctly spider it and serve relevant adverts (I bet you see ones for password cracking and security!), and the second is because the pages changes too frequently and by the time it's spidered it's useless. Both in effect will show inappropriate or public service adverts which do nothing for your revenue... and lower your click-throughs by increasing impressions... and also generate server load by sending too many spiders your way. So... I've written some JavaScript to only put adverts on pages that I know I WANT to show AdSense adverts on... here it is for you Code:
<script type="text/javascript"> var adPages = new Array( 'forumdisplay.php', 'index.php', 'announcement.php', 'showthread.php', 'calendar.php', 'donate.php', 'misc.php', 'memberlist.php', 'vbstats.php', 'member.php', 'forum/f', 'forum/t' ); var returnAdvert = false; var pageString = new String(); pageString = document.location.href; for (var ii = 0; ii < adPages.length; ii++) { if (pageString.indexOf(adPages[ii]) >= 0) { returnAdvert = true; break; } } if (returnAdvert == false && document.location.href == "http://www.bowlie.com/forum/") { returnAdvert = true; } if (returnAdvert == true) { var google_ad_client='pub-9576666925012421'; var google_ad_width=468; var google_ad_height=60; var google_ad_format='468x60_as'; document.write('<scr'+'ipt type="text/javascr'+'ipt" language="JavaScr'+'ipt" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></scr'+'ipt>'); } </script> Hope all of that info helps everyone. Cheers David K |
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
More Information | |
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|