Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 2.x > vBulletin 2.x Full Releases
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Details »»

Version: , by fastforward fastforward is offline
Developer Last Online: Nov 2011 Show Printable Version Email this Page

Version: 2.2.x Rating:
Released: 05-24-2001 Last Update: Never Installs: 35
 
No support by the author.

For vB 2.0

This little hackette is a quick fix to allow search engine bots to spider your threads.

Although this will allow the bots to index every thread on your site, it will not make the threads 'search engine optimized'. They will see exactly what you see when you visit your site. It simply removes the CGI bits from the URL's which prevents most search engine bots from spidering more than one level deep.

If you want a hack that allows to fully customize how the thread will look to the search engine bot, you should look at Overgrows more complete hack here.

The advantage of this hack over Overgrows is that it does not require htaccess support which can have performance issues. This could also be seen as a disadvantage though as my hack requires that you have mod_rewrite enabled on your Apache Server, whereas Overgrows method should work with just about any web host out there.

Take yer pick

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #272  
Old 07-17-2003, 08:13 PM
buro9 buro9 is offline
 
Join Date: Feb 2002
Location: London, UK
Posts: 585
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I'm getting a bit carried away by this now, and someone needs to stop me

I've applied the same idea to assist with reducing bandwidth

In my .htaccess file I've added this:

Code:
  #
  # avatar.php rewriting
  #
  # av1-1053412959.gif = userid + dateline
RewriteRule ^av([0-9]+)-([0-9]+).gif$ avatar.php?userid=$1&dateline=$2 [L]

  #
  # attachment.php rewriting
  #
  # atp157156.gif = postid + extension
RewriteRule ^atp([0-9]+).([a-z]+)$ attachment.php?postid=$1 [L]
  # att157156.gif = attachmentid + extension
RewriteRule ^att([0-9]+).([a-z]+)$ attachment.php?attachmentid=$1 [L]
And you should be able to figure the rest out

Simply change all references of avatar.php?userid=$post[userid]&$post[dateline]... and variations, for av$post[userid]-$post[dateline].gif

Don't worry about the extension, the correct mime-type will be returned by the php... and that's whats important.

Then change template postbit_attachment so that the URL for attachments is this : atp$post[postid].$post[attachmentextension]

Note that this also sidesteps a bug in Mozilla whereby downloading a zip file from a php page would prompt a php file extension rather than zip.

I've been hacking for sure, and again I don't recall clearly every change I made. But if you got the gist of everything else in this thread then I've no doubt you can do this.

Essentially the point is that a lack of querystring allows the browser and proxies/caches to cache the avatars and attachments.

This obviously reduces bandwidth... and also reduces database load.

I have high hopes for this little addition to this very fine hack
Reply With Quote
  #273  
Old 07-20-2003, 08:33 PM
buro9 buro9 is offline
 
Join Date: Feb 2002
Location: London, UK
Posts: 585
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

One last bug fix.

If you use lots of standard avatars... then the page navigation over pages of avatars from member.php will be broken (you'll only ever get the first page)... so you will also need to insert this:

Code:
RewriteCond %{QUERY_STRING} ^(.*)-([0-9]+)\.html$
RewriteRule ^member.php$ member.php?%1&pagenumber=%2? [L]
That will be after lierduh's other corrective rewrites.
Reply With Quote
  #274  
Old 07-21-2003, 02:14 AM
Erwin's Avatar
Erwin Erwin is offline
 
Join Date: Jan 2002
Posts: 7,604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Btw, I am doing something similar in vB3 - it's a lot easier, since sessionhash is coded by itself now.
Reply With Quote
  #275  
Old 07-22-2003, 12:58 AM
Ogmuk Ogmuk is offline
 
Join Date: Jun 2003
Posts: 113
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Thanks for the terrific job on the great hack guys. I'm one of the few who doesn't need to have their forums added to search engines, but it's great for everyone who uses Google AdSense to gain some revenue. I have a question, does anyone know how to get the sessionhashes removed from the navbits? i.e.:
My_vbulletin_board > Some_Forum > This_is_my_post
Listed near the top in each forum or post level page. This same problem affects the Forum Jump menu on the bottom left and no doubt many other pages, but from personal experience, these are most commonly used ones.

This is not interesting for those who use this hack purely to get their pages Googled but for those who're using this for AdSense sessionhashes nearly always means that you'll be getting charity placeholders for banners instead (since the crawler thinks that it hasn't cached the page to generate an appropriate banner).

It also looks like that if a link on the forum sends you back to the index any different from http://www.mywebsite.com/forums/index.php (i.e. http://www.mywebsite.com/forums/index.php?s= or just http://www.mywebsite.com/forums/) that it will generate placeholder banners too. Perhaps this can be avoided as well?
Reply With Quote
  #276  
Old 07-23-2003, 04:56 AM
buro9 buro9 is offline
 
Join Date: Feb 2002
Location: London, UK
Posts: 585
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

My reason for all of this is AdSense... I'm not bothered about spidering at all.

I manually removed all mentions of 'sessionhash' as appropriate throughout the whole codebase (php & templates).

There's a few subtle ones that linger... for example in the replacement variables for the styles... modify the header to remove the sessionhash from the main image and core navigation.

The page nav bit is buried in admin/functions.php and you can remove the sessionhash from there.

I also then adjusted all of my user options and registration forms to remove the option to not use cookies. And modified the FAQ to say that cookies are compulsory.
Reply With Quote
  #277  
Old 07-23-2003, 04:59 AM
buro9 buro9 is offline
 
Join Date: Feb 2002
Location: London, UK
Posts: 585
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

One thing to point out is that even when you successfully remove all sessionhashes... Google spiders still visit with one!

I think their software has learnt vb and just compensates and discards. But this didn't bother me because the lack of sessionhashes and querystrings does help with being cached by proxies (the particularly dumb ones that AOL seem to use). So there is a benefit to it... but not as much as you think there will be.
Reply With Quote
  #278  
Old 07-23-2003, 10:04 AM
Erwin's Avatar
Erwin Erwin is offline
 
Join Date: Jan 2002
Posts: 7,604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Today at 03:59 PM buro9 said this in Post #276
One thing to point out is that even when you successfully remove all sessionhashes... Google spiders still visit with one!
What do you mean?
Reply With Quote
  #279  
Old 07-23-2003, 11:21 AM
buro9 buro9 is offline
 
Join Date: Feb 2002
Location: London, UK
Posts: 585
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Oh no, my mistake

In my online.php there was still a place where a sessionhash was being echoed and I incorrectly thought that the spiders were using a hash... but they're not... it's just the display to me of where the spider is that inserted the hash.

Ignore that last bit

Which is good... as now it clearly is working better than I thought.
Reply With Quote
  #280  
Old 07-23-2003, 11:38 AM
Ogmuk Ogmuk is offline
 
Join Date: Jun 2003
Posts: 113
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Thanks for the reply Buro9. I wonder if it's possible to get a step-by-step guide how to remove all the sessionhashes on every page where it is needed. If you or anyone has that amount of spare time of course

filburt1's beta script looks like something that could work with AdSense too, this might be interesting to look into. Did anyone try anything like this out for AdSense?

EDIT: VB3 works like a charm with AdSense. I can't wait for RC1 (just like nearly everyone else here).
Reply With Quote
  #281  
Old 07-23-2003, 03:35 PM
buro9 buro9 is offline
 
Join Date: Feb 2002
Location: London, UK
Posts: 585
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Found another bug:

The admin function to merge threads did not work, because your thread URL's are now of a different format.

postings.php and the action 'domergethread' expected a URL with 'threadid=' in it. But if you've followed all instructions (!) your formats are more similar to:

http://www.bowlie.com/forum/t5249.html
and
http://www.bowlie.com/forum/t5249-15-3.html

So, to fix this, do this:

FIND (in postings.php):
Code:
$getthreadid=intval(substr($mergethreadurl,strpos($mergethreadurl,"threadid=")+9));
And replace with:
Code:
  // HACK : START : SPIDER FRIENDLY URLS
  //$getthreadid=intval(substr($mergethreadurl,strpos($mergethreadurl,"threadid=")+9));
  $getthreadid = intval(preg_replace("/(^.*\/t)|(-[\d]+-[\d]+)|(\.html)/", "", $mergethreadurl));
  // HACK : END : SPIDER FRIENDLY URLS
If your format is slightly different, modify the regexp pattern slightly

All it's doing is stripping out the threadid from the new format URL and putting that in the variable in the same way the old code did.

As you'll note, I always leave old-code lying around commented out in case I ever want to roll-back... it's just my style... but if you trust my work you can delete that line.

I also always leave those START and END blocks in, so I can see what the hell I changed and why



Ogmuk, simply put... I started by using the template search to find all templates with 'sessionhash' in them. Then I edited each and every template (nigh on all of them) and removed the applicable code... which usually boils down to:

Code:
s=$session[sessionhash]
And that is buried in nearly all URL's and also in some hidden form fields.

Once removed from all templates, I then searched through all .php files in the root of the forum directories, and similarly replaced all sessionhashes. EXCEPT where I found $dbsession[sessionhash] as this was usually being written TO the cookie and wasn't being echoed.

You will have to read through each instance, but it's obvious that if it's appearing in a URL you can strip it out... but if it's in code then you'll probably want to keep it there.


And lastly... I have AdSense running on my site, and thought I'd share this last tip for you:

AdSense advises you not to place adverts on pages that you have to be logged on to view, or on search results pages. The former is because they'll never correctly spider it and serve relevant adverts (I bet you see ones for password cracking and security!), and the second is because the pages changes too frequently and by the time it's spidered it's useless. Both in effect will show inappropriate or public service adverts which do nothing for your revenue... and lower your click-throughs by increasing impressions... and also generate server load by sending too many spiders your way.

So... I've written some JavaScript to only put adverts on pages that I know I WANT to show AdSense adverts on... here it is for you

Code:
<script type="text/javascript">
var adPages = new Array(
  'forumdisplay.php',
  'index.php',
  'announcement.php',
  'showthread.php',
  'calendar.php',
  'donate.php',
  'misc.php',
  'memberlist.php',
  'vbstats.php',
  'member.php',
  'forum/f',
  'forum/t'
);
var returnAdvert = false;
var pageString = new String();
pageString = document.location.href;
for (var ii = 0; ii < adPages.length; ii++) {
  if (pageString.indexOf(adPages[ii]) >= 0) {
    returnAdvert = true;
    break;
  }
}
if (returnAdvert == false && document.location.href == "http://www.bowlie.com/forum/") {
  returnAdvert = true;
}
if (returnAdvert == true) {
  var google_ad_client='pub-9576666925012421';
  var google_ad_width=468;
  var google_ad_height=60;
  var google_ad_format='468x60_as';
  document.write('<scr'+'ipt type="text/javascr'+'ipt" language="JavaScr'+'ipt" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></scr'+'ipt>');
}
</script>
All you need to do is change my forum path to your full forum path and put in your ad_client code (otherwise I get your money!).

Hope all of that info helps everyone.

Cheers

David K
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 09:14 AM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.08597 seconds
  • Memory Usage 2,319KB
  • Queries Executed 25 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (6)bbcode_code
  • (1)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (1)pagenav_pagelinkrel
  • (11)post_thanks_box
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete