vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.5 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=113)
-   -   Member Bots 1.0 - Allow bots to act as members (https://vborg.vbsupport.ru/showthread.php?t=108070)

trilljester 02-14-2006 10:00 PM

Member Bots 1.0 - Allow bots to act as members
 
--------- Member Bots 1.0 ---------
Written by Trilljester - http://www.trilliummud.com

This small mod allows you to treat search engine robots (i.e. GoogleBot, Yahoo Slurp) as registered users instead of guests, as they are treated by default in VBulletin.

This is handy if your site allows limited access to guests, but full access to registered users. By default, a search engine robot would only be allowed to index what a guest could see, but with this mod, it would allow them to fully index your site as a registered user.

This mod uses the spiders_vbulletin.xml file for determining if a visitor is a bot or not, so make sure you have that file up to date. The one that comes with VB is pretty sparse, but it works. If you're only interested in GoogleBot, then the default file works great.

WARNING! This mod requires a very small edit to a VBulletin core file. This means whenever you upgrade to a new version of VBulletin, you'll need to re-do this edit, if you overwrite this file. If the idea of editing a VBulletin core file bothers you, don't install this mod! I won't be held responsible for your forum being fouled up. This mod is very safe however, so don't fret.

Install Instructions:

1. Unzip the product-memberbot.xml file on your system.

2. Go to your VBulletin AdminCP and click on Manage Products under the Plugin Manager section.

3. Click on Add/Import Product.

4. Click on Browse on the first line (Upload the XML from your computer) and locate the product-memberbot.xml file. Click on Import to upload it to the system.

5. Click on Plugin Manager, and ensure that the new plugin named Bot Checker, which is found under the init_startup hook, is enabled.

6. Now the fun part, you'll need to edit the core VBulletin file init.php which is located in the includes directory under your forum home.

For example, if your forum home is located in /home/blah/forum, then init.php is in /home/blah/forum/includes

If you can edit the file on the server without having to download it, more power to you.

In init.php, find this line (should be near the very bottom of the file, it was line 403 in mine):

Code:

if (!empty($db->explain))
Add ABOVE this line:
Code:

if ($is_bot == 1) {
        $vbulletin->userinfo['usergroupid'] = 2;
}

NOTE: My registered users group is ID #2. Yours may be different, you may change this to any group ID you like.

7. Save init.php and upload back to your server if you had to download the file to edit it.

If you're editing on the server, save it, and that's it!

================================================== ========
That's it! Search Engine bots should now "see" your forum as a regular user, not a guest.

Questions, comments, improvements are more than welcome. Please use this thread as a cental support site. It makes it easier for me to help out. Also, you may visit my site and fire me off a PM for help, or just send a quick thanks or shout out.

Enjoy!
// trilljester

trilljester 02-15-2006 09:18 PM

Reserving for future use. If any....

bashy 02-15-2006 09:53 PM

Hi

Me installed...
I take it the bots are now registeed users and not showing as guest anymore?

Smiry Kin's 02-15-2006 10:04 PM

hmm i read some one doing this was agaisnt the bots agreements etc?

trilljester 02-15-2006 10:08 PM

Quote:

Originally Posted by bashy
Hi

Me installed...
I take it the bots are now registeed users and not showing as guest anymore?

No, they'll still show up as Guest, but their usergroup will be that of a member, thus allowing them to see what members can see.

trilljester 02-15-2006 10:19 PM

Quote:

Originally Posted by Smiry Kin's
hmm i read some one doing this was agaisnt the bots agreements etc?

This is a well-known and well-liked feature of IPB that, in my opinion, is sorely lacking in VBulletin. I sent a quick email to Google regarding this issue and they responded that this is not an issue to them since they don't personally review every site that GoogleBot spiders.

If you install this mod and see issues with Google/Yahoo/etc... then remove it and everything will go back to the way it was.

PixelFx 02-15-2006 10:47 PM

this kicks a$$, thanks for releasing it

mightyb 02-15-2006 11:19 PM

Quote:

Originally Posted by Smiry Kin's
hmm i read some one doing this was agaisnt the bots agreements etc?

Good point! This might be considered as search engine cloacing. Because spiders will see the page different to what regular users (guests) will see. But i dont know...

Smiry Kin's 02-15-2006 11:41 PM

well it happened to one of my sites.. now google and blocked it from there search engine.. im just say.. :) becareful., infact they banned my whole adsense account, in which they still wont mail back to this day..

Ramsesx 02-16-2006 02:29 AM

Hm, would like to install it very much, but the most visitors at this time come from google and would be a desaster if they would ban my site. :ermm:
@trilljester
could you be so kind and post exactly what google wrote to you?
Thanks

GamerJunk.net 02-16-2006 02:41 AM

Amazing! Any plans to have them show as users on the memberlist like GoogleBot and YahooSlurp?

AshokForums.com 02-16-2006 02:44 AM

thanks for this excellent hack.. love it

kall 02-16-2006 02:56 AM

Quote:

Originally Posted by Smiry Kin's
hmm i read some one doing this was agaisnt the bots agreements etc?

Highly unlikely.

If this is cloaking, having a Guest usergroup is in and of itself, cloaking.

If a registered user happens to come across the Google Cache of one of your pages, and goes to visit it, they see something different as well.

trilljester 02-16-2006 05:57 AM

Here's what Google wrote back to me:

Quote:

Thank you for your note. We recognize your concern. Please be advised that
we don't personally review individual sites, nor do we comment on
webmaster techniques or the details of our search technology beyond what
appears on our site.

We've dedicated an entire section of our site to answering the most common
questions from those who maintain and/or promote websites. You'll find all
of our publicly available information information posted at
http://www.google.com/webmasters/index.html
I did some more research on their site, and nowhere in Google's definitions does this qualify as a "bad" practice. However, it could be loosely, and I mean, very loosely defined as cloaking, in the sense that 2 sets of content are being presented. However, you are not hiding content from anyone.

There are a lot of websites out on the net that require you to register or even pay to see content. Yet, they're indexed and served in Google. If this was a rampant violation of their terms of service, then they'd clearly state it. Again, it's your choic to use this mod or not. If you are extremely worried, then don't use it. Simple as that!

Onwards:

Quote:

Amazing! Any plans to have them show as users on the memberlist like GoogleBot and YahooSlurp?
Well, on my site, when Googlebot shows up, it displays them as their true identity on the Who's Online display. It's probably not a good idea to make them actual members, they're just borrowing the members' group.

Zia 02-16-2006 06:01 AM

hmm..i dont its would be better to install it or not...
some whre in a famous webmaster forum i read "cloaking' spider/bot maight be a reason banned the site from the search engine...

...looks for xpert's comment....

Corriewf 02-16-2006 06:28 AM

This is a great mod!

Think about the logic you guys are preaching. With the same logic, we could get banned for an indexed thread that has been moved to a private area..... There is no black and white here but nothing in life is. If the bot has indexed the material available in your site then your fine by Google's TOS. Cloaking is making material that does not actually exist on your site.

Install and have fun! ;)

Cap'n Steve 02-16-2006 06:46 AM

If you're worried about the engines getting annoyed with you, the Search Engine-Friendly Archive is much shadier than this.

Snake 02-16-2006 08:18 AM

Thank you very much, just installed the hack and it works great! ;)

kofoid 02-16-2006 02:30 PM

YAY YAY YAY! I have been wanting this for a while - THANK YOU

bashy 02-16-2006 04:04 PM

Hi

I still have bots getting the No Permissions when trying to view a forum that users are allowed to view?

trilljester 02-16-2006 04:22 PM

Quote:

Originally Posted by bashy
Hi

I still have bots getting the No Permissions when trying to view a forum that users are allowed to view?

Are the bots in your spiders_vbulletin.xml file (located in includes/xml)? This mod uses this file to determine if the visitor is a bot or not.

ScienceofSpock 02-17-2006 06:04 AM

Just a quick question/security concern:
How are you identifying SE bots? Are you using the user-agent?
If so, are bots allowed to post (considering they're regarded as regular members) ?
If so, What's to stop me loading up Opera, changing my user-agent to googlebot and posting on your forums?

I'm not trying to be sarcastic here, I'd really like to add this to my forum. I'm just trying to play devil's advocate and perform due diligence.

trilljester 02-17-2006 04:38 PM

Quote:

Originally Posted by ScienceofSpock
Just a quick question/security concern:
How are you identifying SE bots? Are you using the user-agent?
If so, are bots allowed to post (considering they're regarded as regular members) ?
If so, What's to stop me loading up Opera, changing my user-agent to googlebot and posting on your forums?

I'm not trying to be sarcastic here, I'd really like to add this to my forum. I'm just trying to play devil's advocate and perform due diligence.

Yes, the modification uses the spiders_vbulletin.xml and checks the user-agent to determine if you're a bot or not. And if you follow the default installation, you'll be allowing bots or humans who are clever enough to change their user-agent to post on your site.

HOWEVER, you can set whatever group you want the bots/humans to be in. If you want the bots to be able to see your site but not post, create a new usergroup that doesn't allow posting, just viewing. Which is what I would recommend in any case, so nobody can spoof a bot and post on your forums.

To change to that group, just modify the code you added in init.php and change that group ID to whatever ID you have setup for that special group.

lionheart53 02-17-2006 07:34 PM

I want only the spider for google adsense to get through and not the others. So I tried changing the spiders_vbulletin.xml to just what's below but then my pages in the forum don't come up at all. Any ideas why? Is there a better way to do this?

<?xml version="1.0" encoding="ISO-8859-1"?>

<searchspiders>
<spider ident="Mediapartners-Google">
<name>Google AdSense</name>
<info>https://www.google.com/adsense/faq</info>
<email>adsense-support@google.com</email>
</spider>
</searchspiders>

<!-- CVS: $RCSfile: spiders_vbulletin.xml,v $ - $Revision: 1.2 $ -->

trilljester 02-17-2006 08:02 PM

Quote:

Originally Posted by lionheart53
I want only the spider for google adsense to get through and not the others. So I tried changing the spiders_vbulletin.xml to just what's below but then my pages in the forum don't come up at all. Any ideas why? Is there a better way to do this?

For some reason not entirely clear to me, VBulletin doesn't handle a single entry in the spiders_vbulletin.xml file very well. So, just add another Spider in the file. For your example, I added GoogleBot as well as Google Adsense.

Try this spiders_vbulletin.xml:

Code:

<?xml version="1.0" encoding="ISO-8859-1"?>

<spiderlist version="1.0">
        <spider ident="Mediapartners-Google">
                <name>Google AdSense</name>
                <type>searchspider</type>
                <info>https://www.google.com/adsense/faq</info>
                <email>adsense-support@google.com</email>
        </spider>
        <spider ident="Googlebot/">
                <name>Google</name>
                <type>searchspider</type>
                <info>http://www.google.com/bot.html</info>
                <email>googlebot@google.com</email>
                <addresses>
                        <address type="range">64.233.160.0-64.233.191.255</address>
                        <address type="range">66.249.64.0-66.249.95.255</address>
                        <address type="range">72.14.192.0-72.14.207.255</address>
                        <address type="range">216.239.32.0-216.239.63.255</address>
                </addresses>
        </spider>
</spiderlist>


lionheart53 02-17-2006 08:07 PM

Thanks. That resolved it. Very weird but I can work with that.

trilljester 02-17-2006 08:19 PM

Quote:

Originally Posted by lionheart53
Thanks. That resolved it. Very weird but I can work with that.

Yeah, it is weird, I'll have to look into it a little more. I might have to update the mod to use it's own XML file for what bots you want in or not, so as not to mess with the original file.

Maybe if I have time this weekend...

Zia 02-18-2006 03:55 AM

Quote:

Originally Posted by trilljester
For some reason not entirely clear to me, VBulletin doesn't handle a single entry in the spiders_vbulletin.xml file very well. So, just add another Spider in the file. For your example, I added GoogleBot as well as Google Adsense.

Try this spiders_vbulletin.xml:

Code:

<?xml version="1.0" encoding="ISO-8859-1"?>

<spiderlist version="1.0">
        <spider ident="Mediapartners-Google">
                <name>Google AdSense</name>
                <type>searchspider</type>
                <info>https://www.google.com/adsense/faq</info>
                <email>adsense-support@google.com</email>
        </spider>
        <spider ident="Googlebot/">
                <name>Google</name>
                <type>searchspider</type>
                <info>http://www.google.com/bot.html</info>
                <email>googlebot@google.com</email>
                <addresses>
                        <address type="range">64.233.160.0-64.233.191.255</address>
                        <address type="range">66.249.64.0-66.249.95.255</address>
                        <address type="range">72.14.192.0-72.14.207.255</address>
                        <address type="range">216.239.32.0-216.239.63.255</address>
                </addresses>
        </spider>
</spiderlist>


umm nice..support......do u mind to rrelase an update spiderlist.xml

i trust,u tweak/customize ur spider list......can u relase it....
in fact now a days no one taking care of spiders list...

dutchbb 02-18-2006 03:28 PM

it's a black hat technique

google says don't optimize your pages for spiders, if you do, you are taking a high risk...

trilljester 02-18-2006 10:59 PM

Quote:

Originally Posted by Zia
umm nice..support......do u mind to rrelase an update spiderlist.xml

i trust,u tweak/customize ur spider list......can u relase it....
in fact now a days no one taking care of spiders list...

I downloaded the one offered at vbulletin.com - here's the link:

http://www.ragnarokonline.de/spiderlist/spiderlist.xml

Upload it into includes/xml, name it spiders_vbulletin.xml

You could also name it that before uploading and overwrite the one in that directory. I keep one called spiders_vbulletin.xml.new in includes/xml so when upgrading VB versions, I can just copy that one over the default one given in VB.

Andreas 02-19-2006 05:56 PM

<font size="3">Attention</font>
Using this Hack is to be considered cloaking your site, and you risk being banned from search engine indexes - like it recently happened to BMW.

trilljester 02-19-2006 09:37 PM

Andreas: Do you have official word from the search engines that this is cloaking?

Also, how do you explain the thousands of Invision based boards (which have this built in) that do exactly what this mod does, yet are NOT banned from the search engines? If this is cloaking, then what do you consider the "search engine friendly" archive as mentioned earlier in this thread?

Andreas 02-20-2006 04:50 PM

http://www.google.de/intl/us/webmasters/guidelines.html

Quote:

Make pages for users, not for search engines. Don't deceive your users or present different content to search engines than you display to users, which is commonly referred to as "cloaking."
(I am pretty sure other search engines have similar terms)

If a search engine indexes a Thread that is only accesable for Registered Users (or even worse only for paying users), and I access your site as a guest, I do get different content (a no permission error).
As you are intentionally allow search engines to spider content that is not publically accessable (to attract more users than there would be if you only had public content indexed), this is to be considered cloaking.

The search engine friendly archive is different in 2 ways:
a) I do get the same content through the archive as I do get through the forum
b) If I, as a Guest, access the archive I do get the exact same output as does a search engine spider.

Of course there are thousands of sites out there that are cloaking and still are indexed ... how many pages does Google have indexed?
Just because they are not discovered and remove yet, does not mean that they never will be.

If you come across sites that are cloaking, report them - this increases the chance of getting them removed.

lionheart53 02-20-2006 04:54 PM

I'm not an expert in this and don't know about the Google spider for content, but I do know that for adsense the contact at google I was emailing with specifically mentioned that it was okay for me to have adsense ads if I could set up a way for the google adsense spider to get in to the data even though the content normally requires login. So at least for that one bot I'm assuming they wouldn't call it cloaking since they told me to do it.

I'd be surprised if they called it cloaking if a user can get to that same data by logging in since you're then not giving different data, just requiring login. For paying for access, that may be something different since a regular person couldn't get to that data.

trilljester 02-20-2006 05:29 PM

Andreas: Would you agree that if you allow a spider to see content that only registered users (not paying) can see is not technically cloaking? Because all the incoming user has to do is register to see the content.

Maybe a mod on top of this one would be a "snippet" where the incoming visitor could see some of the first post in a thread, and would have to register to see the full content.

If you are charging users to see your forums and using this hack, then I agree 100%, it is cloaking, because you are now requiring a user to pay to see the content.

But for a simple 30 second registration requirement, that's really stretching the definition of "cloaking". One good example of why this could not be defined as cloaking is the New York Times. To see articles, you must create a free account. Yet, NYT articles pop up on search engines all the time. You'd think Google or Yahoo would be a little more inclined to ban a major site like the NYT as opposed to a small site like mine.

We could sit here and debate this until we turn blue. I go with my original edict. If you are worried about being banned from search engines then don't install this hack.

Andreas 02-20-2006 05:38 PM

It would still be cloaking, as I as a guest user cannot view the content.

Being able to view it when logged is not the solution; I as a user coming from a search engine result might not even be able to register at all or I might not want to do this.

EoD from my side :)

Andreas 02-21-2006 03:34 AM

Idea to make this a Plugin (untested):

Create a Bot User, set $_COOKIE[COOKIE_PREFIX . 'userid'] and $_COOKIE[COOKIE_PREFIX . 'password'] in init_startup accordingly.

Moya 02-21-2006 07:41 AM

Quote:

Originally Posted by Andreas
It would still be cloaking, as I as a guest user cannot view the content.

Being able to view it when logged is not the solution; I as a user coming from a search engine result might not even be able to register at all or I might not want to do this.

EoD from my side :)


Hehehehe I don't mean to attack him or his hack, but I think we can test this out. We can try to search for subjects related to his link. Once we find one that require certain login to access the content, we report to google and ask for their opinion whether this is cloaking or not.

:D :D just my lousy 0.02

trilljester 02-21-2006 04:59 PM

Quote:

Originally Posted by Moya
Hehehehe I don't mean to attack him or his hack, but I think we can test this out. We can try to search for subjects related to his link. Once we find one that require certain login to access the content, we report to google and ask for their opinion whether this is cloaking or not.

:D :D just my lousy 0.02

Example of what I'm talking about:

http://www.nytimes.com/2006/02/14/po...hitehouse.html

I did a search on Google for: cheney shooting + new york times

The site comes back with a 1 sentence snippet from the article, but requires a registration to read the full article.

So, is it cloaking?

treasureman 02-21-2006 10:52 PM

trilljester,

I want to thank you for this mod! I have installed it and it is working beautifully! This is a standard feature with Invision board and I used it for 3 years with no problems.

For the VBulletin Developers who are so deadset against this mod: It is NOT cloaking! Cloaking is when you show different content to search engines than to users. With this mod we are showing the SAME content , it's just that the users have to register to see a little deeper into the site.

Cloaking is showing information to the search engines that is DIFFERENT or unrelated, such as pages stuffed with keywords or pages that are not relevant to your content, just to increase ranking. In other words cloaking is showing misleading content. This is not what this mod is about.


All times are GMT. The time now is 12:08 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01441 seconds
  • Memory Usage 1,864KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (4)bbcode_code_printable
  • (15)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (2)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (40)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete