Go Back   vb.org Archive > vBulletin Modifications > Archive > vB.org Archives > vBulletin 3.0 > vBulletin 3.0 Full Releases
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools
Stop Spammers with rel=nofollow in URLs! Details »»
Stop Spammers with rel=nofollow in URLs!
Version: 1.00, by kall kall is offline
Developer Last Online: Aug 2021 Show Printable Version Email this Page

Version: 3.0.5 Rating:
Released: 01-19-2005 Last Update: Never Installs: 41
 
No support by the author.

In the first cooperative move for nearly ten years, the major search engines have unveiled a new indexing command for web authors that they all recognize, one that they hope will help reduce the link and comment spam that plagues many web sites....due to removing the point of doing it in the first place.

The new "nofollow" attribute that can be associated with links was originated as an idea by Google in late 2004 and MSN and Yahoo, as well as major blogging vendors have jumped onboard.

The Nofollow Attribute

The new attribute is called "nofollow" with rel="nofollow" being the format inserted within an anchor tag.
When added to any link, it will effectively serve as a flag to tell the search engines that the link has not been explictly approved by the site owner, and therefore "not follow" it, or not use the referring page's (on your site) Page Rank in any way.

For example, this is how the HTML markup for an ordinary link might look:

<a href="http://www.somedomain.com/page.html">My forums are the best lol lol lol click here!!</a>

This is how the link would look after the nofollow attribute has been added, with the attribute portion shown in bold

<a href="http://www.somedomain.com/page.html" rel="nofollow">My forums are the best lol lol lol click here!!</a>

This would also be acceptable, as order of elements within the anchor tag makes no difference:

<a rel="nofollow" href="http://www.site.com/page.html" >Visit My Page</a>

Once added, the search engines supporting the attribute will understand that the link has not been approved in some way by the site owner.

Think of it as a way to flag to them, "I didn't post this link -- someone else did."

Quote:
Originally Posted by Alkatraz
If Google sees nofollow as part of a link, it will:

1. NOT follow through to that page.
2. NOT count the link in calculating PageRank link popularity scores.
3. NOT count the anchor text in determining what terms the page being linked to is relevant for.
The site that is being linked to will gain nothing from the link, so the whole point of doing it in the first place is removed.

WHAT WILL THIS DO, IN ESSENCE?

This will affect URLs in posts, as well as signatures...anything that goes through the bbcodeparse function as far as I can tell/guess, and will work recursively, or whatever the word is that means 'it will affect all existing posts and signatures'...or it did for me anyway.

Update:

Thanks to Michael Morris and natez0rz for pointing out that using the $post global would be a much better idea.

To change the conditional number of posts, alter
PHP Code:
OR $post['posts'] > 50
to whatever you like.

It should work with all vB 3.0.x versions, but was tested on 3.0.6.

File to modify: 1

1/ Open your includes/functions_bbcodeparse.php file

Find:
PHP Code:
if ($type == 'url')
    {
        
// standard URL hyperlink
        
return "<a href=\"$rightlink\" target=\"_blank\">$text</a>";
    }
    else
    {
        
// email hyperlink (mailto:) 
Replace with:
PHP Code:
        if ($type == 'url')
    {
        global 
$post;

if (
is_member_of($post6//Admins are exempt
OR is_member_of($post5//Mods are exempt
OR is_member_of($post7//SuperMods are exempt
OR $post['posts'] > 50// People with over 50 posts are exempt
    
{
    
// standard URL hyperlink
    
return "<a href=\"$rightlink\" target=\"_blank\">$text</a>";
    }
    else
    {
     return 
"<a href=\"$rightlink\" rel=\"nofollow\" target=\"_blank\">$text</a>";
    }
    }
   else
    {
        
// email hyperlink (mailto:) 
2/ Save and Upload.

3/ Relax, safe in the knowledge that spammers linking from your site are doing so for no reason whatsoever.

4/ Edit: exclude staff usergroups and members with over 50 posts.

Show Your Support

  • This modification may not be copied, reproduced or published elsewhere without author's permission.

Comments
  #12  
Old 01-20-2005, 04:43 PM
kall's Avatar
kall kall is offline
 
Join Date: Apr 2004
Location: New Zealand
Posts: 2,608
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by neocorteqz
thanks.

one small question.

How do we verify it's working?
View the source of any page with a posted URL or a signature.

CTRL-F for nofollow.
Reply With Quote
  #13  
Old 01-20-2005, 06:14 PM
neocorteqz's Avatar
neocorteqz neocorteqz is offline
 
Join Date: May 2002
Location: Barefoot Bay Fl
Posts: 473
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by kall
View the source of any page with a posted URL or a signature.

CTRL-F for nofollow.
thanks.
Reply With Quote
  #14  
Old 01-20-2005, 07:30 PM
yoyoyoyo's Avatar
yoyoyoyo yoyoyoyo is offline
 
Join Date: Dec 2004
Location: USA
Posts: 1,612
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

ADDON "HACK":You can also add the "no index and no follow" rule to each page header as well as the URL.

Go to your admin control panel, and open the style manager, and choose to edit the headinclude template and look for:

PHP Code:
<meta http-equiv="Content-Type" content="text/html; charset=$stylevar[charset]/> 
and add the following AFTER:
PHP Code:
<meta name="robots" content="no index, no follow" /> 
if you want to have the page indexed, but still have the no follow rule stay in effect use this instead:
PHP Code:
<meta name="robots" content="no follow" /> 
Reply With Quote
  #15  
Old 01-20-2005, 08:39 PM
Erwin's Avatar
Erwin Erwin is offline
 
Join Date: Jan 2002
Posts: 7,604
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Interesting.
Reply With Quote
  #16  
Old 01-20-2005, 08:43 PM
Dean C's Avatar
Dean C Dean C is offline
 
Join Date: Jan 2002
Location: England
Posts: 9,071
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Not to criticise your modification but I'd say this was a poor way of implementing this. As soon as you put no follow on the links it'll:

Quote:
1. NOT follow through to that page.
2. NOT count the link in calculating PageRank link popularity scores.
3. NOT count the anchor text in determining what terms the page being linked to is relevant for.
Also your addon will mean google will not try to index the page. Maybe I'm missing something here but why on earth would you not want the search engines to index your page. The only usage for this will be on blog comment pages. Just because a spambot sees your link having rel="no follow" inside of it will not mean it won't spam the email.
Reply With Quote
  #17  
Old 01-20-2005, 09:01 PM
kall's Avatar
kall kall is offline
 
Join Date: Apr 2004
Location: New Zealand
Posts: 2,608
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Dean C
Not to criticise your modification but I'd say this was a poor way of implementing this. As soon as you put no follow on the links it'll:



Also your addon will mean google will not try to index the page. Maybe I'm missing something here but why on earth would you not want the search engines to index your page. The only usage for this will be on blog comment pages. Just because a spambot sees your link having rel="no follow" inside of it will not mean it won't spam the email.
*seethes at criticism*

But seriously...it's not spambots that are 'targeted' by this hack, it's the Spammers that send them out.

The theory goes that if people were to implement this idea, there would be no reason for the Spammers to send out the bots in the first place. At least, it removes the advantage of having PR from the sites they are spamming be added to their site.

How would you go about implementing this?

regarding the addon: I don't know why he is suggesting to have noindex in the header of each page...not something I would do myself.
Reply With Quote
  #18  
Old 01-20-2005, 09:21 PM
yoyoyoyo's Avatar
yoyoyoyo yoyoyoyo is offline
 
Join Date: Dec 2004
Location: USA
Posts: 1,612
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Dean C
Also your addon will mean google will not try to index the page. Maybe I'm missing something here but why on earth would you not want the search engines to index your page. The only usage for this will be on blog comment pages. Just because a spambot sees your link having rel="no follow" inside of it will not mean it won't spam the email.
Quote:
Originally Posted by kall
Regarding the addon: I don't know why he is suggesting to have noindex in the header of each page...not something I would do myself.
Not to hijack the thread, but there are many good reasons not to be indexed, but it is up to each person to decide if they want to or not. That is why I gave the alternate of only including the "no follow" meta tag instead of having both the no index and no follow. It seems to me that doing one without doing the other (including the no follow in the URL, but not in the meta) is only half of the solution.

You can also tell the spider to ignore only specific parts of your site in a few different ways. One way is to use a "robots.txt" file. The robots.txt is a TEXT file (not HTML!) which has a section for each robot to be controlled. Each section has a user-agent line which names the robot to be controlled and has a list of "disallows" and "allows". Each disallow will prevent any address that starts with the disallowed string from being accessed. Similarly, each allow will permit any address that starts with the allowed string from being accessed. The (dis)allows are scanned in order, with the last match encountered determining whether an address is allowed to be used or not. If there are no matches at all then the address will be used.

Using a robots.txt file is easy. If your site is located at:
http://domain.com/mysite/index.html
you will need to be able to create a file located here:
http://domain.com/robots.txt

Here's an example:

Code:
user-agent: FreeFind
   disallow: /mysite/test/
   disallow: /mysite/cgi-bin/post.cgi?action=reply
   disallow: /a
In this example the following addresses would be ignored by the spider:

Code:
http://domain.com/mysite/test/index.html
   http://domain.com/mysite/cgi-bin/post.cgi?action=reply&id=1
   http://domain.com/mysite/cgi-bin/post.cgi?action=replytome
   http://domain.com/abc.html
and the following ones would be allowed:

Code:
http://domain.com/mysite/test.html
   http://domain.com/mysite/cgi-bin/post.cgi?action=edit
   http://domain.com/mysite/cgi-bin/post.cgi
   http://domain.com/bbc.html
It is also possible to use an "allow" in addition to disallows. For example:

Code:
user-agent: FreeFind
   disallow: /cgi-bin/
   allow: /cgi-bin/Ultimate.cgi
   allow: /cgi-bin/forumdisplay.cgi
This robots.txt file prevents the spider from accessing every cgi-bin address from being accessed except Ultimate.cgi and forumdisplay.cgi.

Using allows can often simplify your robots.txt file.

Here's another example which shows a robots.txt with two sections in it. One for "all" robots, and one for the FreeFind spider:

Code:
user-agent: *
   disallow: /cgi-bin/

   user-agent: FreeFind
   disallow:
In this example all robots except the FreeFind spider will be prevented from accessing files in the cgi-bin directory. FreeFind will be able to access all files (a disallow with nothing after it means "allow everything").

Examples:

To prevent FreeFind from indexing your site at all:

Code:
user-agent: FreeFind
disallow: /
To prevent FreeFind from indexing common Front Page image map junk:

Code:
user-agent: FreeFind
disallow: /_vti_bin/shtml.exe/
To prevent FreeFind from indexing a test directory and a private file:

Code:
user-agent: FreeFind
disallow: /test/
disallow: private.html
To allow let FreeFind index everything but prevent other robots from accessing certain files:

Code:
user-agent: *
disallow: /cgi-bin/
disallow: this.html
disallow: and.html
disallow: that.html

user-agent: FreeFind
disallow:
Here are some more examples:

The exclusion:
http://mysite.com/ignore.html
prevents that file from being included in the index.

The exclusion:
http://mysite.com/archive/*
prevents everything in the "archive" directory from being included in the index.

The exclusion:
/archive/*
prevents everything in any "archive" directory from being included in the index regardless of the site it's on.

The exclusion:
http://mysite.com/*.txt
prevents files on "mysite.com" that end with the extension ".txt" from being included in the index.

The exclusion:
*.txt
prevents all files that end with the extension ".txt" from being included in the index regardless of what site they're on.

The exclusion:
http://mysite.com/alphaindex/?.html
prevents a file like "http://mysite.com/alphaindex/a.html" from being indexed, but would allow a file "http://mysite.com/alphaindex/aardvark.html" to be indexed.

The exclusion:
http://mysite.com/alphaindex/?.html index=no follow=yes
prevents a file like "http://mysite.com/alphaindex/a.html" from being added to the index but would allow the spider to find and follow the links in that page.

The exclusion:
http://mysite.com/endwiththis.html index=yes follow=no
allows that file to be added to the index but prevents the spider from following any of the links in that file.
Reply With Quote
  #19  
Old 01-20-2005, 10:37 PM
yoyoyoyo's Avatar
yoyoyoyo yoyoyoyo is offline
 
Join Date: Dec 2004
Location: USA
Posts: 1,612
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by Natch
If you don't mind my saying: "no kidding" or "so?"
you could say that about alot of the mods here, but each little addon or mod is like a lesson in the workings of php and vbulletin, so I find it interesting. I am sorry if it bothers you, but hopefully some other people will find it interesting or helpful.
Quote:
Originally Posted by Natch
I can't think of a reason why I would want to block Legit spiders (those that respect robots.txt restrictions), and a spambot spider is likely as not to ignore meta tags and robots.txt anyway.
well, maybe you can't but obviously others can, and desire that function, thus the "rules." I did not invent the meta tags. You are correct that bots don't always play by the rules, but some do, and these are the ones that this hack was addressing.

Actually....ya know what - forget about the "no index" option... forget I even mentioned it- this hack was about the "no follow" so if you are planning on implementing the first hack I suggest adding the meta in the header, and I apologize for trying to toss in more info than was needed.
Reply With Quote
  #20  
Old 01-21-2005, 12:14 AM
Princeton's Avatar
Princeton Princeton is offline
 
Join Date: Nov 2001
Location: Vineland, NJ
Posts: 6,693
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

The "rel" attribute has been around since HTML 3.2.

It's getting a lot of attention these days because of the "junk" pages that are being indexed by the major search engines. Most notably caused by individuals who comment (spam) a blog, wicki, or forum site. The search engines are looking for a way to conserve resources (use it where it counts) and prevent indexing of sites with no relative content.

So they are asking the community to start using the rel="nofollow" attribute to help them stop-- at the very least slow -- the "spamming".

When an individual spams a site they leave links on the post hoping that the search engines will "follow" the link back to their site. When they (the spammers) do this they are hoping to increase their "popularity" with search engines.

The rel="nofollow" does not prevent the search engines from indexing your pages. Nor, does it prevent the other site from being indexed when search engines do it directly.

It will simply tell the search engine not to follow the link that was posted on your page (thread/post) -- that was NOT created by you.

ABOUT THE HACK
If you are worried about your PAGERANK than use this.

If you want to prevent spammers from posting in your forum than this hack will not help. Spammers will continue doing what they do ... the best route is to remove post and ban user. Most will not even know you are using rel="nofollow" and some will not even understand it.

SOME CONTROLS ARE NEEDED
I think there should be some controls.

For example, converting all posted links with rel="nofollow" also punishes those who are loyal to the site.

Why not help your loyal members with their site "popularity"? Do not convert links posted by loyal users. Allow the search engines to follow these links. Some sites can even list this as a membership benefit. -- just throwing ideas

Anyway, what I'm trying to get to is that the ADMIN should have some control over what links get rel="nofollow".

As it is now, all "in-house" links are tagged with rel="nofollow" which may hurt your "popularity".
Reply With Quote
  #21  
Old 01-21-2005, 01:07 AM
kall's Avatar
kall kall is offline
 
Join Date: Apr 2004
Location: New Zealand
Posts: 2,608
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by princeton
SOME CONTROLS ARE NEEDED
I think there should be some controls.

For example, converting all posted links with rel="nofollow" also punishes those who are loyal to the site.

Why not help your loyal members with their site "popularity"? Do not convert links posted by loyal users. Allow the search engines to follow these links. Some sites can even list this as a membership benefit. -- just throwing ideas

Anyway, what I'm trying to get to is that the ADMIN should have some control over what links get rel="nofollow".

As it is now, all "in-house" links are tagged with rel="nofollow" which may hurt your "popularity".
Alrighty then, try this:

PHP Code:
    if ($type == 'url')
    {
        global 
$bbuserinfo;

    if (
is_member_of($bbuserinfo6))
        {
        
// standard URL hyperlink
        
return "<a href=\"$rightlink\" target=\"_blank\">$text</a>";
        }
        else
        {
         return 
"<a href=\"$rightlink\" rel=\"nofollow\" target=\"_blank\">$text</a>";
        }
    }
    else 
This will make it so anyone who is an admin (group 6 - change this to whatever you want) will not have their links tagged with the nofollow attribute.

The syntax for multiple groups escapes me at present, but if someone can remind me, I will change it.
Reply With Quote
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 03:20 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.05258 seconds
  • Memory Usage 2,369KB
  • Queries Executed 25 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (9)bbcode_code
  • (7)bbcode_php
  • (10)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)modsystem_post
  • (1)navbar
  • (6)navbar_link
  • (120)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (3)pagenav_pagelink
  • (11)post_thanks_box
  • (11)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (11)post_thanks_postbit_info
  • (10)postbit
  • (11)postbit_onlinestatus
  • (11)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • pagenav_page
  • pagenav_complete
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete