![]() |
Stop Spammers with rel=nofollow in URLs!
In the first cooperative move for nearly ten years, the major search engines have unveiled a new indexing command for web authors that they all recognize, one that they hope will help reduce the link and comment spam that plagues many web sites....due to removing the point of doing it in the first place.
The new "nofollow" attribute that can be associated with links was originated as an idea by Google in late 2004 and MSN and Yahoo, as well as major blogging vendors have jumped onboard. The Nofollow Attribute The new attribute is called "nofollow" with rel="nofollow" being the format inserted within an anchor tag. When added to any link, it will effectively serve as a flag to tell the search engines that the link has not been explictly approved by the site owner, and therefore "not follow" it, or not use the referring page's (on your site) Page Rank in any way. For example, this is how the HTML markup for an ordinary link might look: <a href="http://www.somedomain.com/page.html">My forums are the best lol lol lol click here!!</a> This is how the link would look after the nofollow attribute has been added, with the attribute portion shown in bold <a href="http://www.somedomain.com/page.html" rel="nofollow">My forums are the best lol lol lol click here!!</a> This would also be acceptable, as order of elements within the anchor tag makes no difference: <a rel="nofollow" href="http://www.site.com/page.html" >Visit My Page</a> Once added, the search engines supporting the attribute will understand that the link has not been approved in some way by the site owner. Think of it as a way to flag to them, "I didn't post this link -- someone else did." Quote:
WHAT WILL THIS DO, IN ESSENCE? This will affect URLs in posts, as well as signatures...anything that goes through the bbcodeparse function as far as I can tell/guess, and will work recursively, or whatever the word is that means 'it will affect all existing posts and signatures'...or it did for me anyway. Update: Thanks to Michael Morris and natez0rz for pointing out that using the $post global would be a much better idea. To change the conditional number of posts, alter PHP Code:
It should work with all vB 3.0.x versions, but was tested on 3.0.6. File to modify: 1 1/ Open your includes/functions_bbcodeparse.php file Find: PHP Code:
PHP Code:
3/ Relax, safe in the knowledge that spammers linking from your site are doing so for no reason whatsoever. :) 4/ Edit: exclude staff usergroups and members with over 50 posts. |
Very interesting. Thanks for the news and the modification instructions.
|
excellent, thanks!
|
Quote:
edit: found this which explains it all http://blog.searchenginewatch.com/blog/050118-204728 Quote:
|
Quote:
|
Nice little mod this, thank you.. :)
|
Very interesting! Thanks!
|
Quote:
1. I read about it on slashdot 2. I search for it here 3. I open includes/functions_bbcodeparse.php and insert a very small amount of data 4. I astound myself by discovering it works 5. I upload it to this thread. :) A nice easy install for something that took about 10 minutes from conception to release. :D |
Nice hack :)
|
thanks.
one small question. How do we verify it's working? |
Quote:
CTRL-F for nofollow. :) |
Quote:
|
ADDON "HACK":You can also add the "no index and no follow" rule to each page header as well as the URL.
Go to your admin control panel, and open the style manager, and choose to edit the headinclude template and look for: PHP Code:
PHP Code:
PHP Code:
|
Interesting. :)
|
Not to criticise your modification but I'd say this was a poor way of implementing this. As soon as you put no follow on the links it'll:
Quote:
|
Quote:
But seriously...it's not spambots that are 'targeted' by this hack, it's the Spammers that send them out. The theory goes that if people were to implement this idea, there would be no reason for the Spammers to send out the bots in the first place. At least, it removes the advantage of having PR from the sites they are spamming be added to their site. How would you go about implementing this? regarding the addon: I don't know why he is suggesting to have noindex in the header of each page...not something I would do myself. |
Quote:
Quote:
You can also tell the spider to ignore only specific parts of your site in a few different ways. One way is to use a "robots.txt" file. The robots.txt is a TEXT file (not HTML!) which has a section for each robot to be controlled. Each section has a user-agent line which names the robot to be controlled and has a list of "disallows" and "allows". Each disallow will prevent any address that starts with the disallowed string from being accessed. Similarly, each allow will permit any address that starts with the allowed string from being accessed. The (dis)allows are scanned in order, with the last match encountered determining whether an address is allowed to be used or not. If there are no matches at all then the address will be used. Using a robots.txt file is easy. If your site is located at: http://domain.com/mysite/index.html you will need to be able to create a file located here: http://domain.com/robots.txt Here's an example: Code:
user-agent: FreeFind Code:
http://domain.com/mysite/test/index.html Code:
http://domain.com/mysite/test.html Code:
user-agent: FreeFind Using allows can often simplify your robots.txt file. Here's another example which shows a robots.txt with two sections in it. One for "all" robots, and one for the FreeFind spider: Code:
user-agent: * Examples: To prevent FreeFind from indexing your site at all: Code:
user-agent: FreeFind Code:
user-agent: FreeFind Code:
user-agent: FreeFind Code:
user-agent: * The exclusion: http://mysite.com/ignore.html prevents that file from being included in the index. The exclusion: http://mysite.com/archive/* prevents everything in the "archive" directory from being included in the index. The exclusion: /archive/* prevents everything in any "archive" directory from being included in the index regardless of the site it's on. The exclusion: http://mysite.com/*.txt prevents files on "mysite.com" that end with the extension ".txt" from being included in the index. The exclusion: *.txt prevents all files that end with the extension ".txt" from being included in the index regardless of what site they're on. The exclusion: http://mysite.com/alphaindex/?.html prevents a file like "http://mysite.com/alphaindex/a.html" from being indexed, but would allow a file "http://mysite.com/alphaindex/aardvark.html" to be indexed. The exclusion: http://mysite.com/alphaindex/?.html index=no follow=yes prevents a file like "http://mysite.com/alphaindex/a.html" from being added to the index but would allow the spider to find and follow the links in that page. The exclusion: http://mysite.com/endwiththis.html index=yes follow=no allows that file to be added to the index but prevents the spider from following any of the links in that file. |
Quote:
Quote:
Actually....ya know what - forget about the "no index" option... forget I even mentioned it- this hack was about the "no follow" so if you are planning on implementing the first hack I suggest adding the meta in the header, and I apologize for trying to toss in more info than was needed. |
The "rel" attribute has been around since HTML 3.2.
It's getting a lot of attention these days because of the "junk" pages that are being indexed by the major search engines. Most notably caused by individuals who comment (spam) a blog, wicki, or forum site. The search engines are looking for a way to conserve resources (use it where it counts) and prevent indexing of sites with no relative content. So they are asking the community to start using the rel="nofollow" attribute to help them stop-- at the very least slow -- the "spamming". When an individual spams a site they leave links on the post hoping that the search engines will "follow" the link back to their site. When they (the spammers) do this they are hoping to increase their "popularity" with search engines. The rel="nofollow" does not prevent the search engines from indexing your pages. Nor, does it prevent the other site from being indexed when search engines do it directly. It will simply tell the search engine not to follow the link that was posted on your page (thread/post) -- that was NOT created by you. ABOUT THE HACK If you are worried about your PAGERANK than use this. If you want to prevent spammers from posting in your forum than this hack will not help. Spammers will continue doing what they do ... the best route is to remove post and ban user. Most will not even know you are using rel="nofollow" and some will not even understand it. SOME CONTROLS ARE NEEDED I think there should be some controls. For example, converting all posted links with rel="nofollow" also punishes those who are loyal to the site. Why not help your loyal members with their site "popularity"? Do not convert links posted by loyal users. Allow the search engines to follow these links. Some sites can even list this as a membership benefit. -- just throwing ideas Anyway, what I'm trying to get to is that the ADMIN should have some control over what links get rel="nofollow". As it is now, all "in-house" links are tagged with rel="nofollow" which may hurt your "popularity". |
Quote:
PHP Code:
The syntax for multiple groups escapes me at present, but if someone can remind me, I will change it. |
Quote:
PHP Code:
edit, actually I believe it would be PHP Code:
|
Quote:
This hack adds the "no follow" attribute to links which have gone through vB's bbcode parser (posts, sigs, etc) and will stop spiders following them but the links in your templates (almost all probably pointing to other pages on your own site) won't be affected and so will be followed. The "no follow" in the header meta will tell spiders to not follow any links at all, including to the rest of your own site. |
Quote:
|
Nice hack.
I made a modification to it so that only new members (in my case, those with less than 30 posts) have rel="nofollow" attached to their posts. Established members are not penalised and the searchbots will still follow and index their links. Replace PHP Code:
PHP Code:
|
Quote:
Cretins get high ranking by using loopholes in the way that search engines rank sites/pages. Essentially if a search engine "sees" a url that has numerous other pages linking to it it thinks that that site is popular. This may be the case, but if I were to visit every single one of the users sites from this forum alone and posted a link to my board on each one suddenly my sites ranking will increase as the bots "see" lots of links to my site. This is the essence of "google bombing" (like search on Feeling Lucky for WMD in google for example). By making it so that links POSTED IN THREADS have no follow means that the search engine bots IGNORE links in posts but will index your pages and follow all the other links in your pages unless you've messed with robots.txt or put the nofollow elsewhere. I think that explains it.... *INSTALLED* |
Quote:
I kinda thought that was obvious, but thank you for spelling it out so well for those people who seem to think this is some kind of 'Stop Spiders from indexing your site' hack, which..quite frankly, would be a retarded idea. |
Installed, however, is there a way this can be implemented into signatures as well? It seems as if it's taking no effect there. *edit* Errr, maybe it is. It should work everywhere, right? Profile pages don't seem to include it, maybe my browser is cached.
|
Although I think yoyo's post needed to be clarified, I think people arebeing hard on him. There are many boards out there with many goals. Aslong as people understand what he's doing and implement it tofurthertheir goals, his posts have been very helpful AND he clearlyknows whathe's talking about.
Thanks for the hack Kall and the additions. I like the one basedonnumber of posts. It will be cool if the addons can be mentioned inthefirst thread for new people coming onto this thread. This willprobablygrow into a big thread until vb supports the tag....WHICH Isuggestthey do right away so that they can boost their pagerank bygettingmentioned in Google's blog which announced this ;) (just kidding, they are using a redirect anyway.) |
Couldnt get this to work, followed instructions but no "nofollow" showed when viewing source.
|
Quote:
|
Quote:
|
Quote:
|
Quote:
:clicks ignore: Now, might I recommend the following. Your conditional references $bbuserinfo. That will only affect the viewing user - so spiders still see no links in posts regardless of the user. To pull up the post user data, use $post. You'll have to global it. Further, instead of using a set usergroup, I recommend using a post count threshhold of 50. I doubt many spammers will reach that threshold, while most of your regulars will. Hence PHP Code:
|
Quote:
PHP Code:
|
I tried to do what natez0rz did in the post above me but I get an error.. :(
Code:
Parse error: parse error, unexpected T_ELSE in /home/forum/public_html/forums/includes/functions_bbcodeparse.php on line 1523 Code:
if ($type == 'url') |
Quote:
*updates hack* Now it will affect users with under 50 posts only. |
Works great, clicks install.
I modified it a bit for simplification. If the user IS a member of New Members usergroup, then it posts the modified URL. |
I just thought I'd point out that in the SEO community the use of nofollow has been suggested as a way to devalue pages.
The purpose of a nofollow tag is not simply indicate a link should be ignored, but also that the page content is subject to third party interference. Therefore if people think that implementing this hack could be a great way to "horde PageRank" or other such strategy, then they may find what they actually achieve is simply a way of telling search engines that the forum itself is nothing but a collection of "free for all" pages and should be devalued. Webmasters should consider very carefully why they wish to use nofollow, because if it's for SEO purposes then there is a real possibility that the plan could backfire on you. |
Quote:
This hack applies the attribute to links posted by (and in the signatures of) members of Usergroups defined by the admin. It does not affect board-wide links or template links or anything else. How that would tell the engines that the forum is nothing but a collection of free-for-all pages is beyond me. Please explain. Or did you totally miss the point of this hack? |
Anyone with before and after SEO stats with this mod installed?
That'll explain to everyone if this mod hurts rankings and put the second guessing to rest. Something is needed to keep the spammers at bay. If it's true that the bots are reading images via OCR, even image verification won't help. :( Chris |
All times are GMT. The time now is 04:52 PM. |
Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
![]() |
|
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|