The Arcive of Official vBulletin Modifications Site.It is not a VB3 engine, just a parsed copy! |
|
Stop Spammers with rel=nofollow in URLs! Details »» | ||||||||||||||||||||||||||
In the first cooperative move for nearly ten years, the major search engines have unveiled a new indexing command for web authors that they all recognize, one that they hope will help reduce the link and comment spam that plagues many web sites....due to removing the point of doing it in the first place.
The new "nofollow" attribute that can be associated with links was originated as an idea by Google in late 2004 and MSN and Yahoo, as well as major blogging vendors have jumped onboard. The Nofollow Attribute The new attribute is called "nofollow" with rel="nofollow" being the format inserted within an anchor tag. When added to any link, it will effectively serve as a flag to tell the search engines that the link has not been explictly approved by the site owner, and therefore "not follow" it, or not use the referring page's (on your site) Page Rank in any way. For example, this is how the HTML markup for an ordinary link might look: <a href="http://www.somedomain.com/page.html">My forums are the best lol lol lol click here!!</a> This is how the link would look after the nofollow attribute has been added, with the attribute portion shown in bold <a href="http://www.somedomain.com/page.html" rel="nofollow">My forums are the best lol lol lol click here!!</a> This would also be acceptable, as order of elements within the anchor tag makes no difference: <a rel="nofollow" href="http://www.site.com/page.html" >Visit My Page</a> Once added, the search engines supporting the attribute will understand that the link has not been approved in some way by the site owner. Think of it as a way to flag to them, "I didn't post this link -- someone else did." Quote:
WHAT WILL THIS DO, IN ESSENCE? This will affect URLs in posts, as well as signatures...anything that goes through the bbcodeparse function as far as I can tell/guess, and will work recursively, or whatever the word is that means 'it will affect all existing posts and signatures'...or it did for me anyway. Update: Thanks to Michael Morris and natez0rz for pointing out that using the $post global would be a much better idea. To change the conditional number of posts, alter PHP Code:
It should work with all vB 3.0.x versions, but was tested on 3.0.6. File to modify: 1 1/ Open your includes/functions_bbcodeparse.php file Find: PHP Code:
PHP Code:
3/ Relax, safe in the knowledge that spammers linking from your site are doing so for no reason whatsoever. 4/ Edit: exclude staff usergroups and members with over 50 posts. Show Your Support
|
Comments |
#12
|
||||
|
||||
Quote:
CTRL-F for nofollow. |
#13
|
||||
|
||||
Quote:
|
#14
|
||||
|
||||
ADDON "HACK":You can also add the "no index and no follow" rule to each page header as well as the URL.
Go to your admin control panel, and open the style manager, and choose to edit the headinclude template and look for: PHP Code:
PHP Code:
PHP Code:
|
#15
|
||||
|
||||
Interesting.
|
#16
|
||||
|
||||
Not to criticise your modification but I'd say this was a poor way of implementing this. As soon as you put no follow on the links it'll:
Quote:
|
#17
|
||||
|
||||
Quote:
But seriously...it's not spambots that are 'targeted' by this hack, it's the Spammers that send them out. The theory goes that if people were to implement this idea, there would be no reason for the Spammers to send out the bots in the first place. At least, it removes the advantage of having PR from the sites they are spamming be added to their site. How would you go about implementing this? regarding the addon: I don't know why he is suggesting to have noindex in the header of each page...not something I would do myself. |
#18
|
||||
|
||||
Quote:
Quote:
You can also tell the spider to ignore only specific parts of your site in a few different ways. One way is to use a "robots.txt" file. The robots.txt is a TEXT file (not HTML!) which has a section for each robot to be controlled. Each section has a user-agent line which names the robot to be controlled and has a list of "disallows" and "allows". Each disallow will prevent any address that starts with the disallowed string from being accessed. Similarly, each allow will permit any address that starts with the allowed string from being accessed. The (dis)allows are scanned in order, with the last match encountered determining whether an address is allowed to be used or not. If there are no matches at all then the address will be used. Using a robots.txt file is easy. If your site is located at: http://domain.com/mysite/index.html you will need to be able to create a file located here: http://domain.com/robots.txt Here's an example: Code:
user-agent: FreeFind disallow: /mysite/test/ disallow: /mysite/cgi-bin/post.cgi?action=reply disallow: /a Code:
http://domain.com/mysite/test/index.html http://domain.com/mysite/cgi-bin/post.cgi?action=reply&id=1 http://domain.com/mysite/cgi-bin/post.cgi?action=replytome http://domain.com/abc.html Code:
http://domain.com/mysite/test.html http://domain.com/mysite/cgi-bin/post.cgi?action=edit http://domain.com/mysite/cgi-bin/post.cgi http://domain.com/bbc.html Code:
user-agent: FreeFind disallow: /cgi-bin/ allow: /cgi-bin/Ultimate.cgi allow: /cgi-bin/forumdisplay.cgi Using allows can often simplify your robots.txt file. Here's another example which shows a robots.txt with two sections in it. One for "all" robots, and one for the FreeFind spider: Code:
user-agent: * disallow: /cgi-bin/ user-agent: FreeFind disallow: Examples: To prevent FreeFind from indexing your site at all: Code:
user-agent: FreeFind disallow: / Code:
user-agent: FreeFind disallow: /_vti_bin/shtml.exe/ Code:
user-agent: FreeFind disallow: /test/ disallow: private.html Code:
user-agent: * disallow: /cgi-bin/ disallow: this.html disallow: and.html disallow: that.html user-agent: FreeFind disallow: The exclusion: http://mysite.com/ignore.html prevents that file from being included in the index. The exclusion: http://mysite.com/archive/* prevents everything in the "archive" directory from being included in the index. The exclusion: /archive/* prevents everything in any "archive" directory from being included in the index regardless of the site it's on. The exclusion: http://mysite.com/*.txt prevents files on "mysite.com" that end with the extension ".txt" from being included in the index. The exclusion: *.txt prevents all files that end with the extension ".txt" from being included in the index regardless of what site they're on. The exclusion: http://mysite.com/alphaindex/?.html prevents a file like "http://mysite.com/alphaindex/a.html" from being indexed, but would allow a file "http://mysite.com/alphaindex/aardvark.html" to be indexed. The exclusion: http://mysite.com/alphaindex/?.html index=no follow=yes prevents a file like "http://mysite.com/alphaindex/a.html" from being added to the index but would allow the spider to find and follow the links in that page. The exclusion: http://mysite.com/endwiththis.html index=yes follow=no allows that file to be added to the index but prevents the spider from following any of the links in that file. |
#19
|
||||
|
||||
Quote:
Quote:
Actually....ya know what - forget about the "no index" option... forget I even mentioned it- this hack was about the "no follow" so if you are planning on implementing the first hack I suggest adding the meta in the header, and I apologize for trying to toss in more info than was needed. |
#20
|
||||
|
||||
The "rel" attribute has been around since HTML 3.2.
It's getting a lot of attention these days because of the "junk" pages that are being indexed by the major search engines. Most notably caused by individuals who comment (spam) a blog, wicki, or forum site. The search engines are looking for a way to conserve resources (use it where it counts) and prevent indexing of sites with no relative content. So they are asking the community to start using the rel="nofollow" attribute to help them stop-- at the very least slow -- the "spamming". When an individual spams a site they leave links on the post hoping that the search engines will "follow" the link back to their site. When they (the spammers) do this they are hoping to increase their "popularity" with search engines. The rel="nofollow" does not prevent the search engines from indexing your pages. Nor, does it prevent the other site from being indexed when search engines do it directly. It will simply tell the search engine not to follow the link that was posted on your page (thread/post) -- that was NOT created by you. ABOUT THE HACK If you are worried about your PAGERANK than use this. If you want to prevent spammers from posting in your forum than this hack will not help. Spammers will continue doing what they do ... the best route is to remove post and ban user. Most will not even know you are using rel="nofollow" and some will not even understand it. SOME CONTROLS ARE NEEDED I think there should be some controls. For example, converting all posted links with rel="nofollow" also punishes those who are loyal to the site. Why not help your loyal members with their site "popularity"? Do not convert links posted by loyal users. Allow the search engines to follow these links. Some sites can even list this as a membership benefit. -- just throwing ideas Anyway, what I'm trying to get to is that the ADMIN should have some control over what links get rel="nofollow". As it is now, all "in-house" links are tagged with rel="nofollow" which may hurt your "popularity". |
#21
|
||||
|
||||
Quote:
PHP Code:
The syntax for multiple groups escapes me at present, but if someone can remind me, I will change it. |
|
|
X vBulletin 3.8.12 by vBS Debug Information | |
---|---|
|
|
More Information | |
Template Usage:
Phrase Groups Available:
|
Included Files:
Hooks Called:
|