View Full Version : Search-Engine Listing (Dynamic)
I've created an alternative hack to the existing one which generated lots of little files. This one's a lot less system intensive and it doesn't involve static files, which means that you save A LOT of web space.
However, this will only work on Linux systems. (contact me if you have access to your httpd.conf on another OS, and I'll make code alterations for it to work)
Ed and I tested this (thanks a bunch, Ed!) and everything seems to be working great!
Check out his archive at:
http://www.magic-singles.com/cpa/forums/search.php3
Directly in the vB directory, create an .htaccess containing:
<Files search>
DirectoryIndex search
ForceType application/x-httpd-php3
</Files>
Then, create a file called "search" in the vB directory containing:
<?
$searcharray=explode("/",$REQUEST_URI);
$searchcount=count($searcharray);
$spec = $searchcount - 1;
$threadid = $searcharray[$spec];
require("showthread.php");
?>
Finally, create a file in your existing vB directory called search.php3 containing:
<?
require("global.php");
mysql_pconnect($server,$user,$password);
mysql_select_db($database);
$threads=$DB_site->query("SELECT threadid,title FROM thread WHERE visible=1 ORDER BY lastpost DESC");
while ($threadarray = $DB_site->fetch_array($threads)) {
$threadid = $threadarray["threadid"];
$title = $threadarray["title"];
print "<a href=\"search/$threadid\">$title</a><br>\n";
}
?>
Finally, create a line in the cssinclude section of the CP which reads:
<base href="http://yoursite.com/forums/">
and replace yoursite.com/forums/ with the domain + directory that vB is located in.
All should work well!
Just found a little glitch with this: it breaks the "jump to new post" link (from within a thread). To fix this, in showthread.php, find:
$newpostlink="#newpost";
And replace with: $newpostlink="showthread.php?threadid=$threadid#newpost";
That should do it...
I'm new to vBulletin but am very happy with the software so far. I had posted a few problems before but I realized they were all simply solved... thanks for the great hack.
On a slightly similar note, what search engines index .php files? How have you guys(and gals) found this to work with search engines?
[Edited by ExtremeFactor on 06-18-2000 at 01:21 AM]
Excellent job Stallion. Seeing as how this hack is designed to create an index that search engine spiders will crawl and index each individual thread, I have one simple question: Will they index this file? I mean without any text other than the links, I've heard that search engines wouldn't index these types of pages. I assume Ed has submitted his thread index thread, and I was wondering if any engines had indexed the individual threads?
I was also wondering how you would go about creating individual meta tags for the pages. Keyword and description meta tags arn't as important anymore, but it would still be nice to have them. Anyone know how to do this. (I'm thinking if a thread is called "How to make a million dollars" I'd like the meta description to say "How to make a million dollars, dicussed on the (Insert your forum name here)"
Sorry for rambling, it's getting late.
Cameron
Yes, search engines will index any file with any extenstion, its just that they don't store query strings in their databases. This would mean if they found a link to showthread.php?threadid=325 they would store it as showthread.php, which gives an error.
This hack creates an index of all threads, which is called with a query-string-less PHP file.
The threads themselves will be linked from the search engines with the new URL's, but they remain seamlessly identical to the showthread.php versions. It works well :)
As far as <meta> tags, that could be added via templates, for all thread pages.
Anyway to get this working on my virtual server, not linux, but SunOS?
I don't think I clarified enough in my last post Stallion. I understand why you have to generate the page without the links containing ?'s. My question was if from experience, the search engines actually index every thread, or if they ignore the page because there are too many links and not enough text. I've heard that because link popularity is more and more important, many engines won't index pages with so many links without content.
Cameron
I have gone through all of the steps several times and I am getting a 404 error (file not found) everytime I click on a link from search.php3. I'm using an .htaccess and I know that my .htaccess if functional because I use it for many things in a variety of directories. Is there a reason that this specific command might not work in my .htaccess? I assume that must be where the problem is, because I have made and modified all of the files, as noted above.
Any thoughts?
Is there any way to exclude private forums?
Will this work for FreeBSD?
Originally posted by Cold Steel
Will this work for FreeBSD?
Yes, it does.
http://www.aikiweb.com/forums/engine.php
Running on FreeBSD 4.0-stable, Apache 1.3.12, mySQL 3.22.32, and PHP 4.0.
TB2: Yes.. just change your sql statement in the search.php3 you created to read:
SELECT threadid,title FROM thread WHERE visible=1 and forumid != BADFORUMID1 and forumid != BADFORUMID2 ORDER BY lastpost DESC
Replace BADFORUMID1 and BADFORUMID2, etc. with the forum id's that you do not want to be indexed.
Stallion: I was just looking through the global.php file and it's not necessary for you to issue the mysql_pconnect() or mysql_select_db() calls at the top of search.php3. A database connection is already created inside of global.php.
Also, just to clarify for everyone.. search.php3 can be any filename you wish. I named it to search_hack.php myself since we already had a search.php file we were using.
And lastly.. it's a drain on the server to use .htaccess files... those apache directives could just as easily be put inside of a set of Directory tags in the main httpd.conf file. Apache will be all the faster for it.
i.e.
<Directory /www/sitename/forums/>
? <Files search>
DirectoryIndex search
ForceType application/x-httpd-php
</Files>
</Directory>
Also, for those of you using php4, as I am, notice that my ForceType directive does not end in php3 as in Stallion's example. PHP4 uses application/x-httpd-php instead.
Stallion: great hack.. thanks.
Including the "<base href="forums/URL"> in the $sccinclude messes up the forums for Netscape users. For example, when they went to my board, they would see the link for a forum as "http://forumdisplay.php?yadda-yadda". I've taken it out and it works fine now, but unfortunatly the search page doesn't work anymore. What should I do to remedy this?
cool, i would like to have this setup and would need some guidance...
i am new to linux, new to dedicated server use and new to vbulletin... should i be messing around with it right now ?
Originally posted by JimF
Including the "<base href="forums/URL"> in the $sccinclude messes up the forums for Netscape users. For example, when they went to my board, they would see the link for a forum as "http://forumdisplay.php?yadda-yadda". I've taken it out and it works fine now, but unfortunatly the search page doesn't work anymore. What should I do to remedy this?
That's odd. I don't see this behavior on my board when I look at it using Netscape and I've implemented the hack. Am I looking at it correctly?
My board: http://www.aikiweb.com/forums/
The hack: http://www.aikiweb.com/forums/engine.php
This is cool is there any simple way to create more of a structure. IE
Main Page:
Forum 1
|_ New Page With Posts XX Per Page
Forum 2
|_ New Page With Posts XX Per Page
Forum 3
|_ New Page With Posts XX Per Page
Just an idea...
Originally posted by BassWriters
My question was if from experience, the search engines actually index every thread, or if they ignore the page because there are too many links and not enough text.
Its been done this way for other forums/news scripts, so I'd imagine its the preferred method and will work fine with spiders.
1) Listen to danbeck...he speaks the truth :)
2) JimF, it should be the full URL, nothing relative
3) eva2000, as long as you don't overwrite any existing files, no harm can be done :)
4) Brian: not sure what you mean...
What I meant was for the main page to have links to the fourms, and then the page that shows the links for that forum have it make a new page for every 50 or so topics.
For example say the main page would have a link to each forum and then the page with the posts for each forum would show xxx topics per page then link to the next page that way if you have a big forum you dont have thousands of links all on one page
I just don't see the logic behind that if the spiders will cycle through all the links anyways...
From what I have read spiders will dislike links pages with that many links....
But does this bog down the message boards when a web crawler hits the list of messages? Seems like each link is a call to vB (rather than just sucking up an html file).
Each link is a call to vB, but it'd be much more system intensive to force html file generation.
Remember: with the previous method, its calling an HTML file which is THEN calling vB.
okay after applying a few hacks i think i want to get this in to my forum as well...
so is it the first 2 posts of this thread all i need to do... the original code and instructions added with Ed's correction in the showthreads.php file ?
Then, create a file called "search" in the vB directory containing:
<?
$searcharray=explode("/",$REQUEST_URI);
$searchcount=count($searcharray);
$spec = $searchcount - 1;
$threadid = $searcharray[$spec];
require("showthread.php");
?>
i am confused the file called search , what extension on the file ?
search.* ?
No extension whatsoever :)
Just "search"
oh... thanks. sort of had me confused for a moment :D
One thing that can be used to slow down the spidering of these pages so to not cause problems is to have them have a dynamic page extention such as .php or .shtml this will let most spiders know that they should go slower in their spider..
-Brian
this is old but i am having problems....
http://animeboards.net/forums/search.php3
if i put the htaccess file in i get a 500 error, if i leave it out when i click a thread from the search results it redirects to my front page ?
Have you used .htaccess fils on your server in the past?
If not, its most likely a server-config issue.
yes i am using a htaccess file in my root directory to parse .php files and have my server setup for 404 error redirects to the main page
what should i do ?
Not sure about this one.
Did anyone else get similar problems?
My guess is that its redirecting to the main page since it considers the URL's "not found", which I'm assuming is because your 404 error page is your main index file ;)
The problem is the 500 error, I guess.
Anyone?
what if i shoved this
<Files search>
DirectoryIndex search
ForceType application/x-httpd-php3
</Files>
into my htaccess in my root directory ?
Don't think so, since that "search" location is a relative path.
Hmm, what server are you running on?
a dedicated red hat linux 6.2 server using a webmin control panel
The fix to keep this from showing a private forum would be as follows:
In search.php3 change:
$threads=$DB_site->query("SELECT threadid,title FROM thread WHERE visible=1 ORDER BY lastpost DESC");
to:
$threads=$DB_site->query("SELECT threadid,title FROM thread WHERE visible=1 AND forumid <> 34 ORDER BY lastpost DESC");
Make sure you change "34" to the forumid of your private forum. (The forum id of a forum can be found by going to your main page and holding your mouse over a link, in the status bar you will see something like: "http://www.extremeforums.com/forums/forumdisplay.php?forumid=2" whatever number is next to "forumid=" is the forum id of that forum)
Sorry to put this in the terms that I did but some people seriously do have problems with this
~Chris
P.S Job well done Stallion!
Does anyone see any advantages / disadvantages to this:
<?
require("global.php");
mysql_pconnect($server,$user,$password);
mysql_select_db($database);
$threads=$DB_site->query("SELECT threadid,title,threadindex FROM thread WHERE visible=1 AND forumid <> 34 ORDER BY lastpost DESC");
while ($threadarray = $DB_site->fetch_array($threads)) {
$threadid = $threadarray["threadid"];
$title = $threadarray["title"];
$threadindex = $threadarray["threadindex"];
print "<a href=\"search/$threadid\">$threadindex</a><br><br>\n";
}
?>
Basically what your doing, is instead of making the title of the thread the link, your making the entire thread index (selected words from the thread) for the link.
I was thinking this would give the spider more text to chew on and in turn, would make your site turn up in more searches, but I could be wrong. Any search engine pros wanna comment?
~Chris
I have a backend script that reports back all 404, 500, 400 etc etc errors. Well what I did was I went ahead and submitted my page to the engines that spider and within minutes I was getting logs of 404 not found errors on every one of the links that they were spidering from that page.
I think this is a real good idea and I really wish it would work :(
Funny thing is though, if you go to the page you dont get the 404 errors.
http://www.extremeforums.com/forums/spider.php
here is the error....
ON YOUR SITE, Extreme Forums
ERROR CODE 404 NOT FOUND
OCCURRED ON Wed Aug 9 02:23:48 2000
WHEN THE URL /search/259 WAS REQUESTED
FROM THE PAGE
BY A USER AT 195.145.119.25
THE BROWSER WAS Infoseek Sidewinder/0.9
------------------------------------------------------------------------------
One thing I notice though is that the engines are looking for /search/*** why arent they looking for /forums/search/*** ??? After all thats what is listed on search.php3
Im sure someone knows more about this than I do, so could you please comment?
Thanks,
~Chris
well my problem still stands
http://animeboards.net/forums/search.php3
lists the threads but still doesn't work with the htaccess file in the forum directory...
I don't know why it won't work for some people, so don't ask.
But I do have a bug fix.
Instead of replacing $newpostlink="#newpost"; with $newpostlink="showthread.php?threadid=$threadid#newpost";, replace it with this:
if ($pagenumber=="1") {
$newpostlink="showthread.php?threadid=$threadid#newpost";
} else {
$newpostlink="showthread.php?threadid=$threadid&pagenumber=$pagenumber#newpost";
}
:D
[Edited by Ed Sullivan on 08-28-2000 at 01:55 AM]
thanks ed... i noticed that little bug on page spans...
still not working for the actual hack though :(
To implement Brian's "slow down the spider" suggestion of 7/13/00, I did this:
My "search" file was
<?
$searcharray=explode("/",$REQUEST_URI);
$searchcount=count($searcharray);
$spec = $searchcount - 1;
$threadid = $searcharray[$spec];
require("showthread.php");
?>
so I changed it to
<?
$searcharray=explode("/",$REQUEST_URI);
$searchcount=count($searcharray);
$spec = $searchcount - 1;
$threadid_with_dotphp_suffix = $searcharray[$spec];
$threadid = substr($threadid_with_dotphp_suffix, 0, -4);
require("showthread.php");
?>
and to add the ".php" to the end of the URL of each thread title in the search.php3 results listing I modified this line search.php3
print "<a href=\"search/$threadid\">$title</a><br>\n";
to become this
print "<a href=\"search/$threadid.php\">$title</a><br>\n";
vBulletin® v3.8.12 by vBS, Copyright ©2000-2025, vBulletin Solutions Inc.