vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.7 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=228)
-   -   Miscellaneous Hacks - TFSEO Google Sitemap Generator (URL rewriting support) (https://vborg.vbsupport.ru/showthread.php?t=198369)

SoftDux 02-22-2009 10:01 AM

Does anyone know if this mod will work on VB 3.6?

imported_silkroad 04-10-2009 08:46 AM

Can't get this to work with 3.7.x .....

Error:

[10-Apr-2009 05:30:44] PHP Fatal error: Call to undefined function convert_int_to_utf8() in /home/public_html/forum/sitemap.php(36) : regexp code on line 1

Crimm 04-14-2009 05:11 PM

I ran across this and I see people are having problems.

I really want this working so I have started working on it.

Update 1: I have noticed that there is no place for a prefix, which a lot of us use. So I have added that to the variables.

Update 2: I noticed a few of the queries combined multiple queries into one. mysql_num_rows for one. I have split these up for debugging purposes.

For example:

Code:

$result = mysql_query("select * from " . $prefix . "thread WHERE visible = 1",$link);
$num_rows = mysql_num_rows($result);   
$query = "select * from " . $prefix . "thread WHERE visible = 1  ORDER BY dateline desc" ; 
$result = mysql_query($query) or die("Query failed"); 

$num_rows2 = mysql_num_rows(mysql_query("select * from " . $prefix . "forum"));   
$query2 = "select * from " . $prefix . "forum ORDER BY forumid desc" ; 
$result2 = mysql_query($query2) or die("Query failed");

Update 3: It is failing validation (http://validator.w3.org/), but adding the necessary information to it causes the file to fail.

Update 4: It would be really nice if this file pulled from the config automatically.


Giving me the following file:

Code:

<?php

///////////////////////Configure this/////////////////////////////////////////////////////

$username="username"; //your db user name
$password="password"; // the pass for the db
$database="dbname"; // name of db
$server="localhost"; // server
$prefix = "vb_";
$sitename="http://something.com/"; // sitename including www
$maxkey = "8"; // max key words as you specified in tfseo control panel
$home_freq = "daily"; //always, hourly, daily, weekly, monthly, yearly, never
$home_priority = "1"; // google priority for home (www.yoursite.com)
$forum_freq = "daily"; //always, hourly, daily, weekly, monthly, yearly, never
$forum_priority = "0.8"; // google priority for forums (www.yoursite.com/f4)
$thread_freq = "daily"; //always, hourly, daily, weekly, monthly, yearly, never
$thread_priority = "0.4"; // google priority for threads (www.yoursite.com/f4/your-thread)

function remove_accents($string){ ////this strings must be the same you are using as character replacements, these are the "wide range"

    return strtr($string,
                "???????????????????????????????????????????????????????????????@?",
                "YuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyyEan");
}

////////////////////////////DONT NEED TO TOUCH ANYTHING BELOW//////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////



function unhtmlspecialchars($text)
{

                $text = preg_replace('/&#([0-9]+);/esiU', "convert_int_to_utf8('\\1')", $text);


        return str_replace(array('&lt;', '&gt;', '&quot;', '&amp;'), array('<', '>', '"', '&'), $text);
}



//connect to the database 
$link = mysql_connect($server,$username,$password);
mysql_select_db($database,$link) or die( "Unable to select database");

$result = mysql_query("select * from " . $prefix . "thread WHERE visible = 1",$link);
$num_rows = mysql_num_rows($result);   
$query = "select * from " . $prefix . "thread WHERE visible = 1  ORDER BY dateline desc" ; 
$result = mysql_query($query) or die("Query failed"); 

$num_rows2 = mysql_num_rows(mysql_query("select * from " . $prefix . "forum"));   
$query2 = "select * from " . $prefix . "forum ORDER BY forumid desc" ; 
$result2 = mysql_query($query2) or die("Query failed"); 

//this is the normal header applied to any Google sitemap.xml file 
echo '<?xml version="1.0" encoding="ISO-8859-1"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">'; 

//HOME RESULTS
 
$url_product ="http://" . $sitename;
         
$realdate = date("Y-m-d");
$year = substr($realdate,0,4); //work out the month   
$mon  = substr($realdate,5,2); //work out the month 
$day  = substr($realdate,8,2); //work out the day     
$displaydate = ''.$year.'-'.$mon.'-'.$day.'';
                   
echo 

<url> 
<loc>'.$url_product.'</loc> 
<lastmod>'.$displaydate.'</lastmod> 
<changefreq>'.$home_freq.'</changefreq> 
<priority>'.$home_priority.'</priority> 
</url> 
'; 


//FORUM RESULTS 
$i=0;
for($i=0;$i<$num_rows2; $i++) 

$url_product ="http://" . $sitename . "/f" .mysql_result($result2,$i,"forumid");
   
/*you need to assign a date to the entity.  if you don't 
 store a timestamp in the Database then you need slapping*/ 
     
$realdate = date("Y-m-d");
$year = substr($realdate,0,4); //work out the month   
$mon  = substr($realdate,5,2); //work out the month 
$day  = substr($realdate,8,2); //work out the day     
$displaydate = ''.$year.'-'.$mon.'-'.$day.'';
                   
echo 

<url> 
<loc>'.$url_product.'</loc> 
<lastmod>'.$displaydate.'</lastmod> 
<changefreq>'.$forum_freq.'</changefreq> 
<priority>'.$forum_priority.'</priority> 
</url> 
'; 
}

//THREAD RESULTS 
for($i=0;$i<$num_rows; $i++) 

//cleanurl
$title = mysql_result($result,$i,"title");
$a = strtolower($title);
$a = remove_accents($a);
$a = unhtmlspecialchars($a);
$a = str_replace("'", '', $a);
$a = preg_split("#[^a-z0-9]#", $a, -1, PREG_SPLIT_NO_EMPTY);
$a = array_slice($a, 0, $maxkey);
$a = implode("-",$a);
if (empty($a))
{
$a = 'thread';
}

//your url-product as we worked out in #4 
$url_product ="http://" . $sitename . "/f" .mysql_result($result,$i,"forumid") . "/" . $a . '-t' .mysql_result($result,$i,"threadid");
   
/*you need to assign a date to the entity.  if you don't 
 store a timestamp in the Database then you need slapping*/ 
     

$date = mysql_result($result,$i,"dateline"); //the date stored 
$realdate = date('Y-m-d H:i:s', $date);

$year = substr($realdate,0,4); //work out the month   
$mon  = substr($realdate,5,2); //work out the month 
$day  = substr($realdate,8,2); //work out the day 
 
/*display the date in the format Google expects:
2006-01-29 for example*/ 
   
$displaydate = ''.$year.'-'.$mon.'-'.$day.'';
                   
//you can assign whatever changefreq and priority you like
echo 

<url> 
<loc>'.$url_product.'</loc> 
<lastmod>'.$displaydate.'</lastmod> 
<changefreq>'.$thread_freq.'</changefreq> 
<priority>'.$thread_priority.'</priority> 
</url> 
'; 
  } 
 
mysql_close(); //close connection 
 
//close the XML attribute that we opened in #3 
echo 
'</urlset>'; 



?>

Errors I'm still receiving:
1) IE 7 says invalid XML
2) Awaiting Google Webmaster Tools to validate Sitemap.

Wants:
1) To pass validation
2) Pull info from config file

If I have time (which I don't have much of) - I'll come back here and update this and see if I can't help some people out :) by adding wants.

Note: I'm running 3.8.1 and I get output, but I'm not 100% sure that Google will accept it.

Also note: I'm not officially supporting this, like I said I don't have much time. I will try to get it working for myself and put it here and put notes, but that's about all I'm going to have time to do :)

Crimm 04-14-2009 05:25 PM

I just got returned a whole bunch of errors from Google Webmaster tools.

Let me keep working on this and see what I can present.

EDIT:

The errors were because of the $sitename variable.

Do not include http://

Crimm 04-14-2009 05:45 PM

Okay this update utilizes the config.php to pull database information, so you don't have to enter it.

The only thing you have to enter is the little bit of stuff at the top.

Code:

<?php
require_once('./includes/config.php');

///////////////////////Configure this/////////////////////////////////////////////////////

$sitename="SITENAME WITHOUT THE HTTP!!"; // sitename including www
$maxkey = "8"; // max key words as you specified in tfseo control panel
$home_freq = "daily"; //always, hourly, daily, weekly, monthly, yearly, never
$home_priority = "1"; // google priority for home (www.yoursite.com)
$forum_freq = "daily"; //always, hourly, daily, weekly, monthly, yearly, never
$forum_priority = "0.8"; // google priority for forums (www.yoursite.com/f4)
$thread_freq = "daily"; //always, hourly, daily, weekly, monthly, yearly, never
$thread_priority = "0.4"; // google priority for threads (www.yoursite.com/f4/your-thread)

function remove_accents($string){ ////this strings must be the same you are using as character replacements, these are the "wide range"

    return strtr($string,
                "???????????????????????????????????????????????????????????????@?",
                "YuAAAAAAACEEEEIIIIDNOOOOOOUUUUYsaaaaaaaceeeeiiiionoooooouuuuyyEan");
}

////////////////////////////DONT NEED TO TOUCH ANYTHING BELOW//////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////
///////////////////////////////////////////////////////////////////////////////////////////////

$username=$config['MasterServer']['username']; //your db user name
$password=$config['MasterServer']['password']; // the pass for the db
$database=$config['Database']['dbname']; // name of db
$server=$config['MasterServer']['servername']; // server
$prefix = $config['Database']['tableprefix'];


function unhtmlspecialchars($text)
{

                $text = preg_replace('/&#([0-9]+);/esiU', "convert_int_to_utf8('\\1')", $text);


        return str_replace(array('&lt;', '&gt;', '&quot;', '&amp;'), array('<', '>', '"', '&'), $text);
}



//connect to the database 
$link = mysql_connect($server,$username,$password);
mysql_select_db($database,$link) or die( "Unable to select database");

$result = mysql_query("select * from " . $prefix . "thread WHERE visible = 1",$link);
$num_rows = mysql_num_rows($result);   
$query = "select * from " . $prefix . "thread WHERE visible = 1  ORDER BY dateline desc" ; 
$result = mysql_query($query) or die("Query failed"); 

$num_rows2 = mysql_num_rows(mysql_query("select * from " . $prefix . "forum"));   
$query2 = "select * from " . $prefix . "forum ORDER BY forumid desc" ; 
$result2 = mysql_query($query2) or die("Query failed"); 

//this is the normal header applied to any Google sitemap.xml file 
echo '<?xml version="1.0" encoding="ISO-8859-1"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">'; 

//HOME RESULTS
 
$url_product ="http://" . $sitename;
         
$realdate = date("Y-m-d");
$year = substr($realdate,0,4); //work out the month   
$mon  = substr($realdate,5,2); //work out the month 
$day  = substr($realdate,8,2); //work out the day     
$displaydate = ''.$year.'-'.$mon.'-'.$day.'';
                   
echo 

<url> 
<loc>'.$url_product.'</loc> 
<lastmod>'.$displaydate.'</lastmod> 
<changefreq>'.$home_freq.'</changefreq> 
<priority>'.$home_priority.'</priority> 
</url> 
'; 


//FORUM RESULTS 
$i=0;
for($i=0;$i<$num_rows2; $i++) 

$url_product ="http://" . $sitename . "/f" .mysql_result($result2,$i,"forumid");
   
/*you need to assign a date to the entity.  if you don't 
 store a timestamp in the Database then you need slapping*/ 
     
$realdate = date("Y-m-d");
$year = substr($realdate,0,4); //work out the month   
$mon  = substr($realdate,5,2); //work out the month 
$day  = substr($realdate,8,2); //work out the day     
$displaydate = ''.$year.'-'.$mon.'-'.$day.'';
                   
echo 

<url> 
<loc>'.$url_product.'</loc> 
<lastmod>'.$displaydate.'</lastmod> 
<changefreq>'.$forum_freq.'</changefreq> 
<priority>'.$forum_priority.'</priority> 
</url> 
'; 
}

//THREAD RESULTS 
for($i=0;$i<$num_rows; $i++) 

//cleanurl
$title = mysql_result($result,$i,"title");
$a = strtolower($title);
$a = remove_accents($a);
$a = unhtmlspecialchars($a);
$a = str_replace("'", '', $a);
$a = preg_split("#[^a-z0-9]#", $a, -1, PREG_SPLIT_NO_EMPTY);
$a = array_slice($a, 0, $maxkey);
$a = implode("-",$a);
if (empty($a))
{
$a = 'thread';
}

//your url-product as we worked out in #4 
$url_product ="http://" . $sitename . "/f" .mysql_result($result,$i,"forumid") . "/" . $a . '-t' .mysql_result($result,$i,"threadid");
   
/*you need to assign a date to the entity.  if you don't 
 store a timestamp in the Database then you need slapping*/ 
     

$date = mysql_result($result,$i,"dateline"); //the date stored 
$realdate = date('Y-m-d H:i:s', $date);

$year = substr($realdate,0,4); //work out the month   
$mon  = substr($realdate,5,2); //work out the month 
$day  = substr($realdate,8,2); //work out the day 
 
/*display the date in the format Google expects:
2006-01-29 for example*/ 
   
$displaydate = ''.$year.'-'.$mon.'-'.$day.'';
                   
//you can assign whatever changefreq and priority you like
echo 

<url> 
<loc>'.$url_product.'</loc> 
<lastmod>'.$displaydate.'</lastmod> 
<changefreq>'.$thread_freq.'</changefreq> 
<priority>'.$thread_priority.'</priority> 
</url> 
'; 
  } 
 
mysql_close(); //close connection 
 
//close the XML attribute that we opened in #3 
echo 
'</urlset>'; 



?>

It translates to XML using IE7 - Just tried and I'm awaiting another re-submission with Google Webmaster Tools.

CILGINKRAL_ 04-14-2009 06:17 PM

Thank you really, really good :)

Crimm 04-14-2009 07:00 PM

Looks like Google Webmaster Tools is taking it this time.

It'll take a long time though because I have a few LARGE RSS feeds coming in.

Tonight once I see it complete ... Maybe I'll work on further integrating it.

I wonder if paketeto would let me release it under 3.8 so that we have a clean place to discuss it.

Crimm 04-15-2009 01:19 AM

google webmaster tools said:

Quote:

Property Status
Sitemap type Web
Format Sitemap
Submitted 7 hours ago
Last downloaded by Google 5 hours ago
Status OK
Total URLs in Sitemap 6196

So this works.

I guess I'll wait to do anymore modifications until I hear from paketeto.

I took vbtfseo the first version and modified it highly for my site, but this still works perfectly with it. I tested it on another site with vbtfseo version 2.X and it works as well.

If anyone is interested in me working on this, please let me know.

zeus_r6 02-19-2010 04:43 PM

These are the errors I'm coming up with using your modified sitemap.php code:

78460
Parsing error
We were unable to read your Sitemap. It may contain an entry we are unable to recognize. Please validate your Sitemap before resubmitting.
Problem detected on: Feb 19, 2010
Warnings 78458
Invalid XML tag
This tag was not recognized. Please fix it and resubmit.
Parent tag: urlset
Tag: br
Problem detected on: Feb 19, 2010
Warnings 78459
Invalid XML tag
This tag was not recognized. Please fix it and resubmit.
Parent tag: urlset
Tag: b
Problem detected on: Feb 19, 2010
Warnings 78459
Invalid XML tag
This tag was not recognized. Please fix it and resubmit.
Parent tag: urlset
Tag: b
Problem detected on: Feb 19, 2010
Warnings 78459
Invalid XML tag
This tag was not recognized. Please fix it and resubmit.
Parent tag: urlset
Tag: b
Problem detected on: Feb 19, 2010
Warnings 78459
Invalid XML tag
This tag was not recognized. Please fix it and resubmit.
Parent tag: urlset
Tag: br
Problem detected on: Feb 19, 2010

Carpesimia 10-12-2011 12:18 AM

I know this response comes pretty late, but I just inherited a forum running TFSEO on 3.8.7, and my boss wanted a sitemap immediately. I downloaded this package and tested it out. Here is what I found, and I hope it helps you...

1) The sitemap ran for awhile and then just stopped dead. It was missing a function called convert_int_to_utf8(). This could be the issue many of you reported. I added this function, and the sitemap ran to completion each time.

2) The sitemap generator is stupid, and grabs all forums and threads. Including private forums. So I added an "excuded_forums" variable, and excluded those forums in the 2 SQL statements.

3) The way this is written is S-L-O-W on a large forum. The forum Im using has 184,000 threads and it took 38 minutes to create the sitemap. The box is a bad-ass machine too, with 32 gb of ram and 4 quad processors. Its not a 386. Its because of the "select *" and "mysql_result()" grabbing alot of data, and then picking each piece from the results one at a time. So yeah, SLOW.

4) Having 184k urls, my sitemap is too big for google to eat.

So here's what I had to do to make it work:

1) I replaced the I/O with a database library I use all the time. Just this one change took the processing of this program from 38 minutes down to 11 seconds. This is because instead of grabbing the data piecemeal from the database, i only selected the fields i was using and grabbed the data in-bulk from the db and dropped it into an array for processing. So yeah, 11 seconds.

2) The file was still big, so i changed the program to write the file to disc instead of echoing out the output. This way I can generate it via cron, whenever i want.

3) I also added a new variable called $urlsperfile, and created a sitemap_#.xml with that many urls, cycling to a new file when i got more than $urlsperfile.

4) Added a call to gzip to compress all of the files i made.

5) created an uncompressed sitemap.xml file, which is a sitemap index file pointing to all of the sitemap_#.xml.gz files i just processed.

6) Finally, made some very MINOR changes to the xml to meet the sitemap.orgs requirements, found at: http://www.sitemaps.org/protocol.php

The final outcome? a cronjob that runs in less than 15 seconds, generates 10 sitemap files with 20k urls per, and one sitemap file linking them all.

Oh, and this is on a 3.8.7 forum.


All times are GMT. The time now is 03:10 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01244 seconds
  • Memory Usage 1,812KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (3)bbcode_code_printable
  • (1)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (2)pagenav_pagelink
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete