vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vB3 Programming Discussions (https://vborg.vbsupport.ru/forumdisplay.php?f=15)
-   -   PHP Help, Stripping HTML tags and updating MySQL. (https://vborg.vbsupport.ru/showthread.php?t=169399)

speedway 02-02-2008 04:19 AM

PHP Help, Stripping HTML tags and updating MySQL.
 
Hi all

I am looking for a bit of help in PHP. For my forums DB, I want to programatically update every posts content with the same content minus all the HTML tags. Basically a small PHP app to

connect to Mysql,
open the post table,
extract the pagetext value,
strip all HTML tags from it
write it back to that record
loop to the next record.

Due to my distinct lack of PHP knowledge I am struggling with the looping bit and retrieve/modify/update bit, so I appreciate some kind soul showing me the right direction.

Thanks in advance

Cheers
Bruce

MoT3rror 02-02-2008 04:41 AM

When the posts are brought out of the database they are script of all the html tags, etc if you have html turn off and bbcode code is added.

speedway 02-02-2008 05:16 AM

Quote:

Originally Posted by MoT3rror (Post 1434373)
When the posts are brought out of the database they are script of all the html tags, etc if you have html turn off and bbcode code is added.

Thanks but I am not concerned how they look on the page. I want to convert my DB to UTF-8 and there are problematic posts that have been pasted from Microsoft Word. These contain font tags, color tags, font size tags and weird character formatting that causes grief. I basically want to strip everything HTML related from every posts and then I can safely convert the contents to UTF-8.

Cheers
Bruce

Opserty 02-02-2008 08:31 AM

The 'post' table only stores the post with BBCode, unless you had HTML enabled on your board or something?

speedway 02-02-2008 08:48 AM

Quote:

Originally Posted by Opserty (Post 1434445)
The 'post' table only stores the post with BBCode, unless you had HTML enabled on your board or something?

Damn, see? Shows how much experience I have with this! :)

So, I need to remove all BBcode tags and restore the text to the Verdana, size 2 set. Any pointers to how I would do that - sites, texts, books, anything?

Cheers
Bruce

Dismounted 02-02-2008 09:50 AM

You'll need some regex replacements. (I hate regex, as it's really confusing, but I'm giving it a shot :p.)
PHP Code:

$newtext preg_replace('/\[(.+?)]/'''$text); 


speedway 02-02-2008 11:51 AM

Thank you Sir.

Armed with that I went searching on Google and found this:

Code:

function stripBBCode($text_to_search) {
$pattern = '|[[\/\!]*?[^\[\]]*?]|si';
$replace = '';
return preg_replace($pattern, $replace, $text_to_search);

}

but it zaps *all* BBCode. Would anyone have any idea on how to make it *not* remove things like:
[ QUOTE ]
[ /QUOTE ]
[ QUOTE=
[ B ]
[ I ]
[ U ]
or any of the other basic ones? (spaces included on purpose so the forum doesn't try and use them)

I did find this function as well:

Code:

function stripBBCode($stringInput) {
    if (strpos($stringInput, '[') !== false) {
        $validBBCodeArray = array(
            'b',
            'i',
            'u',
            'url',
            'quote',
        );
       
        $validBBCode = join('|', $validBBCodeArray);
       
        $stringOutput = preg_replace(
            '@\[(?:\/{0,1}?)(?:' . $validBBCode . ')(?:\s{0,1}?)(?:\/{0,1}?)\]@',
            '',
            $stringInput
        );
    } else {
        $stringOutput = $stringInput;
    }

    return $stringOutput;
}

but it strips everything *except* font and size tags (for some reason).

All the help so far is being appreciated I assure you. Now for that last little step :)

Cheers
Bruce

Dismounted 02-03-2008 04:57 AM

Try this (it will strip all font and size tags, but nothing else):
PHP Code:

$newtext preg_replace('/\[(?:\/{0,1}?)(?:font|size)(?:\s{0,1}?)(?:\/{0,1}?)\]/'''$text); 


speedway 02-03-2008 08:18 AM

Thanks Hanson

I tried that but it strips everything *but* the font & size tags :) I have tried nutting this one out myself but am still lost.

Cheers
Bruce

Reecey 02-03-2008 08:23 AM

where do i put this code i would like to get rid of html used on my forum to as when i first opened many post's were made using html and i would prefer it to be bb code where do i put the code ?

Dismounted 02-03-2008 10:54 AM

Quote:

Originally Posted by speedway (Post 1435184)
Thanks Hanson

I tried that but it strips everything *but* the font & size tags :) I have tried nutting this one out myself but am still lost.

Cheers
Bruce

In the second function you posted, you said it strips everything but the font/size tags. But my function does it as well? That's not really possible as I've reversed the conditions...
Quote:

Originally Posted by Reecey (Post 1435187)
where do i put this code i would like to get rid of html used on my forum to as when i first opened many post's were made using html and i would prefer it to be bb code where do i put the code ?

This strips BB Code, not HTML.

kansei 02-27-2008 07:13 PM

Quote:

Originally Posted by Reecey (Post 1435187)
where do i put this code i would like to get rid of html used on my forum to as when i first opened many post's were made using html and i would prefer it to be bb code where do i put the code ?

+1

a new member on the forum I admin on copied a post from a different forum which is vbulletin BUT ALLOWS HTML in posts.. bah!

It's seriously hundreds and hundreds of ugly photobucket image tags with target _blank and oh so many things.

I spent 20 minutes manually changing them all to vbcode img tags but.. I'm only 1/3 of the way through?

Poet PHP 02-28-2008 05:17 AM

if U want to replace BBcode to HTML use this

PHP Code:

require_once('./global.php');
require_once(
DIR '/includes/class_bbcode.php');
     
$bbcode_parser =& new vB_BbCodeParser($vbulletinfetch_tag_list());
       
$previewmessage $bbcode_parser->parse($message); 

and to remove them use the

PHP Code:

 $previewmessage  strip_tags(strip_bbcode($messagetruetrue)); 

see the result www.akafi.net/tvv.php


All times are GMT. The time now is 03:28 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01093 seconds
  • Memory Usage 1,753KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_code_printable
  • (4)bbcode_php_printable
  • (5)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (13)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete