Go Back   vb.org Archive > vBulletin 3 Discussion > vB3 General Discussions
  #1  
Old 03-17-2007, 08:36 PM
orth orth is offline
 
Join Date: Mar 2007
Location: Toronto, Ontario, Canada
Posts: 7
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default Word Processors & vBulletin UTF-8 Stripped Posts (Possible Fix?)

Hi all, I'm fairly new to the inner workings of vBulletin and I found myself tackling a bug when users would post directly from MS Word or Appleworks. Now these programs have the SmartQuotes and AutoCorrect options and those seemed to be Windows 1252 encodings so when the user would enter their text with copy and paste, what was happening was the moment the post hit a non UTF-8 character, the rest of the post would get cut off. I didn't have much luck finding any help with some vBulletin googling. After a lot of searching around though I found a helpful post on the php forums. All it required I do was add the following bolded code in includes/class_core.php:

Code:
function htmlspecialchars_uni($text, $entities = true)
{
        $text = mb_convert_encoding($text,"HTML-ENTITIES","auto");
        return str_replace(
                // replace special html characters
                array('<', '>', '"'),
                array('&lt;', '&gt;', '&quot;'),
                preg_replace(
                        // translates all non-unicode entities
                        '/&(?!' . ($entities ? '#[0-9]+' : '(#[0-9]+|[a-z]+)') . ';)/si',
                        '&amp;',
                        $text
                )
        );
}
As I said before I'm not too familiar with the vBulletin innerworkings nor PHP and I just want to make sure I'm not invalidating any security measures with HTML using this? If I'm not then maybe someone else who is having the same problems might find this useful.

Thanks for any insight,
orth
Reply With Quote
  #2  
Old 06-22-2007, 11:31 PM
Asilouhuette Asilouhuette is offline
 
Join Date: May 2007
Posts: 7
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I have this problem as well. It seems to work fine for previews, but once you actually make the post it screws everything up.

Is that how you experienced it as well?

I havn't seen any other recommendations on this, so unless someone (From the vB Team maybe?) advises otherwise, I think I'll use this fix. Anyone know if this will be an included fix in future versions?

Thanks
Reply With Quote
  #3  
Old 06-23-2007, 02:55 AM
Dismounted's Avatar
Dismounted Dismounted is offline
 
Join Date: Jun 2005
Location: Melbourne, Australia
Posts: 15,047
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

You should report bugs to the vBulletin Bug Tracker (http://www.vbulletin.com/forum/project.php?projectid=6).
Reply With Quote
  #4  
Old 07-03-2007, 06:36 AM
Asilouhuette Asilouhuette is offline
 
Join Date: May 2007
Posts: 7
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I tried this fix and unfortunately it didn't work for me.

I'm not sure that it is really a bug so much as a poor choice of character encoding on my part. Choosing UTF-8 may have let me avoid this.

I tried fixing it myself, with varying amounts of success. I tried placing an str_replace() that replaces smartquotes and other special characters with their $blah; counter parts. This gave me some improvement, however, I couldn't find the most optimal place to put this. I tried having it change the $dataman->['pagetext'] after the various _presave hooks but that didn't fix all cases. It would only work with edits || new posts, not both, for example.

Could anyone point me to which hook would be caught by all posts/post edits and which variable would be best to change at that hook?

I think it would be fairly trivial to do this if I didnt use a hook, but that of course makes maintaing updates much harder. Here is the str_replace I'm using...

PHP Code:
    $wordentities = array(chr(128), chr(130), chr(131), chr(132), chr(133), chr(134), chr(135), chr(136), chr(137), chr(138), chr(139), chr(140), chr(145), chr(146), chr(147), chr(148), chr(149), chr(150), chr(151), chr(152), chr(153), chr(154), chr(155), chr(156), chr(159));                
    
$htmlwordentities = array('&euro;''&sbquo;''&fnof;''&bdquo;''&hellip;''&dagger;''&Dagger;''&circ;''&permil;''&Scaron;''&lsaquo;''&OElig;''&lsquo;''&rsquo;''&ldquo;''&rdquo;''&bull;''-''&mdash;''&tilde;''&trade;''&scaron;''&rsaquo;''&oelig;''&Yuml;');
    return 
str_replace($wordentities$htmlwordentities$text); 

Thanks in advance.
Reply With Quote
  #5  
Old 11-08-2007, 09:18 PM
dnetzer dnetzer is offline
 
Join Date: Aug 2007
Posts: 4
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Just stumbled upon this post. Thanks for the insightful information people. Have you since, found a better solution? I see people inserting text which is cut & pasted from Word docs. What a mess.
Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 10:09 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.06717 seconds
  • Memory Usage 2,209KB
  • Queries Executed 13 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (1)bbcode_code
  • (1)bbcode_php
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (5)post_thanks_box
  • (5)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (5)post_thanks_postbit_info
  • (5)postbit
  • (5)postbit_onlinestatus
  • (5)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • fetch_musername
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete