Go Back   vb.org Archive > vBulletin 3 Discussion > vB3 Programming Discussions
FAQ Community Calendar Today's Posts Search

Reply
 
Thread Tools Display Modes
  #1  
Old 11-16-2014, 12:20 AM
omardealo's Avatar
omardealo omardealo is offline
 
Join Date: Nov 2008
Location: egypt
Posts: 235
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default [SOLVED] Arabic encoding with preg_match_all

HELLO ,

i try discovery of banned words when users posted new post , but i have a problem only Only the discovery of English words, I think that the problem is in the Arabic language encoding , I tried to solve the problem by
PHP Code:
iconv("windows-1256""utf-8",$string ); 
but don't work , Are there any suggestions ?


Code:
$wordss = "هالو|مرحبا|google.com";
$bwords = explode("|", $wordss);   
//$string = $vbulletin->GPC['message'];  
$string = 'BLA BLA مرحبا BLA BLA google.com BLA BLA BLA ';  
$matchFound = preg_match_all(  
                "/\b(" . implode($bwords,"|") . ")\b/i",   
                $string,   
                $matches  
              );  
$words = array_unique($matches[0]);   
print_r($words);

output : google.com
but Must be : google.com,مرحبا
Reply With Quote
  #2  
Old 11-16-2014, 01:40 AM
kh99 kh99 is offline
 
Join Date: Aug 2009
Location: Maine
Posts: 13,185
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Maybe try putting a u at the end of your pattern string:
Code:
"/\b(" . implode($bwords,"|") . ")\b/iu"
to tell it to use unicode strings.
Reply With Quote
Благодарность от:
omardealo
  #3  
Old 11-16-2014, 02:09 AM
omardealo's Avatar
omardealo omardealo is offline
 
Join Date: Nov 2008
Location: egypt
Posts: 235
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by kh99 View Post
Maybe try putting a u at the end of your pattern string:
Code:
"/\b(" . implode($bwords,"|") . ")\b/iu"
to tell it to use unicode strings.
yes sir , i try this pattern already
i change /i to /iu
and try it on Different places
- on online external php file by [/iu] only - > works good
- on localhost vbulletin plugin by [/i] only - > works good
but ..
- on localhost external php file - > don't work
- on online vbulletin plugin - > don't work

so .. i Become confused :erm: , i don't know what's the wrong

--------------- Added [DATE]1416112678[/DATE] at [TIME]1416112678[/TIME] ---------------

UPDATE :
when i convert php files to encoding ANSI , Results appear in Arabic by pattern "/\b(" . implode($bwords,"|") . ")\b/i"
but on plugin how i solve this problem ?
Reply With Quote
  #4  
Old 11-16-2014, 01:56 PM
kh99 kh99 is offline
 
Join Date: Aug 2009
Location: Maine
Posts: 13,185
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

I can't get it to work on my test system either, so I'm afraid I'm stumped. I googled to try to find an answer, but the only thing i found was something that mentioned that it's possible that some versions of php don't handle UTF-8 matching correctly.
Reply With Quote
  #5  
Old 11-16-2014, 02:15 PM
omardealo's Avatar
omardealo omardealo is offline
 
Join Date: Nov 2008
Location: egypt
Posts: 235
Благодарил(а): 0 раз(а)
Поблагодарили: 0 раз(а) в 0 сообщениях
Default

Quote:
Originally Posted by kh99 View Post
I can't get it to work on my test system either, so I'm afraid I'm stumped. I googled to try to find an answer, but the only thing i found was something that mentioned that it's possible that some versions of php don't handle UTF-8 matching correctly.
yeah I also looked very much on google, thank you
But I do not think this is the reason [php versions] , because the code work well in an external file on the same site withot encoding it but on vb plugin don't work .
anyway , can i do what i want by another way ? matching banned words and print it with no problem with the Arabic words .

--------------- Added [DATE]1416157331[/DATE] at [TIME]1416157331[/TIME] ---------------

UPDATE :
I FOUND THE Solution :

\b detects word boundaries, remove them to get a regular match.


JUST USE pattern

Code:
"/(" . implode($bwords,"|") . ")/i "
THANX , kh99
Reply With Quote
Благодарность от:
kh99
Reply


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT. The time now is 08:09 PM.


Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2024, vBulletin Solutions Inc.
X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.07632 seconds
  • Memory Usage 2,213KB
  • Queries Executed 13 (?)
More Information
Template Usage:
  • (1)SHOWTHREAD
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)ad_showthread_beforeqr
  • (1)ad_showthread_firstpost
  • (1)ad_showthread_firstpost_sig
  • (1)ad_showthread_firstpost_start
  • (4)bbcode_code
  • (1)bbcode_php
  • (2)bbcode_quote
  • (1)footer
  • (1)forumjump
  • (1)forumrules
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (1)navbar
  • (3)navbar_link
  • (120)option
  • (5)post_thanks_box
  • (2)post_thanks_box_bit
  • (5)post_thanks_button
  • (1)post_thanks_javascript
  • (1)post_thanks_navbar_search
  • (2)post_thanks_postbit
  • (5)post_thanks_postbit_info
  • (5)postbit
  • (5)postbit_onlinestatus
  • (5)postbit_wrapper
  • (1)spacer_close
  • (1)spacer_open
  • (1)tagbit_wrapper 

Phrase Groups Available:
  • global
  • inlinemod
  • postbit
  • posting
  • reputationlevel
  • showthread
Included Files:
  • ./showthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/functions_bigthree.php
  • ./includes/class_postbit.php
  • ./includes/class_bbcode.php
  • ./includes/functions_reputation.php
  • ./includes/functions_post_thanks.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_postinfo_query
  • fetch_postinfo
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • showthread_start
  • showthread_getinfo
  • forumjump
  • showthread_post_start
  • showthread_query_postids
  • showthread_query
  • bbcode_fetch_tags
  • bbcode_create
  • showthread_postbit_create
  • postbit_factory
  • postbit_display_start
  • post_thanks_function_post_thanks_off_start
  • post_thanks_function_post_thanks_off_end
  • post_thanks_function_fetch_thanks_start
  • fetch_musername
  • post_thanks_function_fetch_thanks_end
  • post_thanks_function_thanked_already_start
  • post_thanks_function_thanked_already_end
  • postbit_imicons
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • postbit_display_complete
  • post_thanks_function_can_thank_this_post_start
  • post_thanks_function_fetch_thanks_bit_start
  • post_thanks_function_show_thanks_date_start
  • post_thanks_function_show_thanks_date_end
  • post_thanks_function_fetch_thanks_bit_end
  • post_thanks_function_fetch_post_thanks_template_start
  • post_thanks_function_fetch_post_thanks_template_end
  • tag_fetchbit_complete
  • forumrules
  • navbits
  • navbits_complete
  • showthread_complete