vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vB3 Programming Discussions (https://vborg.vbsupport.ru/forumdisplay.php?f=15)
-   -   Censorship case-sensitive (https://vborg.vbsupport.ru/showthread.php?t=237187)

Wlad 03-01-2010 01:18 PM

Censorship case-sensitive
 
Hello!

Censorship case-sensitive, ie word "привет" and "Привет" will recognize as different. When the words I write in the Latin alphabet, ie "hello" and "Hello" everything is normal, replace "******" two words. How can I make censorship is not case-sensitive letters to Cyrillic?

PS: sorry for my English, I use a translator

Marco van Herwaarden 03-01-2010 02:16 PM

I am not 100% sure what you are asking. Do you want both those russian words to be treated the same, regardless of the case?

Wlad 03-01-2010 04:50 PM

Marco van Herwaarden, yes. If you write "Привет привет приВет ПРивет", and censor word = "привет", we get "Привет ****** приВет ПРивет". Must "****** ****** ****** ******"

Marco van Herwaarden 03-02-2010 08:19 AM

I have not encountered this problem before, but my guess is that is caused by the characterset used for MySQL. If that doesn't know both characters are the same but only in ifferent case, then it won't work.

kh99 03-02-2010 12:36 PM

I'm basing this answer on only about 10 minutes of research, but: it looks like the censored words are detected using the PHP "preg" functions http://us2.php.net/manual/en/ref.pcre.php which are based on something called the PCRE library of functions: http://www.pcre.org/ But beyond that I don't know how to tell you to fix it. It could have to do with how the server locale is set or how those libraries were built. (Some of the comments on this page might be helpful: http://us2.php.net/manual/en/function.setlocale.php)

One comment from the php manual site http://us2.php.net/manual/en/function.preg-match.php mentions changing the pattern string to force use of UTF-8. I have no idea if this would fix your problem but it's something you could probably try easily:

Quote:

I noticed that in order to deal with UTF-8 texts, without having to recompile php with the PCRE UTF-8 flag enabled, you can just add the following sequence at the start of your pattern: (*UTF8)

for instance : '#(*UTF8)[[:alnum:]]#' will return TRUE for '?' where '#[[:alnum:]]#' will return FALSE

found this very very useful tip after hours of research over the web directly in pcre website right here : http://www.pcre.org/pcre.txt
there are many further informations about UTF-8 support in the lib

hop that will help!

--
cedric

doing a grep for $vbulletin->options['censorwords'] finds two files in the includes directory, functions.php and class_dm_user.php, so that's at least a place to start.

Maybe someone else out there knows more about locales?


All times are GMT. The time now is 02:28 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01065 seconds
  • Memory Usage 1,719KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (1)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (5)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete