vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   vBulletin 3.7 Add-ons (https://vborg.vbsupport.ru/forumdisplay.php?f=228)
-   -   Major Additions - Links and Downloads Manager (https://vborg.vbsupport.ru/showthread.php?t=166094)

AndrewD 08-30-2008 05:34 AM

Quote:

Originally Posted by derfelix (Post 1610354)

find:
PHP Code:

$find[] = '/(\b'.$w.'\b)/i'

replace with:
PHP Code:

$find[] = '/('.preg_quote($w).')/iu'

BUT.. my question.. is there a drawback?????

Felix

Thanks, Felix. Indeed the problem is/was the word boundary. The drawback with removing the word boundary markers is that you end up highlighting substrings in the results which the search itself did not match.

For example, suppose you have a string "happily merrily sadly happilymerrilysadly" and you do a search for merrily

This should highlight as "happily merrily sadly happilymerrilysadly"

and it does with the word boundary flags in the regex.

But without them, it highlights as "happily merrily sadly happilymerrilysadly"

** Edited **

Can you try another way of solving the word boundary problem. Edit the loop in ldm_make_highlight_regex as follows:

Code:

        foreach ($words AS $w) {
                if ($w != "") {
                        $find[] = '/([\p{C}\p{P}\p{Z}]' . $w . '[\p{C}\p{P}\p{Z}]' . ')/iu';
                        $find[] = '/^(' . $w . '[\p{C}\p{P}\p{Z}]' . ')/iu';
                        $find[] = '/([\p{C}\p{P}\p{Z}]' . $w . '$)/iu';
                }
        }


pooslokka 08-30-2008 06:02 AM

Hai we have now your plugin for last thow months and working like a charm.

I have a question.

Do thumbnails are auto generated or we need to upload a thumbnail ? this is for wallpapers.

:confused:

AndrewD 08-30-2008 06:32 AM

Quote:

Originally Posted by pooslokka (Post 1610433)
Hai we have now your plugin for last thow months and working like a charm.

I have a question.

Do thumbnails are auto generated or we need to upload a thumbnail ? this is for wallpapers.

:confused:

First, you have to set the LDM admin setting link_imagesize, otherwise thumbs are not shown at all

Then, if you fill in the Image field on the Add/Edit Entry form with a image url or image upload, LDM will use that to generate the thummb.

Otherwise, if the entry url itself is an image, the thumbnail is autogenerated from the entry url.

Otherwise, if you have installed the id3tag_enhancements LDM extra and the url is an mp3, it will try to pull out the album art from the mp3.

derfelix 08-30-2008 06:43 AM

Quote:

Originally Posted by AndrewD (Post 1610417)
Thanks, Felix. Indeed the problem is/was the word boundary. The drawback with removing the word boundary markers is that you end up highlighting substrings in the results which the search itself did not match.
For example, suppose you have a string "happily merrily sadly happilymerrilysadly" and you do a search for merrily

This should highlight as "happily merrily sadly happilymerrilysadly"

and it does with the word boundary flags in the regex.

But without them, it highlights as "happily merrily sadly happilymerrilysadly"

So we need to solve the word boundary problem in utf8.

Well then.. I am happy.. :p then it is actually a feature..

if you search for "intern" in google.. in the description and the title, words like international or internal or internship are highlighted!!!!

i was going to anyway modify the search from "word" to "*word*" because if i do a search for "luxury" and only have one entry with the word "luxuryhotels" in description.. i would get no results..it would not show up.. in that case at least the highlighting would allready be done..
---------------------
on the otherhand.. using ldm as is.. it is also not a major drawback:
if you are looking for merrily ... it will only show you results where the word "merrily" is standalone... so you do have the correct results.. and if you have an extra sadlymerrilysadly then only it will be highlighted.. wich i think is a feature!!!
---------------------

so if it is the only drawback.. i'm sticking to that solution, especially as php6 is going to have full unicode support.. and I am ready to bet that in php6 this problem will be solved!!

But at least for the moment adding the /u modifier (making it /iu) to the regex will help for languages like german, french or spanish as the highlighting will work as you expect it..


Felix
PS: just seen your edit.. doing testing now!

[EDIT]
just tested your routine... works fine with description....(not working with keywords) hmmm

BUT with chinese there is another problem... did some reading (i do not understand chinese)
i was trying to extract content to use as description.. thats how i stumbled into this article:
it says
Quote:

Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chinese NLP systems therefore employ preprocessors to segment sentences into words.
source: http://portal.acm.org/citation.cfm?id=981621

if this is true i think that the "no boundary" version will for the moment be the easiest solution...for chinese

pooslokka 08-30-2008 06:54 AM

Quote:

Originally Posted by AndrewD (Post 1610439)
First, you have to set the LDM admin setting link_imagesize, otherwise thumbs are not shown at all

Then, if you fill in the Image field on the Add/Edit Entry form with a image url or image upload, LDM will use that to generate the thummb.

Otherwise, if the entry url itself is an image, the thumbnail is autogenerated from the entry url.

Otherwise, if you have installed the id3tag_enhancements LDM extra and the url is an mp3, it will try to pull out the album art from the mp3.

Tx, but how to specify the link_imagesize ? no example given,
is this format ok? 200px ? 150px

AndrewD 08-30-2008 07:48 AM

Quote:

Originally Posted by pooslokka (Post 1610445)
Tx, but how to specify the link_imagesize ? no example given,
is this format ok? 200px ? 150px

No, you just put in a number. As the admin page says:

Size in pixels (along larger dimension) of thumbnail image shown within linkbit. 0: No thumbnails is displayed and data entry forms do not offer image fields.

The information about picking up current value of Thumbnail Size, as set in the vBulletin Admin Control Panel, is not working on every system, for a reason that I do not yet understand.

AndrewD 08-30-2008 07:54 AM

Quote:

Originally Posted by derfelix (Post 1610442)
Chinese sentences are written with no special delimiters such as space to indicate word boundaries. Existing Chinese NLP systems therefore employ preprocessors to segment sentences into words.

Yes, that was one of the things I discovered too, when I looked into this some time ago. There has been a very helpful and knowledgable Chinese user (ItsBlack) on this forum (he has done all the Chinese translations) - maybe he will spot this post and comment.

I will look at the keyword problem

Edited:

yes, of course, there needs to be a fourth line:
Code:

        $find[] = '/^(' . $w . '$)/iu';
this is all because the special utf8 regex characters do not map neatly onto \b, as far as I can tell - \b matches at start and end of line which the utf8 specials do not.

pooslokka 08-30-2008 08:57 AM

Quote:

Originally Posted by AndrewD (Post 1610459)
No, you just put in a number. As the admin page says:

Size in pixels (along larger dimension) of thumbnail image shown within linkbit. 0: No thumbnails is displayed and data entry forms do not offer image fields.

The information about picking up current value of Thumbnail Size, as set in the vBulletin Admin Control Panel, is not working on every system, for a reason that I do not yet understand.

Done, Thumbs are generated nicesly :up:

derfelix 08-30-2008 09:31 AM

Quote:

Originally Posted by AndrewD (Post 1610460)
maybe he will spot this post and comment.

Looking forward for that...
Quote:

Originally Posted by AndrewD (Post 1610460)
yes, of course, there needs to be a fourth line:
Code:

        $find[] = '/^(' . $w . '$)/iu';
this is all because the special utf8 regex characters do not map neatly onto \b, as far as I can tell - \b matches at start and end of line which the utf8 specials do not.

Works as a charm..

Felix

vbboarder 08-30-2008 02:32 PM

Yeeaahh - it's LDM's 1000th post in this thread alone!

Congrats Andrew on a successful mod & community, and thanks for all your help!


All times are GMT. The time now is 07:28 PM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.02894 seconds
  • Memory Usage 1,768KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (3)bbcode_code_printable
  • (2)bbcode_php_printable
  • (10)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)pagenav
  • (1)pagenav_curpage
  • (4)pagenav_pagelink
  • (5)pagenav_pagelinkrel
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (10)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • pagenav_page
  • pagenav_complete
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete