vb.org Archive

vb.org Archive (https://vborg.vbsupport.ru/index.php)
-   Sphinx (https://vborg.vbsupport.ru/forumdisplay.php?f=265)
-   -   Sphinx: WARNING: duplicate document ids found (https://vborg.vbsupport.ru/showthread.php?t=241843)

FractalizeR 05-04-2010 01:13 PM

Sphinx: WARNING: duplicate document ids found
 
The following is the output of cronjob /usr/local/sphinx/cron/delta.sh:

Code:

Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'ForumDelta'...
collected 43 docs, 0.0 MB
collected 1 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 0.0 Mhits, 100.0% done
total 43 docs, 5030 bytes
total 0.014 sec, 361480.03 bytes/sec, 3090.19 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'ThreadPostDelta'...
collected 2966 docs, 1.2 MB
collected 588 attr values
sorted 0.0 Mvalues, 100.0% done
sorted 0.1 Mhits, 84.8% done
WARNING: duplicate document ids found
total 2966 docs, 1159212 bytes
total 122.929 sec, 9429.94 bytes/sec, 24.13 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'DiscussionMessageDelta'...
collected 0 docs, 0.0 MB
collected 1 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.034 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'SocialGroupDelta'...
collected 0 docs, 0.0 MB
collected 1 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'VisitorMessageDelta'...
collected 0 docs, 0.0 MB
collected 1 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.014 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'BlogEntryDelta'...
collected 0 docs, 0.0 MB
collected 0 attr values
sorted 0.0 Mvalues, nan% done
total 0 docs, 0 bytes
total 0.046 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'BlogCommentDelta'...
collected 0 docs, 0.0 MB
collected 1 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.010 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).
Sphinx 0.9.8-id64-release (r1371)
Copyright (c) 2001-2008, Andrew Aksyonoff

using config file '/usr/local/sphinx/etc/vbulletin-sphinx.php'...
indexing index 'CMSArticlesDelta'...
collected 0 docs, 0.0 MB
collected 1 attr values
sorted 0.0 Mvalues, 100.0% done
total 0 docs, 0 bytes
total 0.011 sec, 0.00 bytes/sec, 0.00 docs/sec
rotating indices: succesfully sent SIGHUP to searchd (pid=11570).

Please look at ThreadPostDelta indexing:

WARNING: duplicate document ids found message appears. Is that a normal behavior of Sphinx? What is the document id used?

sung 05-04-2010 02:41 PM

I got the warning as well (so glad it isn't just me), which I've reported in the vbulletin.com forums.

It can cause all sorts of nasty problems with Sphinx.

Quote:

There are a few different restrictions imposed on the source data which is going to be indexed by Sphinx, of which the single most important one is:

ALL DOCUMENT IDS MUST BE UNIQUE UNSIGNED NON-ZERO INTEGER NUMBERS (32-BIT OR 64-BIT, DEPENDING ON BUILD TIME SETTINGS).

If this requirement is not met, different bad things can happen. For instance, Sphinx can crash with an internal assertion while indexing; or produce strange results when searching due to conflicting IDs. Also, a 1000-pound gorilla might eventually come out of your display and start throwing barrels at you. You've been warned.

FractalizeR 05-04-2010 07:42 PM

The following combination is used in configuration file to make so-called Document ID, that MUST be unique:

Code:

SELECT (c.contenttypeid << 32) | (p.postid) AS id
On some reason, it appears non-unique. However, I don't see how it can be other than really duplicating rows are returned by complete query

graham_w 06-19-2010 10:58 PM

Did you ever sort this out - i'm noticing the same error.

Cheers

FractalizeR 06-20-2010 05:49 AM

No, but it looks like it doesn't affect search quality.

graham_w 06-20-2010 06:53 AM

Thanks for the reply - yeah I did find a thread saying similar on the sphinx website.

Cheers

JesterP 06-20-2010 04:35 PM

Quote:

Originally Posted by graham_w (Post 2056227)
Thanks for the reply - yeah I did find a thread saying similar on the sphinx website.

Cheers

I recieved in my inbox this morning:

--->8---

### SAVE ORDERED IDS TO SEARCH CACHE ###;

MySQL Error : Duplicate entry '92f3f32f09b269797e91242ce55639a6-lastpost-DESC' for key 2
Error Number : 1062
Request Date : Sunday, June 20th 2010 @ 10:44:01 AM

---8<---
Everything is still running and I am not seeing anything bad happening. No errors since.


All times are GMT. The time now is 11:30 AM.

Powered by vBulletin® Version 3.8.12 by vBS
Copyright ©2000 - 2025, vBulletin Solutions Inc.

X vBulletin 3.8.12 by vBS Debug Information
  • Page Generation 0.01672 seconds
  • Memory Usage 1,736KB
  • Queries Executed 10 (?)
More Information
Template Usage:
  • (1)ad_footer_end
  • (1)ad_footer_start
  • (1)ad_header_end
  • (1)ad_header_logo
  • (1)ad_navbar_below
  • (2)bbcode_code_printable
  • (2)bbcode_quote_printable
  • (1)footer
  • (1)gobutton
  • (1)header
  • (1)headinclude
  • (6)option
  • (1)post_thanks_navbar_search
  • (1)printthread
  • (7)printthreadbit
  • (1)spacer_close
  • (1)spacer_open 

Phrase Groups Available:
  • global
  • postbit
  • showthread
Included Files:
  • ./printthread.php
  • ./global.php
  • ./includes/init.php
  • ./includes/class_core.php
  • ./includes/config.php
  • ./includes/functions.php
  • ./includes/class_hook.php
  • ./includes/modsystem_functions.php
  • ./includes/class_bbcode_alt.php
  • ./includes/class_bbcode.php
  • ./includes/functions_bigthree.php 

Hooks Called:
  • init_startup
  • init_startup_session_setup_start
  • init_startup_session_setup_complete
  • cache_permissions
  • fetch_threadinfo_query
  • fetch_threadinfo
  • fetch_foruminfo
  • style_fetch
  • cache_templates
  • global_start
  • parse_templates
  • global_setup_complete
  • printthread_start
  • bbcode_fetch_tags
  • bbcode_create
  • bbcode_parse_start
  • bbcode_parse_complete_precache
  • bbcode_parse_complete
  • printthread_post
  • printthread_complete