You are viewing a plain text version of this content. The canonical link for it is here.
Posted to xindice-users@xml.apache.org by Alexander DEJANOVSKI <ad...@hotmail.com> on 2004/02/26 06:41:09 UTC
Re: [work] Re: After creating index nothing is found anymore

I've had the same problem...
I actually solved it by switching collection compression to true.
Uncompressed collections thus have issues with index management.

Bye


At 16:55 23/02/2004, you wrote:
>Re: your final question, there is currently no way
>to get just the doc ids.
>
>Best you can do to ensure efficiency is
>to select a small fragment of the resource
>thus minimizing the excess bytes transmitted.
>
>WRT the index creation problem,
>you could help the project if you can
>write a unit test demonstrating that
>index creation during collection creation
>fails, and filing a bug report in
>bugzilla including your unit test.
>
>Thanks!
>
>-Terry
>
>Sascha Kulawik wrote:
>
>>>Have you tried running the unit tests?
>>>
>>
>>No, not now.
>>The http://marc.theaimsgroup.com/?l=xindice-users&m=107426829426034&w=2
>>solved my problem, but I dont know why - so I won't create the index during
>>the creation of the collection, furthermore after that with that given
>>function.
>>I've recreated the indexes with the patterns link@* and document@* - this is
>>much better. A result for a Xpath Query takes 30ms with about 1000 Documents
>>- thats far good enough for my workcase.
>>Actually there is only one problem left - how could I speed up the Querys,
>>where I don't need the Xpath result. So - is there any solution to get a
>>ResultSet back without any Documents in? In this case I need only the
>>DocumentIds res.getDocumentId();
>>
>>Thank you very much for your help,
>>
>>Sascha
>>
>>
>>
>>>The test IndexedSearchTest in
>>>java/tests/src/org/apache/xindice/integration/client/services
>>>includes a number of tests that test not only whether or not indexed 
>>>searching is working, but also test whether or not indexing speeds up 
>>>the query. One of the tests uses the following query:
>>>
>>>//phone[starts-with(@call, 'n')]
>>>
>>>That is very similar to the query used in:
>>>
>>> > If I'm doing a search like "//document[@src='170']", everything works 
>>> fine,  > except that it takes the same time as without an index.
>>>
>>>The IndexedSearchTest indexer for this case uses the pattern "*@call" to 
>>>speed up the //phone[starts-with(@call, 'n')] query. The pattern says 
>>>index all "call" Attributes regardless of what Element they belong to.
>>>
>>>Your indexer is defined with pattern "link@viewid". Since it does not 
>>>index ALL possible viewid Attributes (only the viewid Attribute of the 
>>>link Element) Xindice cannot use this index to search all occurrences of 
>>>viewid Attributes. Thus, you see no speedup. Try pattern "*@viewid" instead.
>>>
>>>I would expect to see the IndexedSearchTest fail if there is a problem.
>>>Otherwise, perhaps you have a corrupted index. Try removing it and 
>>>reindexing.
>>>
>>>-Terry
>>>
>>>Sascha Kulawik wrote:
>>>
>>>
>>>
>>>>Hello,
>>>>
>>>>I finally getting headage during the configuration of
>>>Xindice. I'm using Xindice 1.1b3 (currently Ive tried a CVS checkout 
>>>from today morning) in Jboss with Jetty as exploded war-archive.
>>>
>>>
>>>>I've created a collection with following code snippet:
>>>>
>>>>---------------------------------------------------------------
>>>>String collectionConfig = "<collection compressed=\"false\" 
>>>>name=\""+collectionName+"\">"+ "<filer 
>>>>class=\"org.apache.xindice.core.filer.BTreeFiler\"
>>>gzip=\"false\"/>"+
>>>
>>>>"<indexes>"+ "<index 
>>>>class=\"org.apache.xindice.core.indexer.ValueIndexer\" 
>>>>name=\"internalLink_attr_idx\" pattern=\"link@viewid\" 
>>>>type=\"String\"/>"+ "<index 
>>>>class=\"org.apache.xindice.core.indexer.ValueIndexer\" 
>>>>name=\"document_attr_idx\" pattern=\"document\" type=\"String\"/>"+ 
>>>>"</indexes>"+ "</collection>"; col = DatabaseManager.getCollection(uri);
>>>>CollectionManager collman = (CollectionManager) 
>>>>col.getService("CollectionManager", "1.0"); try { 
>>>>collman.createCollection(collectionName, 
>>>>XercesHelper.string2Dom(collectionConfig));
>>>>}catch(Exception exe) {
>>>>String errMsg = "Error during the converting of the
>>>Collection-String
>>>
>>>>to XML-DOM"; log.error(errMsg); throw new 
>>>>XMLDBException(ErrorCodes.VENDOR_ERROR, -1, errMsg, exe); }
>>>>---------------------------------------------------------------
>>>>
>>>>If I'm doing a search like "//document[@src='170']",
>>>everything works fine, except that it takes the same time as without an 
>>>index.
>>>
>>>
>>>>If I'm trying to search for "//link[@viewid='2045']",
>>>nothing happens,
>>>
>>>>no result, nothing. Without the index I will get some
>>>results back. This Xpath search is very fast (80ms), but without any 
>>>result it is obvious needless :) The idx file of the first one is about 
>>>30kB in size, the second one is 6kB - this is the default I think.
>>>
>>>
>>>>For the first Xpath Query it is only relevant, if this
>>>document exists in any xml document in the collection. I've seen on 
>>>MARC, that this could be done faster, so that the result of this Xpath 
>>>Query will be only the Document itself or the id of the document. How is 
>>>this possible?
>>>
>>>
>>>>Thank you all very much,
>>>>
>>>>Sascha
>>>>
>>>>
>>>>
>>>
>>
>>
>