You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucy.apache.org by Kieron Taylor <kt...@ebi.ac.uk> on 2013/10/01 11:40:36 UTC

Re: [lucy-user] Documents gone AWOL

On 30/09/2013 17:41, Peter Karman wrote:
> On 9/30/13 11:22 AM, Kieron Taylor wrote:
>
>> %%% Indexing %%%
>>
>> $lucy_indexer = Lucy::Index::Indexer->new(
>>              schema => $schema,
>>              index => $path,
>>              create => 1,
>> );
>>
>> #
>> while ($record = shift) {
>>
>>    %flattened_record = %{$record};
>>    $flattened_record{accessions} = join ' ',@accessions;
>>    # Array of values turned into whitespaced list.
>>    $lucy_indexer->add_doc(
>>            \%flattened_record
>>    );
>>
>> }
>>
>> # Commit is called ~100k records, before spinning up another indexer
>> $lucy_indexer->commit;
>
>
> I assume you are not passing the 'create => 1' param for each
> $lucy_indexer.

Your comment suggests this would be terminal. I'll make certain I've not 
made a blunder.
>>
>> %%% Querying %%%
>>
>> $query = 'accessions:UPI01';
>>
>> $searcher = Lucy::Search::IndexSearcher->new(
>>      index => $path,
>> );
>> $parser = Search::Query->parser(
>>    dialect => 'Lucy',
>>    fields  => $lucy_indexer->get_schema()->all_fields,
>> );
>>
>> $search = $parser->parse($query)->as_lucy_query;
>
>
> I would probably insert a debugging statement here to verify that the
> parser is doing what you think it is:
>
> $parsed_query = $parser->parse($query);
> printf("parsed_query:%s\n", $parsed_query);
> $lucy_query = $parsed_query->as_lucy_query;
> printf("lucy_query:%s\n", $lucy_query->dump);

Suggestion welcomed and assimilated.

> Instead of grep'ing the segment files, you might try seeing what Lucy
> reports via the API:
>
> https://metacpan.org/source/KARMAN/SWISH-Prog-Lucy-0.17/bin/lucyx-dump-terms

Ok. To be honest, I was only grasping at straws with grep, so I'm glad 
there's a more appropriate alternative.

Thanks very much,

Kieron