You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucy.apache.org by Serkan Mulayim <se...@gmail.com> on 2016/11/10 22:38:43 UTC

[lucy-user] Does multivalued field support exist in Lucy?

Hi guys,

I would like to understand if multivalued fields can be defined in the
Lucy? To be more specific can I put multiple StringType objects to a single
field?

I could not find any solution for this. And I do not want to use
RegexTokenizer for a few reasons:
1- I do not want to have any complexity or limitations in the way I index
tokens.
2- I would like to have a static library which would not depend on PCRE. (I
know this is a second question but do you know which version of PCRE is
supported. I have 8.39, and I am receiving errors.)

Thanks,
Serkan

Re: [lucy-user] Does multivalued field support exist in Lucy?

Posted by Serkan Mulayim <se...@gmail.com>.
Thank you Nick, I will give it a try. It sounds really promising.



On Fri, Nov 11, 2016 at 11:03 AM, Nick Wellnhofer <we...@aevum.de>
wrote:

> On 11/11/2016 19:46, Serkan Mulayim wrote:
>
>> I was referring to C library not Perl, sorry for not putting it on my
>> question.
>>
>> Peter, regarding the multivalue fields, it seems like I, for sure, need to
>> create Whitespace tokenizer based on RegexTokenizer, can you or someone
>> please confirm? This would create the dependency for the PCRE.  In order
>> to
>> make it I will need PCRE to be built as static library and linked with
>> lucy
>> and my code then.
>>
>
> You can write a custom Analyzer that simply splits on a predefined
> character. Have a look at this thread for how to do this in C:
>
> https://lists.apache.org/thread.html/ea5b19eb7a8f688c85c8268
> b0119282936eb1d097b3b3306d4b909de@1427747314@%3Cdev.lucy.apache.org%3E
>
> Or here with proper indentation:
>
> http://mail-archives.apache.org/mod_mbox/lucy-dev/201503.mbo
> x/%3cCAAS6=7hPSMNA=RrT63q1YPvTS=2Jphzfxu5ArXXS4fEgUGLLDA@mail.gmail.com%3e
>
> Nick
>
>

Re: [lucy-user] Does multivalued field support exist in Lucy?

Posted by Nick Wellnhofer <we...@aevum.de>.
On 11/11/2016 19:46, Serkan Mulayim wrote:
> I was referring to C library not Perl, sorry for not putting it on my
> question.
>
> Peter, regarding the multivalue fields, it seems like I, for sure, need to
> create Whitespace tokenizer based on RegexTokenizer, can you or someone
> please confirm? This would create the dependency for the PCRE.  In order to
> make it I will need PCRE to be built as static library and linked with lucy
> and my code then.

You can write a custom Analyzer that simply splits on a predefined character. 
Have a look at this thread for how to do this in C:

https://lists.apache.org/thread.html/ea5b19eb7a8f688c85c8268b0119282936eb1d097b3b3306d4b909de@1427747314@%3Cdev.lucy.apache.org%3E

Or here with proper indentation:

http://mail-archives.apache.org/mod_mbox/lucy-dev/201503.mbox/%3cCAAS6=7hPSMNA=RrT63q1YPvTS=2Jphzfxu5ArXXS4fEgUGLLDA@mail.gmail.com%3e

Nick


Re: [lucy-user] Does multivalued field support exist in Lucy?

Posted by Serkan Mulayim <se...@gmail.com>.
Thanks Peter and Nick for your quick responses.

I was referring to C library not Perl, sorry for not putting it on my
question.

Peter, regarding the multivalue fields, it seems like I, for sure, need to
create Whitespace tokenizer based on RegexTokenizer, can you or someone
please confirm? This would create the dependency for the PCRE.  In order to
make it I will need PCRE to be built as static library and linked with lucy
and my code then.

Nick, probably I messed up something in my environment. I ran the tests and
with the default Makefile of Lucy, which only creates dynamic library, AND
RegexTokenizer passed the test.

Thanks again guys,
Serkan

On Fri, Nov 11, 2016 at 4:28 AM, Nick Wellnhofer <we...@aevum.de>
wrote:

> On 10/11/2016 23:38, Serkan Mulayim wrote:
>
>> 2- I would like to have a static library which would not depend on PCRE.
>> (I
>> know this is a second question but do you know which version of PCRE is
>> supported. I have 8.39, and I am receiving errors.)
>>
>
> This version of PCRE should be supported. What errors are you receiving?
>
> Nick
>
>

Re: [lucy-user] Does multivalued field support exist in Lucy?

Posted by Nick Wellnhofer <we...@aevum.de>.
On 10/11/2016 23:38, Serkan Mulayim wrote:
> 2- I would like to have a static library which would not depend on PCRE. (I
> know this is a second question but do you know which version of PCRE is
> supported. I have 8.39, and I am receiving errors.)

This version of PCRE should be supported. What errors are you receiving?

Nick


Re: [lucy-user] Does multivalued field support exist in Lucy?

Posted by Peter Karman <pe...@peknet.com>.
Serkan Mulayim wrote on 11/10/16, 4:38 PM:
> Hi guys,
>
> I would like to understand if multivalued fields can be defined in the
> Lucy? To be more specific can I put multiple StringType objects to a single
> field?
>

There is not native support in Lucy for multi-values fields. You would need to 
concatenate multiple strings together to store them under a single field.

See http://markmail.org/message/r2dyzgj6pcewaaq4

Dezi supports multi-value fields through concatenation using the ASCII byte \003 
(ETX end of text).

See https://metacpan.org/source/KARMAN/Dezi-App-0.014/lib/Dezi/Lucy/Indexer.pm#L110

-- 
Peter Karman  .  http://peknet.com/  .  peter@peknet.com