You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by jame vaalet <ja...@gmail.com> on 2011/10/24 13:41:49 UTC

indexing key value pair into lucene solr index

hi,
in my use case i have list of key value pairs in each document object, if i
index them as separate index fields then in the result doc object i will get
two arrays corresponding to my keys and values. The problem i face here is
that there wont be any mapping between those keys and values.

do we have any easy to index these data in solr ? thanks in advance ...

-- 

-JAME

Re: indexing key value pair into lucene solr index

Posted by Ken Krugler <kk...@transpac.com>.
On Oct 24, 2011, at 1:41pm, jame vaalet wrote:

> hi,
> in my use case i have list of key value pairs in each document object, if i
> index them as separate index fields then in the result doc object i will get
> two arrays corresponding to my keys and values. The problem i face here is
> that there wont be any mapping between those keys and values.
> 
> do we have any easy to index these data in solr ? thanks in advance ...

As Karsten said, providing more detail re what you're actually trying to do usually makes for better and more helpful/accurate answers.

But I'm guessing you only want to search on the key, not the value, right?

If so, then:

1. Create a multi-value field with a custom type, indexed, stored.
2. During indexing, add entries as <key><tab><value>
3. In the custom type, set the analyzer to strip off the <tab><value> so you only index the key. E.g.

    <fieldType name="key_value" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true" omitTermFreqAndPositions="true" omitNorms="true">
      <analyzer type="index">
        <!-- Get rid of <tab><value> text at the end of each string -->
        <charFilter class="solr.PatternReplaceCharFilterFactory" pattern="\t\d+$" replacement="" />
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.StandardTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
      </analyzer>
    </fieldType>

-- Ken

--------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
custom big data solutions & training
Hadoop, Cascading, Mahout & Solr




Re: indexing key value pair into lucene solr index

Posted by ka...@gmx.de.
Hi Jame,

preserve order in index fields:

if you don't want to use phrase queries in key or value this order is "position".
if you use phrase queries but no value has more then 50 Tokens you also could use position and start each pair with position 100, 200, 300 ...
Otherwise you could use payloads.

Imho there is no standard way to connect the positions of two fields.
You have to write your own Query.
My Tip: 
 Take org.apache.lucene.search.spans.TermSpans as starting point and use the queryparser-Module.

btw: 
normaly there is a standard solution in lucene for each problem.
So please tell more about your use-case and somebody will have an answer without "program by your own".

Best regards
  Karsten



-------- Original-Nachricht --------
> Datum: Mon, 24 Oct 2011 17:53:26 +0530
> Von: jame vaalet <ja...@gmail.com>
> An: solr-user@lucene.apache.org
> Betreff: Re: indexing key value pair into lucene solr index

> thanks karsten.
> can we preserve order within index field ? if yes, i can index them
> separately and map them using their order.
> 
> On 24 October 2011 17:32, <ka...@gmx.de> wrote:
> 
> > Hi Jame,
> >
> > you can
> >  - generate one token for each pair (key, value) --> key_value
> >  - insert a gap between each pair and us phrase queries
> >  - use key as field-name (if you have a restricted set of keys)
> >  - wait for joins in Solr 4.0 (http://wiki.apache.org/solr/Join)
> >  - use position or payloads to connect key and value
> >  - tell the forum your exact use-case with examples
> >
> > Best regrads
> >  Karsten
> >
> > -------- Original-Nachricht --------
> > > Datum: Mon, 24 Oct 2011 17:11:49 +0530
> > > Von: jame vaalet <ja...@gmail.com>
> > > An: solr-user@lucene.apache.org
> > > Betreff: indexing key value pair into lucene solr index
> >
> > > hi,
> > > in my use case i have list of key value pairs in each document object,
> if
> > > i
> > > index them as separate index fields then in the result doc object i
> will
> > > get
> > > two arrays corresponding to my keys and values. The problem i face
> here
> > is
> > > that there wont be any mapping between those keys and values.
> > >
> > > do we have any easy to index these data in solr ? thanks in advance
> ...
> > >
> > > --
> > >
> > > -JAME
> >
> 
> 
> 
> -- 
> 
> -JAME

Re: indexing key value pair into lucene solr index

Posted by jame vaalet <ja...@gmail.com>.
thanks karsten.
can we preserve order within index field ? if yes, i can index them
separately and map them using their order.

On 24 October 2011 17:32, <ka...@gmx.de> wrote:

> Hi Jame,
>
> you can
>  - generate one token for each pair (key, value) --> key_value
>  - insert a gap between each pair and us phrase queries
>  - use key as field-name (if you have a restricted set of keys)
>  - wait for joins in Solr 4.0 (http://wiki.apache.org/solr/Join)
>  - use position or payloads to connect key and value
>  - tell the forum your exact use-case with examples
>
> Best regrads
>  Karsten
>
> -------- Original-Nachricht --------
> > Datum: Mon, 24 Oct 2011 17:11:49 +0530
> > Von: jame vaalet <ja...@gmail.com>
> > An: solr-user@lucene.apache.org
> > Betreff: indexing key value pair into lucene solr index
>
> > hi,
> > in my use case i have list of key value pairs in each document object, if
> > i
> > index them as separate index fields then in the result doc object i will
> > get
> > two arrays corresponding to my keys and values. The problem i face here
> is
> > that there wont be any mapping between those keys and values.
> >
> > do we have any easy to index these data in solr ? thanks in advance ...
> >
> > --
> >
> > -JAME
>



-- 

-JAME

Re: indexing key value pair into lucene solr index

Posted by ka...@gmx.de.
Hi Jame,

you can
 - generate one token for each pair (key, value) --> key_value
 - insert a gap between each pair and us phrase queries
 - use key as field-name (if you have a restricted set of keys)
 - wait for joins in Solr 4.0 (http://wiki.apache.org/solr/Join)
 - use position or payloads to connect key and value
 - tell the forum your exact use-case with examples

Best regrads
  Karsten

-------- Original-Nachricht --------
> Datum: Mon, 24 Oct 2011 17:11:49 +0530
> Von: jame vaalet <ja...@gmail.com>
> An: solr-user@lucene.apache.org
> Betreff: indexing key value pair into lucene solr index

> hi,
> in my use case i have list of key value pairs in each document object, if
> i
> index them as separate index fields then in the result doc object i will
> get
> two arrays corresponding to my keys and values. The problem i face here is
> that there wont be any mapping between those keys and values.
> 
> do we have any easy to index these data in solr ? thanks in advance ...
> 
> -- 
> 
> -JAME

RE: indexing key value pair into lucene solr index

Posted by "Jaeger, Jay - DOT" <Ja...@dot.wi.gov>.
Maybe put them in a single string field (or any other field type that is not analyzed -- certainly not text) using some character separator that will connect them, but won't confuse the Solr query parser?

So maybe you start out with key value pairs of

Key1 value1
Key2 value2
Key3 value3

Preprocess them for indexing, and then index (and search) for them as, for example, 

Key1$value1
Key2$value2
Key3$value3

(You could also store their individual values in a separate field, of course).

JRJ

-----Original Message-----
From: jame vaalet [mailto:jamevaalet@gmail.com] 
Sent: Monday, October 24, 2011 6:42 AM
To: solr-user@lucene.apache.org
Subject: indexing key value pair into lucene solr index

hi,
in my use case i have list of key value pairs in each document object, if i
index them as separate index fields then in the result doc object i will get
two arrays corresponding to my keys and values. The problem i face here is
that there wont be any mapping between those keys and values.

do we have any easy to index these data in solr ? thanks in advance ...

-- 

-JAME