You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Rishi Easwaran <ri...@aol.com> on 2013/05/28 17:12:04 UTC

Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"  multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"  multiValued="false" required="true"/> 
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like <uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using custom code to generate a new field, but wanted to know if there was an inbuilt SOLR mechanism of doing this. That would prevent us from creating and storing an extra field.

Thanks,

Rishi.





Re: Solr Composite Unique key from existing fields in schema

Posted by Jan Høydahl <ja...@cominvent.com>.
The cleanest is to do this from the outside.

Alternatively, it will perhaps work to populate your uniqueKey in a custom UpdateProcessor. You can try.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

28. mai 2013 kl. 17:12 skrev Rishi Easwaran <ri...@aol.com>:

> Hi All,
> 
> Historically we have used a single field in our schema as a uniqueKey.
> 
>  <field name="docid"        type="string"   indexed="true"  stored="true"  multiValued="false" required="true"/>
>  <field name="userid"      type="string"   indexed="true"  stored="true"  multiValued="false" required="true"/> 
> <uniqueKey>docid</uniqueKey>
> 
> Wanted to change this to a composite key something like <uniqueKey>userid-docid</uniqueKey>.
> I know I can auto generate compositekey at document insert time, using custom code to generate a new field, but wanted to know if there was an inbuilt SOLR mechanism of doing this. That would prevent us from creating and storing an extra field.
> 
> Thanks,
> 
> Rishi.
> 
> 
> 
> 


Re: Solr Composite Unique key from existing fields in schema

Posted by Rishi Easwaran <ri...@aol.com>.
Thanks Jack, looks like that will do the trick from me. I will try it out. 

 

 

 

-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl 
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone 
update processor, and pick your composite key field name as well. And set 
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid, 
docid).

I used the standard Solr example schema, so I used dynamic fields for the 
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like 
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using 
custom code to generate a new field, but wanted to know if there was an 
inbuilt SOLR mechanism of doing this. That would prevent us from creating 
and storing an extra field.

Thanks,

Rishi.





 

Re: Solr Composite Unique key from existing fields in schema

Posted by Jack Krupansky <ja...@basetechnology.com>.
Great. And I did verify that the field order cannot be guaranteed by a 
single CloneFieldUpdateProcessorFactory with multiple field names - the 
underlying code iterates over the input values, checks the field selector 
for membership and then immediately adds to the output, so changing the 
input order will change the output order. Also, field names are stored in a 
HashSet anyway, which would tend to scramble their order.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 6:01 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Thanks Jack, That fixed it and guarantees the order.

As far as I can tell SOLR cloud 4.2.1 needs a uniquekey defined in its 
schema, or I get an exception.
SolrCore Initialization Failures
* testCloud2_shard1_replica1: 
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: 
QueryElevationComponent requires the schema to have a uniqueKeyField.

Now that I have an autogenerated composite-id, it has to become a part of my 
schema as uniquekey for SOLR cloud to work.
  <field name="docid"        type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
<field name="compositeId"      type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
<uniqueKey>compositeId</uniqueKey>

Is there a way to avoid compositeId field being defined in my schema.xml, 
would like to avoid the overhead of storing this field in my index.

Thanks,

Rishi.








-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 4:33 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The TL;DR response: Try this:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

That will assure that the userid gets processed before the docid.

I'll have to review the contract for CloneFieldUpdateProcessorFactory to see
what is or ain't guaranteed when there are multiple input fields - whether
this is a bug or a feature or simply undefined.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 3:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

I thought the same, but that doesn't seem to be the case.








-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field
names in the processor configuration:

<str name="source">docid_s</str>
<str name="source">userid_s</str>

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the
compositeId that is generated is based on input order.

For example:
If my input comes in as
<field name="docid">1</field>
<field name="userid">12345</field>

I get the following compositeId1-12345.

If I reverse the input

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.











Re: Solr Composite Unique key from existing fields in schema

Posted by Rishi Easwaran <ri...@aol.com>.
Thanks Jack, That fixed it and guarantees the order.

As far as I can tell SOLR cloud 4.2.1 needs a uniquekey defined in its schema, or I get an exception.
SolrCore Initialization Failures
 * testCloud2_shard1_replica1: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: QueryElevationComponent requires the schema to have a uniqueKeyField. 

Now that I have an autogenerated composite-id, it has to become a part of my schema as uniquekey for SOLR cloud to work. 
  <field name="docid"        type="string"   indexed="true"  stored="true" multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true" multiValued="false" required="true"/>
 <field name="compositeId"      type="string"   indexed="true"  stored="true" multiValued="false" required="true"/> 
<uniqueKey>compositeId</uniqueKey>

Is there a way to avoid compositeId field being defined in my schema.xml, would like to avoid the overhead of storing this field in my index.

Thanks,

Rishi.


 

 

 

-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 4:33 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The TL;DR response: Try this:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

That will assure that the userid gets processed before the docid.

I'll have to review the contract for CloneFieldUpdateProcessorFactory to see 
what is or ain't guaranteed when there are multiple input fields - whether 
this is a bug or a feature or simply undefined.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 3:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

I thought the same, but that doesn't seem to be the case.








-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field
names in the processor configuration:

<str name="source">docid_s</str>
<str name="source">userid_s</str>

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the
compositeId that is generated is based on input order.

For example:
If my input comes in as
<field name="docid">1</field>
<field name="userid">12345</field>

I get the following compositeId1-12345.

If I reverse the input

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.









 

Re: Solr Composite Unique key from existing fields in schema

Posted by Jack Krupansky <ja...@basetechnology.com>.
The TL;DR response: Try this:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

That will assure that the userid gets processed before the docid.

I'll have to review the contract for CloneFieldUpdateProcessorFactory to see 
what is or ain't guaranteed when there are multiple input fields - whether 
this is a bug or a feature or simply undefined.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 3:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

I thought the same, but that doesn't seem to be the case.








-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field
names in the processor configuration:

<str name="source">docid_s</str>
<str name="source">userid_s</str>

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the
compositeId that is generated is based on input order.

For example:
If my input comes in as
<field name="docid">1</field>
<field name="userid">12345</field>

I get the following compositeId1-12345.

If I reverse the input

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.









Re: Solr Composite Unique key from existing fields in schema

Posted by Rishi Easwaran <ri...@aol.com>.
I thought the same, but that doesn't seem to be the case.


 

 

 

-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 3:32 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


The order in the ID should be purely dependent on the order of the field 
names in the processor configuration:

<str name="source">docid_s</str>
<str name="source">userid_s</str>

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the 
compositeId that is generated is based on input order.

For example:
If my input comes in as
<field name="docid">1</field>
<field name="userid">12345</field>

I get the following compositeId1-12345.

If I reverse the input

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.







 

Re: Solr Composite Unique key from existing fields in schema

Posted by Jack Krupansky <ja...@basetechnology.com>.
The order in the ID should be purely dependent on the order of the field 
names in the processor configuration:

<str name="source">docid_s</str>
<str name="source">userid_s</str>

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 2:54 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Composite Unique key from existing fields in schema

Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the 
compositeId that is generated is based on input order.

For example:
If my input comes in as
<field name="docid">1</field>
<field name="userid">12345</field>

I get the following compositeId1-12345.

If I reverse the input

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 .


In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone
update processor, and pick your composite key field name as well. And set
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid,
docid).

I used the standard Solr example schema, so I used dynamic fields for the
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true"
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using
custom code to generate a new field, but wanted to know if there was an
inbuilt SOLR mechanism of doing this. That would prevent us from creating
and storing an extra field.

Thanks,

Rishi.







Re: Solr Composite Unique key from existing fields in schema

Posted by Rishi Easwaran <ri...@aol.com>.
Jack,

No sure if this is the correct behaviour.
I set up updateRequestorPorcess chain as mentioned below, but looks like the compositeId that is generated is based on input order.

For example: 
If my input comes in as 
<field name="docid">1</field>
<field name="userid">12345</field>

 I get the following compositeId1-12345. 

If I reverse the input 

<field name="userid">12345</field>

<field name="docid">1</field>
I get the following compositeId 12345-1 . 
 

In this case the compositeId is not unique and I am getting duplicates.

Thanks,

Rishi.



-----Original Message-----
From: Jack Krupansky <ja...@basetechnology.com>
To: solr-user <so...@lucene.apache.org>
Sent: Tue, May 28, 2013 12:07 pm
Subject: Re: Solr Composite Unique key from existing fields in schema


You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl 
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone 
update processor, and pick your composite key field name as well. And set 
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid, 
docid).

I used the standard Solr example schema, so I used dynamic fields for the 
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like 
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using 
custom code to generate a new field, but wanted to know if there was an 
inbuilt SOLR mechanism of doing this. That would prevent us from creating 
and storing an extra field.

Thanks,

Rishi.





 

Re: Solr Composite Unique key from existing fields in schema

Posted by Jack Krupansky <ja...@basetechnology.com>.
You can do this by combining the builtin update processors.

Add this to your solrconfig:

<updateRequestProcessorChain name="composite-id">
  <processor class="solr.CloneFieldUpdateProcessorFactory">
    <str name="source">docid_s</str>
    <str name="source">userid_s</str>
    <str name="dest">id</str>
  </processor>
  <processor class="solr.ConcatFieldUpdateProcessorFactory">
    <str name="fieldName">id</str>
    <str name="delimiter">--</str>
  </processor>
  <processor class="solr.LogUpdateProcessorFactory" />
  <processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>

Add documents such as:

curl 
"http://localhost:8983/solr/update?commit=true&update.chain=composite-id" \
-H 'Content-type:application/json' -d '
[{"title": "Hello World",
  "docid_s": "doc-1",
  "userid_s": "user-1",
  "comments_ss": ["Easy", "Fast"]}]'

And get results like:

"title":["Hello World"],
"docid_s":"doc-1",
"userid_s":"user-1",
"comments_ss":["Easy",
  "Fast"],
"id":"doc-1--user-1",

Add as many fields in whatever order you want using "source" in the clone 
update processor, and pick your composite key field name as well. And set 
the delimiter string as well in the concat update processor.

I managed to reverse the field order from what you requested (userid, 
docid).

I used the standard Solr example schema, so I used dynamic fields for the 
two ids, but use your own field names.

-- Jack Krupansky

-----Original Message----- 
From: Rishi Easwaran
Sent: Tuesday, May 28, 2013 11:12 AM
To: solr-user@lucene.apache.org
Subject: Solr Composite Unique key from existing fields in schema

Hi All,

Historically we have used a single field in our schema as a uniqueKey.

  <field name="docid"        type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
  <field name="userid"      type="string"   indexed="true"  stored="true" 
multiValued="false" required="true"/>
<uniqueKey>docid</uniqueKey>

Wanted to change this to a composite key something like 
<uniqueKey>userid-docid</uniqueKey>.
I know I can auto generate compositekey at document insert time, using 
custom code to generate a new field, but wanted to know if there was an 
inbuilt SOLR mechanism of doing this. That would prevent us from creating 
and storing an extra field.

Thanks,

Rishi.