You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dotan Cohen <do...@gmail.com> on 2013/05/30 13:55:03 UTC

Removing a single value from a multiValue field

I have a Solr application with a multiValue field 'tags'. All fields
are indexed in this application. There exists a uniqueKey field 'id'
and a '_version_' field. This is running on Solr 4.x.

In order to add a tag, the application retrieves the full document,
creates a PHP array from the document structure, removes the
'_version_' field, and then adds the appropriate tag to the 'tags'
array. This is all then sent to Solr's update method via HTTP with
'overwrite=true'. Solr correctly replaces the extant document with the
new document, which is identical with the exception of a new value for
the '_version_' field and an additional value in the multiValued field
'tags'. This all works correctly.

I am now adding a feature where one can remove tags. I am using the
same business logic, however instead of adding a value to the 'tags'
array I am removing one. I can confirm that the data being sent to
Solr does not contain the removed tag. However, it seems that the old
value for the multiValue field is persisted, that is the old tag
stays. I can see that the '_version_' field has a new value, so I see
that the change was properly commited.

Is there a known bug that overwriting such a doc...:
<doc>
    <arr name="tags">
        <str>a</str>
        <str>b</str>
 </arr>
</doc>

...with this doc...:
<doc>
    <arr name="tags">
        <str>a</str>
 </arr>
</doc>

...has no effect? Can multiValue fields be only added, but not removed?

Thanks.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

Re: Removing a single value from a multiValue field

Posted by Dotan Cohen <do...@gmail.com>.
On Thu, May 30, 2013 at 5:01 PM, Jack Krupansky <ja...@basetechnology.com> wrote:
> You gave an XML example, so I assumed you were working with XML!
>

Right, I did give the output as XML. I find XML to be a great document
markup language, but a terrible command format! Mostly, due to
(mis-)use of the attributes.


> In JSON...
>
> [{"id": "doc-id", "tags": {"add": ["a", "b"]}]
>
> and
>
> [{"id": "doc-id", "tags": {"set": null}}]
>

Thank you! That is quite more intuitive and less ambiguous than the
XML, would you not agree?

> BTW, this kind of stuff is covered in the book, separate chapters for XML
> and JSON, each with dozens of examples like this.
>

I have not posted on the book postings, but I will definitely order
one. My vote is for spiral bound, though I know that the perfect-bound
will look more professional on a bookshelf. I don't even care what the
book costs, within reason. Any resource that compiles in a single
package the wonderful methods that yourself and other contributors
mention here and in other places online, will pay for itself in short
order. Apache Solr is an amazing product, but it is often obtuse and
unintuitive. Other times one does not even know what Solr is capable
of, such as the case in this thread, where I was parsing entire
documents to change the multiField value.

Thank you very much!

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

Re: Removing a single value from a multiValue field

Posted by Jack Krupansky <ja...@basetechnology.com>.
You gave an XML example, so I assumed you were working with XML!

In JSON...

[{"id": "doc-id", "tags": {"add": ["a", "b"]}]

and

[{"id": "doc-id", "tags": {"set": null}}]

BTW, this kind of stuff is covered in the book, separate chapters for XML 
and JSON, each with dozens of examples like this.

-- Jack Krupansky

-----Original Message----- 
From: Dotan Cohen
Sent: Thursday, May 30, 2013 9:36 AM
To: solr-user@lucene.apache.org
Subject: Re: Removing a single value from a multiValue field

On Thu, May 30, 2013 at 3:42 PM, Jack Krupansky <ja...@basetechnology.com> 
wrote:
> First, you cannot do any internal editing of a multi-valued list, other
> than:
>
> 1. Replace the entire list.
> 2. Add values on to the end of the list.
>

Thank you. I meant that I am actually editing the entire document.
Reading it, changing the values that I need, and then 'updating' it. I
will look into updating only the single multiValued field.


> But you can do both of those operations on a single multivalued field with
> "atomic update" without reading and writing the entire document.
>
> Second, there is no "<arr>" element in the Solr Update XML format. Only
> "<field>".
>
> To simply replace the full, current value of one multi-valued field:
>
> <add>
>  <doc>
>    <field name="id">doc-id</field>
>    <field name="tags" update="set">a</field>
>    <field name="tags" update="set">b</field>
>  </doc>
> </add>
>
> If you simply want to append a couple of values:
>
> <add>
>  <doc>
>    <field name="id">doc-id</field>
>    <field name="tags" update="add">a</field>
>    <field name="tags" update="add">b</field>
>  </doc>
> </add>
>
> To empty out a multivalued field:
>
> <add>
>  <doc>
>    <field name="id">doc-id</field>
>    <field name="tags" update="set" null="true" />
>  </doc>
> </add>
>

Thank you. I will see about translating that into the JSON format that
I work with.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com 


Re: Removing a single value from a multiValue field

Posted by Dotan Cohen <do...@gmail.com>.
On Thu, May 30, 2013 at 3:42 PM, Jack Krupansky <ja...@basetechnology.com> wrote:
> First, you cannot do any internal editing of a multi-valued list, other
> than:
>
> 1. Replace the entire list.
> 2. Add values on to the end of the list.
>

Thank you. I meant that I am actually editing the entire document.
Reading it, changing the values that I need, and then 'updating' it. I
will look into updating only the single multiValued field.


> But you can do both of those operations on a single multivalued field with
> "atomic update" without reading and writing the entire document.
>
> Second, there is no "<arr>" element in the Solr Update XML format. Only
> "<field>".
>
> To simply replace the full, current value of one multi-valued field:
>
> <add>
>  <doc>
>    <field name="id">doc-id</field>
>    <field name="tags" update="set">a</field>
>    <field name="tags" update="set">b</field>
>  </doc>
> </add>
>
> If you simply want to append a couple of values:
>
> <add>
>  <doc>
>    <field name="id">doc-id</field>
>    <field name="tags" update="add">a</field>
>    <field name="tags" update="add">b</field>
>  </doc>
> </add>
>
> To empty out a multivalued field:
>
> <add>
>  <doc>
>    <field name="id">doc-id</field>
>    <field name="tags" update="set" null="true" />
>  </doc>
> </add>
>

Thank you. I will see about translating that into the JSON format that
I work with.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com

Re: Removing a single value from a multiValue field

Posted by Jack Krupansky <ja...@basetechnology.com>.
First, you cannot do any internal editing of a multi-valued list, other 
than:

1. Replace the entire list.
2. Add values on to the end of the list.

But you can do both of those operations on a single multivalued field with 
"atomic update" without reading and writing the entire document.

Second, there is no "<arr>" element in the Solr Update XML format. Only 
"<field>".

To simply replace the full, current value of one multi-valued field:

<add>
  <doc>
    <field name="id">doc-id</field>
    <field name="tags" update="set">a</field>
    <field name="tags" update="set">b</field>
  </doc>
</add>

If you simply want to append a couple of values:

<add>
  <doc>
    <field name="id">doc-id</field>
    <field name="tags" update="add">a</field>
    <field name="tags" update="add">b</field>
  </doc>
</add>

To empty out a multivalued field:

<add>
  <doc>
    <field name="id">doc-id</field>
    <field name="tags" update="set" null="true" />
  </doc>
</add>

-- Jack Krupansky

-----Original Message----- 
From: Dotan Cohen
Sent: Thursday, May 30, 2013 7:55 AM
To: solr-user@lucene.apache.org
Subject: Removing a single value from a multiValue field

I have a Solr application with a multiValue field 'tags'. All fields
are indexed in this application. There exists a uniqueKey field 'id'
and a '_version_' field. This is running on Solr 4.x.

In order to add a tag, the application retrieves the full document,
creates a PHP array from the document structure, removes the
'_version_' field, and then adds the appropriate tag to the 'tags'
array. This is all then sent to Solr's update method via HTTP with
'overwrite=true'. Solr correctly replaces the extant document with the
new document, which is identical with the exception of a new value for
the '_version_' field and an additional value in the multiValued field
'tags'. This all works correctly.

I am now adding a feature where one can remove tags. I am using the
same business logic, however instead of adding a value to the 'tags'
array I am removing one. I can confirm that the data being sent to
Solr does not contain the removed tag. However, it seems that the old
value for the multiValue field is persisted, that is the old tag
stays. I can see that the '_version_' field has a new value, so I see
that the change was properly commited.

Is there a known bug that overwriting such a doc...:
<doc>
    <arr name="tags">
        <str>a</str>
        <str>b</str>
</arr>
</doc>

...with this doc...:
<doc>
    <arr name="tags">
        <str>a</str>
</arr>
</doc>

...has no effect? Can multiValue fields be only added, but not removed?

Thanks.

--
Dotan Cohen

http://gibberish.co.il
http://what-is-what.com