You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Ryan McKinley (JIRA)" <ji...@apache.org> on 2007/02/04 09:02:05 UTC

[jira] Created: (SOLR-139) Support updateable/modifiable documents

Support updateable/modifiable documents
---------------------------------------

                 Key: SOLR-139
                 URL: https://issues.apache.org/jira/browse/SOLR-139
             Project: Solr
          Issue Type: Improvement
          Components: update
            Reporter: Ryan McKinley


It would be nice to be able to update some fields on a document without having to insert the entire document.

Given the way lucene is structured, (for now) one can only modify stored fields.

While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.

for background, see:
http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment: getStoredFields.patch

Here's the latest patch, with the beginnings of a test, but I've run into some issues.

1) I'm getting occasional errors about the index writer already being closed (with lucene trunk), or null pointer exception with the lucene version we have bundled.  It's the same issue I believe... somehow the writer gets closed when someone else is trying to use it.  If I comment out verifyLatest() in the test (that calls getStoredFields), it no longer happens.

2) When I comment out getStoredFields, things seem to run correctly, but memory grows without bound and i eventually run out.  100 threads each limiting themselves to manipulating 16 possible documents each should not be able to cause this.

At this point, I think #2 is the most important to investigate.  It may be unrelated to this latest patch, and could be related to other peoples reports of running out of memory while indexing.


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: Eriks-ModifiableDocument.patch

applies with trunk...

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-ModifyInputDocuments.patch

implementing modifiable documents in UpdateRequestProcessor.

This adds two example docs:
modify1.xml and modify2.xml

<add mode="OVERWRITE,price:INCREMENT,cat:DISTINCT">
 <doc>
  <field name="id">CSC</field>
  <field name="name">Campbell's Soup Can</field>
  <field name="manu">Andy Warhol</field>
  <field name="price">23.00</field>
  <field name="popularity">100</field>
  <field name="cat">category1</field>
 </doc>
</add>

will increment the price by 23 each time it is run.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514234 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

Thanks Mike, I've made those changes to my local copy.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-269+139-ModifiableDocumentUpdateProcessor.patch

implements modifiable documents in the SOLR-269 update processor chain.

If the request does not have a 'mode' string, the ModifyDocumentProcessorFactory does not add a processor to the chain.



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by Yonik Seeley <yo...@apache.org>.
On 7/13/07, Mike Klaas <mi...@gmail.com> wrote:
> >>> ... ParallelReader, where some fields are in one sub-index ...
> >> the processor would ask the updateHandler for the existing
> >> document - the updateHandler deals with
> >> getting it to/from the right place.
> >
> > The big reason you would use ParallelReader is to avoid touching
> > the less-modified/bigger fields in one index when changing some of
> > the other fields in the other index.
>
> I've pondered this a few times: it could be a huge win for
> highlighting apps, which can be stored-field-heavy.
>
> However, I wonder if there is something that I am missing: PR
> requires perfect synchro of lucene doc ids, no?  If you update fields
> for a doc in one index, need not you (re-)store the fields in all
> other indices too, to keep the doc ids in sync?

Well, it would be tricky... one PR usecase would be to entirely
re-index one field (in it's own separate index) thus maintaining
synchronization with the main index. As Doug said
"ParallelReader was not really designed to support incremental updates of
fields, but rather to accellerate batch updates. For incremental
updates you're probably better served by updating a single index."

That's probably not too useful for a general purpose platform like Solr.

Another way to support a more incremental model is perhaps to split up
the smaller volatile index into many segments so that updating a
single doc involves rewriting just that segment.

There might also be possibilities in different types of IndexReader
implementations:  one could map docids to maintain synchronization.
This brings up a slightly different problem that lucene scorers expect
to go in docid order.

-Yonik

Re: [jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by Mike Klaas <mi...@gmail.com>.
On 13-Jul-07, at 1:53 PM, Yonik Seeley (JIRA) wrote:

>
>>> ... ParallelReader, where some fields are in one sub-index ...
>> the processor would ask the updateHandler for the existing  
>> document - the updateHandler deals with
>> getting it to/from the right place.
>
> The big reason you would use ParallelReader is to avoid touching  
> the less-modified/bigger fields in one index when changing some of  
> the other fields in the other index.

I've pondered this a few times: it could be a huge win for  
highlighting apps, which can be stored-field-heavy.

However, I wonder if there is something that I am missing: PR  
requires perfect synchro of lucene doc ids, no?  If you update fields  
for a doc in one index, need not you (re-)store the fields in all  
other indices too, to keep the doc ids in sync?

-mike

[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512617 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

>> ... ParallelReader, where some fields are in one sub-index ...
> the processor would ask the updateHandler for the existing document - the updateHandler deals with
> getting it to/from the right place.

The big reason you would use ParallelReader is to avoid touching the less-modified/bigger fields in one index when changing some of the other fields in the other index.

> What are you thinking? Adding the processor as a parameter to AddUpdateCommand?

I didn't have a clear alternative... I was just pointing out the future pitfalls of assuming too much implementation knowledge.



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521903 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

IMO, copyField targets should always be re-generated... so no, it doesn't seem like you should have to say anything about tag if you update erik_tag.

On a side note, I'm not sure how scalable the field-per-user strategy is.  There are some places in Lucene (segment merging for one) that go over each field, merging the properties.  Low thousands would be OK, but millions would not be OK.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509631 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> How can the UpdateHandler get access to pending documents? should it just use req.getSearcher()? 

I don't think so... the document may have been added more recently than the last searcher was opened.
We need to use the IndexReader from the UpdateHandler.  The reader and writer can both remain open as long as the reader is not used for deletions (which is a write operation and hence exclusive with IndexWriter).



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Mike Klaas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12514099 ] 

Mike Klaas commented on SOLR-139:
---------------------------------

It is my fault that the DUH2 locking is so hairy to begin with, so I should at least review changes to it ;)

With your last change, the locking looks sound.  However, I noticed a few things:

This comment is now inaccurate:
+    // need to start off with the write lock because we can't aquire
+    // the write lock if we need to.

Should openSearcher() call closeSearcher() instead of doing it manually?  It looks like searcherHasChanges is not being reset to false.



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470081 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

Is this what you are suggesting?

I added a second searcher to DirectUpdatehandler2 that is only closed when you call commit();  


  // Check if the document has not been commited yet
  Integer cnt = pset.get( id.toString() );
  if( cnt != null && cnt > 0 ) {
    commit( new CommitUpdateCommand(false) );
  }
  if( committedSearcher == null ) {
    committedSearcher = core.newSearcher("DirectUpdateHandler2.committed");
  }
  Term t = new Term( uniqueKey.getName(), uniqueKey.getType().toInternal( id.toString() ) );
  int docID = committedSearcher.getFirstMatch( t );
  if( docID >= 0 ) {
    Document luceneDoc = committedSearcher.doc( docID );
    // set the new doc as the existingDoc + our changes
    DocumentBuilder builder = new DocumentBuilder( schema );
    add.doc = builder.bulid( 
       this.modifyDocument( luceneDoc, cmd.doc, cmd.mode ) );
  }


- - - - - - -

This passes a new test that adds the same doc multiple times.  BUT it does commit each time.

Another alternative would be to keep a Map<String,Document> of the pending documents in memory.  Then we would not have to commit each time something has changed.







> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-XmlUpdater.patch

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

applies (almost cleanly) with trunk + SOLR-193

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623416#action_12623416 ] 

Noble Paul commented on SOLR-139:
---------------------------------

there are pros and cons w/ each approaches (am I discovering a universal truth here ;) )

Many approaches can to confuse users . I can propose something like 

<modifiable>true</modifiable> in the manIndex .
And say 
<modificationStrategy>solr.SepareteIndexStrategy</modificationStrategy>
or
<modificationStrategy>solr.SameIndexStrategy</modificationStrategy>

(I did not mention  the other two because of personal preferences ;)  )

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Erik Hatcher updated SOLR-139:
------------------------------

    Attachment: Eriks-ModifiableDocument.patch

For posterity, here's the patch I'm running in Collex production right now, and its working fine thus far.

Sorry, I know this issue has a convoluted set of patches to follow at the moment.  I trust that when Ryan is back in action we'll get this tidied up and somehow get Yonik-satisfying refactorings :)

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470065 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

Oh, so we obviously need some tests that modify the same document multiple times w/o a commit in between.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Comment: was deleted

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470043 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

I just attached a modified XmlUpdateRequestHandler that uses the new IndexDocumentCommand .  I left this out originally because I think the discussion around syntax and functionality should be separated...  BUT without some example, it is tough to get a sense how this would work, so i added this example.

Check the new file:
monitor-modifier.xml

It starts with:
<add mode="cat=DISTINCT,features=APPEND,price=INCREMENT,sku=REMOVE,OVERWRITE">
<doc>
  <field name="id">3007WFP</field>
  ...

If you run  ./post.sh  monitor-modifier.xml multiple times and check: http://localhost:8983/solr/select?q=id:3007WFP you should notice
1) the price increments by 5 each time
2) there is an additional 'feature' line each time
3) the categories are distinct even if the input is not

sku=REMOVE is required because sku is a stored field that is written to with copyField.  

Although I think this syntax is reasonable, this is just an example intended to spark discussion.  Other things to consider:

* rather then 'field=mode,' we could do 'field:mode,' this may look less like HTTP request parameter syntax

* The update handler could skip any stored field that is the target of a 'copyField' automatically.  This is the most normal case, so it may be the most reasonable thing to do.



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

added missing files.  this should is ready to check over if anyone has some time

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Bill Au (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601408#action_12601408 ] 

Bill Au commented on SOLR-139:
------------------------------

I noticed that this bug is no longer included in the 1.3 release.  Are there any outstanding issues if all the fields are stored?  Requiring that all fields are stored for a document to be update-able seems like reasonable to me.  This feature will simplify things for Solr users who are doing a query to get all the fields following by an add when they only want to update a very small number of fields.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley reassigned SOLR-139:
----------------------------------

    Assignee: Ryan McKinley

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Lance Norskog (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12582401#action_12582401 ] 

Lance Norskog commented on SOLR-139:
------------------------------------

If I may comment?  

Is it ok if a first implementation requires all fields to stored, and then a later iteration supports non-stored fields?  This seems to be a complex problem, and you might decide later that the first design is completely bogus.




> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "David Smiley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591317#action_12591317 ] 

David Smiley commented on SOLR-139:
-----------------------------------

I agree Lance.  I don't know what the "first design" is that might be bogus you're talking about... but we definitely *eventually* want to handle non-stored fields.  I have an index that isn't huge and I'm salivating at the prospects of doing an update.  For non-stored fields... it seems that if an update always over-writes such fields with new data then we should be able to support that now easily because we don't care what the old data was.

For non-stored fields that need to be retained (Otis & Yoniks concern)... I wonder what Lucene exposes about the indexed data for a non-stored field.  We'd just want to copy this low-level data over to a new document, basically.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512558 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

> the udpate handler knows much more about the index than we do outside

Yes.  The patch i just attached only deals with documents that are already commited.  It uses req.getSearcher() to find existing documents.  

Beyond finding commited or non-commited Documents, is there anything else that it can do better?  

Is it enought to add something to UpdateHandler to ask for a pending or commited document by uniqueId?

I like having the the actual document manipulation happening in the Processor because it is an easy place to put in other things like grabbing stuff from a SQL database.  

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12520554 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

I'm experimenting with this patch with tagging.   I'm modeling the fields in this way, beyond general document metadata fields:

   <username>_tags
  usernames

And copyFielding *_tags into "tags". 

usernames field allows seeing all users who have tagged documents.

Users are allowed to "uncollect" an object, which would remove the <username>_tags field and remove their name from the usernames field.   Removing the <username>_tags field use case is covered with the <username>_tags:OVERWRITE mode.  But removing a username from the multiValued and non-duplicating usernames field is not.

An example (from some conversations with Ryan): 

 id: 10
 usernames: ryan, erik

You want to be able to remove 'ryan' but keep 'erik'.

Perhaps we need to add a 'REMOVE' mode to remove the first (all?)
matching values
 /update?mode=OVERWRITE,username=REMOVE
 <doc>
  id=10,
  usernames=ryan
 </doc>

and make the output:

 id: 10
 usernames: erik

But what about duplicate values?  

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521928 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

A mistake I had was my copyField target ("tag") was stored.  Setting it to be unstored alleviated the need to overwrite it - thanks!

One thing I noticed is that all fields sent in the update must be stored, but that doesn't really need to be the case with fields being overwritten -  perhaps that restriction should be lifted and only applies when the stored data is needed.

As for sending in a field that was the target of a copyField - I'm not doing this nor can I really envision this case, but it seemed like it might be a case to consider here.  Perhaps a "text" field that could be optionally set by the client, and is also the destination of copyField's of title, author, etc. 

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment: getStoredFields.patch

Attaching a patch for getStoredFields that appears to work.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509608 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

Another use case to keep in mind... someone might use an UpdateRequestProcessor
to add new fields to a document (something like copyField, but more complex).  Some of this logic might be based on the value of other fields themselves.

example1:  a userTag field that represents tags on objects of the form user#tagstring.
If user==member, then add tagstring to the indexed-only ownerTags field, else
add the tagstring to the socialTags field.

example2: an UpdateRequestProcessor is used to encode the value of a field with rot13... this should obviously only be done for new field values, and not values that are just being re-stored, so the UpdateRequestProcessor needs to be able to distinguish between the two.

example3: some field values are pulled from a database when missing rather than being stored values.

I think we need to try to find the right UpdateRequestProcessor interface to support all these with UpdateableDocuments.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by Ryan McKinley <ry...@gmail.com>.
Sorry I'm out of the loop on this.  In the last two weeks, I planned a 
wedding, got married saturday and moved out of our apartment.  We are 
now driving east to DC.

I'll try to catch up on this thread in a week (or so)

thanks
ryan


Yonik Seeley (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522333 ] 
> 
> Yonik Seeley commented on SOLR-139:
> -----------------------------------
> 
>> how about using <update> instead of <add>
> 
> We had previously talked about making this distinction (and configuration for each field) in the URL:
> http://localhost:8983/solr/update?mode=title:overwrite,cat:distinct
> This makes it usable and consistent for different update handlers and formats (CSV, future SQL, future JSON, etc)
> 
> but perhaps if we allowed the <add> tag to optionally be called something more neutral like <docs>?
> 
> wrt patches, I think the functionality needs refactoring so that modify document logic is in the update handler.  It seems like it's the only clean way from a locking perspective, and it also leaves open future optimizations (like using different indices depending on the fieldname and using a parallel reader across them).
> 
>> Support updateable/modifiable documents
>> ---------------------------------------
>>
>>                 Key: SOLR-139
>>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>>             Project: Solr
>>          Issue Type: Improvement
>>          Components: update
>>            Reporter: Ryan McKinley
>>            Assignee: Ryan McKinley
>>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>>
>>
>> It would be nice to be able to update some fields on a document without having to insert the entire document.
>> Given the way lucene is structured, (for now) one can only modify stored fields.
>> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
>> for background, see:
>> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293
> 


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522333 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> how about using <update> instead of <add>

We had previously talked about making this distinction (and configuration for each field) in the URL:
http://localhost:8983/solr/update?mode=title:overwrite,cat:distinct
This makes it usable and consistent for different update handlers and formats (CSV, future SQL, future JSON, etc)

but perhaps if we allowed the <add> tag to optionally be called something more neutral like <docs>?

wrt patches, I think the functionality needs refactoring so that modify document logic is in the update handler.  It seems like it's the only clean way from a locking perspective, and it also leaves open future optimizations (like using different indices depending on the fieldname and using a parallel reader across them).

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

An updated and cleaned up versoin.  The big change is to the SolrDocument interface - rather then expose Maps directly, it hides them in an interface:

http://solrstuff.org/svn/solrj/src/org/apache/solr/util/SolrDocument.java

Still needs some test cases

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Otis Gospodnetic (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573906#action_12573906 ] 

Otis Gospodnetic commented on SOLR-139:
---------------------------------------

I'm looking at this issue now.  I used to think "sure, this will work fine", but I'm looking at a 1B doc index (split over N servers) and am suddenly very scared of having to store more than necessary in the index.  In other words, writing custom field value loaders from external storage sounds like the right thing to do.  Perhaps one such loader could simply load from the index itself. :)


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Mark Diggory (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12787072#action_12787072 ] 

Mark Diggory commented on SOLR-139:
-----------------------------------

I notice this is a very long lived issue and that it is marked for 1.5.  Are there outstanding issues or problems with its usage if I apply it to my 1.4 source?

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.5
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment: getStoredFields.patch

Found another bug just by looking at it a little longer... pset needs to be synchronized if all we have is the read lock, since addDoc() actually changes the pset while only holding the readLock (but it also synchronizes).  I expanded the sync block to encompass the pset access and merged the two sync blocks.  Hopefully this one should be correct.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470035 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

SOLR-139-IndexDocumentCommand.patch adds a new command to UpdateHandler and deprecates 'AddUpdateCommand'

This patch is only concerned with adding updateability to the UpdateHandler, it does not deal with how request handlers specify what should happen with each field.

I added:

public class IndexDocumentCommand 
{
  public enum MODE {
    OVERWRITE, // overwrite existing values with the new one. (the default behavior)
    APPEND,    // add the new value to existing value
    DISTINCT,  // same as APPEND, but make sure each value is distinct
    INCREMENT, // increment existing value.  Must be a number!
    REMOVE     // remove the previous value.
  };

  public boolean overwrite = true;
  public SolrDocument doc;
  public Map<SchemaField,MODE> mode; // What to do for each field.  null is the default
  public int commitMaxTime = -1; // make sure the document is commited within this much time
}

RequestHandlers will need to fill up the 'mode' map if they want to support updateability.  Setting the mode.put( null, APPEND ) sets the default mode.




> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516473 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

So the big issue now is that I don't think we can use getStoredFields() and do document modification outside the update handler.  The biggest reason is that I think we need to be able to update documents atomically (in the sense that updates should not be lost).

Consider the usecase of adding a new tag to a multi-valued field:  if two different clients tag a document at the same time, it doesn't seem acceptable that one of the tags could be lost.  So I think that we need a modifyDocument() call on updateHandler, and perhaps a ModifyUpdateCommand to go along with it.

I'm not sure yet what this means for request processors.  Perhaps another method that handles the reloaded storedFields?

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

depending on SOLR-193

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470061 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

Haven't had a chance to check out any code, but a few quick comments:

If the field modes were parameters, they could be reused for other update handlers like SQL or CSV
Perhaps something like:
/update/xml?modify=true&f.price.mode=increment,f.features.mode=append

> sku=REMOVE is required because sku is a stored field that is written to with copyField.
I'm not sure I quite grok what REMOVE means yet, and how it fixes the copyField problem.

Another way to work around copyField is to only collect stored fields that aren't copyField targets.  Then you run the copyField logic while indexing the document again (as you should anyway).

I think I'll have a tagging usecase that requires removing a specific field value from a multivalued field.  Removing based on a regex might be nice too. though.

f.tag.mode=remove
f.tag.mode=removeMatching or removeRegex



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509635 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> (as long as you don't need multiple of them). 

If you had a bunch of independent things you wanted to do, having a single processor forces you to squish them together (say you even added another document type to your index and then wanted to add another mutation operation).

What if we had some sort of hybrid between the two approaches. Had a list of processors, and the last would have primary responsibility for getting the documents in the index?

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment: UpdateProcessor.patch

OK, this patch adds the ability to specify multiple processor factories.

Summary:
- decoupled changing the index from logging... I think it makes it much clearer how things work (there was much more logging code than anything else).  This also would allow Ryan to add back his incremental timing to a different processor.
- added SolrInputDocument to AddUpdateCommand, and added some methods
- removed NamedList return from the XML update handler and started passing SolrQueryResponse around instead (this is more future-proof and flexible)
- removed adding all the ids to the response... (back to 1.2 response format).  We probably shouldn't add ids by default... think of CSV uploading millions of records, etc.
- An array of factories is kept, and when a processor is instantiated, it is passed a "next" pointer.   An alternative would be to expose a "next factory" pointer to every factory (any advantage to having the current factory call getInstance() on the next factory instead of us doing that?)


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, UpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
that was quick!  thanks ryan!


On Aug 14, 2007, at 2:15 PM, Ryan McKinley (JIRA) wrote:

>
>      [ https://issues.apache.org/jira/browse/SOLR-139? 
> page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Ryan McKinley updated SOLR-139:
> -------------------------------
>
>     Attachment: SOLR-139-ModifyInputDocuments.patch
>
> applies with trunk
>
>> Support updateable/modifiable documents
>> ---------------------------------------
>>
>>                 Key: SOLR-139
>>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>>             Project: Solr
>>          Issue Type: Improvement
>>          Components: update
>>            Reporter: Ryan McKinley
>>            Assignee: Ryan McKinley
>>         Attachments: getStoredFields.patch, getStoredFields.patch,  
>> getStoredFields.patch, getStoredFields.patch,  
>> getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch,  
>> SOLR-139-IndexDocumentCommand.patch, SOLR-139- 
>> IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,  
>> SOLR-139-IndexDocumentCommand.patch, SOLR-139- 
>> IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,  
>> SOLR-139-IndexDocumentCommand.patch, SOLR-139- 
>> IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch,  
>> SOLR-139-IndexDocumentCommand.patch, SOLR-139- 
>> ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch,  
>> SOLR-139-ModifyInputDocuments.patch, SOLR-139- 
>> ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269 
>> +139-ModifiableDocumentUpdateProcessor.patch
>>
>>
>> It would be nice to be able to update some fields on a document  
>> without having to insert the entire document.
>> Given the way lucene is structured, (for now) one can only modify  
>> stored fields.
>> While we are at it, we can support incrementing an existing value  
>> - I think this only makes sense for numbers.
>> for background, see:
>> http://www.nabble.com/loading-many-documents-by-ID- 
>> tf3145666.html#a8722293
>
> -- 
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-ModifyInputDocuments.patch

applies with trunk

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512613 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

> 
> The update handler could call the processor when it was time to do the manipulation too.
> 

What are you thinking?  Adding the processor as a parameter to AddUpdateCommand?

> ... ParallelReader, where some fields are in one sub-index  ...

the processor would ask the updateHandler for the existing document - the updateHandler deals with getting it to/from the right place.

we could add something like:
  Document getDocumentFromPendingOrCommited( String indexId )
to UpdateHandler and then that is taken care of.

Other then extracting the old document, what needs to be done that cant be done in the processor?  

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616564#action_12616564 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

There are many approaches to make this work -- i don't think there will be a one-size-fits all approach though.  Storing all fields in the lucene index may be fine.  Perhaps we want to write the xml to disk when it is indexed, then reload it when the file is 'updated', perhaps the content should be stored in a SQL db.

David - storing all data in the search index can be a problem because it can get BIG.  Imagine if nutch stored the raw content in the lucene index?  (I may be wrong on this) even with Lazy loading, there is a query time cost to having stored fields.

In general, I think we just need an API that will allow for a variety of storage mechanisms.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Jörg Kiegeland (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12544856 ] 

Jörg Kiegeland commented on SOLR-139:
-------------------------------------

A useful feature would be "update based on query", so that documents matching the query condition will all be modified in the same way on the given update fields. 
This feature would correspond to SQL's UPDATE command, so that Solr would now cover all the basic commands SQL provides.  (While this is a theoretic motivation, I just missed this feature for my Solr projects..)

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515306 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

OOM still happens from the command line also after lucene updates to 2.2.
Looks like it's time for old-school instrumentation (printfs, etc).

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: Eriks-ModifiableDocument.patch

updated to work with trunk.  no real changes.

The final version of this will need to move updating logic out of the processor into the UpdateHandler

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509615 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------


So you are suggesting pulling this out of the UpdateHandler and managing the document merging in the UpdateRequestProcessor?  (this might makes sense - It was not an option when the patch started in feb)  

How can the UpdateHandler get access to pending documents?  should it just use req.getSearcher()? 


> example1:  a userTag field that represents tags on objects of the form user#tagstring.
> If user==member, then add tagstring to the indexed-only ownerTags field, else
> add the tagstring to the socialTags field.
> 
> example2: an UpdateRequestProcessor is used to encode the value of a field with rot13... this should obviously only be done for new field values, and not values that are just being re-stored, so the UpdateRequestProcessor needs to be able to distinguish between the two.
> 

1 & 2 seem pretty straightforwad

> example3: some field values are pulled from a database when missing rather than being stored values.
> 

Do you mean as input or output?  The UpdateRequestProcessor could not affect if a field is stored or not, it could augment a document with more fields *before* it is indexed.  To add fields from a database rather then storing them, we would need a hook at the end.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-ModifyInputDocuments.patch

Updated to use Yonik's 'getStoredFields'.  The one change to that is to have UpdateHandler addFields() load fields as the Object (not external string) and to skip copy fields:

void addFields(Document luceneDoc, SolrDocument solrDoc) {
    for (Fieldable f : (List<Fieldable>)luceneDoc.getFields()) {
      SchemaField sf = schema.getField( f.name() );
      if( !schema.isCopyFieldTarget( sf ) ) {
        Object externalVal = sf.getType().toObject(f);
        solrDoc.addField(f.name(), externalVal);  
      }
    }
  }

- - - -

This still implements modifiable documents as a RequestProcessor.


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521906 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

I agree in the gut feel to how copyFields and overwrite should work, but what about the situation where the client is sending in values for that field also?   If it got completely regenerated during a modify operation, data would be lost.  No?

Thanks for the note about field-per-user strategy.  In our case, even thousands of users is on down the road.   The simplicity of having fields per user is mighty alluring and is working nicely in the prototype for now. 

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616762#action_12616762 ] 

Noble Paul commented on SOLR-139:
---------------------------------

* If your index is really big most likely you may have a master/slave deployment. In that case only master needs to store the data copy. The slaves do not have to pay the 'update tax'

bq. Perhaps we want to write the xml to disk when it is indexed, then reload it when the file is 'updated', perhaps the content should be stored in a SQL db. 

Writing the xml may not be an option if I use DIH for indexing (there is no xml). And what if I use CSV. That means multiple storage formats. RDDBMS storage is a problem because of incompatibility of Lecene/RDBMS data structures. Creating a schema will be extremely hard because of dynamic fields

I guess we can have multiple solutions. I can provide a simple DuplicateIndexUpdateProcessor (for lack of a better name) which can store all the data in a duplicate index. Let the user decide what he wants

bq.It's possible to recover unstored fields, if the purpose of such recovery is to make a copy of the document and update other fields.
It is not wise to invest our time to do 'very clever' things because it is error prone. Unless Lucene gives us a clean API to do so

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Marcus Herou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702778#action_12702778 ] 

Marcus Herou commented on SOLR-139:
-----------------------------------

It would make sense of adding ParallelReader functionality so a core can read from several index-dirs. 
Guess it complicates things a little since you would need to have support for adding data as well to more than one index.

Suggestion:
/update/coreX/index1 - Uses schema1.xml
/update/coreX/index2 - Uses schema2.xml
/select/coreX - Uses all schemas e.g. A ParallelReader.

Seing quite a lot questions on the mailinglist about users that want to be able to update a single field while maintaining the rest of the index intact (not reindex).



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.5
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: Eriks-ModifiableDocument.patch

updated patch to work with trunk

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522310 ] 

ehatcher edited comment on SOLR-139 at 8/23/07 4:59 PM:
------------------------------------------------------------

There is a bug in the last patch that allows an update to a non-existent document to create a new document. 

I've corrected this by adding this else clause in ModifyExistingDocumentProcessor:

      if( existing != null ) {
        cmd.solrDoc = ModifyDocumentUtils.modifyDocument(existing, cmd.solrDoc, modes, schema );
      } else {
        throw new SolrException(SolrException.ErrorCode.BAD_REQUEST,
            "Cannot update non-existent document: " + id);
      }

I can't generate a patch, yet, that is clean addition of just that bit of code along with the other changes.


      was (Author: ehatcher):
    FYI - I'm looking into tracking it down, but there is a bug in the latest patch that allows an update to a non-existent document to create a new document.  More when I figure out the cause.
  
> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564896#action_12564896 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

I'm having second thoughts if this is a good enough approach to really put in core Solr.
Requiring that all fields be stored is a really large drawback, esp for large indicies with really large documents.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "David Smiley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616467#action_12616467 ] 

David Smiley commented on SOLR-139:
-----------------------------------

It's unclear to me how your suggestion, Paul, is better.  If at the end of the day, all the fields must be stored on disk somewhere (and that is the case), then why complicate matters and split out the storage of the fields into a separate index?  In my case, nearly all of my fields are stored so this would be redundant.  AFAIK, having your "main" index "small" only matters as far as index storage, not stored-field storage.  So I don't get the point in this.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Mike Klaas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616729#action_12616729 ] 

Mike Klaas commented on SOLR-139:
---------------------------------

[quote]David - storing all data in the search index can be a problem because it can get BIG. Imagine if nutch stored the raw content in the lucene index? (I may be wrong on this) even with Lazy loading, there is a query time cost to having stored fields.[/quote]

Splitting it out into another store is much better at scale.  A distinct lucene index works relatively well.



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment: getStoredFields.patch

updated getStoredFields.patch to use synchronization instead of the write lock since you can't upgrade a read lock to a write lock.  I could have started out with the write lock and downgraded it to a read lock, but that would probably lead to much worse concurrency since it couldn't proceed in parallel with other operations such as adds.

It would be good if someone could review this for threading issues.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522320 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

Some thoughts on the update request, how about using <update> instead of <add>?   And put the modes as attributes to <update>...

   <update overwrite="title" distinct="cat">
      <field name="id">ID</field>
      <field name="title">new title</field>
      <field name="cat">NewCategory</field>
    </update>

Using <add> has the wrong implication for this operation.   Thoughts?

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515366 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

I disabled logging on all of "org.apache.solr" via a filter, and voila, OOM problems are gone.
Perhaps the logger could not keep up with the number of records and they piled up over time time (does any component of the logging framework use another thread that might be getting starved?)

Anyway, it doesn't look like Solr has a memory leak.
On to the next issue.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

updated with trunk

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-ModifyInputDocuments.patch

Updated patch to work with SOLR-269 UpdateRequestProcessors.

One thing I think is weird about this is that it uses parameters to say the mode rather then the add command.  That is, to modify a documetn you have to send:

/update?mode=OVERWRITE,count:INCREMENT
<add>
 <doc>
  <field name="id">1</field>
  <field name="count">5</field>
 </doc>
</add>

rather then:
<add mode="OVERWRITE,count:INCREMENT">
 <doc>
  <field name="id">1</field>
  <field name="count">5</field>
 </doc>
</add>

This is fine, but it makes it hard to have an example 'modify' xml document.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12513355 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

FYI, I'm working on a loadStoredFields() for UpdateHandler now.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12564910#action_12564910 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

that is part of why i thought having it in an update request processor makes sense -- it can easily be subclassed to pull the existing fields from whereever it needs.  Even if it is directly in the UpdateHandler, there could be some interface to _loadExistingFields( id )_ or something similar.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

* Updated for new SolrDocument implementation.
* Testing via solrj.

I think this should be committed soon.  It should only affect performance if you specify a 'mode' in the add command.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470064 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

I browsed the code really quick, looking for the tricky part... It's here:
+      openSearcher();
+      Term t = new Term( uniqueKey.getName(), uniqueKey.getType().toInternal( id.toString() ) );
+      int docID = searcher.getFirstMatch( t );

When you overwrite a document, it is really just adds another instance... so the index contains multiple copies.  When we "commit", deletes of the older versions are performed.  So you really want the *last* doc matching a term, not the first.

Also, to need to make sure that the searcher you are using can actually "see" the last document (once a searcher is opened, it sees documents that were added since the last IndexWriter close().

So a quick fix would be to do a commit() first that would close the writer, and then delete any old copies of docments.
Opening and closing readers and writers is *very* expensive though.
You can get slightly more sophisticated by checking the pset (set of pending documents), and skip the commit() if the doc you are updating isn't in there (so you know an older searcher will still have the freshest doc for that id).

We might be able to get more efficient yet in the future by leveraging NewIndexModifier: SOLR-124


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment: getStoredFields.patch

Here's a patch that adds getStoredFields to updateHandler.
Changes include leaving the searcher open as long as possible (and in conjunction with the writer, if the reader has no unflushed deletions).  This would allow reuse of a single index searcher when modifying multiple documents that are not in the pending set.

No tests for getStoredFields() yet... I plan on a comprehensive update handler test that uses multiple threads to try and flush out any possible concurrency bugs. 

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522310 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

FYI - I'm looking into tracking it down, but there is a bug in the latest patch that allows an update to a non-existent document to create a new document.  More when I figure out the cause.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521900 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

One thing to note about overwrite and copyFields is that to keep a purely copyFielded field in sync you must basically remove it (overwrite without providing a value).   

For example, my schema:

    <dynamicField name="*_tag" type="string" indexed="true" stored="true" multiValued="true"/>
    <field name="tag" type="string" indexed="true" stored="true" multiValued="true"/>

and then this:
  <copyField source="*_tag" dest="tag"/>

The client never provides a value for "tag" only ever <username>_tag values.   I was seeing old values in the tag field after doing overwrites of <username>_tag expecting "tag" to get rewritten entirely.   Saying mode=tag:OVERWRITE does the trick.    This is understandable, but confusing, as the client then needs to know about purely copyFielded fields that it never sends directly. 

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515154 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

Weird... I put a profiler on it to try and figure out the OOM issue, and I never saw heap usage growing over time (stayed between 20M and 30M), right up until I got an OOM (it never registered on the profiler).

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Issue Comment Edited: (SOLR-139) Support updateable/modifiable documents

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616399#action_12616399 ] 

noble.paul edited comment on SOLR-139 at 7/24/08 2:14 AM:
----------------------------------------------------------

How about maintaining a separate index at store.index and write all the documents to that index also. 

In the store.index , all the fields must be stored and none will be indexed. This index will not have any copy fields . It will blindly dump the data as it is.

Updating can be done by reading data from this.  Deletion must also be done on both the indices

This way the original index will be small and the users do not have to make all fields stored

      was (Author: noble.paul):
    How about maintaining a separate index at store.index and write all the documents to that index also. 

In the store.index , all the fields must be stored and none will be indexed. Updating can be done by reading data from this. 

This way the original index will be small and the users do not have to make all fields stored
  
> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: Eriks-ModifiableDocument.patch

applies to /trunk

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616725#action_12616725 ] 

Andrzej Bialecki  commented on SOLR-139:
----------------------------------------

It's possible to recover unstored fields, if the purpose of such recovery is to make a copy of the document and update other fields. The process is time-consuming, because you need to traverse all postings for all terms, so it might be impractical for larger indexes. Furthermore, such recovered content may be incomplete - tokens may have been changed or skipped/added by analyzers, positionIncrement gaps may have been introduced, etc, etc.

Most of this functionality is implemented in Luke "Restore & Edit" function. Perhaps it's possible to implement a new low-level Lucene API to do it more efficiently.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

This is a minor change that keeps the OVERWRITE property if it is specified.  The previous version ignored the update mode if everthing was OVERWRITE.


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12702883#action_12702883 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

ParallelReader assumes you have two indexes that "line up" so the internal docids match.  Maintaining something like that would currently be pretty hard or impractical.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.5
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521913 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

I think there are a number of restrictions for using this feature:
1) all source fields (not copyField targets) need to be stored, or included in modify commands
2) copyField targets should either be unstored, or if stored, should never be explicitly set by the user

What tags would you send in that aren't user tags?  If they are some sort of global tags, then you could do global_tag=foo (reusing your dynamic user_tag field), or create a new globalTags field and an additional copyField to the tag field.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591323#action_12591323 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

{quote}
For non-stored fields that need to be retained (Otis & Yoniks concern)... I wonder what Lucene exposes about the indexed data for a non-stored field. We'd just want to copy this low-level data over to a new document, basically.
{quote}

Lucene maintains an inverted index, so the "indexed" part is spread over the entire index (terms point to documents).  Copying an indexed field would require looping over every indexed term (all documents with that field).  It would be *very* slow once an index got large.


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470080 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

If modes are in a single param, a ":" syntax might be nicer because you could cut-n-paste it into a URL w/o escaping.

> I'm using 'REMOVE' to say "remove the previous value of this field before doing anything." Essentially, this makes sure you new 
> document does not start with a value for 'sku'.

That wouldn't work for multi-valued fields though, right?
If we keep this option, perhaps we should find a better name... to me "remove" suggests that the given value should be removed.  Think adding / removing tag values from a document.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Fix Version/s:     (was: 1.3)

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470109 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

I added a new version of SOLR-139-IndexDocumentCommand.patch that:
* gets rid of 'REMOVE' option
* uses a separate searcher to search for existing docs
* includes the XmlUpdateRequestHandler
* moves general code from XmlUpdateRequestHandler to SolrPluginUtils
* adds a few more tests

Can someone with a better lucene understanding look into re-useint the existing searcher as Yonik suggests above - I don't quite understand the other DUH2 implications.

I moved the part that parses (and validates) field mode parsing into SolrPluginUtils.  This could be used by other RequestHandlers to parse the mode map.

The XmlUpdateRequestHandler in this patch should support all legacy calls *except* cases where overwritePending != overwriteCommitted.  There are no existing tests with this case, so it is not a problem from the testing standpoint.  I don't know if anyone is using this (mis?) feature.



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12521932 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> One thing I noticed is that all fields sent in the update must be stored, but that doesn't really need to be the case with fields being overwritten - perhaps that restriction should be lifted and only applies when the stored data is needed.

+1

> Perhaps a "text" field that could be optionally set by the client, and is also the destination of copyField's of title, author, etc.

Seems like things of this form can always be refactored by adding a field "settableText" and adding another copyField from that to "text"

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Erik Hatcher (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12522920 ] 

Erik Hatcher commented on SOLR-139:
-----------------------------------

Added DELETE support in ModifyDocumentUtils like this:

      case DELETE:
        if( field != null ) {
          Collection<Object> collection = existing.getFieldValues(name);
          if (collection != null) {
            collection.remove(field.getValue());
            if (collection.isEmpty()) {
              existing.removeField(name);
            } else {
              existing.setField(name, collection, field.getBoost());
            }
          }
        }
        // TODO: if field is null, should the field be deleted?
        break;


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "David Smiley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616570#action_12616570 ] 

David Smiley commented on SOLR-139:
-----------------------------------

Ryan, I know of course the index can get big because one needs to store all the data for re-indexing; but due to Lucene's fundamental limitations, we can't get around that fact.  Moving the data off to another place (a DB of some sort or whatever) doesn't change the fundamental problem.  If one is unwilling to store a copy somewhere convenient due to data scalability issues then we simply cannot support the feature because Lucene doesn't have an underlying update capability.

If the schema's field isn't stored, then it may be useful to provide an API that can fetch un-stored fields for a given document.  I don't think it'd be common to use the API and definitely wouldn't be worth providing a default implementation.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12516598 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------


> 
> So I think that we need a modifyDocument() call on updateHandler, and perhaps a ModifyUpdateCommand to go along with it.
> 
> I'm not sure yet what this means for request processors.  Perhaps another method that handles the reloaded storedFields?
>

Another option might be some sort of transaction or locking model.  Could it block other requests while there is an open transaction/lock?

Consider the case where we need the same atomic protection for fields loaded from non-stored fields loaded from a SQL database.  In this case, it may be nice to have locking/blocking happen at the processor level.

I don't know synchronized well enough to know if this works or is a bad idea, but what about something like:

class ModifyExistingDocumentProcessor {
  void processAdd(AddUpdateCommand cmd) {
    String id = cmd.getIndexedId(schema);
    synchronized( updateHandler ) {
       SolrDocument existing = updateHandler.loadStoredFields( id );
       myCustomHelper.loadTagsFromDB( existing );
       cmd.solrDoc = ModifyDocumentUtils.modifyDocument( ... );
       
       // This eventually calls: updateHandler.addDoc(cmd);
       super.processAdd(cmd);
    }
  }
}

This type of approach would need to make sure everyone modifying fields was locking on the same updateHandler.

- - - -

I'm not against adding a ModifyUpdateCommand, I just like having the modify logic sit outside the UpdateHandler.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623383#action_12623383 ] 

Noble Paul commented on SOLR-139:
---------------------------------

Shall I open another issue on my idea keeping another index for just stored fields (in an UpdateRequestProcessor). 

Is it a good idea to have multiple approaches for the same feature? 
Or should I post the patch in this issue only?

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512602 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> I like having the the actual document manipulation happening in the Processor because it is an easy 
> place to put in other things like grabbing stuff from a SQL database.

The update handler could call the processor when it was time to do the manipulation too.

Consider the case of future support for ParallelReader, where some fields are in one sub-index and some fields are in another sub-index.  I'm not sure if our current processor interfaces will be able to handle that scenario well (but I'm not sure if we should worry about it too much right now either).

There is another way to perhaps mitigate the problem of a few frequently modified documents causing thrashing: a document cache in the update handler.


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Fix Version/s: 1.3
       Issue Type: New Feature  (was: Improvement)

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.3
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616399#action_12616399 ] 

Noble Paul commented on SOLR-139:
---------------------------------

How about maintaining a separate index at store.index and write all the documents to that index also. 

In the store.index , all the fields must be stored and none will be indexed. Updating can be done by reading data from this. 

This way the original index will be small and the users do not have to make all fields stored

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470082 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------


> 
> That wouldn't work for multi-valued fields though, right?

'REMOVE' on a mult-valued field would clear the old fields before adding the new ones.  It is essentially the same as OVERWRITE, but you may or may not pass in a new value on top of the old one.

> If we keep this option, perhaps we should find a better name... 

how about 'IGNORE' or 'CLEAR'  It is awkward because it refers to what was in in the document before, not what you are passing in.

The more i think about it, I think we should drop the 'REMOVE' option.  You can get the same effect using 'OVERWRITE' and passing in a null value.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: Eriks-ModifiableDocument.patch

Updated Erik's patch to /trunk and added back the solrj modifiable document tests.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12601614#action_12601614 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

the biggest reason this patch won't work is that with SOLR-559, the DirectUpdateHandler2 does not keep track of pending updates -- to get this to work again, it will need to maintain a list somewhere.

Also, we need to make sure the API lets you grab stored fields from somewhere else -- as is, it forces you to store all fields for all documents.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Mike Klaas (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515424 ] 

Mike Klaas commented on SOLR-139:
---------------------------------

Darn, you're right: writer.addDocument() is outside of the synchronized block.

We could do as you suggested, downgrading to a read lock from commit.  It should only reduce concurrently when the document is in pending state.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512553 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

Some general issues w/ update processors and modifiable documents, and keeping this stuff out of the update handler is that the udpate handler knows much more about the index than we do outside, and it constrains implementation (and performance optimizations).

For example, if modifiable documents were implemented in the update handler, and the old version of the document hasn't been committed yet, the update handler could buffer the complete modify command to be done at a later time (the *much* slower alternative is closing the writer and opening the reader to get the latest stored fields), then closing the reader and re-opening the writer.


> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12525955 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

I'm back and have some time to focus on this...  Internally, I need to move the document modification to the update handler, but before I get going it would be nice to agree what we want the external interface to look like.

- - -

Should we deprecate the AddUpdateCommand and replace it with something else? Do we want one command to do Add/Update/Modify?  Two?

- - - 

what should happen if you "modify" a non-existent document?
a. error -- makes sense in a 'tagging' context
b. treat it as an add -- makes sense in a "keep these fields up to date" context (i don't want to check if the document already exists or not)

I happen to be working with context b, but I suspect 'a' makes more sense.

- - - 

Should we have different xml syntax for document modification vs add?  Erik suggested:

<update overwrite="title" distinct="cat">
 ...
</update>

since 'update' is already used in a few ways, maybe <modify>?

Should we put the id as an xml attribute?  this would make it possible to change the unique key.
  <modify id="ID">
   <field name="id">new id</field>
  </modify>
That may look weird if the uniqueKeyField is not called "id"

Assuming we put the modes as attributes, I guess multiple fields would be comma delimited?
 <modify distinct="cat,keyword">

Do you like the default mode called "default" or "mode"?
<modify id="ID" default="overwrite">?
<modify id="ID" mode="overwrite">?











> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12512628 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

> 
> ... avoid touching the less-modified/bigger fields ...
> 

aaah, perhaps a future updateHandler getDocument() function could take a list of fields it should extract.  Still problems with what to do when you add it.. maybe it checks if anything has changed in the less-modified index?  I see your point.

>> What are you thinking? Adding the processor as a parameter to AddUpdateCommand?
> 
> I didn't have a clear alternative... I was just pointing out the future pitfalls of assuming too much implementation knowledge.
> 

I am fine either way -- in the UpdateHandler or the Processors.  

Request plumbing-wise, it feels the most natural in a processor.  But if we rework the AddUpdateCommand it could fit there too.  I don't know if it is an advantage or disadvantage to have the 'modify' parameters tied to the command or the parameters.  either way has its +-, with no real winner (or loser) IMO

In the end, I want to make sure that I never need a custom UpdateHandler (80% is greek to me), but can easily change the 'modify' logic.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-139:
---------------------------------------

    Fix Version/s: 1.4

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>             Fix For: 1.4
>
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12515406 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

The locking logic for getStoredFields() is indeed flawed.
closing the writer inside the sync block of getStoredFields() doesn't project callers of addDoc() from concurrently using that writer.  The commit lock aquire will be needed after all... no getting around it I think.

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12623385#action_12623385 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

to me the key is getting an interface that would allow for the existing fields to be stored a number of ways:
 * within the index itself
 * within an independent index (as you suggest)
 * within SQL
 * on the file system
 * ...



> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, Eriks-ModifiableDocument.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, getStoredFields.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch, SOLR-269+139-ModifiableDocumentUpdateProcessor.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ryan McKinley updated SOLR-139:
-------------------------------

    Attachment: SOLR-139-IndexDocumentCommand.patch

* Updated to work with trunk.
* moved MODE enum to common

This could be added without affecting any call to the XmlUpdateHandler -- (for now) this only works with the StaxUpdateHandler.  Adding it will help clarify some of the DocumentBuilder needs/issues...

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Yonik Seeley updated SOLR-139:
------------------------------

    Attachment:     (was: UpdateProcessor.patch)

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-ModifyInputDocuments.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509622 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> So you are suggesting [...]
 
I don't have a concrete implementation idea, I'm just going over all the things I know people will want to do (and many of these I have an immediate use for).

> Do you mean as input or output?

Input, for index-only fields.  Normally field values need to be stored for an "update" to work, but we could also allow the user to get these field values from an external source.

> we would need a hook at the end.

Yes, it might make sense to have more than one callback method per UpdateRequestProcessor

Of course now that I finally look at the code, UpdateRequestProcessor isn't quite what I expected.
I was originally thinking more along the lines of DocumentMutator(s) that manipulate a document, not that actually initiate the add/delete/udpate calls.  But there is a certain greater power to what you are exposing/allowing too (as long as you don't need multiple of them).

In UpdateRequestProcessor , instead of 
  protected final NamedList<Object> response;
Why not just expose SolrQueryRequest, SolrQueryResponse?




> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>            Assignee: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Ryan McKinley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470076 ] 

Ryan McKinley commented on SOLR-139:
------------------------------------

> 
> If the field modes were parameters, they could be reused for other update handlers like SQL or CSV
> Perhaps something like:
> /update/xml?modify=true&f.price.mode=increment,f.features.mode=append
> 

Yes.  I think we should decide a standard 'modify modes' syntax that can be used across handlers.  In this example, I am using the string:

mode=cat=DISTINCT,features=APPEND,price=INCREMENT,sku=REMOVE,OVERWRITE

and passing it to 'parseFieldModes' in XmlUpdateRequestHandler.

Personally, I think all the modes should be specified in a single param rather then a list of them.  I vote for a syntax like:

<lst name="params">
 <str name="mode">cat=DISTINCT,features=APPEND,price=INCREMENT,sku=REMOVE,OVERWRITE</str>
</lst>
 or:
<lst name="params">
 <str name="mode">cat:DISTINCT,features:APPEND,price:INCREMENT,sku:REMOVE,OVERWRITE</str>
</lst>

rather then:

<lst name="params">
 <str name="f.cat.mode">DISTINCT</str>
 <str name="f.features.mode">APPEND</str>
 <str name="f.price.mode">INCREMENT</str>
 <str name="f.sku.mode">REMOVE</str>
 <str name="modify.default.mode">OVERWRITE</str>
</lst>



>> sku=REMOVE is required because sku is a stored field that is written to with copyField.
> I'm not sure I quite grok what REMOVE means yet, and how it fixes the copyField problem.
> 

I'm using 'REMOVE' to say "remove the previous value of this field before doing anything."  Essentially, this makes sure you new document does not start with a value for 'sku'.


> Another way to work around copyField is to only collect stored fields that aren't copyField targets.  

I just implemented this.  It is the most normal case, so it should be the default.  It can be overridden by setting the mode for a copyField explicitly.




> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-139) Support updateable/modifiable documents

Posted by "Yonik Seeley (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12470084 ] 

Yonik Seeley commented on SOLR-139:
-----------------------------------

> I added a second searcher to DirectUpdatehandler2 that is only closed when you call commit();

That's OK for right now, but longer term we should re-use the other searcher.
Previously, the searcher was only used for deletions, hence we always had to close it before we opened a writer.
If it's going to be used for doc retrieval now, we should change closeSearcher() to flushDeletes() and only really close the searcher if there had been deletes pending (from our current deleteByQuery  implementation).

> Support updateable/modifiable documents
> ---------------------------------------
>
>                 Key: SOLR-139
>                 URL: https://issues.apache.org/jira/browse/SOLR-139
>             Project: Solr
>          Issue Type: Improvement
>          Components: update
>            Reporter: Ryan McKinley
>         Attachments: SOLR-139-IndexDocumentCommand.patch, SOLR-139-IndexDocumentCommand.patch, SOLR-139-XmlUpdater.patch
>
>
> It would be nice to be able to update some fields on a document without having to insert the entire document.
> Given the way lucene is structured, (for now) one can only modify stored fields.
> While we are at it, we can support incrementing an existing value - I think this only makes sense for numbers.
> for background, see:
> http://www.nabble.com/loading-many-documents-by-ID-tf3145666.html#a8722293

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.