You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Per Steffensen (Created) (JIRA)" <ji...@apache.org> on 2012/02/28 15:39:46 UTC

[jira] [Created] (SOLR-3173) Database semantics - insert and update

Database semantics - insert and update
--------------------------------------

                 Key: SOLR-3173
                 URL: https://issues.apache.org/jira/browse/SOLR-3173
             Project: Solr
          Issue Type: New Feature
          Components: update
    Affects Versions: 3.5
         Environment: All
            Reporter: Per Steffensen
             Fix For: 4.0


In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
* Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
* Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219104#comment-13219104 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

Thanks for great input Yonik Seeley. I belive you are already commenting on the related Jira issue that I havnt created yet :-) This issue SOLR-3173 is not so much about versioning but basically only about being able to state your intent (insert or update) on Solr updates, and support failing on "insert"-intent if document already exists, and doing nothing on "update"-intent if document does not already exist.

I want to reply a little to you comment here anyway:
I fell over the _version_ field, and wanted to ask about what it is used for. Can you point me in the direction of a (Wiki) description explaining more exactly what it is used for and how. Or else I will need to read the code, in order to make sure that I agree that we can just use that version number - or to what degree we can use it.

I am not sure I ALWAYS like optimistic locking on a per-request basis. Then it is "too much" up to the clients to use it "correctly". So it depends on how much control you have over your clients. I have different thoughts on this area though. Those thoughts will be reflected soon in a change of the description of this issue and the upcomming related issue (the one actually dealing with versions). Please be patient.

Providing me more info about the _version_ would be greatly appreciated now, though :-)
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218270#comment-13218270 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

Want to put myself as "Assignee". How to?
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Assigned] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Assigned) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Per Steffensen reassigned SOLR-3173:
------------------------------------

    Assignee: Per Steffensen
    
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13260741#comment-13260741 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

See patch attached to SOLR-3178
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219202#comment-13219202 ] 

Per Steffensen edited comment on SOLR-3173 at 2/29/12 1:49 PM:
---------------------------------------------------------------

bq. Optimistic locking as a superset to insert/update:
bq.
bq. What I already had in mind:
bq. - update only a specific version of the document by specifying it's exact version:  _version_=12345
bq. - add a document only if it doesn't already exist (i.e. insert): _version_=-1
bq. - add a document regardless: don't specify a version

I still need a little time to evaluate to what extend _version_ can be used.

bq. So now that I look at it again, it looks like what's missing is your "UPDATE" semantics which would only replace the record if it already existed (a weaker form of the first case... any positive version is OK).  But I really wonder how useful those semantics are (only add a doc if it's overwriting an existing doc, regardless of what version or what data it contains?)
bq. If there are usecases, we certainly should be able to do it.

The only-insert-if-not-exists is needed by us. The only-update-if-exists is mostly for consistency with what we know from RDBMS. Basically simulating what happens when you do the following in SQL and you have unique-constraint on id column. 1) will fail with a unique-key constraint error if document already exists and 2) will not create the row/doc if it does not already exist.
1) INSERT INTO docs (id, column2, column3,...) VALUES (id-value, value2, value3,...)
and
2) UPDATE docs SET column2=value2, column3=value3, ... WHERE id=id-value
RDBMS people are used to a update operation that does no create a row/document if it has already been deleted. I will consider not making that feature - it is only there to give a consistent experince compared to what you are used to using RDBMS's, and actually seen from a distant perspective I think it is not logical with an "update"-operation that creates stuff if it does not exist (it is simple not logical from the word "update")

Right now I believe the solution will be that you will have the following URL-extentions
a) .../solr/.../update, the one already existing in Solr with unchanged semantics
b) .../solr/.../database/update, that updates if document already exists and does nothing if it does not already exists. And when versioning is activated (SOLR-3178) only updates if correct version is given - give VersionConflict error if document exists but version is not correct.
c) .../solr/.../database/insert, that creates a new document if document does not already exist. Fails with DocumentAlreadyExists error if document already exists.
The you can keep using Solr exactly as you are used to, and you can start using the new "database semantics" features if you want that. I might create a optinal config for DirectUpdateHandler2 where you can deactivate the stuff behind a). This can be used when you dont trust clients to use a) correctly in a setup where you want to ensure consistency under high concurrent load.

bq. As far as what \_version\_ is, it's new and used for solrcloud to handle reorders of updates to replicas (among other things).
bq. The leader shard decides what the version of a document should be (versions only increase), and forwards the doc with the version to the replicas.
bq. If a replica receives the same doc with a lower version, it knows that it can safely drop it because it already has a newer version.

Cool. I understand a little better now. So no (Wiki) documentation written yet?
                
      was (Author: steff1193):
    
bq. Optimistic locking as a superset to insert/update:
bq.
bq. What I already had in mind:
bq. - update only a specific version of the document by specifying it's exact version:  _version_=12345
bq. - add a document only if it doesn't already exist (i.e. insert): _version_=-1
bq. - add a document regardless: don't specify a version

I still need a little time to evaluate to what extend _version_ can be used.

bq. So now that I look at it again, it looks like what's missing is your "UPDATE" semantics which would only replace the record if it already existed (a weaker form of the first case... any positive version is OK).  But I really wonder how useful those semantics are (only add a doc if it's overwriting an existing doc, regardless of what version or what data it contains?)
bq. If there are usecases, we certainly should be able to do it.

The only-insert-if-not-exists is needed by us. The only-update-if-exists is mostly for consistency with what we know from RDBMS. Basically simulating what happens when you do the following in SQL and you have unique-constraint on id column. 1) will fail with a unique-key constraint error and 2) will not create the row/doc if it does not already exist.
1) INSERT INTO docs (id, column2, column3,...) VALUES (id-value, value2, value3,...)
and
2) UPDATE docs SET column2=value2, column3=value3, ... WHERE id=id-value
RDBMS people are used to a update operation that does no create a row/document if it has already been deleted. I will consider not making that feature - it is only there to give a consistent experince compared to what you are used to using RDBMS's, and actually seen from a distant perspective I think it is not logical with an "update"-operation that creates stuff if it does not exist (it is simple not logical from the word "update")

Right now I believe the solution will be that you will have the following URL-extentions
a) .../solr/.../update, the one already existing in Solr with unchanged semantics
b) .../solr/.../database/update, that updates if document already exists and does nothing if it does not already exists. And when versioning is activated (SOLR-3178) only updates if correct version is given - give VersionConflict error if document exists but version is not correct.
c) .../solr/.../database/insert, that creates a new document if document does not already exist. Fails with DocumentAlreadyExists error if document already exists.
The you can keep using Solr exactly as you are used to, and you can start using the new "database semantics" features if you want that. I might create a optinal config for DirectUpdateHandler2 where you can deactivate the stuff behind a). This can be used when you dont trust clients to use a) correctly in a setup where you want to ensure consistency under high concurrent load.

bq. As far as what \_version\_ is, it's new and used for solrcloud to handle reorders of updates to replicas (among other things).
bq. The leader shard decides what the version of a document should be (versions only increase), and forwards the doc with the version to the replicas.
bq. If a replica receives the same doc with a lower version, it knows that it can safely drop it because it already has a newer version.

Cool. I understand a little better now. So no (Wiki) documentation written yet?
                  
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219175#comment-13219175 ] 

Yonik Seeley commented on SOLR-3173:
------------------------------------

bq. I belive you are already commenting on the related Jira issue that I havnt created yet

Optimistic locking as a superset to insert/update:

What I already had in mind:
- update only a specific version of the document by specifying it's exact version:  _version_=12345
- add a document only if it doesn't already exist (i.e. insert): _version_=-1
- add a document regardless: don't specify a version

So now that I look at it again, it looks like what's missing is your "UPDATE" semantics which would only replace the record if it already existed (a weaker form of the first case... any positive version is OK).  But I really wonder how useful those semantics are (only add a doc if it's overwriting an existing doc, regardless of what version or what data it contains?)

If there are usecases, we certainly should be able to do it.

As far as what \_version\_ is, it's new and used for solrcloud to handle reorders of updates to replicas (among other things).
The leader shard decides what the version of a document should be (versions only increase), and forwards the doc with the version to the replicas.
If a replica receives the same doc with a lower version, it knows that it can safely drop it because it already has a newer version.


                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218260#comment-13218260 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

See thread "Unique key constraint and optimistic locking (versioning)" on solr-user mailing list (started 21.02.2012)
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Per Steffensen updated SOLR-3173:
---------------------------------

    Description: 
In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
* Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
* Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.

The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.

This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

  was:
In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
* Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
* Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

    
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Issue Comment Edited] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Issue Comment Edited) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218254#comment-13218254 ] 

Per Steffensen edited comment on SOLR-3173 at 2/28/12 3:22 PM:
---------------------------------------------------------------

Impl thoughts:
* In solrconfig.xml turn this feature on by adding a tag <databaseSemantics> (probably to <updateHandler>). When this "flag" in not turned on "update"-requests work as always (add documents if not exist). With this "flag" turned on "update"-requests do not create a document if it does not already exist. Also when this "flag" is turned on "insert"-requests are suddenly also possible - they do as "update"-requets do today, except that they return DocumentAlreadyExist-error if document already exists.
* Proper concurrency handling
* Be carefull using this feature unless you are running with updateLog turned on (and therefore will never "lose" already accepted updates/deletes on crash)

                
      was (Author: steff1193):
    Impl thoughts:
* In solrconfig.xml turn this feature on by added a tag <databaseSemantics> (probably to <updateHandler>). When this "flag" in not turned on "update"-requests work as always (creates if not exists). With this "flag" turned on "update"-requests does not create a document if it does not already exist. Also when this "flag" is turned on "insert"-requests are suddenly also possible - they do as "update"-requets does today, except that they return DocumentAlreadyExist-error if document already exists.
* Proper concurrency handling
* Be carefull using this feature unless you are running with updateLog turned on (and therefore will never "lose" already accepted updates/deletes on crash)

                  
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221008#comment-13221008 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

Hi again

Have most of it coded by now. No tests yet though. Simply wasnt able to create tests before I knew what kind changes to do and where to do them.

Yonik Seeley, I really hope for your help on the best way (quickest), from inside DirectUpdateHandler2 (e.g. addDoc method), to realtime-get the newest _version_ number of the document in cmd (using its idField) - basically getRealtimeVersion(id) in comment above. Please help with some code or a few hints.
I would also really like you to confirm or correct me on a) through d) in comment above.

I have been playing a little with some code to get the newest document with same uniqueKey (idField) value as the document in cmd:
          SearchComponent realTimeGetComponent = core.getSearchComponent(RealTimeGetComponent.COMPONENT_NAME);
          SolrQueryRequest req = new SolrQueryRequestBase(core, ???) {};
          ResponseBuilder rb = ???; 
          realTimeGetComponent.prepare(rb);
          realTimeGetComponent.process(rb);
          long currentVersion = rb.???;
But I am in doubt if there is a more direct way than getting the SearchComponent from the core and use that? And exactly what to put in as SolrParams? How to create a ResponseBuilder from req? If I am allowed to just call prepare and process on a SearchComponent, or if that has to be handled by some framework that does more? How to get the currentVersion from rb after processing the query? In general about exactly how to do?
I will make it work, but you can really save me some time, and help me get i right and as efficient as possible in the first go.
Thanks in advance!

Regards, Per Steffensen

                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13219202#comment-13219202 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------


bq. Optimistic locking as a superset to insert/update:
bq.
bq. What I already had in mind:
bq. - update only a specific version of the document by specifying it's exact version:  _version_=12345
bq. - add a document only if it doesn't already exist (i.e. insert): _version_=-1
bq. - add a document regardless: don't specify a version

I still need a little time to evaluate to what extend _version_ can be used.

bq. So now that I look at it again, it looks like what's missing is your "UPDATE" semantics which would only replace the record if it already existed (a weaker form of the first case... any positive version is OK).  But I really wonder how useful those semantics are (only add a doc if it's overwriting an existing doc, regardless of what version or what data it contains?)
bq. If there are usecases, we certainly should be able to do it.

The only-insert-if-not-exists is needed by us. The only-update-if-exists is mostly for consistency with what we know from RDBMS. Basically simulating what happens when you do the following in SQL and you have unique-constraint on id column. 1) will fail with a unique-key constraint error and 2) will not create the row/doc if it does not already exist.
1) INSERT INTO docs (id, column2, column3,...) VALUES (id-value, value2, value3,...)
and
2) UPDATE docs SET column2=value2, column3=value3, ... WHERE id=id-value
RDBMS people are used to a update operation that does no create a row/document if it has already been deleted. I will consider not making that feature - it is only there to give a consistent experince compared to what you are used to using RDBMS's, and actually seen from a distant perspective I think it is not logical with an "update"-operation that creates stuff if it does not exist (it is simple not logical from the word "update")

Right now I believe the solution will be that you will have the following URL-extentions
a) .../solr/.../update, the one already existing in Solr with unchanged semantics
b) .../solr/.../database/update, that updates if document already exists and does nothing if it does not already exists. And when versioning is activated (SOLR-3178) only updates if correct version is given - give VersionConflict error if document exists but version is not correct.
c) .../solr/.../database/insert, that creates a new document if document does not already exist. Fails with DocumentAlreadyExists error if document already exists.
The you can keep using Solr exactly as you are used to, and you can start using the new "database semantics" features if you want that. I might create a optinal config for DirectUpdateHandler2 where you can deactivate the stuff behind a). This can be used when you dont trust clients to use a) correctly in a setup where you want to ensure consistency under high concurrent load.

bq. As far as what \_version\_ is, it's new and used for solrcloud to handle reorders of updates to replicas (among other things).
bq. The leader shard decides what the version of a document should be (versions only increase), and forwards the doc with the version to the replicas.
bq. If a replica receives the same doc with a lower version, it knows that it can safely drop it because it already has a newer version.

Cool. I understand a little better now. So no (Wiki) documentation written yet?
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220062#comment-13220062 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

Believe we will be able to use _version_ if:
a) There is a "realtime" way of getting the _version_ corresponding to a given id (or whatever you use as uniqueKey). Lets call this getRealtimeVersion(id)
b) The _version_ for a given id returned by getRealtimeVersion(id) never changes unless changes has been made to the document with that id (created, updated or deleted)
c) That getRealtimeVersion(id) will immediately return that new _version_ as soon a change has been made - no soft- or hard-commit necessary. Well that is the realtime part :-)
d) I will always get a negative number (hopefully always -1) from getRealtimeVersion(id) when calling with an id, where there is no corresponding document in the solr-core. No matter if there has never been such a document or if it has been there but has been deleted.

Can you please confirm or correct me on the above bullets, Yonik. It would also be very helpfull if you would provide the code for getRealtimeVersion(id), assuming that I am in the DirectUpdateHandler2. Thanks alot!

Guess this version-checking stuff is only necessary on primary (or master or whatever you call it) shards and not on replica (or slave). How do I know in DirectUpdateHandler2 if I am primary/master- or replica/slave-shard?

Regret a little bit the idea about different URLs stated in comment above. Guess I would just like to state info about the wanted semantics in the query in some other way. I guess it would be nice with a "semantics" URL-param with the possible values "db-insert", "db-update", "db-update-version-checked" and "classic-solr-update":
- semantics=db-insert: Index document doc if and only if getRealtimeVersion(doc.id) returns -1. Else return DocumentAlreadyExist error
- semantics=db-update: Replace existing document if it exists, else return DocumentDoesNotExist error
- semantics=db-update-version-checked: As db-update but if _version_ on the provided document does not correspond to existing getRealtimeVersion(doc.id) return VersionConflict error
- semantics=classic-solr-update: Do exactly as update does today in Solr
"classic-solr-update" will be used if "semantics" is not specified in update request - it is the default. In solrconfig.xml you will be able to change default semantics plus provide a list of semantics that are not allowed. 

Regards, Per Steffensen
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Steven Rowe (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218327#comment-13218327 ] 

Steven Rowe commented on SOLR-3173:
-----------------------------------

Hi Per, I've added you to the JIRA Solr contributor group, which enables you to be assigned to JIRA issues.
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218254#comment-13218254 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

Impl thoughts:
* In solrconfig.xml turn this feature on by added a tag <databaseSemantics> (probably to <updateHandler>). When this "flag" in not turned on "update"-requests work as always (creates if not exists). With this "flag" turned on "update"-requests does not create a document if it does not already exist. Also when this "flag" is turned on "insert"-requests are suddenly also possible - they do as "update"-requets does today, except that they return DocumentAlreadyExist-error if document already exists.
* Proper concurrency handling
* Be carefull using this feature unless you are running with updateLog turned on (and therefore will never "lose" already accepted updates/deletes on crash)

                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Yonik Seeley (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13218571#comment-13218571 ] 

Yonik Seeley commented on SOLR-3173:
------------------------------------

Here's an idea: we already have a _version_ field for documents (that can also be passed in the URL for other things like deletes), we simply reuse that.  Positive versions are adds, negative versions are deletes.

If a document comes into the shad leader and already has a _version_, then it's considered an optimistic concurrency request... the document should be replacing an existing document with exactly that version.  If the _version_ passed is negative, then the document should not already exist (all deleted documents are considered equal).

No new config needed, and optimistic concurrency can be selected on a per-request basis to the same handler.
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267474#comment-13267474 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

New patch available as part of SOLR-3173_3178_3382_3428_plus.patch on SOLR-3178
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Updated] (SOLR-3173) Database semantics - insert and update

Posted by "Mark Miller (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Miller updated SOLR-3173:
------------------------------

    Fix Version/s:     (was: 4.0)
                   5.0
                   4.1
    
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.1, 5.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[jira] [Commented] (SOLR-3173) Database semantics - insert and update

Posted by "Per Steffensen (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13226442#comment-13226442 ] 

Per Steffensen commented on SOLR-3173:
--------------------------------------

More detailed descriptions of the added features in SOLR-3173 and SOLR-3178 here: http://wiki.apache.org/solr/Per%20Steffensen/Update%20semantics
                
> Database semantics - insert and update
> --------------------------------------
>
>                 Key: SOLR-3173
>                 URL: https://issues.apache.org/jira/browse/SOLR-3173
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 3.5
>         Environment: All
>            Reporter: Per Steffensen
>            Assignee: Per Steffensen
>              Labels: RDBMS, insert, nosql, uniqueKey, update
>             Fix For: 4.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In order increase the ability of Solr to be used as a NoSql database (lots of concurrent inserts, updates, deletes and queries in the entire lifetime of the index) instead of just a search index (first: everything indexed (in one thread), after: only queries), I would like Solr to support the following features inspired by RDBMSs and other NoSql databases.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to INSERT a new document Dnew where Dold.uniqueField is equal to Dnew.uniqueField, then I want a DocumentAlredyExists error. If no such document Dold exists I want Dnew indexed into the solr-core.
> * Given a solr-core with a schema containing a uniqueKey-field "uniqueField" and a document Dold, when trying to UPDATE a document Dnew where Dold.uniqueField is equal to Dnew.uniqueField I want Dold deleted from and Dnew added to the index (just as it is today).If no such document Dold exists I want nothing to happen (Dnew is not added to the index)
> The essence of this issue is to be able to state your intent (insert or update) and have slightly different semantics (from each other and the existing update) depending on you intent.
> The functionality provided by this issue is only really meaningfull when you run with "updateLog" activated.
> This issue might be solved more or less at the same time as SOLR-3178, and only one single SVN patch might be given to cover both issues.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org