You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Simon Lachinger (JIRA)" <ji...@apache.org> on 2010/01/07 15:57:54 UTC

[jira] Created: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Allowing import / update of a specific document using the data import handler
-----------------------------------------------------------------------------

                 Key: SOLR-1708
                 URL: https://issues.apache.org/jira/browse/SOLR-1708
             Project: Solr
          Issue Type: New Feature
          Components: contrib - DataImportHandler
    Affects Versions: 1.4
            Reporter: Simon Lachinger
         Attachments: 02-single-update.patch

There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also do it by feeding it to into the update-handler. It should be centralized.

Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.

The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Simon Lachinger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799170#action_12799170 ] 

Simon Lachinger commented on SOLR-1708:
---------------------------------------

Still missing: Tests.

> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch, 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also for feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Simon Lachinger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799169#action_12799169 ] 

Simon Lachinger commented on SOLR-1708:
---------------------------------------

The last_index_date is no longer updated when using the delta-import command like this:
http://localhost:8080/solr/locations-eng/dataimport?command=delta-import&root-pk=967&indent=on

Thus, other changes will also taken into account on the next regular delta-import.


> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch, 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also do it by feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Simon Lachinger (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799184#action_12799184 ] 

Simon Lachinger commented on SOLR-1708:
---------------------------------------

Well, the solution with ${dih.request.root-pk} generates duplicate configurations if users want to user full-import, delta-imports, and imports with root-pk. And I doubt it will handle the last_index_time correctly. This _kind_ of problems seems to be a pretty basic requirement actually. And all workarounds I could google/think of required creating and maintaining duplicates of the import queries. Which is quite bad.

I guess it would be better to have an additional command to allow these actions?

> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch, 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also for feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Simon Lachinger (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Lachinger updated SOLR-1708:
----------------------------------

    Description: 
There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also for feeding it to into the update-handler. It should be centralized.

Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.

The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

  was:
There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also do it by feeding it to into the update-handler. It should be centralized.

Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.

The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.


> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch, 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also for feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Simon Lachinger (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Lachinger updated SOLR-1708:
----------------------------------

    Attachment: 02-single-update.patch

fixed: last index date is not to be updated for udpates via root-pk

> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch, 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also do it by feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12799179#action_12799179 ] 

Noble Paul commented on SOLR-1708:
----------------------------------

This does not seem to be a good way to solve your problem. you can pass your root-pk as params and use them in your queries directly

${dih.request.root-pk} 

We cannot change the core of DIH for this kind of  problems

> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch, 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also for feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-1708) Allowing import / update of a specific document using the data import handler

Posted by "Simon Lachinger (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-1708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Simon Lachinger updated SOLR-1708:
----------------------------------

    Attachment: 02-single-update.patch

> Allowing import / update of a specific document using the data import handler
> -----------------------------------------------------------------------------
>
>                 Key: SOLR-1708
>                 URL: https://issues.apache.org/jira/browse/SOLR-1708
>             Project: Solr
>          Issue Type: New Feature
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Simon Lachinger
>         Attachments: 02-single-update.patch
>
>
> There is the need that changes or new documents need to be added immediately to the Solr Index. This could easily done via the update-handler - however, when using the DataImportHandler it shouldn't be necessary to specify the data extraction for the the DataImportHandler and also do it by feeding it to into the update-handler. It should be centralized.
> Having to run delta query, identifying the changes, for changes where the ID's of the updated documents are already known to the application is a rather costly (in terms of database load) way to solve this.
> The attached patch allows to specify one or more query parameters for the delta-import command, named 'root-pk', which allow to specify the document(s) to be updated or added.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.