You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-dev@lucene.apache.org by "Wojtek Piaseczny (JIRA)" <ji...@apache.org> on 2009/01/21 18:59:59 UTC

[jira] Created: (SOLR-974) DataImportHandler should not commit if no data has been updated

DataImportHandler should not commit if no data has been updated
---------------------------------------------------------------

                 Key: SOLR-974
                 URL: https://issues.apache.org/jira/browse/SOLR-974
             Project: Solr
          Issue Type: Improvement
          Components: contrib - DataImportHandler
    Affects Versions: 1.3
            Reporter: Wojtek Piaseczny
            Priority: Minor
             Fix For: 1.4


The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.

Related discussion:
http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-974:
---------------------------------------

    Attachment: SOLR-974.patch

Changes
# If command is delta-import and 'clean' parameter is false or not specified, if no documents were created and none were identified to be deleted, then commit is not called.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667600#action_12667600 ] 

Shalin Shekhar Mangar commented on SOLR-974:
--------------------------------------------

bq. I guess the best thing is to expose the 'stats' as a variable in DIH. This can also be exposed through the Context#getStats() 

I like this idea. I'll give a patch.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch, SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar updated SOLR-974:
---------------------------------------

    Attachment: SOLR-974.patch

Changed to skip commit if no documents were created.

Note -- the onImportEnd event listener is still invoked even if no documents were created and commit was skipped. I think that is alright.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch, SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Wojtek Piaseczny (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667306#action_12667306 ] 

Wojtek Piaseczny commented on SOLR-974:
---------------------------------------

Why only if the command is delta-import? I'm managing my updates within my DB, so I'm always using the full-import command.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Noble Paul (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667578#action_12667578 ] 

Noble Paul commented on SOLR-974:
---------------------------------

I guess the best thing is to expose the 'stats' as a variable in DIH. This can also be exposed through the Context#getStats()


> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch, SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667322#action_12667322 ] 

Shalin Shekhar Mangar commented on SOLR-974:
--------------------------------------------

No, nothing right now. The XML response would say that no documents were created. However, one can add a postCommit or newSearcher listener if a commit is all you are interested in.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch, SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Kay Kay (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667314#action_12667314 ] 

Kay Kay commented on SOLR-974:
------------------------------

| Note - the onImportEnd event listener is still invoked even if no documents were created and commit was skipped. I think that is alright.

Is there anything in the Context object that says that the no documents were created and commit was skipped. Otherwise - onImportEndListener would continue to execute even if in reality no documents were imported then, that is not so useful. 

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch, SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar reassigned SOLR-974:
------------------------------------------

    Assignee: Shalin Shekhar Mangar

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Shalin Shekhar Mangar resolved SOLR-974.
----------------------------------------

    Resolution: Fixed

Committed revision 738020.

Thanks Wojtek!

Kay, we can work on exposing the statistics through context with SOLR-989. With this change, one can easily detect if any documents were created or not.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch, SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (SOLR-974) DataImportHandler should not commit if no data has been updated

Posted by "Shalin Shekhar Mangar (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/SOLR-974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12667307#action_12667307 ] 

Shalin Shekhar Mangar commented on SOLR-974:
--------------------------------------------

Fair enough. We can extend this to full import if the user specified clean=false. I'll update the patch.

> DataImportHandler should not commit if no data has been updated
> ---------------------------------------------------------------
>
>                 Key: SOLR-974
>                 URL: https://issues.apache.org/jira/browse/SOLR-974
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.3
>            Reporter: Wojtek Piaseczny
>            Assignee: Shalin Shekhar Mangar
>            Priority: Minor
>             Fix For: 1.4
>
>         Attachments: SOLR-974.patch
>
>
> The DataImportHandler always finishes an import with a commit, even if it retrieved no data from its data source. Add a short circuit to not commit if no data was imported.
> Related discussion:
> http://www.nabble.com/Performance-Hit-for-Zero-Record-Dataimport-td21572935.html

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.