You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/08/01 15:49:03 UTC

[jira] [Created] (NUTCH-1444) Indexing should not create temporary files (do not extend from FileOutputFormat)

Ferdy Galema created NUTCH-1444:
-----------------------------------

             Summary: Indexing should not create temporary files (do not extend from FileOutputFormat)
                 Key: NUTCH-1444
                 URL: https://issues.apache.org/jira/browse/NUTCH-1444
             Project: Nutch
          Issue Type: Bug
            Reporter: Ferdy Galema
             Fix For: 2.1


The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Closed] (NUTCH-1444) Indexing should not create temporary files (do not extend from FileOutputFormat)

Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ferdy Galema closed NUTCH-1444.
-------------------------------

    Resolution: Fixed

committed
                
> Indexing should not create temporary files (do not extend from FileOutputFormat)
> --------------------------------------------------------------------------------
>
>                 Key: NUTCH-1444
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1444
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Ferdy Galema
>             Fix For: 2.1
>
>         Attachments: NUTCH-1444.patch
>
>
> The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (NUTCH-1444) Indexing should not create temporary files (do not extend from FileOutputFormat)

Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429986#comment-13429986 ] 

Ferdy Galema commented on NUTCH-1444:
-------------------------------------

Just to add:

The following exception is fixed with this issue.

java.lang.NullPointerException
        at
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.getOutputPath(FileOutputFormat.java:160)
        at
org.apache.nutch.indexer.solr.SolrIndexerJob.indexSolr(SolrIndexerJob.java:74)
        at
org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:90)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at
org.apache.nutch.indexer.solr.SolrIndexerJob.main(SolrIndexerJob.java:99)
                
> Indexing should not create temporary files (do not extend from FileOutputFormat)
> --------------------------------------------------------------------------------
>
>                 Key: NUTCH-1444
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1444
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Ferdy Galema
>             Fix For: 2.1
>
>         Attachments: NUTCH-1444.patch
>
>
> The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (NUTCH-1444) Indexing should not create temporary files (do not extend from FileOutputFormat)

Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ferdy Galema updated NUTCH-1444:
--------------------------------

    Attachment: NUTCH-1444.patch
    
> Indexing should not create temporary files (do not extend from FileOutputFormat)
> --------------------------------------------------------------------------------
>
>                 Key: NUTCH-1444
>                 URL: https://issues.apache.org/jira/browse/NUTCH-1444
>             Project: Nutch
>          Issue Type: Bug
>            Reporter: Ferdy Galema
>             Fix For: 2.1
>
>         Attachments: NUTCH-1444.patch
>
>
> The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira