You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Ferdy Galema (JIRA)" <ji...@apache.org> on 2012/08/01 15:49:03 UTC
[jira] [Created] (NUTCH-1444) Indexing should not create temporary
files (do not extend from FileOutputFormat)
Ferdy Galema created NUTCH-1444:
-----------------------------------
Summary: Indexing should not create temporary files (do not extend from FileOutputFormat)
Key: NUTCH-1444
URL: https://issues.apache.org/jira/browse/NUTCH-1444
Project: Nutch
Issue Type: Bug
Reporter: Ferdy Galema
Fix For: 2.1
The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (NUTCH-1444) Indexing should not create temporary
files (do not extend from FileOutputFormat)
Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ferdy Galema closed NUTCH-1444.
-------------------------------
Resolution: Fixed
committed
> Indexing should not create temporary files (do not extend from FileOutputFormat)
> --------------------------------------------------------------------------------
>
> Key: NUTCH-1444
> URL: https://issues.apache.org/jira/browse/NUTCH-1444
> Project: Nutch
> Issue Type: Bug
> Reporter: Ferdy Galema
> Fix For: 2.1
>
> Attachments: NUTCH-1444.patch
>
>
> The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (NUTCH-1444) Indexing should not create
temporary files (do not extend from FileOutputFormat)
Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429986#comment-13429986 ]
Ferdy Galema commented on NUTCH-1444:
-------------------------------------
Just to add:
The following exception is fixed with this issue.
java.lang.NullPointerException
at
org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.getOutputPath(FileOutputFormat.java:160)
at
org.apache.nutch.indexer.solr.SolrIndexerJob.indexSolr(SolrIndexerJob.java:74)
at
org.apache.nutch.indexer.solr.SolrIndexerJob.run(SolrIndexerJob.java:90)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at
org.apache.nutch.indexer.solr.SolrIndexerJob.main(SolrIndexerJob.java:99)
> Indexing should not create temporary files (do not extend from FileOutputFormat)
> --------------------------------------------------------------------------------
>
> Key: NUTCH-1444
> URL: https://issues.apache.org/jira/browse/NUTCH-1444
> Project: Nutch
> Issue Type: Bug
> Reporter: Ferdy Galema
> Fix For: 2.1
>
> Attachments: NUTCH-1444.patch
>
>
> The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (NUTCH-1444) Indexing should not create temporary
files (do not extend from FileOutputFormat)
Posted by "Ferdy Galema (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ferdy Galema updated NUTCH-1444:
--------------------------------
Attachment: NUTCH-1444.patch
> Indexing should not create temporary files (do not extend from FileOutputFormat)
> --------------------------------------------------------------------------------
>
> Key: NUTCH-1444
> URL: https://issues.apache.org/jira/browse/NUTCH-1444
> Project: Nutch
> Issue Type: Bug
> Reporter: Ferdy Galema
> Fix For: 2.1
>
> Attachments: NUTCH-1444.patch
>
>
> The creation of the tmp files is a thing from the past, where it was needed to create Lucene indices. For the the SolrIndexer this is not needed anymore. I have changed the indexer to not extend from FileOutputFormat. This greatly simplifies the code. (And makes room for ElasticIndexerJob which I am about to add to the codebase)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira