You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by "Andrzej Bialecki (JIRA)" <ji...@apache.org> on 2009/03/04 11:53:56 UTC

[jira] Created: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Indexer failing after upgrade to Hadoop 0.19.1
----------------------------------------------

                 Key: NUTCH-711
                 URL: https://issues.apache.org/jira/browse/NUTCH-711
             Project: Nutch
          Issue Type: Bug
            Reporter: Andrzej Bialecki 


After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.

The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.

See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Re: [jira] Resolved: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by Sami Siren <ss...@gmail.com>.
Alternatively you could create another issue to track the proper fix and 
let this close during the release process.

--
 Sami Siren

Andrzej Bialecki (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Andrzej Bialecki  resolved NUTCH-711.
> -------------------------------------
>
>     Resolution: Fixed
>
> Applied the patch in rev. 750037. I'm not closing this issue, because this needs to be solved in a better way after 1.0.
>
>   
>> Indexer failing after upgrade to Hadoop 0.19.1
>> ----------------------------------------------
>>
>>                 Key: NUTCH-711
>>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>>             Project: Nutch
>>          Issue Type: Bug
>>    Affects Versions: 1.0.0
>>            Reporter: Andrzej Bialecki 
>>            Assignee: Andrzej Bialecki 
>>            Priority: Blocker
>>             Fix For: 1.0.0
>>
>>         Attachments: patch.txt
>>
>>
>> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
>> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
>> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk
>>     
>
>   


[jira] Resolved: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  resolved NUTCH-711.
-------------------------------------

    Resolution: Fixed

Applied the patch in rev. 750037. I'm not closing this issue, because this needs to be solved in a better way after 1.0.

> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>             Fix For: 1.0.0
>
>         Attachments: patch.txt
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated NUTCH-711:
------------------------------------

    Priority: Minor  (was: Blocker)

Lowering priority because it works now, it just needs a cleanup.

> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: patch.txt
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by "Sami Siren (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12678691#action_12678691 ] 

Sami Siren commented on NUTCH-711:
----------------------------------

+1

> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>             Fix For: 1.0.0
>
>         Attachments: patch.txt
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679064#action_12679064 ] 

Hudson commented on NUTCH-711:
------------------------------

Integrated in Nutch-trunk #743 (See [http://hudson.zones.apache.org/hudson/job/Nutch-trunk/743/])
     - Indexer failing after upgrade to Hadoop 0.19.1. This is a temporary
fix, to be revisited later.


> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Minor
>             Fix For: 1.0.0
>
>         Attachments: patch.txt
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated NUTCH-711:
------------------------------------

             Priority: Blocker  (was: Major)
    Affects Version/s: 1.0.0
        Fix Version/s: 1.0.0
             Assignee: Andrzej Bialecki 

> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>             Fix For: 1.0.0
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (NUTCH-711) Indexer failing after upgrade to Hadoop 0.19.1

Posted by "Andrzej Bialecki (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/NUTCH-711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrzej Bialecki  updated NUTCH-711:
------------------------------------

    Attachment: patch.txt

This patch instantiates IndexingFilters in IndexerOutputFormat, and thus fixes the issue. If there are not objections I will commit it shortly.

> Indexer failing after upgrade to Hadoop 0.19.1
> ----------------------------------------------
>
>                 Key: NUTCH-711
>                 URL: https://issues.apache.org/jira/browse/NUTCH-711
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.0.0
>            Reporter: Andrzej Bialecki 
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>             Fix For: 1.0.0
>
>         Attachments: patch.txt
>
>
> After upgrade to Hadoop 0.19.1 Reducer is initialized in a different order than before (see http://svn.apache.org/viewvc?view=rev&revision=736239). IndexingFilters populate current JobConf with field options that are required for IndexerOutputFormat to function properly. However, the filters are instantiated in Reducer.configure(), which is now called after the OutputFormat is initialized, and not before as previously.
> The workaround for now is to instantiate IndexinigFilters once again inside IndexerOutputFormat.  This issue should be revisited before 1.1 in order to find a better solution.
> See this thread for more information: http://www.lucidimagination.com/search/document/7c62c625c7ea17fe/problem_with_crawling_using_the_latest_1_0_trunk

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.