You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "David Savage (JIRA)" <ji...@apache.org> on 2007/09/06 14:58:32 UTC

[jira] Created: (HADOOP-1844) Allow hadoop to run in an osgi container

Allow hadoop to run in an osgi container
----------------------------------------

                 Key: HADOOP-1844
                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: David Savage
         Attachments: classpath.patch, tasktracker.patch

I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.

I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.

classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.

tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process

taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.

tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "David Savage (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Savage updated HADOOP-1844:
---------------------------------

    Attachment: tasklog.patch

> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>         Attachments: classpath.patch, tasklog.patch, taskrunner.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "Christophe Taton (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Christophe Taton reassigned HADOOP-1844:
----------------------------------------

    Assignee: Christophe Taton

> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>            Assignee: Christophe Taton
>         Attachments: classpath.patch, tasklog.patch, taskrunner.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "Christophe Taton (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12557818#action_12557818 ] 

Christophe Taton commented on HADOOP-1844:
------------------------------------------

After playing with Hadoop inside OSGi containers for some time, here are some complementary comments:
- there is an issue with the web UI: this because resources inside Hadoop jars are referred to with OSGi specific URLs (e.g. jar:bundle://<bundle-id>/path/to/resource) that the embedded Jetty is unable to use.
- i am thinking Map/Reduce jobs could be packaged as OSGi bundles too: dependencies (like 3rd party libraries) are then directly handled by the containers.


> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>            Assignee: Christophe Taton
>         Attachments: classpath.patch, tasklog.patch, taskrunner.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12549507 ] 

Doug Cutting commented on HADOOP-1844:
--------------------------------------

A single patch for this is probably best.  Some comments:

- indentation is not Hadoop standard (2-spaces per level)
- non-existent files in the classpath should not throw exceptions, should they?
- some unit tests would be good to ensure that these changes are maintained
- patches should not include patch-specific comments
- i don't like modifying the child's job configuration.  can't this be implemented by using 'final' parameters in the tasktracker's configuration, so that job's cannot override them?


> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>            Assignee: Christophe Taton
>         Attachments: classpath.patch, tasklog.patch, taskrunner.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "David Savage (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Savage updated HADOOP-1844:
---------------------------------

    Attachment: classpath.patch

> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>         Attachments: classpath.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "David Savage (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Savage updated HADOOP-1844:
---------------------------------

    Attachment: taskrunner.patch

> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>         Attachments: classpath.patch, tasklog.patch, taskrunner.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "David Savage (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Savage updated HADOOP-1844:
---------------------------------

    Attachment: tasktracker.patch

> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>         Attachments: classpath.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HADOOP-1844) Allow hadoop to run in an osgi container

Posted by "Christophe Taton (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HADOOP-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560292#action_12560292 ] 

Christophe Taton commented on HADOOP-1844:
------------------------------------------

Embedded web applications will need to be packaged as war files, so as to have Jetty6/OSGi correctly running: Jetty is only able to use OSGi specific URLs when reading a jar file (thus a war file).

> Allow hadoop to run in an osgi container
> ----------------------------------------
>
>                 Key: HADOOP-1844
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1844
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: David Savage
>            Assignee: Christophe Taton
>         Attachments: classpath.patch, tasklog.patch, taskrunner.patch, tasktracker.patch
>
>
> I have been running some tests getting hadoop to run within an osgi environment (specifically the Newton framework) and this has uncovered a number of minor bugs when mapred classes are instantiated from a different start point than their main methods.
> I have created a number of patches which I'll attach which solve these issues. It's possible these patches could be dealt with as separate issues but all are required to resolve the osgi issue. Happy to split up if easier to manage though.
> classpath.patch: this rearranges the classloader hierarchies for Task objects such that a Task is able to resolve api classes in the case where the api classes are no longer loaded from the system classloader.
> tasklog.patch: this ensures the log files are able to be resolved in the case where the child process is launched from a different directory to the parent process
> taskrunner.patch: this enables the TaskRunner to find a log dir in the case where the parent jvm is not launched by the hadoop scripts, also allows for a client to specify a substitute main class (which delegates to the TaskTracker$Child) in this case for purposes of resolving osgi classpaths but could be more general? Finally adds some extra logging in case where things go wrong.
> tasktracker.patch: allow parent to pass through configuration to child taskrunner (specifically in this case for purposes of passing classpath and laucher to taskrunner)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.