You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Edward J. Yoon (Created) (JIRA)" <ji...@apache.org> on 2012/04/13 09:26:24 UTC

[jira] [Created] (ACCUMULO-532) Add BSP input formats to client package

Add BSP input formats to client package
---------------------------------------

                 Key: ACCUMULO-532
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
             Project: Accumulo
          Issue Type: New Feature
          Components: client
            Reporter: Edward J. Yoon
            Assignee: Billie Rinaldi


I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Keith Turner (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276161#comment-13276161 ] 

Keith Turner commented on ACCUMULO-532:
---------------------------------------

Can this be closed or resolved?
                
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 1.5.0
>
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon" <ed...@apache.org>.
Or, just get rid of this conditions.

On Tue, Apr 24, 2012 at 10:33 AM, Edward J. Yoon <ed...@apache.org> wrote:
> According to CHANGE log, I've added input/output system to BSP
> framework. But, I don't know exactly why we need to check
> (conf.get("bsp.input (or output).dir") != null) conditions when
> initialize record reader/writer objects?
>
> If there's no objection or opinion, I'd like to change like this:
>
> Index: core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
> ===================================================================
> --- core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java     (revision 1329523)
> +++ core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java     (working copy)
> @@ -185,8 +185,7 @@
>
>     initInput();
>
> -    // just output something when the user configured it
> -    if (conf.get("bsp.output.dir") != null) {
> +    if (!bspJob.getInputFormat().getClass().equals(NullOutputFormat.class)) {
>       Path outdir = new Path(conf.get("bsp.output.dir"),
>           Task.getOutputName(partition));
>       outWriter = bspJob.getOutputFormat().getRecordWriter(fs, bspJob,
> @@ -204,8 +203,7 @@
>
>   @SuppressWarnings("unchecked")
>   public final void initInput() throws IOException {
> -    // just read input if the user defined one
> -    if (conf.get("bsp.input.dir") != null) {
> +    if (!bspJob.getInputFormat().getClass().equals(NullInputFormat.class)) {
>       InputSplit inputSplit = null;
>       // reinstantiate the split
>       try {
>
>
> On Tue, Apr 24, 2012 at 6:39 AM, Edward J. Yoon <ed...@apache.org> wrote:
>> FYI,
>>
>> "Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set."
>>
>> Sent from my iPad
>>
>> Begin forwarded message:
>>
>>> From: "Billie Rinaldi (JIRA)" <ji...@apache.org>
>>> Date: April 24, 2012 2:54:35 AM GMT+09:00
>>> To: dev@accumulo.apache.org
>>> Subject: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package
>>> Reply-To: dev@accumulo.apache.org
>>>
>>> Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon" <ed...@apache.org>.
According to CHANGE log, I've added input/output system to BSP
framework. But, I don't know exactly why we need to check
(conf.get("bsp.input (or output).dir") != null) conditions when
initialize record reader/writer objects?

If there's no objection or opinion, I'd like to change like this:

Index: core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
===================================================================
--- core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java	(revision 1329523)
+++ core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java	(working copy)
@@ -185,8 +185,7 @@

     initInput();

-    // just output something when the user configured it
-    if (conf.get("bsp.output.dir") != null) {
+    if (!bspJob.getInputFormat().getClass().equals(NullOutputFormat.class)) {
       Path outdir = new Path(conf.get("bsp.output.dir"),
           Task.getOutputName(partition));
       outWriter = bspJob.getOutputFormat().getRecordWriter(fs, bspJob,
@@ -204,8 +203,7 @@

   @SuppressWarnings("unchecked")
   public final void initInput() throws IOException {
-    // just read input if the user defined one
-    if (conf.get("bsp.input.dir") != null) {
+    if (!bspJob.getInputFormat().getClass().equals(NullInputFormat.class)) {
       InputSplit inputSplit = null;
       // reinstantiate the split
       try {


On Tue, Apr 24, 2012 at 6:39 AM, Edward J. Yoon <ed...@apache.org> wrote:
> FYI,
>
> "Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set."
>
> Sent from my iPad
>
> Begin forwarded message:
>
>> From: "Billie Rinaldi (JIRA)" <ji...@apache.org>
>> Date: April 24, 2012 2:54:35 AM GMT+09:00
>> To: dev@accumulo.apache.org
>> Subject: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package
>> Reply-To: dev@accumulo.apache.org
>>
>> Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Fwd: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon" <ed...@apache.org>.
FYI,

"Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set."

Sent from my iPad

Begin forwarded message:

> From: "Billie Rinaldi (JIRA)" <ji...@apache.org>
> Date: April 24, 2012 2:54:35 AM GMT+09:00
> To: dev@accumulo.apache.org
> Subject: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package
> Reply-To: dev@accumulo.apache.org
> 
> Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.

[jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Billie Rinaldi (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billie Rinaldi reassigned ACCUMULO-532:
---------------------------------------

    Assignee: Edward J. Yoon  (was: Billie Rinaldi)

Ed,

I've added you to the Accumulo contributers list in JIRA and checked in a modified patch to contrib/trunk/bsp as a module of a new accumulo-contrib project.

I modified the input and output formats to have the following form because I didn't want to have so much duplicate code.  This way, if the MR i/o code is changed, the BSP i/o formats will benefit from it directly.

{noformat}
public class AccumuloInputFormat extends org.apache.accumulo.core.client.mapreduce.AccumuloInputFormat implements org.apache.hama.bsp.InputFormat<Key,Value>

public class AccumuloOutputFormat extends org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat implements org.apache.hama.bsp.OutputFormat<Text,Mutation>
{noformat}

Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.

{noformat}
bspjob.setInputPath(new Path("test"));
bspjob.setOutputPath(new Path("test"));
{noformat}
                
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 1.5.0
>
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated ACCUMULO-532:
------------------------------------

          Component/s:     (was: client)
                       contrib
    Affects Version/s:     (was: 1.5.0)
        Fix Version/s:     (was: 1.5.0)

HAMA is about to graduate to TLP status. Please wait until we graduate and release new version. Then, I'll resolve this issues (in/out formats, BSP example, and documentation).
                
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: contrib
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated ACCUMULO-532:
------------------------------------

    Summary: Add BSP input/output formats to client package  (was: Add BSP input formats to client package)
    
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>            Reporter: Edward J. Yoon
>            Assignee: Billie Rinaldi
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-532) Add BSP input formats to client package

Posted by "Edward J. Yoon (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13253199#comment-13253199 ] 

Edward J. Yoon commented on ACCUMULO-532:
-----------------------------------------

I can't attach the file here. It seems a problem of JIRA configuration.

I've uploaded a patch on my server. 
Patch is available at http://udanax.org/bsp.patch

{code}
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] accumulo .......................................... SUCCESS [10.074s]
[INFO] cloudtrace ........................................ SUCCESS [4.911s]
[INFO] accumulo-start .................................... SUCCESS [20.856s]
[INFO] accumulo-core ..................................... SUCCESS [53.895s]
[INFO] accumulo-server ................................... SUCCESS [22.076s]
[INFO] accumulo-examples ................................. SUCCESS [0.407s]
[INFO] examples-simple ................................... SUCCESS [3.784s]
[INFO] accumulo-wikisearch ............................... SUCCESS [0.032s]
[INFO] wikisearch-ingest ................................. SUCCESS [18.361s]
[INFO] wikisearch-query .................................. SUCCESS [13.170s]
[INFO] wikisearch-query-war .............................. SUCCESS [10.451s]
[INFO] accumulo-assemble ................................. SUCCESS [0.854s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 2:40.944s
[INFO] Finished at: Fri Apr 13 16:31:04 KST 2012
[INFO] Final Memory: 117M/278M
[INFO] ------------------------------------------------------------------------
{code}
                
> Add BSP input formats to client package
> ---------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>            Reporter: Edward J. Yoon
>            Assignee: Billie Rinaldi
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13260203#comment-13260203 ] 

Edward J. Yoon commented on ACCUMULO-532:
-----------------------------------------

+1 for reusing org.apache.accumulo.core.client.mapreduce.AccumuloInputFormat, I don't see any problem.

I've fixed initialization problem of record reader/writer objects - HAMA-562 - for the next version.
                
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 1.5.0
>
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated ACCUMULO-532:
------------------------------------

        Fix Version/s: 1.5.0
    Affects Version/s: 1.5.0
               Status: Patch Available  (was: Open)
    
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Billie Rinaldi
>             Fix For: 1.5.0
>
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Billie Rinaldi (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13276811#comment-13276811 ] 

Billie Rinaldi commented on ACCUMULO-532:
-----------------------------------------

I restructured the contrib so the modules can be versioned separately.  The new location is https://svn.apache.org/repos/asf/accumulo/contrib/bsp/trunk/.
                
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: contrib
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward J. Yoon updated ACCUMULO-532:
------------------------------------

    Attachment: bsp.patch
    
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Billie Rinaldi
>             Fix For: 1.5.0
>
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira