You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by "Billie Rinaldi (JIRA)" <ji...@apache.org> on 2012/04/23 19:54:35 UTC

[jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

     [ https://issues.apache.org/jira/browse/ACCUMULO-532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Billie Rinaldi reassigned ACCUMULO-532:
---------------------------------------

    Assignee: Edward J. Yoon  (was: Billie Rinaldi)

Ed,

I've added you to the Accumulo contributers list in JIRA and checked in a modified patch to contrib/trunk/bsp as a module of a new accumulo-contrib project.

I modified the input and output formats to have the following form because I didn't want to have so much duplicate code.  This way, if the MR i/o code is changed, the BSP i/o formats will benefit from it directly.

{noformat}
public class AccumuloInputFormat extends org.apache.accumulo.core.client.mapreduce.AccumuloInputFormat implements org.apache.hama.bsp.InputFormat<Key,Value>

public class AccumuloOutputFormat extends org.apache.accumulo.core.client.mapreduce.AccumuloOutputFormat implements org.apache.hama.bsp.OutputFormat<Text,Mutation>
{noformat}

Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.

{noformat}
bspjob.setInputPath(new Path("test"));
bspjob.setOutputPath(new Path("test"));
{noformat}
                
> Add BSP input/output formats to client package
> ----------------------------------------------
>
>                 Key: ACCUMULO-532
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-532
>             Project: Accumulo
>          Issue Type: New Feature
>          Components: client
>    Affects Versions: 1.5.0
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 1.5.0
>
>         Attachments: bsp.patch
>
>
> I've just wrote basic BSP input formats and its unit tests.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon" <ed...@apache.org>.
Or, just get rid of this conditions.

On Tue, Apr 24, 2012 at 10:33 AM, Edward J. Yoon <ed...@apache.org> wrote:
> According to CHANGE log, I've added input/output system to BSP
> framework. But, I don't know exactly why we need to check
> (conf.get("bsp.input (or output).dir") != null) conditions when
> initialize record reader/writer objects?
>
> If there's no objection or opinion, I'd like to change like this:
>
> Index: core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
> ===================================================================
> --- core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java     (revision 1329523)
> +++ core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java     (working copy)
> @@ -185,8 +185,7 @@
>
>     initInput();
>
> -    // just output something when the user configured it
> -    if (conf.get("bsp.output.dir") != null) {
> +    if (!bspJob.getInputFormat().getClass().equals(NullOutputFormat.class)) {
>       Path outdir = new Path(conf.get("bsp.output.dir"),
>           Task.getOutputName(partition));
>       outWriter = bspJob.getOutputFormat().getRecordWriter(fs, bspJob,
> @@ -204,8 +203,7 @@
>
>   @SuppressWarnings("unchecked")
>   public final void initInput() throws IOException {
> -    // just read input if the user defined one
> -    if (conf.get("bsp.input.dir") != null) {
> +    if (!bspJob.getInputFormat().getClass().equals(NullInputFormat.class)) {
>       InputSplit inputSplit = null;
>       // reinstantiate the split
>       try {
>
>
> On Tue, Apr 24, 2012 at 6:39 AM, Edward J. Yoon <ed...@apache.org> wrote:
>> FYI,
>>
>> "Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set."
>>
>> Sent from my iPad
>>
>> Begin forwarded message:
>>
>>> From: "Billie Rinaldi (JIRA)" <ji...@apache.org>
>>> Date: April 24, 2012 2:54:35 AM GMT+09:00
>>> To: dev@accumulo.apache.org
>>> Subject: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package
>>> Reply-To: dev@accumulo.apache.org
>>>
>>> Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.
>
>
>
> --
> Best Regards, Edward J. Yoon
> @eddieyoon



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Re: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon" <ed...@apache.org>.
According to CHANGE log, I've added input/output system to BSP
framework. But, I don't know exactly why we need to check
(conf.get("bsp.input (or output).dir") != null) conditions when
initialize record reader/writer objects?

If there's no objection or opinion, I'd like to change like this:

Index: core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java
===================================================================
--- core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java	(revision 1329523)
+++ core/src/main/java/org/apache/hama/bsp/BSPPeerImpl.java	(working copy)
@@ -185,8 +185,7 @@

     initInput();

-    // just output something when the user configured it
-    if (conf.get("bsp.output.dir") != null) {
+    if (!bspJob.getInputFormat().getClass().equals(NullOutputFormat.class)) {
       Path outdir = new Path(conf.get("bsp.output.dir"),
           Task.getOutputName(partition));
       outWriter = bspJob.getOutputFormat().getRecordWriter(fs, bspJob,
@@ -204,8 +203,7 @@

   @SuppressWarnings("unchecked")
   public final void initInput() throws IOException {
-    // just read input if the user defined one
-    if (conf.get("bsp.input.dir") != null) {
+    if (!bspJob.getInputFormat().getClass().equals(NullInputFormat.class)) {
       InputSplit inputSplit = null;
       // reinstantiate the split
       try {


On Tue, Apr 24, 2012 at 6:39 AM, Edward J. Yoon <ed...@apache.org> wrote:
> FYI,
>
> "Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set."
>
> Sent from my iPad
>
> Begin forwarded message:
>
>> From: "Billie Rinaldi (JIRA)" <ji...@apache.org>
>> Date: April 24, 2012 2:54:35 AM GMT+09:00
>> To: dev@accumulo.apache.org
>> Subject: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package
>> Reply-To: dev@accumulo.apache.org
>>
>> Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.



-- 
Best Regards, Edward J. Yoon
@eddieyoon

Fwd: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package

Posted by "Edward J. Yoon" <ed...@apache.org>.
FYI,

"Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set."

Sent from my iPad

Begin forwarded message:

> From: "Billie Rinaldi (JIRA)" <ji...@apache.org>
> Date: April 24, 2012 2:54:35 AM GMT+09:00
> To: dev@accumulo.apache.org
> Subject: [jira] [Assigned] (ACCUMULO-532) Add BSP input/output formats to client package
> Reply-To: dev@accumulo.apache.org
> 
> Let me know if you see any issues with this.  It could probably use some more testing.  I was able to get the unit tests working (even the part commented out in the patch) but I had to set fake input and output paths.  It seems that BSP doesn't initialize the RecordReader and RecordWriter unless the configuration options "bsp.input.dir" and "bsp.output.dir" are set.