You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Chaitanya Chebolu <ch...@datatorrent.com> on 2016/04/29 10:39:38 UTC

NFS Input Module

Hi All,

  I am proposing NFS Input Module. Use case is to read large files from NFS
in parallel.

 Design of NFS input module:

   There is a common interface "FSInputModule" in Malhar for the input
Modules. NFS input Module extends from FSInputModule and can be achieved by
using FSFileSplitter and BlockReader operators.

  Please share your thoughts on this.

Regards,
Chaitanya

Re: NFS Input Module

Posted by Amol Kekre <am...@datatorrent.com>.
Chaitanya,
Are you creating NFS for naming purposes for users to find out? At a very
high level is the idea to have the following command return something in
Malhar directory

find . | grep -i nfs | grep java

We do need the above commands to common sources, destinations, protocols to
return something to get folks in. If so this could be as simple as changing
default values/properties, or a particular attributes (say partitioning) in
the Module that in your view makes it better suited for NFS. On somewhat
more complex side it could be java code.

Thks
Amol


On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
chaitanya@datatorrent.com> wrote:

> Hi Chandni,
>
>   Its a good point. I created the hierarchy based on user perspective and
> especially for non Java users. If I return FileSplitter and BlockReader
> from FS Input Module, then this module works for NFS. But, for users
> perspective it would be difficult, whether this module works for NFS or any
> other fileSystem.
>
> Regards,
> Chaitanya
>
> On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <ch...@datatorrent.com>
> wrote:
>
> > I am sorry Chaitanya but I have more questions about this
> >
> > 1. why is the FS Input Module abstract when by default it can return
> > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> >  These implementations are not specific to NFS.
> >
> > 2. In the NFS module that you have suggested to create, what is specific
> to
> > NFS?
> >
> > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > FSFileSplitter from library and move its feature to the base operator.
> >
> > Thanks,
> > Chandni
> >
> > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > chaitanya@datatorrent.com> wrote:
> >
> > > FSFileSplitter & BlockReader are available in com.datatorrent.lib.io.fs
> > > package.
> > >
> > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> singh.chandni@gmail.com>
> > > wrote:
> > >
> > > > Ok. What is specific about the fileSplitter and blockReader returned
> by
> > > > this implementation?
> > > >
> > > >
> > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> chaitanya@datatorrent.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Chandni,
> > > > >
> > > > > Properties wise nothing specific. FS Input Module is an abstract
> > Module
> > > > and
> > > > > NFS Module implements the abstract methods - createFileSplitter()
> and
> > > > > createBlockReader().
> > > > >
> > > > > Regards,
> > > > > Chaitanya
> > > > >
> > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > singh.chandni@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Chaitanya,
> > > > > >
> > > > > > What will be specific in NFS Input Module that is not provided by
> > FS
> > > > > Input
> > > > > > Module?
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <amol@datatorrent.com
> >
> > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Thks
> > > > > > > Amol
> > > > > > >
> > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > sandeep@datatorrent.com
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Sandeep
> > > > > > > >
> > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > mohit@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Mohit
> > > > > > > > >
> > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > >   I am proposing NFS Input Module. Use case is to read
> > large
> > > > > files
> > > > > > > from
> > > > > > > > > NFS
> > > > > > > > > > in parallel.
> > > > > > > > > >
> > > > > > > > > >  Design of NFS input module:
> > > > > > > > > >
> > > > > > > > > >    There is a common interface "FSInputModule" in Malhar
> > for
> > > > the
> > > > > > > input
> > > > > > > > > > Modules. NFS input Module extends from FSInputModule and
> > can
> > > be
> > > > > > > > achieved
> > > > > > > > > by
> > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > >
> > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Chaitanya
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chandni Singh <si...@gmail.com>.
Hi,

I see HDFSFileCopyModule and HDFSFileMerger in the library as well. Since
we are so close to the release and I am not sure if these classes are just
specific to HDFS, I am going to mark them Evolving so that we can address
this afterwards and change the name if its suitable.

Thanks,
Chandni

On Sat, May 7, 2016 at 2:17 PM, Chandni Singh <si...@gmail.com>
wrote:

> I can help Dev.
>
> Thanks,
> Chandni
>
> On Sat, May 7, 2016 at 1:23 PM, Amol Kekre <am...@datatorrent.com> wrote:
>
>> We do have docs on apache.org. Love to a very extensive and deep doc on
>> this topic.
>>
>> Should we add "How to ..." sections?
>>
>> @dev, thks for volunteering. Anyone more volunteers?
>>
>> Thks,
>> Amol
>>
>>
>> On Sat, May 7, 2016 at 12:20 PM, Devendra Tagare <
>> devendrat@datatorrent.com>
>> wrote:
>>
>> > @Thomas,@Amol I would like to contribute/collaborate on this.
>> >
>> > Will create a ticket for the same.
>> >
>> > Thanks,
>> > Dev
>> >
>> > On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <th...@datatorrent.com>
>> > wrote:
>> >
>> > > The documentation is here and is indexed:
>> > >
>> > > http://apex.apache.org/docs/malhar/
>> > >
>> > > I think this is a matter of enhancing it.
>> > >
>> > >
>> > > On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> > >
>> > > > Thomas and I talked. Both of us agree that a white paper is due to
>> get
>> > > > going. Google index clearly beats "find . | grep ..." in this day
>> and
>> > > age.
>> > > >
>> > > > The white paper would walk through and have data on HDFS, FTP, NFS,
>> S3,
>> > > > maybe even example apps (could be app properties) accompanying this.
>> > > >
>> > > > So any volunteers?
>> > > >
>> > > > Thks
>> > > > Amol
>> > > >
>> > > >
>> > > > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <
>> thomas@datatorrent.com>
>> > > > wrote:
>> > > >
>> > > > > Do we have other projects that create dummy classes for every
>> > possible
>> > > > > mounted file system just so that the user knows that's possible?
>> The
>> > > > > capability that matters here from app perspective is local file
>> > system
>> > > > and
>> > > > > every developer in the Hadoop ecosystem should understand that.
>> > > > >
>> > > > > If the operator doesn't have anything specific to NFS then there
>> is
>> > no
>> > > > > place for it in the library (it would be confusing, not helpful).
>> > > > >
>> > > > > There should be a different approach for pre-configured operators
>> > that
>> > > > > doesn't involve writing Java code.
>> > > > >
>> > > > > Thomas
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com>
>> > > wrote:
>> > > > >
>> > > > > > I am not suggesting duplicating code; extend the operators. Just
>> > add
>> > > > > > something (may not even be a function) that can be viewed as
>> > specific
>> > > > to
>> > > > > a
>> > > > > > particular source. Say for NFS, it may be as simple as changing
>> a
>> > > > > default.
>> > > > > > A file with NFS in its name help a great deal with adoption.
>> > > > > >
>> > > > > > Thks
>> > > > > > Amol
>> > > > > >
>> > > > > >
>> > > > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
>> > > > singh.chandni@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > IMO this is not a good idea.
>> > > > > > >
>> > > > > > > We are proposing to add additional Java code which is generic
>> > > (works
>> > > > > with
>> > > > > > > HDFS, NFS, local FS) but just calling it something specific -
>> > NFS.
>> > > > IMO
>> > > > > > this
>> > > > > > > is much more confusing to users.
>> > > > > > >
>> > > > > > > If we want to make it easier for users to find out that the FS
>> > > Module
>> > > > > > > supports writing to NFS then maybe we need to improve
>> > documentation
>> > > > or
>> > > > > > > highlight it somewhere else.
>> > > > > > >
>> > > > > > > Adding java classes means more maintenance overhead and here
>> > these
>> > > > > > classes
>> > > > > > > are not doing anything additional.
>> > > > > > >
>> > > > > > > Thanks,
>> > > > > > > Chandni
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
>> > > > mohit@datatorrent.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > +1 on Sandeep's suggestion. This would make an end user's
>> life
>> > > lot
>> > > > > more
>> > > > > > > > easier!
>> > > > > > > >
>> > > > > > > > Regards,
>> > > > > > > > Mohit
>> > > > > > > >
>> > > > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
>> > > > > > > sandeep@datatorrent.com
>> > > > > > > > >
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > I do agree with Amol on having clear and explicit modules.
>> > This
>> > > > is
>> > > > > > more
>> > > > > > > > > from an end user perspective. For someone who is new to
>> Apex,
>> > > > > having
>> > > > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense
>> than
>> > one
>> > > > > > generic
>> > > > > > > > FS
>> > > > > > > > > module. However small change these modules may have, like
>> > just
>> > > > > couple
>> > > > > > > of
>> > > > > > > > > small functions, I would like to have them separate for
>> the
>> > end
>> > > > > user.
>> > > > > > > > >
>> > > > > > > > > It is finally about the perspective and the user
>> experience
>> > :)
>> > > > > > > > >
>> > > > > > > > > Regards,
>> > > > > > > > > Sandeep
>> > > > > > > > >
>> > > > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
>> > > > > thomas@datatorrent.com
>> > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > I don't think we should name something NFS* when it
>> isn't
>> > > > > specific
>> > > > > > to
>> > > > > > > > > NFS.
>> > > > > > > > > > It is just like any other local FS for this purpose and
>> > > that's
>> > > > > > > already
>> > > > > > > > > > covered by the Hadoop file system abstraction.
>> > > > > > > > > >
>> > > > > > > > > > Why can't a single FS Input module accommodate all of
>> this.
>> > > > Once
>> > > > > > you
>> > > > > > > > know
>> > > > > > > > > > the FS URL, you can automatically optimize the
>> > configuration,
>> > > > if
>> > > > > > > > > > appropriate.
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > > Thomas
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
>> > > > > > > > > > chaitanya@datatorrent.com> wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi Chandni,
>> > > > > > > > > > >
>> > > > > > > > > > >   Its a good point. I created the hierarchy based on
>> user
>> > > > > > > perspective
>> > > > > > > > > and
>> > > > > > > > > > > especially for non Java users. If I return
>> FileSplitter
>> > and
>> > > > > > > > BlockReader
>> > > > > > > > > > > from FS Input Module, then this module works for NFS.
>> > But,
>> > > > for
>> > > > > > > users
>> > > > > > > > > > > perspective it would be difficult, whether this module
>> > > works
>> > > > > for
>> > > > > > > NFS
>> > > > > > > > or
>> > > > > > > > > > any
>> > > > > > > > > > > other fileSystem.
>> > > > > > > > > > >
>> > > > > > > > > > > Regards,
>> > > > > > > > > > > Chaitanya
>> > > > > > > > > > >
>> > > > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
>> > > > > > > > > chandni@datatorrent.com>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > I am sorry Chaitanya but I have more questions about
>> > this
>> > > > > > > > > > > >
>> > > > > > > > > > > > 1. why is the FS Input Module abstract when by
>> default
>> > it
>> > > > can
>> > > > > > > > return
>> > > > > > > > > > > > FileSplitter & BlockReader in
>> > com.datatorrent.lib.io.fs?
>> > > > > > > > > > > >  These implementations are not specific to NFS.
>> > > > > > > > > > > >
>> > > > > > > > > > > > 2. In the NFS module that you have suggested to
>> create,
>> > > > what
>> > > > > is
>> > > > > > > > > > specific
>> > > > > > > > > > > to
>> > > > > > > > > > > > NFS?
>> > > > > > > > > > > >
>> > > > > > > > > > > > Please note: I have created a ticket
>> APEXMALHAR-2081 to
>> > > > > remove
>> > > > > > > > > > > > FSFileSplitter from library and move its feature to
>> the
>> > > > base
>> > > > > > > > > operator.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > Chandni
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
>> > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > FSFileSplitter & BlockReader are available in
>> > > > > > > > > > com.datatorrent.lib.io.fs
>> > > > > > > > > > > > > package.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
>> > > > > > > > > > > singh.chandni@gmail.com>
>> > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Ok. What is specific about the fileSplitter and
>> > > > > blockReader
>> > > > > > > > > > returned
>> > > > > > > > > > > by
>> > > > > > > > > > > > > > this implementation?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
>> > > > > > > > > > > chaitanya@datatorrent.com
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi Chandni,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Properties wise nothing specific. FS Input
>> Module
>> > > is
>> > > > an
>> > > > > > > > > abstract
>> > > > > > > > > > > > Module
>> > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > NFS Module implements the abstract methods -
>> > > > > > > > > createFileSplitter()
>> > > > > > > > > > > and
>> > > > > > > > > > > > > > > createBlockReader().
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > Chaitanya
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh
>> <
>> > > > > > > > > > > > singh.chandni@gmail.com
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Hi Chaitanya,
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > What will be specific in NFS Input Module
>> that
>> > is
>> > > > not
>> > > > > > > > > provided
>> > > > > > > > > > by
>> > > > > > > > > > > > FS
>> > > > > > > > > > > > > > > Input
>> > > > > > > > > > > > > > > > Module?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > > > > > Chandni
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
>> > > > > > > > > > amol@datatorrent.com
>> > > > > > > > > > > >
>> > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > +1
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Thks
>> > > > > > > > > > > > > > > > > Amol
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
>> > > > Deshmukh <
>> > > > > > > > > > > > > > > > sandeep@datatorrent.com
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > +1
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > Sandeep
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit
>> > > Jotwani
>> > > > <
>> > > > > > > > > > > > > > > mohit@datatorrent.com>
>> > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > +1
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > > Mohit
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM,
>> > Chaitanya
>> > > > > > Chebolu
>> > > > > > > <
>> > > > > > > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Hi All,
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >   I am proposing NFS Input Module.
>> Use
>> > > case
>> > > > > is
>> > > > > > to
>> > > > > > > > > read
>> > > > > > > > > > > > large
>> > > > > > > > > > > > > > > files
>> > > > > > > > > > > > > > > > > from
>> > > > > > > > > > > > > > > > > > > NFS
>> > > > > > > > > > > > > > > > > > > > in parallel.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >  Design of NFS input module:
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >    There is a common interface
>> > > > > "FSInputModule"
>> > > > > > in
>> > > > > > > > > > Malhar
>> > > > > > > > > > > > for
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > input
>> > > > > > > > > > > > > > > > > > > > Modules. NFS input Module extends
>> from
>> > > > > > > > FSInputModule
>> > > > > > > > > > and
>> > > > > > > > > > > > can
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > achieved
>> > > > > > > > > > > > > > > > > > > by
>> > > > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
>> > > > > operators.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >   Please share your thoughts on
>> this.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > > > Chaitanya
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: NFS Input Module

Posted by Chandni Singh <si...@gmail.com>.
I can help Dev.

Thanks,
Chandni

On Sat, May 7, 2016 at 1:23 PM, Amol Kekre <am...@datatorrent.com> wrote:

> We do have docs on apache.org. Love to a very extensive and deep doc on
> this topic.
>
> Should we add "How to ..." sections?
>
> @dev, thks for volunteering. Anyone more volunteers?
>
> Thks,
> Amol
>
>
> On Sat, May 7, 2016 at 12:20 PM, Devendra Tagare <
> devendrat@datatorrent.com>
> wrote:
>
> > @Thomas,@Amol I would like to contribute/collaborate on this.
> >
> > Will create a ticket for the same.
> >
> > Thanks,
> > Dev
> >
> > On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <th...@datatorrent.com>
> > wrote:
> >
> > > The documentation is here and is indexed:
> > >
> > > http://apex.apache.org/docs/malhar/
> > >
> > > I think this is a matter of enhancing it.
> > >
> > >
> > > On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <am...@datatorrent.com>
> wrote:
> > >
> > > > Thomas and I talked. Both of us agree that a white paper is due to
> get
> > > > going. Google index clearly beats "find . | grep ..." in this day and
> > > age.
> > > >
> > > > The white paper would walk through and have data on HDFS, FTP, NFS,
> S3,
> > > > maybe even example apps (could be app properties) accompanying this.
> > > >
> > > > So any volunteers?
> > > >
> > > > Thks
> > > > Amol
> > > >
> > > >
> > > > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <thomas@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > Do we have other projects that create dummy classes for every
> > possible
> > > > > mounted file system just so that the user knows that's possible?
> The
> > > > > capability that matters here from app perspective is local file
> > system
> > > > and
> > > > > every developer in the Hadoop ecosystem should understand that.
> > > > >
> > > > > If the operator doesn't have anything specific to NFS then there is
> > no
> > > > > place for it in the library (it would be confusing, not helpful).
> > > > >
> > > > > There should be a different approach for pre-configured operators
> > that
> > > > > doesn't involve writing Java code.
> > > > >
> > > > > Thomas
> > > > >
> > > > >
> > > > >
> > > > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com>
> > > wrote:
> > > > >
> > > > > > I am not suggesting duplicating code; extend the operators. Just
> > add
> > > > > > something (may not even be a function) that can be viewed as
> > specific
> > > > to
> > > > > a
> > > > > > particular source. Say for NFS, it may be as simple as changing a
> > > > > default.
> > > > > > A file with NFS in its name help a great deal with adoption.
> > > > > >
> > > > > > Thks
> > > > > > Amol
> > > > > >
> > > > > >
> > > > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
> > > > singh.chandni@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > IMO this is not a good idea.
> > > > > > >
> > > > > > > We are proposing to add additional Java code which is generic
> > > (works
> > > > > with
> > > > > > > HDFS, NFS, local FS) but just calling it something specific -
> > NFS.
> > > > IMO
> > > > > > this
> > > > > > > is much more confusing to users.
> > > > > > >
> > > > > > > If we want to make it easier for users to find out that the FS
> > > Module
> > > > > > > supports writing to NFS then maybe we need to improve
> > documentation
> > > > or
> > > > > > > highlight it somewhere else.
> > > > > > >
> > > > > > > Adding java classes means more maintenance overhead and here
> > these
> > > > > > classes
> > > > > > > are not doing anything additional.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Chandni
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
> > > > mohit@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1 on Sandeep's suggestion. This would make an end user's
> life
> > > lot
> > > > > more
> > > > > > > > easier!
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Mohit
> > > > > > > >
> > > > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > > > > > sandeep@datatorrent.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I do agree with Amol on having clear and explicit modules.
> > This
> > > > is
> > > > > > more
> > > > > > > > > from an end user perspective. For someone who is new to
> Apex,
> > > > > having
> > > > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense than
> > one
> > > > > > generic
> > > > > > > > FS
> > > > > > > > > module. However small change these modules may have, like
> > just
> > > > > couple
> > > > > > > of
> > > > > > > > > small functions, I would like to have them separate for the
> > end
> > > > > user.
> > > > > > > > >
> > > > > > > > > It is finally about the perspective and the user experience
> > :)
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Sandeep
> > > > > > > > >
> > > > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> > > > > thomas@datatorrent.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I don't think we should name something NFS* when it isn't
> > > > > specific
> > > > > > to
> > > > > > > > > NFS.
> > > > > > > > > > It is just like any other local FS for this purpose and
> > > that's
> > > > > > > already
> > > > > > > > > > covered by the Hadoop file system abstraction.
> > > > > > > > > >
> > > > > > > > > > Why can't a single FS Input module accommodate all of
> this.
> > > > Once
> > > > > > you
> > > > > > > > know
> > > > > > > > > > the FS URL, you can automatically optimize the
> > configuration,
> > > > if
> > > > > > > > > > appropriate.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Thomas
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Chandni,
> > > > > > > > > > >
> > > > > > > > > > >   Its a good point. I created the hierarchy based on
> user
> > > > > > > perspective
> > > > > > > > > and
> > > > > > > > > > > especially for non Java users. If I return FileSplitter
> > and
> > > > > > > > BlockReader
> > > > > > > > > > > from FS Input Module, then this module works for NFS.
> > But,
> > > > for
> > > > > > > users
> > > > > > > > > > > perspective it would be difficult, whether this module
> > > works
> > > > > for
> > > > > > > NFS
> > > > > > > > or
> > > > > > > > > > any
> > > > > > > > > > > other fileSystem.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Chaitanya
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > > > > > chandni@datatorrent.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I am sorry Chaitanya but I have more questions about
> > this
> > > > > > > > > > > >
> > > > > > > > > > > > 1. why is the FS Input Module abstract when by
> default
> > it
> > > > can
> > > > > > > > return
> > > > > > > > > > > > FileSplitter & BlockReader in
> > com.datatorrent.lib.io.fs?
> > > > > > > > > > > >  These implementations are not specific to NFS.
> > > > > > > > > > > >
> > > > > > > > > > > > 2. In the NFS module that you have suggested to
> create,
> > > > what
> > > > > is
> > > > > > > > > > specific
> > > > > > > > > > > to
> > > > > > > > > > > > NFS?
> > > > > > > > > > > >
> > > > > > > > > > > > Please note: I have created a ticket APEXMALHAR-2081
> to
> > > > > remove
> > > > > > > > > > > > FSFileSplitter from library and move its feature to
> the
> > > > base
> > > > > > > > > operator.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Chandni
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > > > > > package.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > > > > > singh.chandni@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Ok. What is specific about the fileSplitter and
> > > > > blockReader
> > > > > > > > > > returned
> > > > > > > > > > > by
> > > > > > > > > > > > > > this implementation?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > > > > > chaitanya@datatorrent.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Chandni,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Properties wise nothing specific. FS Input
> Module
> > > is
> > > > an
> > > > > > > > > abstract
> > > > > > > > > > > > Module
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > > > > > createFileSplitter()
> > > > > > > > > > > and
> > > > > > > > > > > > > > > createBlockReader().
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > > > > > singh.chandni@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > What will be specific in NFS Input Module
> that
> > is
> > > > not
> > > > > > > > > provided
> > > > > > > > > > by
> > > > > > > > > > > > FS
> > > > > > > > > > > > > > > Input
> > > > > > > > > > > > > > > > Module?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > Chandni
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > > > > > amol@datatorrent.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thks
> > > > > > > > > > > > > > > > > Amol
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
> > > > Deshmukh <
> > > > > > > > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit
> > > Jotwani
> > > > <
> > > > > > > > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM,
> > Chaitanya
> > > > > > Chebolu
> > > > > > > <
> > > > > > > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >   I am proposing NFS Input Module.
> Use
> > > case
> > > > > is
> > > > > > to
> > > > > > > > > read
> > > > > > > > > > > > large
> > > > > > > > > > > > > > > files
> > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >    There is a common interface
> > > > > "FSInputModule"
> > > > > > in
> > > > > > > > > > Malhar
> > > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > > > > > Modules. NFS input Module extends
> from
> > > > > > > > FSInputModule
> > > > > > > > > > and
> > > > > > > > > > > > can
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> > > > > operators.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Amol Kekre <am...@datatorrent.com>.
We do have docs on apache.org. Love to a very extensive and deep doc on
this topic.

Should we add "How to ..." sections?

@dev, thks for volunteering. Anyone more volunteers?

Thks,
Amol


On Sat, May 7, 2016 at 12:20 PM, Devendra Tagare <de...@datatorrent.com>
wrote:

> @Thomas,@Amol I would like to contribute/collaborate on this.
>
> Will create a ticket for the same.
>
> Thanks,
> Dev
>
> On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > The documentation is here and is indexed:
> >
> > http://apex.apache.org/docs/malhar/
> >
> > I think this is a matter of enhancing it.
> >
> >
> > On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <am...@datatorrent.com> wrote:
> >
> > > Thomas and I talked. Both of us agree that a white paper is due to get
> > > going. Google index clearly beats "find . | grep ..." in this day and
> > age.
> > >
> > > The white paper would walk through and have data on HDFS, FTP, NFS, S3,
> > > maybe even example apps (could be app properties) accompanying this.
> > >
> > > So any volunteers?
> > >
> > > Thks
> > > Amol
> > >
> > >
> > > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <th...@datatorrent.com>
> > > wrote:
> > >
> > > > Do we have other projects that create dummy classes for every
> possible
> > > > mounted file system just so that the user knows that's possible? The
> > > > capability that matters here from app perspective is local file
> system
> > > and
> > > > every developer in the Hadoop ecosystem should understand that.
> > > >
> > > > If the operator doesn't have anything specific to NFS then there is
> no
> > > > place for it in the library (it would be confusing, not helpful).
> > > >
> > > > There should be a different approach for pre-configured operators
> that
> > > > doesn't involve writing Java code.
> > > >
> > > > Thomas
> > > >
> > > >
> > > >
> > > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com>
> > wrote:
> > > >
> > > > > I am not suggesting duplicating code; extend the operators. Just
> add
> > > > > something (may not even be a function) that can be viewed as
> specific
> > > to
> > > > a
> > > > > particular source. Say for NFS, it may be as simple as changing a
> > > > default.
> > > > > A file with NFS in its name help a great deal with adoption.
> > > > >
> > > > > Thks
> > > > > Amol
> > > > >
> > > > >
> > > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
> > > singh.chandni@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > IMO this is not a good idea.
> > > > > >
> > > > > > We are proposing to add additional Java code which is generic
> > (works
> > > > with
> > > > > > HDFS, NFS, local FS) but just calling it something specific -
> NFS.
> > > IMO
> > > > > this
> > > > > > is much more confusing to users.
> > > > > >
> > > > > > If we want to make it easier for users to find out that the FS
> > Module
> > > > > > supports writing to NFS then maybe we need to improve
> documentation
> > > or
> > > > > > highlight it somewhere else.
> > > > > >
> > > > > > Adding java classes means more maintenance overhead and here
> these
> > > > > classes
> > > > > > are not doing anything additional.
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
> > > mohit@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1 on Sandeep's suggestion. This would make an end user's life
> > lot
> > > > more
> > > > > > > easier!
> > > > > > >
> > > > > > > Regards,
> > > > > > > Mohit
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > > > > sandeep@datatorrent.com
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I do agree with Amol on having clear and explicit modules.
> This
> > > is
> > > > > more
> > > > > > > > from an end user perspective. For someone who is new to Apex,
> > > > having
> > > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense than
> one
> > > > > generic
> > > > > > > FS
> > > > > > > > module. However small change these modules may have, like
> just
> > > > couple
> > > > > > of
> > > > > > > > small functions, I would like to have them separate for the
> end
> > > > user.
> > > > > > > >
> > > > > > > > It is finally about the perspective and the user experience
> :)
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Sandeep
> > > > > > > >
> > > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> > > > thomas@datatorrent.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I don't think we should name something NFS* when it isn't
> > > > specific
> > > > > to
> > > > > > > > NFS.
> > > > > > > > > It is just like any other local FS for this purpose and
> > that's
> > > > > > already
> > > > > > > > > covered by the Hadoop file system abstraction.
> > > > > > > > >
> > > > > > > > > Why can't a single FS Input module accommodate all of this.
> > > Once
> > > > > you
> > > > > > > know
> > > > > > > > > the FS URL, you can automatically optimize the
> configuration,
> > > if
> > > > > > > > > appropriate.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Thomas
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hi Chandni,
> > > > > > > > > >
> > > > > > > > > >   Its a good point. I created the hierarchy based on user
> > > > > > perspective
> > > > > > > > and
> > > > > > > > > > especially for non Java users. If I return FileSplitter
> and
> > > > > > > BlockReader
> > > > > > > > > > from FS Input Module, then this module works for NFS.
> But,
> > > for
> > > > > > users
> > > > > > > > > > perspective it would be difficult, whether this module
> > works
> > > > for
> > > > > > NFS
> > > > > > > or
> > > > > > > > > any
> > > > > > > > > > other fileSystem.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Chaitanya
> > > > > > > > > >
> > > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > > > > chandni@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I am sorry Chaitanya but I have more questions about
> this
> > > > > > > > > > >
> > > > > > > > > > > 1. why is the FS Input Module abstract when by default
> it
> > > can
> > > > > > > return
> > > > > > > > > > > FileSplitter & BlockReader in
> com.datatorrent.lib.io.fs?
> > > > > > > > > > >  These implementations are not specific to NFS.
> > > > > > > > > > >
> > > > > > > > > > > 2. In the NFS module that you have suggested to create,
> > > what
> > > > is
> > > > > > > > > specific
> > > > > > > > > > to
> > > > > > > > > > > NFS?
> > > > > > > > > > >
> > > > > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to
> > > > remove
> > > > > > > > > > > FSFileSplitter from library and move its feature to the
> > > base
> > > > > > > > operator.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Chandni
> > > > > > > > > > >
> > > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > > > > package.
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > > > > singh.chandni@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Ok. What is specific about the fileSplitter and
> > > > blockReader
> > > > > > > > > returned
> > > > > > > > > > by
> > > > > > > > > > > > > this implementation?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > > > > chaitanya@datatorrent.com
> > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Chandni,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Properties wise nothing specific. FS Input Module
> > is
> > > an
> > > > > > > > abstract
> > > > > > > > > > > Module
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > > > > createFileSplitter()
> > > > > > > > > > and
> > > > > > > > > > > > > > createBlockReader().
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > > > > singh.chandni@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What will be specific in NFS Input Module that
> is
> > > not
> > > > > > > > provided
> > > > > > > > > by
> > > > > > > > > > > FS
> > > > > > > > > > > > > > Input
> > > > > > > > > > > > > > > Module?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > Chandni
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > > > > amol@datatorrent.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thks
> > > > > > > > > > > > > > > > Amol
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
> > > Deshmukh <
> > > > > > > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit
> > Jotwani
> > > <
> > > > > > > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM,
> Chaitanya
> > > > > Chebolu
> > > > > > <
> > > > > > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use
> > case
> > > > is
> > > > > to
> > > > > > > > read
> > > > > > > > > > > large
> > > > > > > > > > > > > > files
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >    There is a common interface
> > > > "FSInputModule"
> > > > > in
> > > > > > > > > Malhar
> > > > > > > > > > > for
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > > > > > FSInputModule
> > > > > > > > > and
> > > > > > > > > > > can
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> > > > operators.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Devendra Tagare <de...@datatorrent.com>.
@Thomas,@Amol I would like to contribute/collaborate on this.

Will create a ticket for the same.

Thanks,
Dev

On Sat, May 7, 2016 at 11:04 AM, Thomas Weise <th...@datatorrent.com>
wrote:

> The documentation is here and is indexed:
>
> http://apex.apache.org/docs/malhar/
>
> I think this is a matter of enhancing it.
>
>
> On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <am...@datatorrent.com> wrote:
>
> > Thomas and I talked. Both of us agree that a white paper is due to get
> > going. Google index clearly beats "find . | grep ..." in this day and
> age.
> >
> > The white paper would walk through and have data on HDFS, FTP, NFS, S3,
> > maybe even example apps (could be app properties) accompanying this.
> >
> > So any volunteers?
> >
> > Thks
> > Amol
> >
> >
> > On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <th...@datatorrent.com>
> > wrote:
> >
> > > Do we have other projects that create dummy classes for every possible
> > > mounted file system just so that the user knows that's possible? The
> > > capability that matters here from app perspective is local file system
> > and
> > > every developer in the Hadoop ecosystem should understand that.
> > >
> > > If the operator doesn't have anything specific to NFS then there is no
> > > place for it in the library (it would be confusing, not helpful).
> > >
> > > There should be a different approach for pre-configured operators that
> > > doesn't involve writing Java code.
> > >
> > > Thomas
> > >
> > >
> > >
> > > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com>
> wrote:
> > >
> > > > I am not suggesting duplicating code; extend the operators. Just add
> > > > something (may not even be a function) that can be viewed as specific
> > to
> > > a
> > > > particular source. Say for NFS, it may be as simple as changing a
> > > default.
> > > > A file with NFS in its name help a great deal with adoption.
> > > >
> > > > Thks
> > > > Amol
> > > >
> > > >
> > > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
> > singh.chandni@gmail.com>
> > > > wrote:
> > > >
> > > > > IMO this is not a good idea.
> > > > >
> > > > > We are proposing to add additional Java code which is generic
> (works
> > > with
> > > > > HDFS, NFS, local FS) but just calling it something specific - NFS.
> > IMO
> > > > this
> > > > > is much more confusing to users.
> > > > >
> > > > > If we want to make it easier for users to find out that the FS
> Module
> > > > > supports writing to NFS then maybe we need to improve documentation
> > or
> > > > > highlight it somewhere else.
> > > > >
> > > > > Adding java classes means more maintenance overhead and here these
> > > > classes
> > > > > are not doing anything additional.
> > > > >
> > > > > Thanks,
> > > > > Chandni
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
> > mohit@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > +1 on Sandeep's suggestion. This would make an end user's life
> lot
> > > more
> > > > > > easier!
> > > > > >
> > > > > > Regards,
> > > > > > Mohit
> > > > > >
> > > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > > > sandeep@datatorrent.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I do agree with Amol on having clear and explicit modules. This
> > is
> > > > more
> > > > > > > from an end user perspective. For someone who is new to Apex,
> > > having
> > > > > > > separate NFS, HDFS, FTP, etc would make lot more sense than one
> > > > generic
> > > > > > FS
> > > > > > > module. However small change these modules may have, like just
> > > couple
> > > > > of
> > > > > > > small functions, I would like to have them separate for the end
> > > user.
> > > > > > >
> > > > > > > It is finally about the perspective and the user experience :)
> > > > > > >
> > > > > > > Regards,
> > > > > > > Sandeep
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> > > thomas@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I don't think we should name something NFS* when it isn't
> > > specific
> > > > to
> > > > > > > NFS.
> > > > > > > > It is just like any other local FS for this purpose and
> that's
> > > > > already
> > > > > > > > covered by the Hadoop file system abstraction.
> > > > > > > >
> > > > > > > > Why can't a single FS Input module accommodate all of this.
> > Once
> > > > you
> > > > > > know
> > > > > > > > the FS URL, you can automatically optimize the configuration,
> > if
> > > > > > > > appropriate.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Thomas
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > >
> > > > > > > > > Hi Chandni,
> > > > > > > > >
> > > > > > > > >   Its a good point. I created the hierarchy based on user
> > > > > perspective
> > > > > > > and
> > > > > > > > > especially for non Java users. If I return FileSplitter and
> > > > > > BlockReader
> > > > > > > > > from FS Input Module, then this module works for NFS. But,
> > for
> > > > > users
> > > > > > > > > perspective it would be difficult, whether this module
> works
> > > for
> > > > > NFS
> > > > > > or
> > > > > > > > any
> > > > > > > > > other fileSystem.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Chaitanya
> > > > > > > > >
> > > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > > > chandni@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > > > > > >
> > > > > > > > > > 1. why is the FS Input Module abstract when by default it
> > can
> > > > > > return
> > > > > > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > > > > > >  These implementations are not specific to NFS.
> > > > > > > > > >
> > > > > > > > > > 2. In the NFS module that you have suggested to create,
> > what
> > > is
> > > > > > > > specific
> > > > > > > > > to
> > > > > > > > > > NFS?
> > > > > > > > > >
> > > > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to
> > > remove
> > > > > > > > > > FSFileSplitter from library and move its feature to the
> > base
> > > > > > > operator.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Chandni
> > > > > > > > > >
> > > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > > > package.
> > > > > > > > > > >
> > > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > > > singh.chandni@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Ok. What is specific about the fileSplitter and
> > > blockReader
> > > > > > > > returned
> > > > > > > > > by
> > > > > > > > > > > > this implementation?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > > > chaitanya@datatorrent.com
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Chandni,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Properties wise nothing specific. FS Input Module
> is
> > an
> > > > > > > abstract
> > > > > > > > > > Module
> > > > > > > > > > > > and
> > > > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > > > createFileSplitter()
> > > > > > > > > and
> > > > > > > > > > > > > createBlockReader().
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > > > singh.chandni@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > What will be specific in NFS Input Module that is
> > not
> > > > > > > provided
> > > > > > > > by
> > > > > > > > > > FS
> > > > > > > > > > > > > Input
> > > > > > > > > > > > > > Module?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > Chandni
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > > > amol@datatorrent.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thks
> > > > > > > > > > > > > > > Amol
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
> > Deshmukh <
> > > > > > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit
> Jotwani
> > <
> > > > > > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya
> > > > Chebolu
> > > > > <
> > > > > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use
> case
> > > is
> > > > to
> > > > > > > read
> > > > > > > > > > large
> > > > > > > > > > > > > files
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >    There is a common interface
> > > "FSInputModule"
> > > > in
> > > > > > > > Malhar
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > > > > FSInputModule
> > > > > > > > and
> > > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> > > operators.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Thomas Weise <th...@datatorrent.com>.
The documentation is here and is indexed:

http://apex.apache.org/docs/malhar/

I think this is a matter of enhancing it.


On Sat, May 7, 2016 at 9:18 AM, Amol Kekre <am...@datatorrent.com> wrote:

> Thomas and I talked. Both of us agree that a white paper is due to get
> going. Google index clearly beats "find . | grep ..." in this day and age.
>
> The white paper would walk through and have data on HDFS, FTP, NFS, S3,
> maybe even example apps (could be app properties) accompanying this.
>
> So any volunteers?
>
> Thks
> Amol
>
>
> On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > Do we have other projects that create dummy classes for every possible
> > mounted file system just so that the user knows that's possible? The
> > capability that matters here from app perspective is local file system
> and
> > every developer in the Hadoop ecosystem should understand that.
> >
> > If the operator doesn't have anything specific to NFS then there is no
> > place for it in the library (it would be confusing, not helpful).
> >
> > There should be a different approach for pre-configured operators that
> > doesn't involve writing Java code.
> >
> > Thomas
> >
> >
> >
> > On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com> wrote:
> >
> > > I am not suggesting duplicating code; extend the operators. Just add
> > > something (may not even be a function) that can be viewed as specific
> to
> > a
> > > particular source. Say for NFS, it may be as simple as changing a
> > default.
> > > A file with NFS in its name help a great deal with adoption.
> > >
> > > Thks
> > > Amol
> > >
> > >
> > > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <
> singh.chandni@gmail.com>
> > > wrote:
> > >
> > > > IMO this is not a good idea.
> > > >
> > > > We are proposing to add additional Java code which is generic (works
> > with
> > > > HDFS, NFS, local FS) but just calling it something specific - NFS.
> IMO
> > > this
> > > > is much more confusing to users.
> > > >
> > > > If we want to make it easier for users to find out that the FS Module
> > > > supports writing to NFS then maybe we need to improve documentation
> or
> > > > highlight it somewhere else.
> > > >
> > > > Adding java classes means more maintenance overhead and here these
> > > classes
> > > > are not doing anything additional.
> > > >
> > > > Thanks,
> > > > Chandni
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <
> mohit@datatorrent.com>
> > > > wrote:
> > > >
> > > > > +1 on Sandeep's suggestion. This would make an end user's life lot
> > more
> > > > > easier!
> > > > >
> > > > > Regards,
> > > > > Mohit
> > > > >
> > > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > > sandeep@datatorrent.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > I do agree with Amol on having clear and explicit modules. This
> is
> > > more
> > > > > > from an end user perspective. For someone who is new to Apex,
> > having
> > > > > > separate NFS, HDFS, FTP, etc would make lot more sense than one
> > > generic
> > > > > FS
> > > > > > module. However small change these modules may have, like just
> > couple
> > > > of
> > > > > > small functions, I would like to have them separate for the end
> > user.
> > > > > >
> > > > > > It is finally about the perspective and the user experience :)
> > > > > >
> > > > > > Regards,
> > > > > > Sandeep
> > > > > >
> > > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> > thomas@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > I don't think we should name something NFS* when it isn't
> > specific
> > > to
> > > > > > NFS.
> > > > > > > It is just like any other local FS for this purpose and that's
> > > > already
> > > > > > > covered by the Hadoop file system abstraction.
> > > > > > >
> > > > > > > Why can't a single FS Input module accommodate all of this.
> Once
> > > you
> > > > > know
> > > > > > > the FS URL, you can automatically optimize the configuration,
> if
> > > > > > > appropriate.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Thomas
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > >
> > > > > > > > Hi Chandni,
> > > > > > > >
> > > > > > > >   Its a good point. I created the hierarchy based on user
> > > > perspective
> > > > > > and
> > > > > > > > especially for non Java users. If I return FileSplitter and
> > > > > BlockReader
> > > > > > > > from FS Input Module, then this module works for NFS. But,
> for
> > > > users
> > > > > > > > perspective it would be difficult, whether this module works
> > for
> > > > NFS
> > > > > or
> > > > > > > any
> > > > > > > > other fileSystem.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Chaitanya
> > > > > > > >
> > > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > > chandni@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > > > > >
> > > > > > > > > 1. why is the FS Input Module abstract when by default it
> can
> > > > > return
> > > > > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > > > > >  These implementations are not specific to NFS.
> > > > > > > > >
> > > > > > > > > 2. In the NFS module that you have suggested to create,
> what
> > is
> > > > > > > specific
> > > > > > > > to
> > > > > > > > > NFS?
> > > > > > > > >
> > > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to
> > remove
> > > > > > > > > FSFileSplitter from library and move its feature to the
> base
> > > > > > operator.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Chandni
> > > > > > > > >
> > > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > >
> > > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > > package.
> > > > > > > > > >
> > > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > > singh.chandni@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Ok. What is specific about the fileSplitter and
> > blockReader
> > > > > > > returned
> > > > > > > > by
> > > > > > > > > > > this implementation?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > > chaitanya@datatorrent.com
> > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Chandni,
> > > > > > > > > > > >
> > > > > > > > > > > > Properties wise nothing specific. FS Input Module is
> an
> > > > > > abstract
> > > > > > > > > Module
> > > > > > > > > > > and
> > > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > > createFileSplitter()
> > > > > > > > and
> > > > > > > > > > > > createBlockReader().
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Chaitanya
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > > singh.chandni@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > > >
> > > > > > > > > > > > > What will be specific in NFS Input Module that is
> not
> > > > > > provided
> > > > > > > by
> > > > > > > > > FS
> > > > > > > > > > > > Input
> > > > > > > > > > > > > Module?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > Chandni
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > > amol@datatorrent.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thks
> > > > > > > > > > > > > > Amol
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep
> Deshmukh <
> > > > > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani
> <
> > > > > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya
> > > Chebolu
> > > > <
> > > > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use case
> > is
> > > to
> > > > > > read
> > > > > > > > > large
> > > > > > > > > > > > files
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >    There is a common interface
> > "FSInputModule"
> > > in
> > > > > > > Malhar
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > > > FSInputModule
> > > > > > > and
> > > > > > > > > can
> > > > > > > > > > be
> > > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> > operators.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Amol Kekre <am...@datatorrent.com>.
Thomas and I talked. Both of us agree that a white paper is due to get
going. Google index clearly beats "find . | grep ..." in this day and age.

The white paper would walk through and have data on HDFS, FTP, NFS, S3,
maybe even example apps (could be app properties) accompanying this.

So any volunteers?

Thks
Amol


On Thu, May 5, 2016 at 5:10 PM, Thomas Weise <th...@datatorrent.com> wrote:

> Do we have other projects that create dummy classes for every possible
> mounted file system just so that the user knows that's possible? The
> capability that matters here from app perspective is local file system and
> every developer in the Hadoop ecosystem should understand that.
>
> If the operator doesn't have anything specific to NFS then there is no
> place for it in the library (it would be confusing, not helpful).
>
> There should be a different approach for pre-configured operators that
> doesn't involve writing Java code.
>
> Thomas
>
>
>
> On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com> wrote:
>
> > I am not suggesting duplicating code; extend the operators. Just add
> > something (may not even be a function) that can be viewed as specific to
> a
> > particular source. Say for NFS, it may be as simple as changing a
> default.
> > A file with NFS in its name help a great deal with adoption.
> >
> > Thks
> > Amol
> >
> >
> > On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <si...@gmail.com>
> > wrote:
> >
> > > IMO this is not a good idea.
> > >
> > > We are proposing to add additional Java code which is generic (works
> with
> > > HDFS, NFS, local FS) but just calling it something specific - NFS. IMO
> > this
> > > is much more confusing to users.
> > >
> > > If we want to make it easier for users to find out that the FS Module
> > > supports writing to NFS then maybe we need to improve documentation or
> > > highlight it somewhere else.
> > >
> > > Adding java classes means more maintenance overhead and here these
> > classes
> > > are not doing anything additional.
> > >
> > > Thanks,
> > > Chandni
> > >
> > >
> > >
> > >
> > >
> > > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <mo...@datatorrent.com>
> > > wrote:
> > >
> > > > +1 on Sandeep's suggestion. This would make an end user's life lot
> more
> > > > easier!
> > > >
> > > > Regards,
> > > > Mohit
> > > >
> > > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > > sandeep@datatorrent.com
> > > > >
> > > > wrote:
> > > >
> > > > > I do agree with Amol on having clear and explicit modules. This is
> > more
> > > > > from an end user perspective. For someone who is new to Apex,
> having
> > > > > separate NFS, HDFS, FTP, etc would make lot more sense than one
> > generic
> > > > FS
> > > > > module. However small change these modules may have, like just
> couple
> > > of
> > > > > small functions, I would like to have them separate for the end
> user.
> > > > >
> > > > > It is finally about the perspective and the user experience :)
> > > > >
> > > > > Regards,
> > > > > Sandeep
> > > > >
> > > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <
> thomas@datatorrent.com
> > >
> > > > > wrote:
> > > > >
> > > > > > I don't think we should name something NFS* when it isn't
> specific
> > to
> > > > > NFS.
> > > > > > It is just like any other local FS for this purpose and that's
> > > already
> > > > > > covered by the Hadoop file system abstraction.
> > > > > >
> > > > > > Why can't a single FS Input module accommodate all of this. Once
> > you
> > > > know
> > > > > > the FS URL, you can automatically optimize the configuration, if
> > > > > > appropriate.
> > > > > >
> > > > > > Thanks,
> > > > > > Thomas
> > > > > >
> > > > > >
> > > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > > chaitanya@datatorrent.com> wrote:
> > > > > >
> > > > > > > Hi Chandni,
> > > > > > >
> > > > > > >   Its a good point. I created the hierarchy based on user
> > > perspective
> > > > > and
> > > > > > > especially for non Java users. If I return FileSplitter and
> > > > BlockReader
> > > > > > > from FS Input Module, then this module works for NFS. But, for
> > > users
> > > > > > > perspective it would be difficult, whether this module works
> for
> > > NFS
> > > > or
> > > > > > any
> > > > > > > other fileSystem.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Chaitanya
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > > chandni@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > > > >
> > > > > > > > 1. why is the FS Input Module abstract when by default it can
> > > > return
> > > > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > > > >  These implementations are not specific to NFS.
> > > > > > > >
> > > > > > > > 2. In the NFS module that you have suggested to create, what
> is
> > > > > > specific
> > > > > > > to
> > > > > > > > NFS?
> > > > > > > >
> > > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to
> remove
> > > > > > > > FSFileSplitter from library and move its feature to the base
> > > > > operator.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chandni
> > > > > > > >
> > > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > >
> > > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > > com.datatorrent.lib.io.fs
> > > > > > > > > package.
> > > > > > > > >
> > > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > > singh.chandni@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Ok. What is specific about the fileSplitter and
> blockReader
> > > > > > returned
> > > > > > > by
> > > > > > > > > > this implementation?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > > chaitanya@datatorrent.com
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Chandni,
> > > > > > > > > > >
> > > > > > > > > > > Properties wise nothing specific. FS Input Module is an
> > > > > abstract
> > > > > > > > Module
> > > > > > > > > > and
> > > > > > > > > > > NFS Module implements the abstract methods -
> > > > > createFileSplitter()
> > > > > > > and
> > > > > > > > > > > createBlockReader().
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Chaitanya
> > > > > > > > > > >
> > > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > > singh.chandni@gmail.com
> > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > > >
> > > > > > > > > > > > What will be specific in NFS Input Module that is not
> > > > > provided
> > > > > > by
> > > > > > > > FS
> > > > > > > > > > > Input
> > > > > > > > > > > > Module?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > Chandni
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > > amol@datatorrent.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > +1
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thks
> > > > > > > > > > > > > Amol
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya
> > Chebolu
> > > <
> > > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use case
> is
> > to
> > > > > read
> > > > > > > > large
> > > > > > > > > > > files
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    There is a common interface
> "FSInputModule"
> > in
> > > > > > Malhar
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > input
> > > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > > FSInputModule
> > > > > > and
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > using FSFileSplitter and BlockReader
> operators.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Thomas Weise <th...@datatorrent.com>.
Do we have other projects that create dummy classes for every possible
mounted file system just so that the user knows that's possible? The
capability that matters here from app perspective is local file system and
every developer in the Hadoop ecosystem should understand that.

If the operator doesn't have anything specific to NFS then there is no
place for it in the library (it would be confusing, not helpful).

There should be a different approach for pre-configured operators that
doesn't involve writing Java code.

Thomas



On Thu, May 5, 2016 at 3:10 PM, Amol Kekre <am...@datatorrent.com> wrote:

> I am not suggesting duplicating code; extend the operators. Just add
> something (may not even be a function) that can be viewed as specific to a
> particular source. Say for NFS, it may be as simple as changing a default.
> A file with NFS in its name help a great deal with adoption.
>
> Thks
> Amol
>
>
> On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <si...@gmail.com>
> wrote:
>
> > IMO this is not a good idea.
> >
> > We are proposing to add additional Java code which is generic (works with
> > HDFS, NFS, local FS) but just calling it something specific - NFS. IMO
> this
> > is much more confusing to users.
> >
> > If we want to make it easier for users to find out that the FS Module
> > supports writing to NFS then maybe we need to improve documentation or
> > highlight it somewhere else.
> >
> > Adding java classes means more maintenance overhead and here these
> classes
> > are not doing anything additional.
> >
> > Thanks,
> > Chandni
> >
> >
> >
> >
> >
> > On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <mo...@datatorrent.com>
> > wrote:
> >
> > > +1 on Sandeep's suggestion. This would make an end user's life lot more
> > > easier!
> > >
> > > Regards,
> > > Mohit
> > >
> > > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> > sandeep@datatorrent.com
> > > >
> > > wrote:
> > >
> > > > I do agree with Amol on having clear and explicit modules. This is
> more
> > > > from an end user perspective. For someone who is new to Apex, having
> > > > separate NFS, HDFS, FTP, etc would make lot more sense than one
> generic
> > > FS
> > > > module. However small change these modules may have, like just couple
> > of
> > > > small functions, I would like to have them separate for the end user.
> > > >
> > > > It is finally about the perspective and the user experience :)
> > > >
> > > > Regards,
> > > > Sandeep
> > > >
> > > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <thomas@datatorrent.com
> >
> > > > wrote:
> > > >
> > > > > I don't think we should name something NFS* when it isn't specific
> to
> > > > NFS.
> > > > > It is just like any other local FS for this purpose and that's
> > already
> > > > > covered by the Hadoop file system abstraction.
> > > > >
> > > > > Why can't a single FS Input module accommodate all of this. Once
> you
> > > know
> > > > > the FS URL, you can automatically optimize the configuration, if
> > > > > appropriate.
> > > > >
> > > > > Thanks,
> > > > > Thomas
> > > > >
> > > > >
> > > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > > chaitanya@datatorrent.com> wrote:
> > > > >
> > > > > > Hi Chandni,
> > > > > >
> > > > > >   Its a good point. I created the hierarchy based on user
> > perspective
> > > > and
> > > > > > especially for non Java users. If I return FileSplitter and
> > > BlockReader
> > > > > > from FS Input Module, then this module works for NFS. But, for
> > users
> > > > > > perspective it would be difficult, whether this module works for
> > NFS
> > > or
> > > > > any
> > > > > > other fileSystem.
> > > > > >
> > > > > > Regards,
> > > > > > Chaitanya
> > > > > >
> > > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > > chandni@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > > >
> > > > > > > 1. why is the FS Input Module abstract when by default it can
> > > return
> > > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > > >  These implementations are not specific to NFS.
> > > > > > >
> > > > > > > 2. In the NFS module that you have suggested to create, what is
> > > > > specific
> > > > > > to
> > > > > > > NFS?
> > > > > > >
> > > > > > > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > > > > > > FSFileSplitter from library and move its feature to the base
> > > > operator.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Chandni
> > > > > > >
> > > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > >
> > > > > > > > FSFileSplitter & BlockReader are available in
> > > > > com.datatorrent.lib.io.fs
> > > > > > > > package.
> > > > > > > >
> > > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > > singh.chandni@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Ok. What is specific about the fileSplitter and blockReader
> > > > > returned
> > > > > > by
> > > > > > > > > this implementation?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > > chaitanya@datatorrent.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Chandni,
> > > > > > > > > >
> > > > > > > > > > Properties wise nothing specific. FS Input Module is an
> > > > abstract
> > > > > > > Module
> > > > > > > > > and
> > > > > > > > > > NFS Module implements the abstract methods -
> > > > createFileSplitter()
> > > > > > and
> > > > > > > > > > createBlockReader().
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Chaitanya
> > > > > > > > > >
> > > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > > singh.chandni@gmail.com
> > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Chaitanya,
> > > > > > > > > > >
> > > > > > > > > > > What will be specific in NFS Input Module that is not
> > > > provided
> > > > > by
> > > > > > > FS
> > > > > > > > > > Input
> > > > > > > > > > > Module?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Chandni
> > > > > > > > > > >
> > > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > > amol@datatorrent.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1
> > > > > > > > > > > >
> > > > > > > > > > > > Thks
> > > > > > > > > > > > Amol
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > +1
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > Sandeep
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Mohit
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya
> Chebolu
> > <
> > > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >   I am proposing NFS Input Module. Use case is
> to
> > > > read
> > > > > > > large
> > > > > > > > > > files
> > > > > > > > > > > > from
> > > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >    There is a common interface "FSInputModule"
> in
> > > > > Malhar
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > input
> > > > > > > > > > > > > > > Modules. NFS input Module extends from
> > > FSInputModule
> > > > > and
> > > > > > > can
> > > > > > > > be
> > > > > > > > > > > > > achieved
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Amol Kekre <am...@datatorrent.com>.
I am not suggesting duplicating code; extend the operators. Just add
something (may not even be a function) that can be viewed as specific to a
particular source. Say for NFS, it may be as simple as changing a default.
A file with NFS in its name help a great deal with adoption.

Thks
Amol


On Thu, May 5, 2016 at 11:45 AM, Chandni Singh <si...@gmail.com>
wrote:

> IMO this is not a good idea.
>
> We are proposing to add additional Java code which is generic (works with
> HDFS, NFS, local FS) but just calling it something specific - NFS. IMO this
> is much more confusing to users.
>
> If we want to make it easier for users to find out that the FS Module
> supports writing to NFS then maybe we need to improve documentation or
> highlight it somewhere else.
>
> Adding java classes means more maintenance overhead and here these classes
> are not doing anything additional.
>
> Thanks,
> Chandni
>
>
>
>
>
> On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <mo...@datatorrent.com>
> wrote:
>
> > +1 on Sandeep's suggestion. This would make an end user's life lot more
> > easier!
> >
> > Regards,
> > Mohit
> >
> > On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <
> sandeep@datatorrent.com
> > >
> > wrote:
> >
> > > I do agree with Amol on having clear and explicit modules. This is more
> > > from an end user perspective. For someone who is new to Apex, having
> > > separate NFS, HDFS, FTP, etc would make lot more sense than one generic
> > FS
> > > module. However small change these modules may have, like just couple
> of
> > > small functions, I would like to have them separate for the end user.
> > >
> > > It is finally about the perspective and the user experience :)
> > >
> > > Regards,
> > > Sandeep
> > >
> > > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <th...@datatorrent.com>
> > > wrote:
> > >
> > > > I don't think we should name something NFS* when it isn't specific to
> > > NFS.
> > > > It is just like any other local FS for this purpose and that's
> already
> > > > covered by the Hadoop file system abstraction.
> > > >
> > > > Why can't a single FS Input module accommodate all of this. Once you
> > know
> > > > the FS URL, you can automatically optimize the configuration, if
> > > > appropriate.
> > > >
> > > > Thanks,
> > > > Thomas
> > > >
> > > >
> > > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > > chaitanya@datatorrent.com> wrote:
> > > >
> > > > > Hi Chandni,
> > > > >
> > > > >   Its a good point. I created the hierarchy based on user
> perspective
> > > and
> > > > > especially for non Java users. If I return FileSplitter and
> > BlockReader
> > > > > from FS Input Module, then this module works for NFS. But, for
> users
> > > > > perspective it would be difficult, whether this module works for
> NFS
> > or
> > > > any
> > > > > other fileSystem.
> > > > >
> > > > > Regards,
> > > > > Chaitanya
> > > > >
> > > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > > chandni@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > I am sorry Chaitanya but I have more questions about this
> > > > > >
> > > > > > 1. why is the FS Input Module abstract when by default it can
> > return
> > > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > > >  These implementations are not specific to NFS.
> > > > > >
> > > > > > 2. In the NFS module that you have suggested to create, what is
> > > > specific
> > > > > to
> > > > > > NFS?
> > > > > >
> > > > > > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > > > > > FSFileSplitter from library and move its feature to the base
> > > operator.
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > > chaitanya@datatorrent.com> wrote:
> > > > > >
> > > > > > > FSFileSplitter & BlockReader are available in
> > > > com.datatorrent.lib.io.fs
> > > > > > > package.
> > > > > > >
> > > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > > singh.chandni@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Ok. What is specific about the fileSplitter and blockReader
> > > > returned
> > > > > by
> > > > > > > > this implementation?
> > > > > > > >
> > > > > > > >
> > > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > > chaitanya@datatorrent.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Chandni,
> > > > > > > > >
> > > > > > > > > Properties wise nothing specific. FS Input Module is an
> > > abstract
> > > > > > Module
> > > > > > > > and
> > > > > > > > > NFS Module implements the abstract methods -
> > > createFileSplitter()
> > > > > and
> > > > > > > > > createBlockReader().
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Chaitanya
> > > > > > > > >
> > > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > > singh.chandni@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Chaitanya,
> > > > > > > > > >
> > > > > > > > > > What will be specific in NFS Input Module that is not
> > > provided
> > > > by
> > > > > > FS
> > > > > > > > > Input
> > > > > > > > > > Module?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Chandni
> > > > > > > > > >
> > > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > > amol@datatorrent.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1
> > > > > > > > > > >
> > > > > > > > > > > Thks
> > > > > > > > > > > Amol
> > > > > > > > > > >
> > > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Sandeep
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > +1
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > Mohit
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu
> <
> > > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >   I am proposing NFS Input Module. Use case is to
> > > read
> > > > > > large
> > > > > > > > > files
> > > > > > > > > > > from
> > > > > > > > > > > > > NFS
> > > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >    There is a common interface "FSInputModule" in
> > > > Malhar
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > input
> > > > > > > > > > > > > > Modules. NFS input Module extends from
> > FSInputModule
> > > > and
> > > > > > can
> > > > > > > be
> > > > > > > > > > > > achieved
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chandni Singh <si...@gmail.com>.
IMO this is not a good idea.

We are proposing to add additional Java code which is generic (works with
HDFS, NFS, local FS) but just calling it something specific - NFS. IMO this
is much more confusing to users.

If we want to make it easier for users to find out that the FS Module
supports writing to NFS then maybe we need to improve documentation or
highlight it somewhere else.

Adding java classes means more maintenance overhead and here these classes
are not doing anything additional.

Thanks,
Chandni





On Thu, May 5, 2016 at 11:24 AM, Mohit Jotwani <mo...@datatorrent.com>
wrote:

> +1 on Sandeep's suggestion. This would make an end user's life lot more
> easier!
>
> Regards,
> Mohit
>
> On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <sandeep@datatorrent.com
> >
> wrote:
>
> > I do agree with Amol on having clear and explicit modules. This is more
> > from an end user perspective. For someone who is new to Apex, having
> > separate NFS, HDFS, FTP, etc would make lot more sense than one generic
> FS
> > module. However small change these modules may have, like just couple of
> > small functions, I would like to have them separate for the end user.
> >
> > It is finally about the perspective and the user experience :)
> >
> > Regards,
> > Sandeep
> >
> > On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <th...@datatorrent.com>
> > wrote:
> >
> > > I don't think we should name something NFS* when it isn't specific to
> > NFS.
> > > It is just like any other local FS for this purpose and that's already
> > > covered by the Hadoop file system abstraction.
> > >
> > > Why can't a single FS Input module accommodate all of this. Once you
> know
> > > the FS URL, you can automatically optimize the configuration, if
> > > appropriate.
> > >
> > > Thanks,
> > > Thomas
> > >
> > >
> > > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > > chaitanya@datatorrent.com> wrote:
> > >
> > > > Hi Chandni,
> > > >
> > > >   Its a good point. I created the hierarchy based on user perspective
> > and
> > > > especially for non Java users. If I return FileSplitter and
> BlockReader
> > > > from FS Input Module, then this module works for NFS. But, for users
> > > > perspective it would be difficult, whether this module works for NFS
> or
> > > any
> > > > other fileSystem.
> > > >
> > > > Regards,
> > > > Chaitanya
> > > >
> > > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> > chandni@datatorrent.com>
> > > > wrote:
> > > >
> > > > > I am sorry Chaitanya but I have more questions about this
> > > > >
> > > > > 1. why is the FS Input Module abstract when by default it can
> return
> > > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > > >  These implementations are not specific to NFS.
> > > > >
> > > > > 2. In the NFS module that you have suggested to create, what is
> > > specific
> > > > to
> > > > > NFS?
> > > > >
> > > > > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > > > > FSFileSplitter from library and move its feature to the base
> > operator.
> > > > >
> > > > > Thanks,
> > > > > Chandni
> > > > >
> > > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > > chaitanya@datatorrent.com> wrote:
> > > > >
> > > > > > FSFileSplitter & BlockReader are available in
> > > com.datatorrent.lib.io.fs
> > > > > > package.
> > > > > >
> > > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > > singh.chandni@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Ok. What is specific about the fileSplitter and blockReader
> > > returned
> > > > by
> > > > > > > this implementation?
> > > > > > >
> > > > > > >
> > > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > > chaitanya@datatorrent.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Chandni,
> > > > > > > >
> > > > > > > > Properties wise nothing specific. FS Input Module is an
> > abstract
> > > > > Module
> > > > > > > and
> > > > > > > > NFS Module implements the abstract methods -
> > createFileSplitter()
> > > > and
> > > > > > > > createBlockReader().
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Chaitanya
> > > > > > > >
> > > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > > singh.chandni@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Chaitanya,
> > > > > > > > >
> > > > > > > > > What will be specific in NFS Input Module that is not
> > provided
> > > by
> > > > > FS
> > > > > > > > Input
> > > > > > > > > Module?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Chandni
> > > > > > > > >
> > > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > > amol@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Thks
> > > > > > > > > > Amol
> > > > > > > > > >
> > > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > > > sandeep@datatorrent.com
> > > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Sandeep
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > > > mohit@datatorrent.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Mohit
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > >
> > > > > > > > > > > > >   I am proposing NFS Input Module. Use case is to
> > read
> > > > > large
> > > > > > > > files
> > > > > > > > > > from
> > > > > > > > > > > > NFS
> > > > > > > > > > > > > in parallel.
> > > > > > > > > > > > >
> > > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > > >
> > > > > > > > > > > > >    There is a common interface "FSInputModule" in
> > > Malhar
> > > > > for
> > > > > > > the
> > > > > > > > > > input
> > > > > > > > > > > > > Modules. NFS input Module extends from
> FSInputModule
> > > and
> > > > > can
> > > > > > be
> > > > > > > > > > > achieved
> > > > > > > > > > > > by
> > > > > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > > > > >
> > > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > Chaitanya
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Mohit Jotwani <mo...@datatorrent.com>.
+1 on Sandeep's suggestion. This would make an end user's life lot more
easier!

Regards,
Mohit

On Thu, May 5, 2016 at 11:51 PM, Sandeep Deshmukh <sa...@datatorrent.com>
wrote:

> I do agree with Amol on having clear and explicit modules. This is more
> from an end user perspective. For someone who is new to Apex, having
> separate NFS, HDFS, FTP, etc would make lot more sense than one generic FS
> module. However small change these modules may have, like just couple of
> small functions, I would like to have them separate for the end user.
>
> It is finally about the perspective and the user experience :)
>
> Regards,
> Sandeep
>
> On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <th...@datatorrent.com>
> wrote:
>
> > I don't think we should name something NFS* when it isn't specific to
> NFS.
> > It is just like any other local FS for this purpose and that's already
> > covered by the Hadoop file system abstraction.
> >
> > Why can't a single FS Input module accommodate all of this. Once you know
> > the FS URL, you can automatically optimize the configuration, if
> > appropriate.
> >
> > Thanks,
> > Thomas
> >
> >
> > On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> > chaitanya@datatorrent.com> wrote:
> >
> > > Hi Chandni,
> > >
> > >   Its a good point. I created the hierarchy based on user perspective
> and
> > > especially for non Java users. If I return FileSplitter and BlockReader
> > > from FS Input Module, then this module works for NFS. But, for users
> > > perspective it would be difficult, whether this module works for NFS or
> > any
> > > other fileSystem.
> > >
> > > Regards,
> > > Chaitanya
> > >
> > > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <
> chandni@datatorrent.com>
> > > wrote:
> > >
> > > > I am sorry Chaitanya but I have more questions about this
> > > >
> > > > 1. why is the FS Input Module abstract when by default it can return
> > > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > > >  These implementations are not specific to NFS.
> > > >
> > > > 2. In the NFS module that you have suggested to create, what is
> > specific
> > > to
> > > > NFS?
> > > >
> > > > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > > > FSFileSplitter from library and move its feature to the base
> operator.
> > > >
> > > > Thanks,
> > > > Chandni
> > > >
> > > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > > chaitanya@datatorrent.com> wrote:
> > > >
> > > > > FSFileSplitter & BlockReader are available in
> > com.datatorrent.lib.io.fs
> > > > > package.
> > > > >
> > > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > > singh.chandni@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Ok. What is specific about the fileSplitter and blockReader
> > returned
> > > by
> > > > > > this implementation?
> > > > > >
> > > > > >
> > > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > > chaitanya@datatorrent.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Chandni,
> > > > > > >
> > > > > > > Properties wise nothing specific. FS Input Module is an
> abstract
> > > > Module
> > > > > > and
> > > > > > > NFS Module implements the abstract methods -
> createFileSplitter()
> > > and
> > > > > > > createBlockReader().
> > > > > > >
> > > > > > > Regards,
> > > > > > > Chaitanya
> > > > > > >
> > > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > > singh.chandni@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Chaitanya,
> > > > > > > >
> > > > > > > > What will be specific in NFS Input Module that is not
> provided
> > by
> > > > FS
> > > > > > > Input
> > > > > > > > Module?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Chandni
> > > > > > > >
> > > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> > amol@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Thks
> > > > > > > > > Amol
> > > > > > > > >
> > > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > > sandeep@datatorrent.com
> > > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Sandeep
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > > mohit@datatorrent.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Mohit
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi All,
> > > > > > > > > > > >
> > > > > > > > > > > >   I am proposing NFS Input Module. Use case is to
> read
> > > > large
> > > > > > > files
> > > > > > > > > from
> > > > > > > > > > > NFS
> > > > > > > > > > > > in parallel.
> > > > > > > > > > > >
> > > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > > >
> > > > > > > > > > > >    There is a common interface "FSInputModule" in
> > Malhar
> > > > for
> > > > > > the
> > > > > > > > > input
> > > > > > > > > > > > Modules. NFS input Module extends from FSInputModule
> > and
> > > > can
> > > > > be
> > > > > > > > > > achieved
> > > > > > > > > > > by
> > > > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > > > >
> > > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > > Chaitanya
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Sandeep Deshmukh <sa...@datatorrent.com>.
I do agree with Amol on having clear and explicit modules. This is more
from an end user perspective. For someone who is new to Apex, having
separate NFS, HDFS, FTP, etc would make lot more sense than one generic FS
module. However small change these modules may have, like just couple of
small functions, I would like to have them separate for the end user.

It is finally about the perspective and the user experience :)

Regards,
Sandeep

On Thu, May 5, 2016 at 8:48 PM, Thomas Weise <th...@datatorrent.com> wrote:

> I don't think we should name something NFS* when it isn't specific to NFS.
> It is just like any other local FS for this purpose and that's already
> covered by the Hadoop file system abstraction.
>
> Why can't a single FS Input module accommodate all of this. Once you know
> the FS URL, you can automatically optimize the configuration, if
> appropriate.
>
> Thanks,
> Thomas
>
>
> On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
> chaitanya@datatorrent.com> wrote:
>
> > Hi Chandni,
> >
> >   Its a good point. I created the hierarchy based on user perspective and
> > especially for non Java users. If I return FileSplitter and BlockReader
> > from FS Input Module, then this module works for NFS. But, for users
> > perspective it would be difficult, whether this module works for NFS or
> any
> > other fileSystem.
> >
> > Regards,
> > Chaitanya
> >
> > On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <ch...@datatorrent.com>
> > wrote:
> >
> > > I am sorry Chaitanya but I have more questions about this
> > >
> > > 1. why is the FS Input Module abstract when by default it can return
> > > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> > >  These implementations are not specific to NFS.
> > >
> > > 2. In the NFS module that you have suggested to create, what is
> specific
> > to
> > > NFS?
> > >
> > > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > > FSFileSplitter from library and move its feature to the base operator.
> > >
> > > Thanks,
> > > Chandni
> > >
> > > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > > chaitanya@datatorrent.com> wrote:
> > >
> > > > FSFileSplitter & BlockReader are available in
> com.datatorrent.lib.io.fs
> > > > package.
> > > >
> > > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> > singh.chandni@gmail.com>
> > > > wrote:
> > > >
> > > > > Ok. What is specific about the fileSplitter and blockReader
> returned
> > by
> > > > > this implementation?
> > > > >
> > > > >
> > > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> > chaitanya@datatorrent.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Chandni,
> > > > > >
> > > > > > Properties wise nothing specific. FS Input Module is an abstract
> > > Module
> > > > > and
> > > > > > NFS Module implements the abstract methods - createFileSplitter()
> > and
> > > > > > createBlockReader().
> > > > > >
> > > > > > Regards,
> > > > > > Chaitanya
> > > > > >
> > > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > > singh.chandni@gmail.com
> > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Chaitanya,
> > > > > > >
> > > > > > > What will be specific in NFS Input Module that is not provided
> by
> > > FS
> > > > > > Input
> > > > > > > Module?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Chandni
> > > > > > >
> > > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <
> amol@datatorrent.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Thks
> > > > > > > > Amol
> > > > > > > >
> > > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > > sandeep@datatorrent.com
> > > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Sandeep
> > > > > > > > >
> > > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > > mohit@datatorrent.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Mohit
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi All,
> > > > > > > > > > >
> > > > > > > > > > >   I am proposing NFS Input Module. Use case is to read
> > > large
> > > > > > files
> > > > > > > > from
> > > > > > > > > > NFS
> > > > > > > > > > > in parallel.
> > > > > > > > > > >
> > > > > > > > > > >  Design of NFS input module:
> > > > > > > > > > >
> > > > > > > > > > >    There is a common interface "FSInputModule" in
> Malhar
> > > for
> > > > > the
> > > > > > > > input
> > > > > > > > > > > Modules. NFS input Module extends from FSInputModule
> and
> > > can
> > > > be
> > > > > > > > > achieved
> > > > > > > > > > by
> > > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > > >
> > > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > > Chaitanya
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Thomas Weise <th...@datatorrent.com>.
I don't think we should name something NFS* when it isn't specific to NFS.
It is just like any other local FS for this purpose and that's already
covered by the Hadoop file system abstraction.

Why can't a single FS Input module accommodate all of this. Once you know
the FS URL, you can automatically optimize the configuration, if
appropriate.

Thanks,
Thomas


On Thu, May 5, 2016 at 12:08 AM, Chaitanya Chebolu <
chaitanya@datatorrent.com> wrote:

> Hi Chandni,
>
>   Its a good point. I created the hierarchy based on user perspective and
> especially for non Java users. If I return FileSplitter and BlockReader
> from FS Input Module, then this module works for NFS. But, for users
> perspective it would be difficult, whether this module works for NFS or any
> other fileSystem.
>
> Regards,
> Chaitanya
>
> On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <ch...@datatorrent.com>
> wrote:
>
> > I am sorry Chaitanya but I have more questions about this
> >
> > 1. why is the FS Input Module abstract when by default it can return
> > FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
> >  These implementations are not specific to NFS.
> >
> > 2. In the NFS module that you have suggested to create, what is specific
> to
> > NFS?
> >
> > Please note: I have created a ticket APEXMALHAR-2081 to remove
> > FSFileSplitter from library and move its feature to the base operator.
> >
> > Thanks,
> > Chandni
> >
> > On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> > chaitanya@datatorrent.com> wrote:
> >
> > > FSFileSplitter & BlockReader are available in com.datatorrent.lib.io.fs
> > > package.
> > >
> > > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <
> singh.chandni@gmail.com>
> > > wrote:
> > >
> > > > Ok. What is specific about the fileSplitter and blockReader returned
> by
> > > > this implementation?
> > > >
> > > >
> > > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <
> chaitanya@datatorrent.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Chandni,
> > > > >
> > > > > Properties wise nothing specific. FS Input Module is an abstract
> > Module
> > > > and
> > > > > NFS Module implements the abstract methods - createFileSplitter()
> and
> > > > > createBlockReader().
> > > > >
> > > > > Regards,
> > > > > Chaitanya
> > > > >
> > > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> > singh.chandni@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Chaitanya,
> > > > > >
> > > > > > What will be specific in NFS Input Module that is not provided by
> > FS
> > > > > Input
> > > > > > Module?
> > > > > >
> > > > > > Thanks,
> > > > > > Chandni
> > > > > >
> > > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <amol@datatorrent.com
> >
> > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Thks
> > > > > > > Amol
> > > > > > >
> > > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > > sandeep@datatorrent.com
> > > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Sandeep
> > > > > > > >
> > > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > > mohit@datatorrent.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Mohit
> > > > > > > > >
> > > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hi All,
> > > > > > > > > >
> > > > > > > > > >   I am proposing NFS Input Module. Use case is to read
> > large
> > > > > files
> > > > > > > from
> > > > > > > > > NFS
> > > > > > > > > > in parallel.
> > > > > > > > > >
> > > > > > > > > >  Design of NFS input module:
> > > > > > > > > >
> > > > > > > > > >    There is a common interface "FSInputModule" in Malhar
> > for
> > > > the
> > > > > > > input
> > > > > > > > > > Modules. NFS input Module extends from FSInputModule and
> > can
> > > be
> > > > > > > > achieved
> > > > > > > > > by
> > > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > > >
> > > > > > > > > >   Please share your thoughts on this.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Chaitanya
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chaitanya Chebolu <ch...@datatorrent.com>.
Hi Chandni,

  Its a good point. I created the hierarchy based on user perspective and
especially for non Java users. If I return FileSplitter and BlockReader
from FS Input Module, then this module works for NFS. But, for users
perspective it would be difficult, whether this module works for NFS or any
other fileSystem.

Regards,
Chaitanya

On Thu, May 5, 2016 at 11:05 AM, Chandni Singh <ch...@datatorrent.com>
wrote:

> I am sorry Chaitanya but I have more questions about this
>
> 1. why is the FS Input Module abstract when by default it can return
> FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
>  These implementations are not specific to NFS.
>
> 2. In the NFS module that you have suggested to create, what is specific to
> NFS?
>
> Please note: I have created a ticket APEXMALHAR-2081 to remove
> FSFileSplitter from library and move its feature to the base operator.
>
> Thanks,
> Chandni
>
> On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
> chaitanya@datatorrent.com> wrote:
>
> > FSFileSplitter & BlockReader are available in com.datatorrent.lib.io.fs
> > package.
> >
> > On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <si...@gmail.com>
> > wrote:
> >
> > > Ok. What is specific about the fileSplitter and blockReader returned by
> > > this implementation?
> > >
> > >
> > > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <chaitanya@datatorrent.com
> >
> > > wrote:
> > >
> > > > Hi Chandni,
> > > >
> > > > Properties wise nothing specific. FS Input Module is an abstract
> Module
> > > and
> > > > NFS Module implements the abstract methods - createFileSplitter() and
> > > > createBlockReader().
> > > >
> > > > Regards,
> > > > Chaitanya
> > > >
> > > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <
> singh.chandni@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Chaitanya,
> > > > >
> > > > > What will be specific in NFS Input Module that is not provided by
> FS
> > > > Input
> > > > > Module?
> > > > >
> > > > > Thanks,
> > > > > Chandni
> > > > >
> > > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <am...@datatorrent.com>
> > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Thks
> > > > > > Amol
> > > > > >
> > > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > > sandeep@datatorrent.com
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Regards,
> > > > > > > Sandeep
> > > > > > >
> > > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > > mohit@datatorrent.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Mohit
> > > > > > > >
> > > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > > >
> > > > > > > > > Hi All,
> > > > > > > > >
> > > > > > > > >   I am proposing NFS Input Module. Use case is to read
> large
> > > > files
> > > > > > from
> > > > > > > > NFS
> > > > > > > > > in parallel.
> > > > > > > > >
> > > > > > > > >  Design of NFS input module:
> > > > > > > > >
> > > > > > > > >    There is a common interface "FSInputModule" in Malhar
> for
> > > the
> > > > > > input
> > > > > > > > > Modules. NFS input Module extends from FSInputModule and
> can
> > be
> > > > > > > achieved
> > > > > > > > by
> > > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > > >
> > > > > > > > >   Please share your thoughts on this.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > > Chaitanya
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chandni Singh <ch...@datatorrent.com>.
I am sorry Chaitanya but I have more questions about this

1. why is the FS Input Module abstract when by default it can return
FileSplitter & BlockReader in com.datatorrent.lib.io.fs?
 These implementations are not specific to NFS.

2. In the NFS module that you have suggested to create, what is specific to
NFS?

Please note: I have created a ticket APEXMALHAR-2081 to remove
FSFileSplitter from library and move its feature to the base operator.

Thanks,
Chandni

On Wed, May 4, 2016 at 10:29 PM, Chaitanya Chebolu <
chaitanya@datatorrent.com> wrote:

> FSFileSplitter & BlockReader are available in com.datatorrent.lib.io.fs
> package.
>
> On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <si...@gmail.com>
> wrote:
>
> > Ok. What is specific about the fileSplitter and blockReader returned by
> > this implementation?
> >
> >
> > On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <ch...@datatorrent.com>
> > wrote:
> >
> > > Hi Chandni,
> > >
> > > Properties wise nothing specific. FS Input Module is an abstract Module
> > and
> > > NFS Module implements the abstract methods - createFileSplitter() and
> > > createBlockReader().
> > >
> > > Regards,
> > > Chaitanya
> > >
> > > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <singh.chandni@gmail.com
> >
> > > wrote:
> > >
> > > > Hi Chaitanya,
> > > >
> > > > What will be specific in NFS Input Module that is not provided by FS
> > > Input
> > > > Module?
> > > >
> > > > Thanks,
> > > > Chandni
> > > >
> > > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <am...@datatorrent.com>
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Thks
> > > > > Amol
> > > > >
> > > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > > sandeep@datatorrent.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Regards,
> > > > > > Sandeep
> > > > > >
> > > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > > mohit@datatorrent.com>
> > > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Regards,
> > > > > > > Mohit
> > > > > > >
> > > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > > chaitanya@datatorrent.com> wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > >   I am proposing NFS Input Module. Use case is to read large
> > > files
> > > > > from
> > > > > > > NFS
> > > > > > > > in parallel.
> > > > > > > >
> > > > > > > >  Design of NFS input module:
> > > > > > > >
> > > > > > > >    There is a common interface "FSInputModule" in Malhar for
> > the
> > > > > input
> > > > > > > > Modules. NFS input Module extends from FSInputModule and can
> be
> > > > > > achieved
> > > > > > > by
> > > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > > >
> > > > > > > >   Please share your thoughts on this.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Chaitanya
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chaitanya Chebolu <ch...@datatorrent.com>.
FSFileSplitter & BlockReader are available in com.datatorrent.lib.io.fs
package.

On Thu, May 5, 2016 at 10:47 AM, Chandni Singh <si...@gmail.com>
wrote:

> Ok. What is specific about the fileSplitter and blockReader returned by
> this implementation?
>
>
> On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <ch...@datatorrent.com>
> wrote:
>
> > Hi Chandni,
> >
> > Properties wise nothing specific. FS Input Module is an abstract Module
> and
> > NFS Module implements the abstract methods - createFileSplitter() and
> > createBlockReader().
> >
> > Regards,
> > Chaitanya
> >
> > On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <si...@gmail.com>
> > wrote:
> >
> > > Hi Chaitanya,
> > >
> > > What will be specific in NFS Input Module that is not provided by FS
> > Input
> > > Module?
> > >
> > > Thanks,
> > > Chandni
> > >
> > > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <am...@datatorrent.com>
> wrote:
> > >
> > > > +1
> > > >
> > > > Thks
> > > > Amol
> > > >
> > > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > > sandeep@datatorrent.com
> > > > >
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Regards,
> > > > > Sandeep
> > > > >
> > > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> > mohit@datatorrent.com>
> > > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Regards,
> > > > > > Mohit
> > > > > >
> > > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > > chaitanya@datatorrent.com> wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > >   I am proposing NFS Input Module. Use case is to read large
> > files
> > > > from
> > > > > > NFS
> > > > > > > in parallel.
> > > > > > >
> > > > > > >  Design of NFS input module:
> > > > > > >
> > > > > > >    There is a common interface "FSInputModule" in Malhar for
> the
> > > > input
> > > > > > > Modules. NFS input Module extends from FSInputModule and can be
> > > > > achieved
> > > > > > by
> > > > > > > using FSFileSplitter and BlockReader operators.
> > > > > > >
> > > > > > >   Please share your thoughts on this.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Chaitanya
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chandni Singh <si...@gmail.com>.
Ok. What is specific about the fileSplitter and blockReader returned by
this implementation?


On May 4, 2016 9:43 PM, "Chaitanya Chebolu" <ch...@datatorrent.com>
wrote:

> Hi Chandni,
>
> Properties wise nothing specific. FS Input Module is an abstract Module and
> NFS Module implements the abstract methods - createFileSplitter() and
> createBlockReader().
>
> Regards,
> Chaitanya
>
> On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <si...@gmail.com>
> wrote:
>
> > Hi Chaitanya,
> >
> > What will be specific in NFS Input Module that is not provided by FS
> Input
> > Module?
> >
> > Thanks,
> > Chandni
> >
> > On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <am...@datatorrent.com> wrote:
> >
> > > +1
> > >
> > > Thks
> > > Amol
> > >
> > > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> > sandeep@datatorrent.com
> > > >
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Regards,
> > > > Sandeep
> > > >
> > > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <
> mohit@datatorrent.com>
> > > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Regards,
> > > > > Mohit
> > > > >
> > > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > > chaitanya@datatorrent.com> wrote:
> > > > >
> > > > > > Hi All,
> > > > > >
> > > > > >   I am proposing NFS Input Module. Use case is to read large
> files
> > > from
> > > > > NFS
> > > > > > in parallel.
> > > > > >
> > > > > >  Design of NFS input module:
> > > > > >
> > > > > >    There is a common interface "FSInputModule" in Malhar for the
> > > input
> > > > > > Modules. NFS input Module extends from FSInputModule and can be
> > > > achieved
> > > > > by
> > > > > > using FSFileSplitter and BlockReader operators.
> > > > > >
> > > > > >   Please share your thoughts on this.
> > > > > >
> > > > > > Regards,
> > > > > > Chaitanya
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chaitanya Chebolu <ch...@datatorrent.com>.
Hi Chandni,

Properties wise nothing specific. FS Input Module is an abstract Module and
NFS Module implements the abstract methods - createFileSplitter() and
createBlockReader().

Regards,
Chaitanya

On Wed, May 4, 2016 at 9:45 PM, Chandni Singh <si...@gmail.com>
wrote:

> Hi Chaitanya,
>
> What will be specific in NFS Input Module that is not provided by FS Input
> Module?
>
> Thanks,
> Chandni
>
> On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <am...@datatorrent.com> wrote:
>
> > +1
> >
> > Thks
> > Amol
> >
> > On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <
> sandeep@datatorrent.com
> > >
> > wrote:
> >
> > > +1
> > >
> > > Regards,
> > > Sandeep
> > >
> > > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <mo...@datatorrent.com>
> > > wrote:
> > >
> > > > +1
> > > >
> > > > Regards,
> > > > Mohit
> > > >
> > > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > > chaitanya@datatorrent.com> wrote:
> > > >
> > > > > Hi All,
> > > > >
> > > > >   I am proposing NFS Input Module. Use case is to read large files
> > from
> > > > NFS
> > > > > in parallel.
> > > > >
> > > > >  Design of NFS input module:
> > > > >
> > > > >    There is a common interface "FSInputModule" in Malhar for the
> > input
> > > > > Modules. NFS input Module extends from FSInputModule and can be
> > > achieved
> > > > by
> > > > > using FSFileSplitter and BlockReader operators.
> > > > >
> > > > >   Please share your thoughts on this.
> > > > >
> > > > > Regards,
> > > > > Chaitanya
> > > > >
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Chandni Singh <si...@gmail.com>.
Hi Chaitanya,

What will be specific in NFS Input Module that is not provided by FS Input
Module?

Thanks,
Chandni

On Wed, May 4, 2016 at 7:12 AM, Amol Kekre <am...@datatorrent.com> wrote:

> +1
>
> Thks
> Amol
>
> On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <sandeep@datatorrent.com
> >
> wrote:
>
> > +1
> >
> > Regards,
> > Sandeep
> >
> > On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <mo...@datatorrent.com>
> > wrote:
> >
> > > +1
> > >
> > > Regards,
> > > Mohit
> > >
> > > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > > chaitanya@datatorrent.com> wrote:
> > >
> > > > Hi All,
> > > >
> > > >   I am proposing NFS Input Module. Use case is to read large files
> from
> > > NFS
> > > > in parallel.
> > > >
> > > >  Design of NFS input module:
> > > >
> > > >    There is a common interface "FSInputModule" in Malhar for the
> input
> > > > Modules. NFS input Module extends from FSInputModule and can be
> > achieved
> > > by
> > > > using FSFileSplitter and BlockReader operators.
> > > >
> > > >   Please share your thoughts on this.
> > > >
> > > > Regards,
> > > > Chaitanya
> > > >
> > >
> >
>

Re: NFS Input Module

Posted by Amol Kekre <am...@datatorrent.com>.
+1

Thks
Amol

On Tue, May 3, 2016 at 10:06 PM, Sandeep Deshmukh <sa...@datatorrent.com>
wrote:

> +1
>
> Regards,
> Sandeep
>
> On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <mo...@datatorrent.com>
> wrote:
>
> > +1
> >
> > Regards,
> > Mohit
> >
> > On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> > chaitanya@datatorrent.com> wrote:
> >
> > > Hi All,
> > >
> > >   I am proposing NFS Input Module. Use case is to read large files from
> > NFS
> > > in parallel.
> > >
> > >  Design of NFS input module:
> > >
> > >    There is a common interface "FSInputModule" in Malhar for the input
> > > Modules. NFS input Module extends from FSInputModule and can be
> achieved
> > by
> > > using FSFileSplitter and BlockReader operators.
> > >
> > >   Please share your thoughts on this.
> > >
> > > Regards,
> > > Chaitanya
> > >
> >
>

Re: NFS Input Module

Posted by Sandeep Deshmukh <sa...@datatorrent.com>.
+1

Regards,
Sandeep

On Fri, Apr 29, 2016 at 3:26 PM, Mohit Jotwani <mo...@datatorrent.com>
wrote:

> +1
>
> Regards,
> Mohit
>
> On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
> chaitanya@datatorrent.com> wrote:
>
> > Hi All,
> >
> >   I am proposing NFS Input Module. Use case is to read large files from
> NFS
> > in parallel.
> >
> >  Design of NFS input module:
> >
> >    There is a common interface "FSInputModule" in Malhar for the input
> > Modules. NFS input Module extends from FSInputModule and can be achieved
> by
> > using FSFileSplitter and BlockReader operators.
> >
> >   Please share your thoughts on this.
> >
> > Regards,
> > Chaitanya
> >
>

Re: NFS Input Module

Posted by Mohit Jotwani <mo...@datatorrent.com>.
+1

Regards,
Mohit

On Fri, Apr 29, 2016 at 2:09 PM, Chaitanya Chebolu <
chaitanya@datatorrent.com> wrote:

> Hi All,
>
>   I am proposing NFS Input Module. Use case is to read large files from NFS
> in parallel.
>
>  Design of NFS input module:
>
>    There is a common interface "FSInputModule" in Malhar for the input
> Modules. NFS input Module extends from FSInputModule and can be achieved by
> using FSFileSplitter and BlockReader operators.
>
>   Please share your thoughts on this.
>
> Regards,
> Chaitanya
>