You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by Edmon Begoli <eb...@gmail.com> on 2015/09/14 16:11:00 UTC

Design and Implementation Question - related to DRILL-3738

I want to implement a support for Excel files, both .xls and .xlsx, through
Apache POI
which will give me access to Java objects including column names, column
values
and it will expose the set of types supports.

(I recorded this issue here:
https://issues.apache.org/jira/browse/DRILL-3738)

In other words, I will be doing file I/O and column r/w through POI, so I
will need to write an adapter in Drill for storage and read/write.

What is the best way to do this?

Which classes should I extend or interfaces implement in order to support
Excel files in Drill?

I could probably still go with easy storage approach because POI makes it
easy and the data is tabularly oriented just as csv, but with extended
types (ints, floats, numerics).

Please give me some guidance in form of a "you need to implement x,y and z
to make this work".

Thank you in advance,
Edmon

Re: Design and Implementation Question - related to DRILL-3738

Posted by Abdel Hakim Deneche <ad...@maprtech.com>.
Hi Edmon,

So basically you want to read/write data from/to Excel files. I guess you'd
have to add a Format Plugin that extends from EasyFormatPlugin. I don't
think we have proper documentation to explain how to do it, but you can
look at how JSONFormatPlugin is implemented.

Thanks

On Mon, Sep 14, 2015 at 6:46 PM, Edmon Begoli <eb...@gmail.com> wrote:

> (I know it is busy time with a release, but please don't forget to address
> this inquiry once the release rush is over.)
> I am working on Excel and EDI pieces, and given that Excel is easier, I
> really want to start moving forward and contribute plug in.
>
> I just need some serious direction. We can also discuss tomorrow in a
> Hangout.
>
> Thank you,
> Edmon
>
> On Mon, Sep 14, 2015 at 10:11 AM, Edmon Begoli <eb...@gmail.com> wrote:
>
> > I want to implement a support for Excel files, both .xls and .xlsx,
> > through Apache POI
> > which will give me access to Java objects including column names, column
> > values
> > and it will expose the set of types supports.
> >
> > (I recorded this issue here:
> > https://issues.apache.org/jira/browse/DRILL-3738)
> >
> > In other words, I will be doing file I/O and column r/w through POI, so I
> > will need to write an adapter in Drill for storage and read/write.
> >
> > What is the best way to do this?
> >
> > Which classes should I extend or interfaces implement in order to support
> > Excel files in Drill?
> >
> > I could probably still go with easy storage approach because POI makes it
> > easy and the data is tabularly oriented just as csv, but with extended
> > types (ints, floats, numerics).
> >
> > Please give me some guidance in form of a "you need to implement x,y and
> z
> > to make this work".
> >
> > Thank you in advance,
> > Edmon
> >
> >
> >
> >
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Re: Design and Implementation Question - related to DRILL-3738

Posted by Edmon Begoli <eb...@gmail.com>.
(I know it is busy time with a release, but please don't forget to address
this inquiry once the release rush is over.)
I am working on Excel and EDI pieces, and given that Excel is easier, I
really want to start moving forward and contribute plug in.

I just need some serious direction. We can also discuss tomorrow in a
Hangout.

Thank you,
Edmon

On Mon, Sep 14, 2015 at 10:11 AM, Edmon Begoli <eb...@gmail.com> wrote:

> I want to implement a support for Excel files, both .xls and .xlsx,
> through Apache POI
> which will give me access to Java objects including column names, column
> values
> and it will expose the set of types supports.
>
> (I recorded this issue here:
> https://issues.apache.org/jira/browse/DRILL-3738)
>
> In other words, I will be doing file I/O and column r/w through POI, so I
> will need to write an adapter in Drill for storage and read/write.
>
> What is the best way to do this?
>
> Which classes should I extend or interfaces implement in order to support
> Excel files in Drill?
>
> I could probably still go with easy storage approach because POI makes it
> easy and the data is tabularly oriented just as csv, but with extended
> types (ints, floats, numerics).
>
> Please give me some guidance in form of a "you need to implement x,y and z
> to make this work".
>
> Thank you in advance,
> Edmon
>
>
>
>