You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Divya Gupta <di...@knoldus.com> on 2017/07/03 10:54:53 UTC

[Discussion] CarbonOutputFormat Implementation

CarbonData has implemented CarbonInputFomat, which enable applications
using Hive, Presto and other similar tools to read data from Carbon.

Similarly there should be implementation for CarbonOutputFomat also. This
will enable Hive, Presto or similar applications, using Carbondata as a
datasource, to write and load data to Carbondata files.

Regards
Divya Gupta

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Liang Chen <ch...@gmail.com>.
Hi

+1 for supporting OutputFormat.

Regards
Liang


Divya Gupta wrote
> Thanks Jacky and Venkata for the suggestions. I am working on the design
> part and will post on this discussion in case of any queries. I will share
> the design soon.
> 
> Regards
> Divya Gupta
> Project Lead
> 
> 
> *Knoldus Software LLP &lt;http://www.knoldus.com/&gt;*
> India &lt;http://www.knoldus.in/&gt; - US &lt;http://www.knoldus.com/&gt;
> - Canada
> &lt;http://www.knoldus.ca/&gt;
> &lt;http://www.knoldus.com/&gt;
> Blog &lt;http://blog.knoldus.com/&gt; | Twitter
> &lt;https://twitter.com/knolspeak&gt; |
> FB &lt;https://www.facebook.com/KnoldusSoftware&gt; | LinkedIn
> &lt;http://www.linkedin.com/company/knoldus-software-llp-&gt;
> 
> On Wed, Jul 5, 2017 at 9:14 AM, Jacky Li &lt;

> jacky.likun@

> &gt; wrote:
> 
>> +1.
>>
>> For carbon data files, I think there should be at least two OutputFormat,
>> 1) FileOutputFormat, which will not do sorting and write to carbondata
>> file only. This will be used in GLOBAL_SORT option
>> 2) TableOutputFormat, which will do sorting according to SORT_SCOPE
>> option, and use Single Pass to load
>>
>> And I think dictionary should be another OutputFormat.
>> So user can combine to use dictionary output format and carbondata file
>> output format.
>>
>> I suggest to firstly check the usage scenario and decide the class
>> hierarchy of this feature.
>>
>> Regards,
>> Jacky
>>
>> > 在 2017年7月4日,下午8:37,Venkata Gollamudi &lt;

> g.ramana.v1@

> &gt; 写道:
>> >
>> > +1
>> > OutputFormat should be based on single pass and with similar job
>> > configurations as CarbonInputFormat.
>> > Please output initial design and code skeleton, for review before
>> > proceeding for implementation.
>> >
>> > On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal &lt;

> kumarvishal1802@

> &gt;
>> > wrote:
>> >
>> >> +1
>> >> It's a long pending task.
>> >> -Regards
>> >> Kumar Vishal
>> >>
>> >> Sent from my iPhone
>> >>
>> >>> On 04-Jul-2017, at 16:26, Erlu Chen &lt;

> chenerlu26@

> &gt; wrote:
>> >>>
>> >>> Thanks very much.
>> >>>
>> >>> After you have raised a PR, we can start review.
>> >>>
>> >>>
>> >>> Regards.
>> >>> Chenerlu.
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> View this message in context: http://apache-carbondata-dev-
>> >> mailing-list-archive.1130556.n5.nabble.com/Discussion-
>> CarbonOutputFormat-
>> >> Implementation-tp17113p17239.html
>> >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
>> >> archive at Nabble.com.
>> >>
>>
>>
>>
>>





--
View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17393.html
Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Divya Gupta <di...@knoldus.com>.
Thanks Jacky and Venkata for the suggestions. I am working on the design
part and will post on this discussion in case of any queries. I will share
the design soon.

Regards
Divya Gupta
Project Lead


*Knoldus Software LLP <http://www.knoldus.com/>*
India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada
<http://www.knoldus.ca/>
<http://www.knoldus.com/>
Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> |
FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn
<http://www.linkedin.com/company/knoldus-software-llp->

On Wed, Jul 5, 2017 at 9:14 AM, Jacky Li <ja...@qq.com> wrote:

> +1.
>
> For carbon data files, I think there should be at least two OutputFormat,
> 1) FileOutputFormat, which will not do sorting and write to carbondata
> file only. This will be used in GLOBAL_SORT option
> 2) TableOutputFormat, which will do sorting according to SORT_SCOPE
> option, and use Single Pass to load
>
> And I think dictionary should be another OutputFormat.
> So user can combine to use dictionary output format and carbondata file
> output format.
>
> I suggest to firstly check the usage scenario and decide the class
> hierarchy of this feature.
>
> Regards,
> Jacky
>
> > 在 2017年7月4日,下午8:37,Venkata Gollamudi <g....@gmail.com> 写道:
> >
> > +1
> > OutputFormat should be based on single pass and with similar job
> > configurations as CarbonInputFormat.
> > Please output initial design and code skeleton, for review before
> > proceeding for implementation.
> >
> > On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <ku...@gmail.com>
> > wrote:
> >
> >> +1
> >> It's a long pending task.
> >> -Regards
> >> Kumar Vishal
> >>
> >> Sent from my iPhone
> >>
> >>> On 04-Jul-2017, at 16:26, Erlu Chen <ch...@gmail.com> wrote:
> >>>
> >>> Thanks very much.
> >>>
> >>> After you have raised a PR, we can start review.
> >>>
> >>>
> >>> Regards.
> >>> Chenerlu.
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context: http://apache-carbondata-dev-
> >> mailing-list-archive.1130556.n5.nabble.com/Discussion-
> CarbonOutputFormat-
> >> Implementation-tp17113p17239.html
> >>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> >> archive at Nabble.com.
> >>
>
>
>
>

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Jacky Li <ja...@qq.com>.
+1.

For carbon data files, I think there should be at least two OutputFormat,
1) FileOutputFormat, which will not do sorting and write to carbondata file only. This will be used in GLOBAL_SORT option
2) TableOutputFormat, which will do sorting according to SORT_SCOPE option, and use Single Pass to load

And I think dictionary should be another OutputFormat.
So user can combine to use dictionary output format and carbondata file output format.

I suggest to firstly check the usage scenario and decide the class hierarchy of this feature. 

Regards,
Jacky

> 在 2017年7月4日,下午8:37,Venkata Gollamudi <g....@gmail.com> 写道:
> 
> +1
> OutputFormat should be based on single pass and with similar job
> configurations as CarbonInputFormat.
> Please output initial design and code skeleton, for review before
> proceeding for implementation.
> 
> On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <ku...@gmail.com>
> wrote:
> 
>> +1
>> It's a long pending task.
>> -Regards
>> Kumar Vishal
>> 
>> Sent from my iPhone
>> 
>>> On 04-Jul-2017, at 16:26, Erlu Chen <ch...@gmail.com> wrote:
>>> 
>>> Thanks very much.
>>> 
>>> After you have raised a PR, we can start review.
>>> 
>>> 
>>> Regards.
>>> Chenerlu.
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: http://apache-carbondata-dev-
>> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
>> Implementation-tp17113p17239.html
>>> Sent from the Apache CarbonData Dev Mailing List archive mailing list
>> archive at Nabble.com.
>> 




Re: [Discussion] CarbonOutputFormat Implementation

Posted by Venkata Gollamudi <g....@gmail.com>.
+1
OutputFormat should be based on single pass and with similar job
configurations as CarbonInputFormat.
Please output initial design and code skeleton, for review before
proceeding for implementation.

On Tue, Jul 4, 2017 at 4:30 PM, Kumar Vishal <ku...@gmail.com>
wrote:

> +1
> It's a long pending task.
> -Regards
> Kumar Vishal
>
> Sent from my iPhone
>
> > On 04-Jul-2017, at 16:26, Erlu Chen <ch...@gmail.com> wrote:
> >
> > Thanks very much.
> >
> > After you have raised a PR, we can start review.
> >
> >
> > Regards.
> > Chenerlu.
> >
> >
> >
> > --
> > View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
> Implementation-tp17113p17239.html
> > Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Kumar Vishal <ku...@gmail.com>.
+1
It's a long pending task. 
-Regards 
Kumar Vishal

Sent from my iPhone

> On 04-Jul-2017, at 16:26, Erlu Chen <ch...@gmail.com> wrote:
> 
> Thanks very much.
> 
> After you have raised a PR, we can start review.
> 
> 
> Regards.
> Chenerlu.
> 
> 
> 
> --
> View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17239.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Erlu Chen <ch...@gmail.com>.
Thanks very much.

After you have raised a PR, we can start review.


Regards.
Chenerlu.



--
View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17239.html
Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Divya Gupta <di...@knoldus.com>.
Thanks for the quick reply Chenerlu.

I would surely like to contribute this feature and will start working
towards CARBONDATA-729.

Regards
Divya Gupta

Regards
Divya Gupta
Project Lead


*Knoldus Software LLP <http://www.knoldus.com/>*
India <http://www.knoldus.in/> - US <http://www.knoldus.com/> - Canada
<http://www.knoldus.ca/>
<http://www.knoldus.com/>
Blog <http://blog.knoldus.com/> | Twitter <https://twitter.com/knolspeak> |
FB <https://www.facebook.com/KnoldusSoftware> | LinkedIn
<http://www.linkedin.com/company/knoldus-software-llp->

On Tue, Jul 4, 2017 at 2:37 PM, Erlu Chen <ch...@gmail.com> wrote:

> Hi Divya
>
> Thanks for your suggestion.
>
> Carbondata may support it in the near future.
>
> If you want to contribute this feature, I think it will benefit community a
> lot.
>
>
> Regards.
> Chenerlu.
>
>
>
> --
> View this message in context: http://apache-carbondata-dev-
> mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-
> Implementation-tp17113p17214.html
> Sent from the Apache CarbonData Dev Mailing List archive mailing list
> archive at Nabble.com.
>

Re: [Discussion] CarbonOutputFormat Implementation

Posted by Erlu Chen <ch...@gmail.com>.
Hi Divya

Thanks for your suggestion.

Carbondata may support it in the near future.

If you want to contribute this feature, I think it will benefit community a
lot.


Regards.
Chenerlu.



--
View this message in context: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-CarbonOutputFormat-Implementation-tp17113p17214.html
Sent from the Apache CarbonData Dev Mailing List archive mailing list archive at Nabble.com.