You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@beam.apache.org by Kenneth Knowles <ke...@apache.org> on 2018/10/08 14:08:25 UTC

Re: mapping 1k + columns to BQ row

Hi Satyasheel,

It may help if you provide a small piece of example XML, what the POJO
looks like, and the row that you want to write to BigQuery.

Kenn

On Mon, Oct 8, 2018 at 6:39 AM Satya Sheel <sa...@ymail.com> wrote:

> Hi All,
>
> I am satyasheel working on a project which includes beam (dataflow as
> runner) to process streaming data from pubsub.
>
> My question might sound noob here, so please bare with me. I am parsing a
> XML with more than 1000 unique tags and after some usual processing we are
> sinking it to BQ. The schema we have in BQ is nested and and some are
> repeated as well. The question I asking might be more related to JAVA.
>
> What I am doing write now is reading XML --> to POJO --> Some processing
> --> BQ table row. While converting PCollections to BQ table row I am
> writing huge no of *setters *which is quite manual and not efficient. I
> was wondering if the community has some trick for this. I know there is a
> method called *JsonToRow *but I am struggling to get some working
> example.
>
> Any help is appreciated.
>
> Regards,
> Satyasheel
>
>

Re: mapping 1k + columns to BQ row

Posted by Satya Sheel <sa...@ymail.com>.
Thank you for the response and the pointer, That make sense.  

Thanks,
Satyasheel 
 

> On 8 Oct 2018, at 16:04, Kenneth Knowles <ke...@apache.org> wrote:
> 
> I think you are correct that your question is more related to Java. If you don't want to have to author all of those POJO classes, then you may want to simply not have such a layer, but do your processing on some JSON representation directly (so you can just loop over the fields, etc). If you do want the classes, then you can still loop over the fields somewhat by using Java reflection. I think it is too much for me to say more about these approaches here, so I hope you this helps you find what you want.
> 
> Kenn
> 
> On Mon, Oct 8, 2018 at 7:48 AM Satya Sheel <satyasheel@ymail.com <ma...@ymail.com>> wrote:
> Hi Kenn, 
> 
> Thank you for quick reply. What we are doing is first changing the XML to Json and doing the processing. So, I am attaching a zip file which contains small sample JSON, POJO for the JSON and a BQ row for the same. 
> 
> 
> Thanks, 
> Satyasheel 
> 
> 
> 
>> On 8 Oct 2018, at 15:08, Kenneth Knowles <kenn@apache.org <ma...@apache.org>> wrote:
>> 
>> Hi Satyasheel,
>> 
>> It may help if you provide a small piece of example XML, what the POJO looks like, and the row that you want to write to BigQuery.
>> 
>> Kenn
>> 
>> On Mon, Oct 8, 2018 at 6:39 AM Satya Sheel <satyasheel@ymail.com <ma...@ymail.com>> wrote:
>> Hi All, 
>> 
>> I am satyasheel working on a project which includes beam (dataflow as runner) to process streaming data from pubsub.
>> 
>> My question might sound noob here, so please bare with me. I am parsing a XML with more than 1000 unique tags and after some usual processing we are sinking it to BQ. The schema we have in BQ is nested and and some are repeated as well. The question I asking might be more related to JAVA. 
>> 
>> What I am doing write now is reading XML --> to POJO --> Some processing --> BQ table row. While converting PCollections to BQ table row I am writing huge no of setters which is quite manual and not efficient. I was wondering if the community has some trick for this. I know there is a method called JsonToRow but I am struggling to get some working example. 
>> 
>> Any help is appreciated. 
>> 
>> Regards,
>> Satyasheel
>> 
> 


Re: mapping 1k + columns to BQ row

Posted by Kenneth Knowles <ke...@apache.org>.
I think you are correct that your question is more related to Java. If you
don't want to have to author all of those POJO classes, then you may want
to simply not have such a layer, but do your processing on some JSON
representation directly (so you can just loop over the fields, etc). If you
do want the classes, then you can still loop over the fields somewhat by
using Java reflection. I think it is too much for me to say more about
these approaches here, so I hope you this helps you find what you want.

Kenn

On Mon, Oct 8, 2018 at 7:48 AM Satya Sheel <sa...@ymail.com> wrote:

> Hi Kenn,
>
> Thank you for quick reply. What we are doing is first changing the XML to
> Json and doing the processing. So, I am attaching a zip file which contains
> small sample JSON, POJO for the JSON and a BQ row for the same.
>
>
> Thanks,
> Satyasheel
>
>
>
> On 8 Oct 2018, at 15:08, Kenneth Knowles <ke...@apache.org> wrote:
>
> Hi Satyasheel,
>
> It may help if you provide a small piece of example XML, what the POJO
> looks like, and the row that you want to write to BigQuery.
>
> Kenn
>
> On Mon, Oct 8, 2018 at 6:39 AM Satya Sheel <sa...@ymail.com> wrote:
>
>> Hi All,
>>
>> I am satyasheel working on a project which includes beam (dataflow as
>> runner) to process streaming data from pubsub.
>>
>> My question might sound noob here, so please bare with me. I am parsing a
>> XML with more than 1000 unique tags and after some usual processing we are
>> sinking it to BQ. The schema we have in BQ is nested and and some are
>> repeated as well. The question I asking might be more related to JAVA.
>>
>> What I am doing write now is reading XML --> to POJO --> Some processing
>> --> BQ table row. While converting PCollections to BQ table row I am
>> writing huge no of *setters *which is quite manual and not efficient. I
>> was wondering if the community has some trick for this. I know there is a
>> method called *JsonToRow *but I am struggling to get some working
>> example.
>>
>> Any help is appreciated.
>>
>> Regards,
>> Satyasheel
>>
>>
>

Re: mapping 1k + columns to BQ row

Posted by Satya Sheel <sa...@ymail.com>.
Hi Kenn, 

Thank you for quick reply. What we are doing is first changing the XML to Json and doing the processing. So, I am attaching a zip file which contains small sample JSON, POJO for the JSON and a BQ row for the same. 


Thanks, 
Satyasheel 



> On 8 Oct 2018, at 15:08, Kenneth Knowles <ke...@apache.org> wrote:
> 
> Hi Satyasheel,
> 
> It may help if you provide a small piece of example XML, what the POJO looks like, and the row that you want to write to BigQuery.
> 
> Kenn
> 
> On Mon, Oct 8, 2018 at 6:39 AM Satya Sheel <satyasheel@ymail.com <ma...@ymail.com>> wrote:
> Hi All, 
> 
> I am satyasheel working on a project which includes beam (dataflow as runner) to process streaming data from pubsub.
> 
> My question might sound noob here, so please bare with me. I am parsing a XML with more than 1000 unique tags and after some usual processing we are sinking it to BQ. The schema we have in BQ is nested and and some are repeated as well. The question I asking might be more related to JAVA. 
> 
> What I am doing write now is reading XML --> to POJO --> Some processing --> BQ table row. While converting PCollections to BQ table row I am writing huge no of setters which is quite manual and not efficient. I was wondering if the community has some trick for this. I know there is a method called JsonToRow but I am struggling to get some working example. 
> 
> Any help is appreciated. 
> 
> Regards,
> Satyasheel
>