You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Tamara Mendt <ta...@gmail.com> on 2015/05/28 12:24:58 UTC
JSON data source for Flink Job
Hello,
I have a JSON file containing multiple JSON objects and wish to use this as
a data source for a Flink Job.
What is the best way to do this?
Cheers,
Tamara
Re: JSON data source for Flink Job
Posted by Tamara Mendt <ta...@gmail.com>.
Ok great, thanks a lot =)
On Thu, May 28, 2015 at 12:39 PM, Stephan Ewen <se...@apache.org> wrote:
> Hi!
>
> This depends a bit how the JSON is formatted.
>
> If you want the source to be parallelizable, you need to have a way of
> splitting the file at object boundaries. Is there a character on which you
> can split? If yes, you can use theTextInputFormat (with a custom line break
> character), take the strings and parse them to JSON with your favorite
> library (like Jackson or so).
>
> Stephan
>
>
> On Thu, May 28, 2015 at 12:24 PM, Tamara Mendt <ta...@gmail.com>
> wrote:
>
>> Hello,
>>
>> I have a JSON file containing multiple JSON objects and wish to use this
>> as a data source for a Flink Job.
>>
>> What is the best way to do this?
>>
>> Cheers,
>>
>> Tamara
>>
>
>
--
Tamara Mendt
Re: JSON data source for Flink Job
Posted by Stephan Ewen <se...@apache.org>.
Hi!
This depends a bit how the JSON is formatted.
If you want the source to be parallelizable, you need to have a way of
splitting the file at object boundaries. Is there a character on which you
can split? If yes, you can use theTextInputFormat (with a custom line break
character), take the strings and parse them to JSON with your favorite
library (like Jackson or so).
Stephan
On Thu, May 28, 2015 at 12:24 PM, Tamara Mendt <ta...@gmail.com> wrote:
> Hello,
>
> I have a JSON file containing multiple JSON objects and wish to use this
> as a data source for a Flink Job.
>
> What is the best way to do this?
>
> Cheers,
>
> Tamara
>