You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by pietro <pi...@gmail.com> on 2015/03/16 16:57:16 UTC

Most convenient data structure for unspecified length objects

I have to implement a program based on Flink that process some records.

The peculiarity of those records is that it is not possible to know at
compile time how many fields they contain. Therefore, I cannot use a simple
TupleN data type.

The solution I came up with, is to use a tuple with this structure:

/(mandatory_field2, ..., mandatory_fieldM, Array[Int], Array[Double],
Array[String] )/

where the three arrays store the optional fields.

It worked, but is it a good way to do that?
Can I do it better?

Many thanks,
regards.




--
View this message in context: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Most-convenient-data-structure-for-unspecified-length-objects-tp859.html
Sent from the Apache Flink (Incubator) User Mailing List archive. mailing list archive at Nabble.com.

Re: Most convenient data structure for unspecified length objects

Posted by Stephan Ewen <se...@apache.org>.
Ah, okay. Then how about using a List of Strings?

On Mon, Mar 16, 2015 at 5:34 PM, pietro <pi...@gmail.com> wrote:

> Hi Stephan, thanks for the reply!
>
> My problem is that I cannot know whether I will have 0, 1,2,..or more
> strings. Then, Option is not gonna help in my case :(
>
>
>
> --
> View this message in context:
> http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Most-convenient-data-structure-for-unspecified-length-objects-tp859p861.html
> Sent from the Apache Flink (Incubator) User Mailing List archive. mailing
> list archive at Nabble.com.
>

Re: Most convenient data structure for unspecified length objects

Posted by pietro <pi...@gmail.com>.
Hi Stephan, thanks for the reply!

My problem is that I cannot know whether I will have 0, 1,2,..or more
strings. Then, Option is not gonna help in my case :(



--
View this message in context: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Most-convenient-data-structure-for-unspecified-length-objects-tp859p861.html
Sent from the Apache Flink (Incubator) User Mailing List archive. mailing list archive at Nabble.com.

Re: Most convenient data structure for unspecified length objects

Posted by Stephan Ewen <se...@apache.org>.
Hi!

If you are programming in Scala, you can always use "Option[String]" for an
optional String field.

Stephan


On Mon, Mar 16, 2015 at 4:57 PM, pietro <pi...@gmail.com> wrote:

> I have to implement a program based on Flink that process some records.
>
> The peculiarity of those records is that it is not possible to know at
> compile time how many fields they contain. Therefore, I cannot use a simple
> TupleN data type.
>
> The solution I came up with, is to use a tuple with this structure:
>
> /(mandatory_field2, ..., mandatory_fieldM, Array[Int], Array[Double],
> Array[String] )/
>
> where the three arrays store the optional fields.
>
> It worked, but is it a good way to do that?
> Can I do it better?
>
> Many thanks,
> regards.
>
>
>
>
> --
> View this message in context:
> http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Most-convenient-data-structure-for-unspecified-length-objects-tp859.html
> Sent from the Apache Flink (Incubator) User Mailing List archive. mailing
> list archive at Nabble.com.
>

Re: Most convenient data structure for unspecified length objects

Posted by Vinh June <ho...@gmail.com>.
I ran into the same problem. 
I think it depends on the input data, in my case it is CSV of unknown size.
My solution is to read as text, then process on each line and add them into
Map or Array of type Any



--
View this message in context: http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Most-convenient-data-structure-for-unspecified-length-objects-tp859p863.html
Sent from the Apache Flink (Incubator) User Mailing List archive. mailing list archive at Nabble.com.