You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Alexander Alexandrov <al...@gmail.com> on 2015/01/29 12:04:05 UTC

TypeSerializerInputFormat cannot determine its type automatically

I am trying to use the TypeSerializer IO formats to write temp data to
disk. A gist with a minimal example can be found here:

https://gist.github.com/aalexandrov/90bf21f66bf604676f37

However, with the current setting I get the following error with the
TypeSerializerInputFormat:

Exception in thread "main"
org.apache.flink.api.common.InvalidProgramException: The type returned by
the input format could not be automatically determined. Please specify the
TypeInformation of the produced type explicitly.
    at
org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341)
    at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
    at SerializedFormatExample.main(SerializedFormatExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)

I think that the typeInformation instance at line 43 should be somehow
passed to the TypeSerializerInputFormat, but I cannot find a way to do it.

Any suggestions?

Thanks,
A.

Re: Fwd: TypeSerializerInputFormat cannot determine its type automatically

Posted by Timo Walther <tw...@apache.org>.
You don't have to close the PR. The change makes sense anyways.

You are right, the exception message could be improved at this point.



On 29.01.2015 16:21, Alexander Alexandrov wrote:
> Alight, thanks for the hint.
>
> I suggest to close PR 349 and refine the exception with a hint HOW exactly
> to pass the TypeInformation instance, e.g.
>
> The type returned by the input format could not be automatically
> determined. Please pass the TypeInformation of the produced type explicitly
> via 'env.createInput(...)'.
>
> I knew what I had to do, but I couldn't find the right point of entry to do
> is because the IO system is so generic.
>
>
> 2015-01-29 16:07 GMT+01:00 Timo Walther <tw...@apache.org>:
>
>> Hey Alexander,
>>
>> I have looked into your issue. You can simply use
>> env.createInput(InputFormat,TypeInformation) instead of env.readFile()
>> then you can pass TypeInformation manually without implementing
>> ResultTypeQueryable.
>>
>> Regards,
>> Timo
>>
>>
>>
>>
>> On 29.01.2015 14:54, Alexander Alexandrov wrote:
>>
>>> The problem seems to be that the reflection analysis cannot determine the
>>> type of the TypeSerializerInputFormat.
>>>
>>> One possible solution is to add the ResultTypeQueryable interface and
>>> force
>>> clients to explicitly set the TypeInformation.
>>>
>>> This might break code which relies on automatic type inference, but at the
>>> moment I cannot find any other usages of the TypeSerializerInputFormat
>>> except from the unit test.
>>>
>>>
>>> ---------- Forwarded message ----------
>>> From: Alexander Alexandrov <al...@gmail.com>
>>> Date: 2015-01-29 12:04 GMT+01:00
>>> Subject: TypeSerializerInputFormat cannot determine its type automatically
>>> To: user@flink.apache.org
>>>
>>>
>>> I am trying to use the TypeSerializer IO formats to write temp data to
>>> disk. A gist with a minimal example can be found here:
>>>
>>> https://gist.github.com/aalexandrov/90bf21f66bf604676f37
>>>
>>> However, with the current setting I get the following error with the
>>> TypeSerializerInputFormat:
>>>
>>> Exception in thread "main"
>>> org.apache.flink.api.common.InvalidProgramException: The type returned by
>>> the input format could not be automatically determined. Please specify the
>>> TypeInformation of the produced type explicitly.
>>>       at
>>> org.apache.flink.api.java.ExecutionEnvironment.readFile(
>>> ExecutionEnvironment.java:341)
>>>       at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
>>>       at SerializedFormatExample.main(SerializedFormatExample.scala)
>>>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>       at
>>> sun.reflect.NativeMethodAccessorImpl.invoke(
>>> NativeMethodAccessorImpl.java:57)
>>>       at
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>>> DelegatingMethodAccessorImpl.java:43)
>>>       at java.lang.reflect.Method.invoke(Method.java:606)
>>>       at com.intellij.rt.execution.application.AppMain.main(
>>> AppMain.java:134)
>>>
>>> I think that the typeInformation instance at line 43 should be somehow
>>> passed to the TypeSerializerInputFormat, but I cannot find a way to do it.
>>>
>>> Any suggestions?
>>>
>>> Thanks,
>>> A.
>>>
>>>


Re: Fwd: TypeSerializerInputFormat cannot determine its type automatically

Posted by Alexander Alexandrov <al...@gmail.com>.
Alight, thanks for the hint.

I suggest to close PR 349 and refine the exception with a hint HOW exactly
to pass the TypeInformation instance, e.g.

The type returned by the input format could not be automatically
determined. Please pass the TypeInformation of the produced type explicitly
via 'env.createInput(...)'.

I knew what I had to do, but I couldn't find the right point of entry to do
is because the IO system is so generic.


2015-01-29 16:07 GMT+01:00 Timo Walther <tw...@apache.org>:

> Hey Alexander,
>
> I have looked into your issue. You can simply use
> env.createInput(InputFormat,TypeInformation) instead of env.readFile()
> then you can pass TypeInformation manually without implementing
> ResultTypeQueryable.
>
> Regards,
> Timo
>
>
>
>
> On 29.01.2015 14:54, Alexander Alexandrov wrote:
>
>> The problem seems to be that the reflection analysis cannot determine the
>> type of the TypeSerializerInputFormat.
>>
>> One possible solution is to add the ResultTypeQueryable interface and
>> force
>> clients to explicitly set the TypeInformation.
>>
>> This might break code which relies on automatic type inference, but at the
>> moment I cannot find any other usages of the TypeSerializerInputFormat
>> except from the unit test.
>>
>>
>> ---------- Forwarded message ----------
>> From: Alexander Alexandrov <al...@gmail.com>
>> Date: 2015-01-29 12:04 GMT+01:00
>> Subject: TypeSerializerInputFormat cannot determine its type automatically
>> To: user@flink.apache.org
>>
>>
>> I am trying to use the TypeSerializer IO formats to write temp data to
>> disk. A gist with a minimal example can be found here:
>>
>> https://gist.github.com/aalexandrov/90bf21f66bf604676f37
>>
>> However, with the current setting I get the following error with the
>> TypeSerializerInputFormat:
>>
>> Exception in thread "main"
>> org.apache.flink.api.common.InvalidProgramException: The type returned by
>> the input format could not be automatically determined. Please specify the
>> TypeInformation of the produced type explicitly.
>>      at
>> org.apache.flink.api.java.ExecutionEnvironment.readFile(
>> ExecutionEnvironment.java:341)
>>      at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
>>      at SerializedFormatExample.main(SerializedFormatExample.scala)
>>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>      at
>> sun.reflect.NativeMethodAccessorImpl.invoke(
>> NativeMethodAccessorImpl.java:57)
>>      at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>>      at java.lang.reflect.Method.invoke(Method.java:606)
>>      at com.intellij.rt.execution.application.AppMain.main(
>> AppMain.java:134)
>>
>> I think that the typeInformation instance at line 43 should be somehow
>> passed to the TypeSerializerInputFormat, but I cannot find a way to do it.
>>
>> Any suggestions?
>>
>> Thanks,
>> A.
>>
>>
>

Re: Fwd: TypeSerializerInputFormat cannot determine its type automatically

Posted by Timo Walther <tw...@apache.org>.
Hey Alexander,

I have looked into your issue. You can simply use 
env.createInput(InputFormat,TypeInformation) instead of env.readFile() 
then you can pass TypeInformation manually without implementing 
ResultTypeQueryable.

Regards,
Timo



On 29.01.2015 14:54, Alexander Alexandrov wrote:
> The problem seems to be that the reflection analysis cannot determine the
> type of the TypeSerializerInputFormat.
>
> One possible solution is to add the ResultTypeQueryable interface and force
> clients to explicitly set the TypeInformation.
>
> This might break code which relies on automatic type inference, but at the
> moment I cannot find any other usages of the TypeSerializerInputFormat
> except from the unit test.
>
>
> ---------- Forwarded message ----------
> From: Alexander Alexandrov <al...@gmail.com>
> Date: 2015-01-29 12:04 GMT+01:00
> Subject: TypeSerializerInputFormat cannot determine its type automatically
> To: user@flink.apache.org
>
>
> I am trying to use the TypeSerializer IO formats to write temp data to
> disk. A gist with a minimal example can be found here:
>
> https://gist.github.com/aalexandrov/90bf21f66bf604676f37
>
> However, with the current setting I get the following error with the
> TypeSerializerInputFormat:
>
> Exception in thread "main"
> org.apache.flink.api.common.InvalidProgramException: The type returned by
> the input format could not be automatically determined. Please specify the
> TypeInformation of the produced type explicitly.
>      at
> org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341)
>      at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
>      at SerializedFormatExample.main(SerializedFormatExample.scala)
>      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>      at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>      at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>      at java.lang.reflect.Method.invoke(Method.java:606)
>      at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
>
> I think that the typeInformation instance at line 43 should be somehow
> passed to the TypeSerializerInputFormat, but I cannot find a way to do it.
>
> Any suggestions?
>
> Thanks,
> A.
>


Re: TypeSerializerInputFormat cannot determine its type automatically

Posted by Alexander Alexandrov <al...@gmail.com>.
As a quickfix I implmeented ResultTypeQueryable for the
TypeSerializerInputFormat. A PR for the 0.8 branch can be found here

https://github.com/apache/flink/pull/349

Please check and let me know if there is a way to fix the problem without
breaking the 0.8 line API.

2015-01-29 14:54 GMT+01:00 Alexander Alexandrov <
alexander.s.alexandrov@gmail.com>:

> The problem seems to be that the reflection analysis cannot determine the
> type of the TypeSerializerInputFormat.
>
> One possible solution is to add the ResultTypeQueryable interface and
> force clients to explicitly set the TypeInformation.
>
> This might break code which relies on automatic type inference, but at the
> moment I cannot find any other usages of the TypeSerializerInputFormat
> except from the unit test.
>
>
>
> ---------- Forwarded message ----------
> From: Alexander Alexandrov <al...@gmail.com>
> Date: 2015-01-29 12:04 GMT+01:00
> Subject: TypeSerializerInputFormat cannot determine its type automatically
> To: user@flink.apache.org
>
>
> I am trying to use the TypeSerializer IO formats to write temp data to
> disk. A gist with a minimal example can be found here:
>
> https://gist.github.com/aalexandrov/90bf21f66bf604676f37
>
> However, with the current setting I get the following error with the
> TypeSerializerInputFormat:
>
> Exception in thread "main"
> org.apache.flink.api.common.InvalidProgramException: The type returned by
> the input format could not be automatically determined. Please specify the
> TypeInformation of the produced type explicitly.
>     at
> org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341)
>     at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
>     at SerializedFormatExample.main(SerializedFormatExample.scala)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)
>
> I think that the typeInformation instance at line 43 should be somehow
> passed to the TypeSerializerInputFormat, but I cannot find a way to do it.
>
> Any suggestions?
>
> Thanks,
> A.
>
>

Fwd: TypeSerializerInputFormat cannot determine its type automatically

Posted by Alexander Alexandrov <al...@gmail.com>.
The problem seems to be that the reflection analysis cannot determine the
type of the TypeSerializerInputFormat.

One possible solution is to add the ResultTypeQueryable interface and force
clients to explicitly set the TypeInformation.

This might break code which relies on automatic type inference, but at the
moment I cannot find any other usages of the TypeSerializerInputFormat
except from the unit test.


---------- Forwarded message ----------
From: Alexander Alexandrov <al...@gmail.com>
Date: 2015-01-29 12:04 GMT+01:00
Subject: TypeSerializerInputFormat cannot determine its type automatically
To: user@flink.apache.org


I am trying to use the TypeSerializer IO formats to write temp data to
disk. A gist with a minimal example can be found here:

https://gist.github.com/aalexandrov/90bf21f66bf604676f37

However, with the current setting I get the following error with the
TypeSerializerInputFormat:

Exception in thread "main"
org.apache.flink.api.common.InvalidProgramException: The type returned by
the input format could not be automatically determined. Please specify the
TypeInformation of the produced type explicitly.
    at
org.apache.flink.api.java.ExecutionEnvironment.readFile(ExecutionEnvironment.java:341)
    at SerializedFormatExample$.main(SerializedFormatExample.scala:48)
    at SerializedFormatExample.main(SerializedFormatExample.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:606)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:134)

I think that the typeInformation instance at line 43 should be somehow
passed to the TypeSerializerInputFormat, but I cannot find a way to do it.

Any suggestions?

Thanks,
A.