You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Ronny Bräunlich <r....@gmail.com> on 2015/06/10 11:28:45 UTC

Apache Flink 0.9 ALS API

Hello everybody,

for a university project we use the current implementation of ALS in Flink 0.9 and we were wondering about the API of predict() and fit() requiring a DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because the range of Int is quite limited.
That is why we wanted to ask you if it wouldn’t be advantageous to change Int to Long, to allow more values.
Please let me know what you think about it.

Cheers,
Ronny

Re: Apache Flink 0.9 ALS API

Posted by Till Rohrmann <tr...@apache.org>.
+1 for longs as IDs.

Not so much in favour of Strings for the user ID because the row index
could also denote the actual item ID if you swap the indices. Furthermore,
you can always add a transformer which assigns unique IDs to names.

Cheers,
Till

On Sat, Jun 13, 2015 at 3:34 PM Chiwan Park <ch...@icloud.com> wrote:

> +1 for generalisation.
>
> @Ronny: Could you create a JIRA issue related to this?
>
> Regards,
> Chiwan Park
>
> > On Jun 13, 2015, at 9:07 PM, Felix Neutatz <ne...@googlemail.com>
> wrote:
> >
> > Hi Ronny,
> >
> > I agree with you and I would go even further and generalize it overall.
> So
> > that the movieID could be of type Long or Int and the userID of type
> String.
> >
> > This would increase usability of the ALS implementation :)
> >
> > Best regards,
> > Felix
> >
> > 2015-06-10 11:28 GMT+02:00 Ronny Bräunlich <r....@gmail.com>:
> >
> >> Hello everybody,
> >>
> >> for a university project we use the current implementation of ALS in
> Flink
> >> 0.9 and we were wondering about the API of predict() and fit()
> requiring a
> >> DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because
> >> the range of Int is quite limited.
> >> That is why we wanted to ask you if it wouldn’t be advantageous to
> change
> >> Int to Long, to allow more values.
> >> Please let me know what you think about it.
> >>
> >> Cheers,
> >> Ronny
>
>
>
>
>
>
>

Re: Apache Flink 0.9 ALS API

Posted by Chiwan Park <ch...@icloud.com>.
+1 for generalisation.

@Ronny: Could you create a JIRA issue related to this?

Regards,
Chiwan Park

> On Jun 13, 2015, at 9:07 PM, Felix Neutatz <ne...@googlemail.com> wrote:
> 
> Hi Ronny,
> 
> I agree with you and I would go even further and generalize it overall. So
> that the movieID could be of type Long or Int and the userID of type String.
> 
> This would increase usability of the ALS implementation :)
> 
> Best regards,
> Felix
> 
> 2015-06-10 11:28 GMT+02:00 Ronny Bräunlich <r....@gmail.com>:
> 
>> Hello everybody,
>> 
>> for a university project we use the current implementation of ALS in Flink
>> 0.9 and we were wondering about the API of predict() and fit() requiring a
>> DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because
>> the range of Int is quite limited.
>> That is why we wanted to ask you if it wouldn’t be advantageous to change
>> Int to Long, to allow more values.
>> Please let me know what you think about it.
>> 
>> Cheers,
>> Ronny







Re: Apache Flink 0.9 ALS API

Posted by Felix Neutatz <ne...@googlemail.com>.
Hi Ronny,

I agree with you and I would go even further and generalize it overall. So
that the movieID could be of type Long or Int and the userID of type String.

This would increase usability of the ALS implementation :)

Best regards,
Felix

2015-06-10 11:28 GMT+02:00 Ronny Bräunlich <r....@gmail.com>:

> Hello everybody,
>
> for a university project we use the current implementation of ALS in Flink
> 0.9 and we were wondering about the API of predict() and fit() requiring a
> DataSet[(Int, Int)] or DataSet[(Int, Int, Double]) respectively, because
> the range of Int is quite limited.
> That is why we wanted to ask you if it wouldn’t be advantageous to change
> Int to Long, to allow more values.
> Please let me know what you think about it.
>
> Cheers,
> Ronny