You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mahout.apache.org by Dmitriy Lyubimov <dl...@gmail.com> on 2011/12/23 19:33:39 UTC

Ternary vs. uniform vs. normal

Ted,

is Ternary matrix somehow better than uniform or normal for random
projection? or it is just flops saving technique?

Original study actually implied unit gaussian vectors, which i think
is not quite exactly what we are doing with either technique. I think
it is important that vectors are unitary, that way it better figures
the subspace with major variances i think.

Re: Ternary vs. uniform vs. normal

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

ok thank you. That's what I thought. Thank you for confirming.

On Fri, Dec 23, 2011 at 10:51 AM, Ted Dunning <te...@gmail.com> wrote:
> All of the math is done initially using normal distributions, but I doubt
> that there is any detectable difference in practice between normal and
> uniform.  To be precise, the only time I would expect to be able to see a
> difference between uniform and normal is in cases that have less than a
> dozen non-zeros per row on average in A.  That is pretty pathological.
>
> Having a non-zero mean could have bad effects if there were a huge number
> of non-zero elements, but probably has little practical impact.
>
> The ternary distribution is just a flop-saver as you suggest.  A sparse
> ternary distribution worries me a bit, but the convergence guarantees are
> pretty compelling.  A very sparse ternary distribution is probably very bad
> for the data we focus on because you have a significant chance of not using
> any element from some rows of A.
>
> On Fri, Dec 23, 2011 at 10:39 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:
>
>> or, rather, that they are of the same length. although any random
>> distribution actually should eventually converge on the same length
>> for sufficiently high dimensional vectors.
>>
>> also normal distribution would tend to keep vectors closer to
>> subspaces spanned by axes , thus probably ensuring better
>> orthogonality guarantee..
>>
>> On Fri, Dec 23, 2011 at 10:33 AM, Dmitriy Lyubimov <dl...@gmail.com>
>> wrote:
>> > Ted,
>> >
>> > is Ternary matrix somehow better than uniform or normal for random
>> > projection? or it is just flops saving technique?
>> >
>> > Original study actually implied unit gaussian vectors, which i think
>> > is not quite exactly what we are doing with either technique. I think
>> > it is important that vectors are unitary, that way it better figures
>> > the subspace with major variances i think.
>>

Re: Ternary vs. uniform vs. normal

Posted by Ted Dunning <te...@gmail.com>.

All of the math is done initially using normal distributions, but I doubt
that there is any detectable difference in practice between normal and
uniform.  To be precise, the only time I would expect to be able to see a
difference between uniform and normal is in cases that have less than a
dozen non-zeros per row on average in A.  That is pretty pathological.

Having a non-zero mean could have bad effects if there were a huge number
of non-zero elements, but probably has little practical impact.

The ternary distribution is just a flop-saver as you suggest.  A sparse
ternary distribution worries me a bit, but the convergence guarantees are
pretty compelling.  A very sparse ternary distribution is probably very bad
for the data we focus on because you have a significant chance of not using
any element from some rows of A.

On Fri, Dec 23, 2011 at 10:39 AM, Dmitriy Lyubimov <dl...@gmail.com>wrote:

> or, rather, that they are of the same length. although any random
> distribution actually should eventually converge on the same length
> for sufficiently high dimensional vectors.
>
> also normal distribution would tend to keep vectors closer to
> subspaces spanned by axes , thus probably ensuring better
> orthogonality guarantee..
>
> On Fri, Dec 23, 2011 at 10:33 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
> > Ted,
> >
> > is Ternary matrix somehow better than uniform or normal for random
> > projection? or it is just flops saving technique?
> >
> > Original study actually implied unit gaussian vectors, which i think
> > is not quite exactly what we are doing with either technique. I think
> > it is important that vectors are unitary, that way it better figures
> > the subspace with major variances i think.
>

Re: Ternary vs. uniform vs. normal

Posted by Dmitriy Lyubimov <dl...@gmail.com>.

or, rather, that they are of the same length. although any random
distribution actually should eventually converge on the same length
for sufficiently high dimensional vectors.

also normal distribution would tend to keep vectors closer to
subspaces spanned by axes , thus probably ensuring better
orthogonality guarantee..

On Fri, Dec 23, 2011 at 10:33 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:
> Ted,
>
> is Ternary matrix somehow better than uniform or normal for random
> projection? or it is just flops saving technique?
>
> Original study actually implied unit gaussian vectors, which i think
> is not quite exactly what we are doing with either technique. I think
> it is important that vectors are unitary, that way it better figures
> the subspace with major variances i think.