You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Fernando Fernández <fe...@gmail.com> on 2013/08/01 12:15:40 UTC

Why is Lanczos deprecated?

Hi everyone,

Sorry if I duplicate the question but I've been looking for an answer and I
haven't found an explanation other than it's not being used (together with
some other algorithms). If it's been discussed in depth before maybe you
can point me to some link with the discussion.

I have successfully used Lanczos in several projects and it's been a
surprise to me finding that the main reason (according to what I've read
that might not be the full story) is that it's not being used. At the
begining I supposed it was because SSVD is supposed to be much faster with
similar results, but after making some tests I have found that running
times are similar or even worse than lanczos for some configurations (I
have tried several combinations of parameters, given child processes enough
memory, etc. and had no success in running SSVD at least in 3/4 of time
Lanczos runs, thouh they might be some combinations of parameters I have
still not tried). It seems to be quite tricky to find a good combination of
parameters for SSVD and I have seen also a precision loss in some examples
that makes me not confident in migrating Lanczos to SSVD from now on (How
far can I trust results from a combination of parameters that runs in
significant less time, or at least a good time?).

Can someone convince me that SSVD is actually a better option than Lanczos?
(I'm totally willing to be convinced... :) )

Thank you very much in advance.

Fernando.

Re: Why is Lanczos deprecated?

Posted by Jake Mannix <ja...@gmail.com>.
On Thu, Aug 1, 2013 at 7:08 AM, Sebastian Schelter <ss...@apache.org> wrote:

> IIRC the main reasons for deprecating Lanczos was that in contrast to
> SSVD, it does not use a constant number of MapReduce jobs and that our
> implementation has the constraint that all the resulting vectors have to
> fit into the memory of the driver machine.
>

While it's true that Lanczos does not use a constant number of MR
iterations,
the phrase "our implementation" is key in saying we have to hold all the
output
vectors in memory.  This wasn't even a very integral part of our impl.
 It's fairly
simple to implement the linear combinations of the Ritz vectors after
iterations
are complete as an operation keeping only 3 vectors in memory at a time, we
just never made that optimization.


>
> Best,
> Sebastian
>
> On 01.08.2013 12:15, Fernando Fernández wrote:
> > Hi everyone,
> >
> > Sorry if I duplicate the question but I've been looking for an answer
> and I
> > haven't found an explanation other than it's not being used (together
> with
> > some other algorithms). If it's been discussed in depth before maybe you
> > can point me to some link with the discussion.
> >
> > I have successfully used Lanczos in several projects and it's been a
> > surprise to me finding that the main reason (according to what I've read
> > that might not be the full story) is that it's not being used. At the
> > begining I supposed it was because SSVD is supposed to be much faster
> with
> > similar results, but after making some tests I have found that running
> > times are similar or even worse than lanczos for some configurations (I
> > have tried several combinations of parameters, given child processes
> enough
> > memory, etc. and had no success in running SSVD at least in 3/4 of time
> > Lanczos runs, thouh they might be some combinations of parameters I have
> > still not tried). It seems to be quite tricky to find a good combination
> of
> > parameters for SSVD and I have seen also a precision loss in some
> examples
> > that makes me not confident in migrating Lanczos to SSVD from now on (How
> > far can I trust results from a combination of parameters that runs in
> > significant less time, or at least a good time?).
> >
> > Can someone convince me that SSVD is actually a better option than
> Lanczos?
> > (I'm totally willing to be convinced... :) )
> >
> > Thank you very much in advance.
> >
> > Fernando.
> >
>
>


-- 

  -jake

Re: Why is Lanczos deprecated?

Posted by Sebastian Schelter <ss...@apache.org>.
IIRC the main reasons for deprecating Lanczos was that in contrast to
SSVD, it does not use a constant number of MapReduce jobs and that our
implementation has the constraint that all the resulting vectors have to
fit into the memory of the driver machine.

Best,
Sebastian

On 01.08.2013 12:15, Fernando Fernández wrote:
> Hi everyone,
> 
> Sorry if I duplicate the question but I've been looking for an answer and I
> haven't found an explanation other than it's not being used (together with
> some other algorithms). If it's been discussed in depth before maybe you
> can point me to some link with the discussion.
> 
> I have successfully used Lanczos in several projects and it's been a
> surprise to me finding that the main reason (according to what I've read
> that might not be the full story) is that it's not being used. At the
> begining I supposed it was because SSVD is supposed to be much faster with
> similar results, but after making some tests I have found that running
> times are similar or even worse than lanczos for some configurations (I
> have tried several combinations of parameters, given child processes enough
> memory, etc. and had no success in running SSVD at least in 3/4 of time
> Lanczos runs, thouh they might be some combinations of parameters I have
> still not tried). It seems to be quite tricky to find a good combination of
> parameters for SSVD and I have seen also a precision loss in some examples
> that makes me not confident in migrating Lanczos to SSVD from now on (How
> far can I trust results from a combination of parameters that runs in
> significant less time, or at least a good time?).
> 
> Can someone convince me that SSVD is actually a better option than Lanczos?
> (I'm totally willing to be convinced... :) )
> 
> Thank you very much in advance.
> 
> Fernando.
> 


Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández <
fernando.fernandez.gonzalez@gmail.com> wrote:

> Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
> inventing the numbers here cause I don't remeber exact figures): 1834.58,
> 756.34, 325,67,125,67 and providing very good recommendations in the
> recommender system, and SSVD giving eigenvalues (invented numbers again)
> 723,56, 354,67, 111.67, 101.46 and provinding nonsense recommendations...
> that's why I'm suspecting there might be a bug in the input code. Small
> changes in decimal places and even in units, like 723,56 to 730,78 would be
> reasonable. 1834 to 723 is not. I put this numbers in quarantine until I
> determine everything's ok with the input code.
>
> Thanks for the link to Halko's dissertation. I know it's a nice piece of
> work and a reference and had already given it a look but I always have to
> do my own experiments because I have found so often that things doesn't
> work as expected with certain real cases that I always try to at least
> validate what is in papers and dissertations also apply to my data..
>
> I'm aware SSVD is non-deterministic, I always check this kind of algorithms
> with several runs. Here are some results on movielens 100k data using R's
> implementation of SSVD provided  here
>
> https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
> (I hope there are no significant differences between the results with this
> implementation and Mahout's):
>

There are few differences there.

(1) the prootype takes normal values for vectors, and that would be far not
the same as  suggested random unit vectors in the paper. (methinks). Mahout
actually uses uniform U(-1,1) instead which methinks helps numerical
stability errors much more than using rnorm() which would tend to generate
smaller values. Or maybe not.
(2) execution plan is the same, but decomposition numerical stability
properties are likely not quite the same. Mahout's version does givens QR
which is thought to be very stable. Eigen decomposition is likely also
different but not that it would matter compared to everything else.
(3) R prototype is always dense which means it doesn't account for
degenerate nature of 0 elements and runs more flops than a sparse Mahout
input would. Which again means technically less numerical stability.

if your dataset is of significant size, it may (in theory) be the case that
numerical stability issues start to trump stochastic errors (since power
iterations run the figures thru more stuff). The theoretical analysis says
that power iterations cannot worsen result, numerical stability issues due
to rounding error aside. You can verify how bad numerical stability gets
fairly easily (if it is feasible computationally for your dataset) by
requesting full rank ssvd (k+p=min(m,n)). When full rank is requested,
ssvd's stochastic error is zero and then everything else left is purely due
to accumulated rounding errors both in svd and ssvd (surely, there are
rounding errors in full rank svd too).


> First line is 10 first eigenvalues computed with R's svd. Next three are
> computed with ssvd.svd with q=0 and next three are with q=1:
>
> > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
> 145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with q=0"
> [1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572 126.42295
> 121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568 159.13914
> 158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1] 640.63362
> 244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920
> 106.45790  99.01242
>
> [1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
> 157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
> 640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
> 119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
> 157.13394 145.08462 125.33024 120.20984 102.45867  95.37994
>
>
> I have repeated the runs several times with the same results... Maybe I'm
> still missing something else but given these results I can't apply the rule
> of q=1 improves accuracy. At least I have to experiment, my guess is it do
> depends on the dataset. I would like also to repeat this comparison with
> Mahout's SSVD and my dataset and see what happens.
>
> Dmitriy, thank you very much for your attention and sharing your thoughts
> with me. I really appreciate it.
>
> Best,
> Fernando.
>
>
> 2013/8/3 Dmitriy Lyubimov <dl...@gmail.com>
>
> > On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov <dl...@gmail.com>
> > wrote:
> >
> > >
> > >
> > >
> > > On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >
> > >> I don't agree with k>10 being unlikely meaningful. I've used SVD in
> text
> > >> mining problems where k~150 yielded best results (not only a good
> choice
> > >> based on plotting eigenvalues and seeing elbow in decay was near 150
> but
> > >> checking results with different k's and seeing around 150 made much
> more
> > >> sense). Currently I'm working in a recommender system and already have
> > >> Lanczos running with k~50 producing best results, again, based on
> visual
> > >> exploration of eigenvalues and exploring results one by one and seeing
> > >> they
> > >> were more meaningful. Current tests with SSVD are based on the latter
> > and
> > >> when I say I'm not getting good results I mean Lanczos is working
> > properly
> > >> on the same problem (I've explored eigenvalues up to 150 and have a
> good
> > >> decay) and SSVD is not, but as I said, this might be caused by some
> bug
> > in
> > >> the input process, seems to strange to me that results are so
> different
> > so
> > >>
> > >
> > > Depends on how you define "so". But again, in that respect all i can
> > point
> > > to is to the accuracy study by N. Halko, out of published work.
> > >
> > I guess i can save you digging thru Mahout wiki, here is the reference
> >
> http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf
> > .
> > Specifically, look at eigen values chart comparison at page  179. This is
> > run on Mahout's Lanczos and SSVD neck-to-neck. The order of accuracy for
> > first 40 values is claimed as "Order of accuracy is q = 3; q = 2; q = 1,
> > lanczos, q = 0." (see source for details of accuracy assessment).
> >
> > One thing i did not understand there is why Lanczos showed such
> > uncharacteristic values fall-off for values between 40 and 60. I have
> > always assumed -q=1 was showing something much closer to reality after
> > first 40 values as well.
> >
> >
> > >
> > >> I'll get back to this discussions when I figure it out :) . If you are
> > >> curious about the numbers: 1MM rows by 150k columns for text mining
> case
> > >> and 18 MM rows by 80k columns for recommender.
> > >>
> > >> About p and q, I have been playing around with movielens 100k dataset
> > and
> > >> found q>0 actually worsens results in terms of precision (nothing
> severe
> > >> though, but it happens) and its better to increase p a little in that
> > >> particular case, so my guess is it depends a lot on the dataset
> though I
> > >> don't know how.
> > >>
> > >
> > > This again sounds very strange.  The algorithm is non-deterministic,
> > which
> > > means errors you get in one run, will be different from errors in
> another
> > > run, but honesly, you would be the first to report that power
> iterations
> > > worsen expectation of an error. All theoretical work and practical
> > > estimates did not confirm that observation; in fact, quite a bit to the
> > > contrary.
> > >
> > >
> > >>
> > >> 2013/8/2 Dmitriy Lyubimov <dl...@gmail.com>
> > >>
> > >> > the only time you would not get good results is if spectrum does not
> > >> have a
> > >> > good decay. Which is equivalent to mostly same variance in most of
> > >> original
> > >> > basis directions. This problem is similar to problem that arises
> with
> > >> PCA
> > >> > when you try to do dimensionality reduction with retaining certain
> > >> %-tage
> > >> > of variance. in case of flat spectrum decay, you'd need much bigger
> k
> > to
> > >> > retain same amount of variance in dimensionally reduced projection.
> In
> > >> that
> > >> > sense SSVD solution for a given k is as good as PCA gets for the
> same
> > k.
> > >> > Also, i believe (but not 100% sure) "problems too small" exhibit
> > higher
> > >> > errors due to the law of large numbers.
> > >> >
> > >> >
> > >> > On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <
> dlieu.7@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > if you use k > 40 you are already beating Lanczos for larger
> > datasets.
> > >> > > k>10 is unlikely meaninful. p need not be more than 15% of k
> > (default
> > >> is
> > >> > > 15). use q=1, q>1 does not yield tangible improvements in real
> > world.
> > >> > >  Again, see Nathan Halko's dissertation on accuracy comparison.
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >> > >
> > >> > >> Keeping Lanczos would be nice, Like I said, it's currently being
> > >> used in
> > >> > >> some projects with good results and I think it's easier to tune
> so
> > it
> > >> > >> would
> > >> > >> be my first choice for future developments. I still need to
> further
> > >> test
> > >> > >> SSVD, specially because in the current example I'm working it
> > yields
> > >> > very
> > >> > >> different results from Lanczos. We are investigating if it can be
> > due
> > >> > to a
> > >> > >> bug when loading the data, though dimensions of the ouptut seem
> ok,
> > >> or
> > >> > if
> > >> > >> it's a question of increasing p or q parameters. If it's a
> question
> > >> of
> > >> > >> increasing p and q I think running times would make SSVD not
> > viable.
> > >> I
> > >> > >> hope
> > >> > >> to be able to provide some comparison figures in terms of
> precision
> > >> and
> > >> > >> running time in a month or so.
> > >> > >>
> > >> > >> I hope that other users reads this and say wether they are using
> > >> > Lanczos.
> > >> > >>
> > >> > >> Best,
> > >> > >> Fernando.
> > >> > >>
> > >> > >> 2013/8/2 Sebastian Schelter <ss...@apache.org>
> > >> > >>
> > >> > >> > I would also be fine with keeping if there is demand. I just
> > >> proposed
> > >> > to
> > >> > >> > deprecate it and nobody voted against that at that point in
> time.
> > >> > >> >
> > >> > >> > --sebastian
> > >> > >> >
> > >> > >> >
> > >> > >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > >> > >> > > There's a part of Nathan Halko's dissertation referenced on
> > >> > algorithm
> > >> > >> > page
> > >> > >> > > running comparison.  In particular, he was not able to
> compute
> > >> more
> > >> > >> than
> > >> > >> > 40
> > >> > >> > > eigenvectors with Lanczos on wikipedia dataset. You may refer
> > to
> > >> > that
> > >> > >> > > study.
> > >> > >> > >
> > >> > >> > > On the accuracy part, it was not observed that it was a
> > problem,
> > >> > >> assuming
> > >> > >> > > high level of random noise is not the case, at least not in
> > >> LSA-like
> > >> > >> > > application used there.
> > >> > >> > >
> > >> > >> > > That said, i am all for diversity of tools, I would actually
> be
> > >> +0
> > >> > on
> > >> > >> > > deprecating Lanczos, it is not like we are lacking support
> for
> > >> it.
> > >> > >> SSVD
> > >> > >> > > could use improvements too.
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > >> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >> > >> > >
> > >> > >> > >> Hi everyone,
> > >> > >> > >>
> > >> > >> > >> Sorry if I duplicate the question but I've been looking for
> an
> > >> > answer
> > >> > >> > and I
> > >> > >> > >> haven't found an explanation other than it's not being used
> > >> > (together
> > >> > >> > with
> > >> > >> > >> some other algorithms). If it's been discussed in depth
> before
> > >> > maybe
> > >> > >> you
> > >> > >> > >> can point me to some link with the discussion.
> > >> > >> > >>
> > >> > >> > >> I have successfully used Lanczos in several projects and
> it's
> > >> been
> > >> > a
> > >> > >> > >> surprise to me finding that the main reason (according to
> what
> > >> I've
> > >> > >> read
> > >> > >> > >> that might not be the full story) is that it's not being
> used.
> > >> At
> > >> > the
> > >> > >> > >> begining I supposed it was because SSVD is supposed to be
> much
> > >> > faster
> > >> > >> > with
> > >> > >> > >> similar results, but after making some tests I have found
> that
> > >> > >> running
> > >> > >> > >> times are similar or even worse than lanczos for some
> > >> > configurations
> > >> > >> (I
> > >> > >> > >> have tried several combinations of parameters, given child
> > >> > processes
> > >> > >> > enough
> > >> > >> > >> memory, etc. and had no success in running SSVD at least in
> > 3/4
> > >> of
> > >> > >> time
> > >> > >> > >> Lanczos runs, thouh they might be some combinations of
> > >> parameters I
> > >> > >> have
> > >> > >> > >> still not tried). It seems to be quite tricky to find a good
> > >> > >> > combination of
> > >> > >> > >> parameters for SSVD and I have seen also a precision loss in
> > >> some
> > >> > >> > examples
> > >> > >> > >> that makes me not confident in migrating Lanczos to SSVD
> from
> > >> now
> > >> > on
> > >> > >> > (How
> > >> > >> > >> far can I trust results from a combination of parameters
> that
> > >> runs
> > >> > in
> > >> > >> > >> significant less time, or at least a good time?).
> > >> > >> > >>
> > >> > >> > >> Can someone convince me that SSVD is actually a better
> option
> > >> than
> > >> > >> > Lanczos?
> > >> > >> > >> (I'm totally willing to be convinced... :) )
> > >> > >> > >>
> > >> > >> > >> Thank you very much in advance.
> > >> > >> > >>
> > >> > >> > >> Fernando.
> > >> > >> > >>
> > >> > >> > >
> > >> > >> >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Aug 3, 2013 3:06 AM, "Fernando Fernández" <
fernando.fernandez.gonzalez@gmail.com> wrote:
>
> Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
> inventing the numbers here cause I don't remeber exact figures): 1834.58,
> 756.34, 325,67,125,67 and providing very good recommendations in the
> recommender system, and SSVD giving eigenvalues (invented numbers again)
> 723,56, 354,67, 111.67, 101.46 and provinding nonsense recommendations...
> that's why I'm suspecting there might be a bug in the input code. Small
> changes in decimal places and even in units, like 723,56 to 730,78 would
be
> reasonable. 1834 to 723 is not. I put this numbers in quarantine until I
> determine everything's ok with the input code.
Yes theres definitely something fishy. q0 usually results in no more than
5% error on a real life dataset and even that is never for 1st value. you
need a well controlled experiment where you provide exactly the same input.
I thought both methods accept exactly the same drm format so u could just
feed the same thing to them?
>
> Thanks for the link to Halko's dissertation. I know it's a nice piece of
> work and a reference and had already given it a look but I always have to
> do my own experiments because I have found so often that things doesn't
> work as expected with certain real cases that I always try to at least
> validate what is in papers and dissertations also apply to my data..
>
> I'm aware SSVD is non-deterministic, I always check this kind of
algorithms
> with several runs. Here are some results on movielens 100k data using R's
> implementation of SSVD provided  here
>
https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
> (I hope there are no significant differences between the results with this
> implementation and Mahout's):
>
> First line is 10 first eigenvalues computed with R's svd. Next three are
> computed with ssvd.svd with q=0 and next three are with q=1:
>
> > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with q=0"
[1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572 126.42295
121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568 159.13914
158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1] 640.63362
244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920
106.45790  99.01242
>
> [1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
> 157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
> 640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
> 119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
> 157.13394 145.08462 125.33024 120.20984 102.45867  95.37994
>
>
> I have repeated the runs several times with the same results... Maybe I'm
> still missing something else but given these results I can't apply the
rule
> of q=1 improves accuracy. At least I have to experiment, my guess is it do
> depends on the dataset. I would like also to repeat this comparison with
> Mahout's SSVD and my dataset and see what happens.
>
> Dmitriy, thank you very much for your attention and sharing your thoughts
> with me. I really appreciate it.
>
> Best,
> Fernando.
>
>
> 2013/8/3 Dmitriy Lyubimov <dl...@gmail.com>
>
> > On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov <dl...@gmail.com>
> > wrote:
> >
> > >
> > >
> > >
> > > On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >
> > >> I don't agree with k>10 being unlikely meaningful. I've used SVD in
text
> > >> mining problems where k~150 yielded best results (not only a good
choice
> > >> based on plotting eigenvalues and seeing elbow in decay was near 150
but
> > >> checking results with different k's and seeing around 150 made much
more
> > >> sense). Currently I'm working in a recommender system and already
have
> > >> Lanczos running with k~50 producing best results, again, based on
visual
> > >> exploration of eigenvalues and exploring results one by one and
seeing
> > >> they
> > >> were more meaningful. Current tests with SSVD are based on the latter
> > and
> > >> when I say I'm not getting good results I mean Lanczos is working
> > properly
> > >> on the same problem (I've explored eigenvalues up to 150 and have a
good
> > >> decay) and SSVD is not, but as I said, this might be caused by some
bug
> > in
> > >> the input process, seems to strange to me that results are so
different
> > so
> > >>
> > >
> > > Depends on how you define "so". But again, in that respect all i can
> > point
> > > to is to the accuracy study by N. Halko, out of published work.
> > >
> > I guess i can save you digging thru Mahout wiki, here is the reference
> >
http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf
> > .
> > Specifically, look at eigen values chart comparison at page  179. This
is
> > run on Mahout's Lanczos and SSVD neck-to-neck. The order of accuracy for
> > first 40 values is claimed as "Order of accuracy is q = 3; q = 2; q = 1,
> > lanczos, q = 0." (see source for details of accuracy assessment).
> >
> > One thing i did not understand there is why Lanczos showed such
> > uncharacteristic values fall-off for values between 40 and 60. I have
> > always assumed -q=1 was showing something much closer to reality after
> > first 40 values as well.
> >
> >
> > >
> > >> I'll get back to this discussions when I figure it out :) . If you
are
> > >> curious about the numbers: 1MM rows by 150k columns for text mining
case
> > >> and 18 MM rows by 80k columns for recommender.
> > >>
> > >> About p and q, I have been playing around with movielens 100k dataset
> > and
> > >> found q>0 actually worsens results in terms of precision (nothing
severe
> > >> though, but it happens) and its better to increase p a little in that
> > >> particular case, so my guess is it depends a lot on the dataset
though I
> > >> don't know how.
> > >>
> > >
> > > This again sounds very strange.  The algorithm is non-deterministic,
> > which
> > > means errors you get in one run, will be different from errors in
another
> > > run, but honesly, you would be the first to report that power
iterations
> > > worsen expectation of an error. All theoretical work and practical
> > > estimates did not confirm that observation; in fact, quite a bit to
the
> > > contrary.
> > >
> > >
> > >>
> > >> 2013/8/2 Dmitriy Lyubimov <dl...@gmail.com>
> > >>
> > >> > the only time you would not get good results is if spectrum does
not
> > >> have a
> > >> > good decay. Which is equivalent to mostly same variance in most of
> > >> original
> > >> > basis directions. This problem is similar to problem that arises
with
> > >> PCA
> > >> > when you try to do dimensionality reduction with retaining certain
> > >> %-tage
> > >> > of variance. in case of flat spectrum decay, you'd need much
bigger k
> > to
> > >> > retain same amount of variance in dimensionally reduced
projection. In
> > >> that
> > >> > sense SSVD solution for a given k is as good as PCA gets for the
same
> > k.
> > >> > Also, i believe (but not 100% sure) "problems too small" exhibit
> > higher
> > >> > errors due to the law of large numbers.
> > >> >
> > >> >
> > >> > On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <
dlieu.7@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > if you use k > 40 you are already beating Lanczos for larger
> > datasets.
> > >> > > k>10 is unlikely meaninful. p need not be more than 15% of k
> > (default
> > >> is
> > >> > > 15). use q=1, q>1 does not yield tangible improvements in real
> > world.
> > >> > >  Again, see Nathan Halko's dissertation on accuracy comparison.
> > >> > >
> > >> > >
> > >> > >
> > >> > > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >> > >
> > >> > >> Keeping Lanczos would be nice, Like I said, it's currently being
> > >> used in
> > >> > >> some projects with good results and I think it's easier to tune
so
> > it
> > >> > >> would
> > >> > >> be my first choice for future developments. I still need to
further
> > >> test
> > >> > >> SSVD, specially because in the current example I'm working it
> > yields
> > >> > very
> > >> > >> different results from Lanczos. We are investigating if it can
be
> > due
> > >> > to a
> > >> > >> bug when loading the data, though dimensions of the ouptut seem
ok,
> > >> or
> > >> > if
> > >> > >> it's a question of increasing p or q parameters. If it's a
question
> > >> of
> > >> > >> increasing p and q I think running times would make SSVD not
> > viable.
> > >> I
> > >> > >> hope
> > >> > >> to be able to provide some comparison figures in terms of
precision
> > >> and
> > >> > >> running time in a month or so.
> > >> > >>
> > >> > >> I hope that other users reads this and say wether they are using
> > >> > Lanczos.
> > >> > >>
> > >> > >> Best,
> > >> > >> Fernando.
> > >> > >>
> > >> > >> 2013/8/2 Sebastian Schelter <ss...@apache.org>
> > >> > >>
> > >> > >> > I would also be fine with keeping if there is demand. I just
> > >> proposed
> > >> > to
> > >> > >> > deprecate it and nobody voted against that at that point in
time.
> > >> > >> >
> > >> > >> > --sebastian
> > >> > >> >
> > >> > >> >
> > >> > >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > >> > >> > > There's a part of Nathan Halko's dissertation referenced on
> > >> > algorithm
> > >> > >> > page
> > >> > >> > > running comparison.  In particular, he was not able to
compute
> > >> more
> > >> > >> than
> > >> > >> > 40
> > >> > >> > > eigenvectors with Lanczos on wikipedia dataset. You may
refer
> > to
> > >> > that
> > >> > >> > > study.
> > >> > >> > >
> > >> > >> > > On the accuracy part, it was not observed that it was a
> > problem,
> > >> > >> assuming
> > >> > >> > > high level of random noise is not the case, at least not in
> > >> LSA-like
> > >> > >> > > application used there.
> > >> > >> > >
> > >> > >> > > That said, i am all for diversity of tools, I would
actually be
> > >> +0
> > >> > on
> > >> > >> > > deprecating Lanczos, it is not like we are lacking support
for
> > >> it.
> > >> > >> SSVD
> > >> > >> > > could use improvements too.
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > >> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >> > >> > >
> > >> > >> > >> Hi everyone,
> > >> > >> > >>
> > >> > >> > >> Sorry if I duplicate the question but I've been looking
for an
> > >> > answer
> > >> > >> > and I
> > >> > >> > >> haven't found an explanation other than it's not being used
> > >> > (together
> > >> > >> > with
> > >> > >> > >> some other algorithms). If it's been discussed in depth
before
> > >> > maybe
> > >> > >> you
> > >> > >> > >> can point me to some link with the discussion.
> > >> > >> > >>
> > >> > >> > >> I have successfully used Lanczos in several projects and
it's
> > >> been
> > >> > a
> > >> > >> > >> surprise to me finding that the main reason (according to
what
> > >> I've
> > >> > >> read
> > >> > >> > >> that might not be the full story) is that it's not being
used.
> > >> At
> > >> > the
> > >> > >> > >> begining I supposed it was because SSVD is supposed to be
much
> > >> > faster
> > >> > >> > with
> > >> > >> > >> similar results, but after making some tests I have found
that
> > >> > >> running
> > >> > >> > >> times are similar or even worse than lanczos for some
> > >> > configurations
> > >> > >> (I
> > >> > >> > >> have tried several combinations of parameters, given child
> > >> > processes
> > >> > >> > enough
> > >> > >> > >> memory, etc. and had no success in running SSVD at least in
> > 3/4
> > >> of
> > >> > >> time
> > >> > >> > >> Lanczos runs, thouh they might be some combinations of
> > >> parameters I
> > >> > >> have
> > >> > >> > >> still not tried). It seems to be quite tricky to find a
good
> > >> > >> > combination of
> > >> > >> > >> parameters for SSVD and I have seen also a precision loss
in
> > >> some
> > >> > >> > examples
> > >> > >> > >> that makes me not confident in migrating Lanczos to SSVD
from
> > >> now
> > >> > on
> > >> > >> > (How
> > >> > >> > >> far can I trust results from a combination of parameters
that
> > >> runs
> > >> > in
> > >> > >> > >> significant less time, or at least a good time?).
> > >> > >> > >>
> > >> > >> > >> Can someone convince me that SSVD is actually a better
option
> > >> than
> > >> > >> > Lanczos?
> > >> > >> > >> (I'm totally willing to be convinced... :) )
> > >> > >> > >>
> > >> > >> > >> Thank you very much in advance.
> > >> > >> > >>
> > >> > >> > >> Fernando.
> > >> > >> > >>
> > >> > >> > >
> > >> > >> >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >

Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Aug 4, 2013 4:32 AM, "Fernando Fernández" <
fernando.fernandez.gonzalez@gmail.com> wrote:
>
> > I thought both methods accept exactly the same drm format so u could
just
> feed the same thing to them?
>
> That's exactly what I'm doing, (or I think I'm doing at least... I will
> look deeply into this in three or four weeks when I get back to work).
>
> As for the movielens example I will try to replicate the tests with
> mahout's ssvd and lanczos and see what happens. I will try also to do the
> opposite and run my actual data in R's prototype with a big enough
machine,
> and run it also with IRLBA package which seems to work pretty well in
terms
> of precision.
>
> Also, maybe it's easy to adapt SSVD prototype in R to use uniform vectors,
> right?
Yes, it is easy, one just needs ot substitute rnorm with runif over -1..1
but it is much more likely that numeric stability issues in prototype power
iterations if any would appear more pronounced because of dense AB'
multiplication as well as qr.
>
> Thanks again,
>
> Fernando.
>
>
> 2013/8/3 Ted Dunning <te...@gmail.com>
>
> > On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández <
> > fernando.fernandez.gonzalez@gmail.com> wrote:
> >
> > > > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
> > > 145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with
> > q=0"
> > > [1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572
126.42295
> > > 121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568
159.13914
> > > 158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1]
640.63362
> > > 244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920
> > > 106.45790  99.01242
> > >
> > > [1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
> > > 157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
> > > 640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
> > > 119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
> > > 157.13394 145.08462 125.33024 120.20984 102.45867  95.37994
> > >
> > >
> > > I have repeated the runs several times with the same results... Maybe
I'm
> > > still missing something else but given these results I can't apply the
> > rule
> > > of q=1 improves accuracy. At least I have to experiment, my guess is
it
> > do
> > > depends on the dataset. I would like also to repeat this comparison
with
> > > Mahout's SSVD and my dataset and see what happens.
> > >
> > > Dmitriy, thank you very much for your attention and sharing your
thoughts
> > > with me. I really appreciate it.
> > >
> >
> > That is interesting.
> >
> > The results for q=0 and q=1 are remarkably similar which I wouldn't
expect.
> >

Re: Why is Lanczos deprecated?

Posted by Ted Dunning <te...@gmail.com>.
It is very easy but I expect to see no difference.  

Sent from my iPhone

On Aug 4, 2013, at 4:31, Fernando Fernández <fe...@gmail.com> wrote:

> 
> Also, maybe it's easy to adapt SSVD prototype in R to use uniform vectors,
> right?

Re: Why is Lanczos deprecated?

Posted by Fernando Fernández <fe...@gmail.com>.
> I thought both methods accept exactly the same drm format so u could just
feed the same thing to them?

That's exactly what I'm doing, (or I think I'm doing at least... I will
look deeply into this in three or four weeks when I get back to work).

As for the movielens example I will try to replicate the tests with
mahout's ssvd and lanczos and see what happens. I will try also to do the
opposite and run my actual data in R's prototype with a big enough machine,
and run it also with IRLBA package which seems to work pretty well in terms
of precision.

Also, maybe it's easy to adapt SSVD prototype in R to use uniform vectors,
right?

Thanks again,

Fernando.


2013/8/3 Ted Dunning <te...@gmail.com>

> On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández <
> fernando.fernandez.gonzalez@gmail.com> wrote:
>
> > > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
> > 145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with
> q=0"
> > [1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572 126.42295
> > 121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568 159.13914
> > 158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1] 640.63362
> > 244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920
> > 106.45790  99.01242
> >
> > [1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
> > 157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
> > 640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
> > 119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
> > 157.13394 145.08462 125.33024 120.20984 102.45867  95.37994
> >
> >
> > I have repeated the runs several times with the same results... Maybe I'm
> > still missing something else but given these results I can't apply the
> rule
> > of q=1 improves accuracy. At least I have to experiment, my guess is it
> do
> > depends on the dataset. I would like also to repeat this comparison with
> > Mahout's SSVD and my dataset and see what happens.
> >
> > Dmitriy, thank you very much for your attention and sharing your thoughts
> > with me. I really appreciate it.
> >
>
> That is interesting.
>
> The results for q=0 and q=1 are remarkably similar which I wouldn't expect.
>

Re: Why is Lanczos deprecated?

Posted by Ted Dunning <te...@gmail.com>.
On Sat, Aug 3, 2013 at 3:05 AM, Fernando Fernández <
fernando.fernandez.gonzalez@gmail.com> wrote:

> > svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191
> 145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with q=0"
> [1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572 126.42295
> 121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568 159.13914
> 158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1] 640.63362
> 244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920
> 106.45790  99.01242
>
> [1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
> 157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
> 640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
> 119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
> 157.13394 145.08462 125.33024 120.20984 102.45867  95.37994
>
>
> I have repeated the runs several times with the same results... Maybe I'm
> still missing something else but given these results I can't apply the rule
> of q=1 improves accuracy. At least I have to experiment, my guess is it do
> depends on the dataset. I would like also to repeat this comparison with
> Mahout's SSVD and my dataset and see what happens.
>
> Dmitriy, thank you very much for your attention and sharing your thoughts
> with me. I really appreciate it.
>

That is interesting.

The results for q=0 and q=1 are remarkably similar which I wouldn't expect.

Re: Why is Lanczos deprecated?

Posted by Fernando Fernández <fe...@gmail.com>.
Definition of "so" is Mahout Lanczos an R yielding eigenvalues like (I'm
inventing the numbers here cause I don't remeber exact figures): 1834.58,
756.34, 325,67,125,67 and providing very good recommendations in the
recommender system, and SSVD giving eigenvalues (invented numbers again)
723,56, 354,67, 111.67, 101.46 and provinding nonsense recommendations...
that's why I'm suspecting there might be a bug in the input code. Small
changes in decimal places and even in units, like 723,56 to 730,78 would be
reasonable. 1834 to 723 is not. I put this numbers in quarantine until I
determine everything's ok with the input code.

Thanks for the link to Halko's dissertation. I know it's a nice piece of
work and a reference and had already given it a look but I always have to
do my own experiments because I have found so often that things doesn't
work as expected with certain real cases that I always try to at least
validate what is in papers and dissertations also apply to my data..

I'm aware SSVD is non-deterministic, I always check this kind of algorithms
with several runs. Here are some results on movielens 100k data using R's
implementation of SSVD provided  here
https://cwiki.apache.org/confluence/download/attachments/27832158/ssvd.R?version=1&modificationDate=1323358453000
(I hope there are no significant differences between the results with this
implementation and Mahout's):

First line is 10 first eigenvalues computed with R's svd. Next three are
computed with ssvd.svd with q=0 and next three are with q=1:

> svd.r$d[1:10] [1] 640.63362 244.83635 217.84622 159.15360 158.21191 145.87261 126.57977 121.90770 106.82918  99.74794[1] "three runs with q=0" [1] 640.63362 244.83613 217.84493 159.14512 158.20471 145.82572 126.42295 121.79764 105.99973  98.99649 [1] 640.63362 244.83592 217.84568 159.13914 158.19299 145.84226 126.46651 121.73629 106.22892  99.11622 [1] 640.63362 244.83590 217.84482 159.12955 158.19675 145.81728 126.47135 121.79920 106.45790  99.01242

[1] "three runs with q=1" [1] 640.63259 244.75889 217.66362 158.40002
157.61954 145.26448 125.25675 119.74266 104.16382  95.43547 [1]
640.6327 244.7559 217.6805 158.6019 157.4059 144.9223 124.2859
119.1194 103.9104  96.6282 [1] 640.63313 244.62599 217.67781 158.72475
157.13394 145.08462 125.33024 120.20984 102.45867  95.37994


I have repeated the runs several times with the same results... Maybe I'm
still missing something else but given these results I can't apply the rule
of q=1 improves accuracy. At least I have to experiment, my guess is it do
depends on the dataset. I would like also to repeat this comparison with
Mahout's SSVD and my dataset and see what happens.

Dmitriy, thank you very much for your attention and sharing your thoughts
with me. I really appreciate it.

Best,
Fernando.


2013/8/3 Dmitriy Lyubimov <dl...@gmail.com>

> On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> >
> >
> >
> > On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
> > fernando.fernandez.gonzalez@gmail.com> wrote:
> >
> >> I don't agree with k>10 being unlikely meaningful. I've used SVD in text
> >> mining problems where k~150 yielded best results (not only a good choice
> >> based on plotting eigenvalues and seeing elbow in decay was near 150 but
> >> checking results with different k's and seeing around 150 made much more
> >> sense). Currently I'm working in a recommender system and already have
> >> Lanczos running with k~50 producing best results, again, based on visual
> >> exploration of eigenvalues and exploring results one by one and seeing
> >> they
> >> were more meaningful. Current tests with SSVD are based on the latter
> and
> >> when I say I'm not getting good results I mean Lanczos is working
> properly
> >> on the same problem (I've explored eigenvalues up to 150 and have a good
> >> decay) and SSVD is not, but as I said, this might be caused by some bug
> in
> >> the input process, seems to strange to me that results are so different
> so
> >>
> >
> > Depends on how you define "so". But again, in that respect all i can
> point
> > to is to the accuracy study by N. Halko, out of published work.
> >
> I guess i can save you digging thru Mahout wiki, here is the reference
> http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf
> .
> Specifically, look at eigen values chart comparison at page  179. This is
> run on Mahout's Lanczos and SSVD neck-to-neck. The order of accuracy for
> first 40 values is claimed as "Order of accuracy is q = 3; q = 2; q = 1,
> lanczos, q = 0." (see source for details of accuracy assessment).
>
> One thing i did not understand there is why Lanczos showed such
> uncharacteristic values fall-off for values between 40 and 60. I have
> always assumed -q=1 was showing something much closer to reality after
> first 40 values as well.
>
>
> >
> >> I'll get back to this discussions when I figure it out :) . If you are
> >> curious about the numbers: 1MM rows by 150k columns for text mining case
> >> and 18 MM rows by 80k columns for recommender.
> >>
> >> About p and q, I have been playing around with movielens 100k dataset
> and
> >> found q>0 actually worsens results in terms of precision (nothing severe
> >> though, but it happens) and its better to increase p a little in that
> >> particular case, so my guess is it depends a lot on the dataset though I
> >> don't know how.
> >>
> >
> > This again sounds very strange.  The algorithm is non-deterministic,
> which
> > means errors you get in one run, will be different from errors in another
> > run, but honesly, you would be the first to report that power iterations
> > worsen expectation of an error. All theoretical work and practical
> > estimates did not confirm that observation; in fact, quite a bit to the
> > contrary.
> >
> >
> >>
> >> 2013/8/2 Dmitriy Lyubimov <dl...@gmail.com>
> >>
> >> > the only time you would not get good results is if spectrum does not
> >> have a
> >> > good decay. Which is equivalent to mostly same variance in most of
> >> original
> >> > basis directions. This problem is similar to problem that arises with
> >> PCA
> >> > when you try to do dimensionality reduction with retaining certain
> >> %-tage
> >> > of variance. in case of flat spectrum decay, you'd need much bigger k
> to
> >> > retain same amount of variance in dimensionally reduced projection. In
> >> that
> >> > sense SSVD solution for a given k is as good as PCA gets for the same
> k.
> >> > Also, i believe (but not 100% sure) "problems too small" exhibit
> higher
> >> > errors due to the law of large numbers.
> >> >
> >> >
> >> > On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <dl...@gmail.com>
> >> > wrote:
> >> >
> >> > > if you use k > 40 you are already beating Lanczos for larger
> datasets.
> >> > > k>10 is unlikely meaninful. p need not be more than 15% of k
> (default
> >> is
> >> > > 15). use q=1, q>1 does not yield tangible improvements in real
> world.
> >> > >  Again, see Nathan Halko's dissertation on accuracy comparison.
> >> > >
> >> > >
> >> > >
> >> > > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> >> > >
> >> > >> Keeping Lanczos would be nice, Like I said, it's currently being
> >> used in
> >> > >> some projects with good results and I think it's easier to tune so
> it
> >> > >> would
> >> > >> be my first choice for future developments. I still need to further
> >> test
> >> > >> SSVD, specially because in the current example I'm working it
> yields
> >> > very
> >> > >> different results from Lanczos. We are investigating if it can be
> due
> >> > to a
> >> > >> bug when loading the data, though dimensions of the ouptut seem ok,
> >> or
> >> > if
> >> > >> it's a question of increasing p or q parameters. If it's a question
> >> of
> >> > >> increasing p and q I think running times would make SSVD not
> viable.
> >> I
> >> > >> hope
> >> > >> to be able to provide some comparison figures in terms of precision
> >> and
> >> > >> running time in a month or so.
> >> > >>
> >> > >> I hope that other users reads this and say wether they are using
> >> > Lanczos.
> >> > >>
> >> > >> Best,
> >> > >> Fernando.
> >> > >>
> >> > >> 2013/8/2 Sebastian Schelter <ss...@apache.org>
> >> > >>
> >> > >> > I would also be fine with keeping if there is demand. I just
> >> proposed
> >> > to
> >> > >> > deprecate it and nobody voted against that at that point in time.
> >> > >> >
> >> > >> > --sebastian
> >> > >> >
> >> > >> >
> >> > >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> >> > >> > > There's a part of Nathan Halko's dissertation referenced on
> >> > algorithm
> >> > >> > page
> >> > >> > > running comparison.  In particular, he was not able to compute
> >> more
> >> > >> than
> >> > >> > 40
> >> > >> > > eigenvectors with Lanczos on wikipedia dataset. You may refer
> to
> >> > that
> >> > >> > > study.
> >> > >> > >
> >> > >> > > On the accuracy part, it was not observed that it was a
> problem,
> >> > >> assuming
> >> > >> > > high level of random noise is not the case, at least not in
> >> LSA-like
> >> > >> > > application used there.
> >> > >> > >
> >> > >> > > That said, i am all for diversity of tools, I would actually be
> >> +0
> >> > on
> >> > >> > > deprecating Lanczos, it is not like we are lacking support for
> >> it.
> >> > >> SSVD
> >> > >> > > could use improvements too.
> >> > >> > >
> >> > >> > >
> >> > >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> >> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> >> > >> > >
> >> > >> > >> Hi everyone,
> >> > >> > >>
> >> > >> > >> Sorry if I duplicate the question but I've been looking for an
> >> > answer
> >> > >> > and I
> >> > >> > >> haven't found an explanation other than it's not being used
> >> > (together
> >> > >> > with
> >> > >> > >> some other algorithms). If it's been discussed in depth before
> >> > maybe
> >> > >> you
> >> > >> > >> can point me to some link with the discussion.
> >> > >> > >>
> >> > >> > >> I have successfully used Lanczos in several projects and it's
> >> been
> >> > a
> >> > >> > >> surprise to me finding that the main reason (according to what
> >> I've
> >> > >> read
> >> > >> > >> that might not be the full story) is that it's not being used.
> >> At
> >> > the
> >> > >> > >> begining I supposed it was because SSVD is supposed to be much
> >> > faster
> >> > >> > with
> >> > >> > >> similar results, but after making some tests I have found that
> >> > >> running
> >> > >> > >> times are similar or even worse than lanczos for some
> >> > configurations
> >> > >> (I
> >> > >> > >> have tried several combinations of parameters, given child
> >> > processes
> >> > >> > enough
> >> > >> > >> memory, etc. and had no success in running SSVD at least in
> 3/4
> >> of
> >> > >> time
> >> > >> > >> Lanczos runs, thouh they might be some combinations of
> >> parameters I
> >> > >> have
> >> > >> > >> still not tried). It seems to be quite tricky to find a good
> >> > >> > combination of
> >> > >> > >> parameters for SSVD and I have seen also a precision loss in
> >> some
> >> > >> > examples
> >> > >> > >> that makes me not confident in migrating Lanczos to SSVD from
> >> now
> >> > on
> >> > >> > (How
> >> > >> > >> far can I trust results from a combination of parameters that
> >> runs
> >> > in
> >> > >> > >> significant less time, or at least a good time?).
> >> > >> > >>
> >> > >> > >> Can someone convince me that SSVD is actually a better option
> >> than
> >> > >> > Lanczos?
> >> > >> > >> (I'm totally willing to be convinced... :) )
> >> > >> > >>
> >> > >> > >> Thank you very much in advance.
> >> > >> > >>
> >> > >> > >> Fernando.
> >> > >> > >>
> >> > >> > >
> >> > >> >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Fri, Aug 2, 2013 at 3:08 PM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

>
>
>
> On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
> fernando.fernandez.gonzalez@gmail.com> wrote:
>
>> I don't agree with k>10 being unlikely meaningful. I've used SVD in text
>> mining problems where k~150 yielded best results (not only a good choice
>> based on plotting eigenvalues and seeing elbow in decay was near 150 but
>> checking results with different k's and seeing around 150 made much more
>> sense). Currently I'm working in a recommender system and already have
>> Lanczos running with k~50 producing best results, again, based on visual
>> exploration of eigenvalues and exploring results one by one and seeing
>> they
>> were more meaningful. Current tests with SSVD are based on the latter and
>> when I say I'm not getting good results I mean Lanczos is working properly
>> on the same problem (I've explored eigenvalues up to 150 and have a good
>> decay) and SSVD is not, but as I said, this might be caused by some bug in
>> the input process, seems to strange to me that results are so different so
>>
>
> Depends on how you define "so". But again, in that respect all i can point
> to is to the accuracy study by N. Halko, out of published work.
>
I guess i can save you digging thru Mahout wiki, here is the reference
http://amath.colorado.edu/faculty/martinss/Pubs/2012_halko_dissertation.pdf.
Specifically, look at eigen values chart comparison at page  179. This is
run on Mahout's Lanczos and SSVD neck-to-neck. The order of accuracy for
first 40 values is claimed as "Order of accuracy is q = 3; q = 2; q = 1,
lanczos, q = 0." (see source for details of accuracy assessment).

One thing i did not understand there is why Lanczos showed such
uncharacteristic values fall-off for values between 40 and 60. I have
always assumed -q=1 was showing something much closer to reality after
first 40 values as well.


>
>> I'll get back to this discussions when I figure it out :) . If you are
>> curious about the numbers: 1MM rows by 150k columns for text mining case
>> and 18 MM rows by 80k columns for recommender.
>>
>> About p and q, I have been playing around with movielens 100k dataset and
>> found q>0 actually worsens results in terms of precision (nothing severe
>> though, but it happens) and its better to increase p a little in that
>> particular case, so my guess is it depends a lot on the dataset though I
>> don't know how.
>>
>
> This again sounds very strange.  The algorithm is non-deterministic, which
> means errors you get in one run, will be different from errors in another
> run, but honesly, you would be the first to report that power iterations
> worsen expectation of an error. All theoretical work and practical
> estimates did not confirm that observation; in fact, quite a bit to the
> contrary.
>
>
>>
>> 2013/8/2 Dmitriy Lyubimov <dl...@gmail.com>
>>
>> > the only time you would not get good results is if spectrum does not
>> have a
>> > good decay. Which is equivalent to mostly same variance in most of
>> original
>> > basis directions. This problem is similar to problem that arises with
>> PCA
>> > when you try to do dimensionality reduction with retaining certain
>> %-tage
>> > of variance. in case of flat spectrum decay, you'd need much bigger k to
>> > retain same amount of variance in dimensionally reduced projection. In
>> that
>> > sense SSVD solution for a given k is as good as PCA gets for the same k.
>> > Also, i believe (but not 100% sure) "problems too small" exhibit higher
>> > errors due to the law of large numbers.
>> >
>> >
>> > On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <dl...@gmail.com>
>> > wrote:
>> >
>> > > if you use k > 40 you are already beating Lanczos for larger datasets.
>> > > k>10 is unlikely meaninful. p need not be more than 15% of k (default
>> is
>> > > 15). use q=1, q>1 does not yield tangible improvements in real world.
>> > >  Again, see Nathan Halko's dissertation on accuracy comparison.
>> > >
>> > >
>> > >
>> > > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
>> > > fernando.fernandez.gonzalez@gmail.com> wrote:
>> > >
>> > >> Keeping Lanczos would be nice, Like I said, it's currently being
>> used in
>> > >> some projects with good results and I think it's easier to tune so it
>> > >> would
>> > >> be my first choice for future developments. I still need to further
>> test
>> > >> SSVD, specially because in the current example I'm working it yields
>> > very
>> > >> different results from Lanczos. We are investigating if it can be due
>> > to a
>> > >> bug when loading the data, though dimensions of the ouptut seem ok,
>> or
>> > if
>> > >> it's a question of increasing p or q parameters. If it's a question
>> of
>> > >> increasing p and q I think running times would make SSVD not viable.
>> I
>> > >> hope
>> > >> to be able to provide some comparison figures in terms of precision
>> and
>> > >> running time in a month or so.
>> > >>
>> > >> I hope that other users reads this and say wether they are using
>> > Lanczos.
>> > >>
>> > >> Best,
>> > >> Fernando.
>> > >>
>> > >> 2013/8/2 Sebastian Schelter <ss...@apache.org>
>> > >>
>> > >> > I would also be fine with keeping if there is demand. I just
>> proposed
>> > to
>> > >> > deprecate it and nobody voted against that at that point in time.
>> > >> >
>> > >> > --sebastian
>> > >> >
>> > >> >
>> > >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
>> > >> > > There's a part of Nathan Halko's dissertation referenced on
>> > algorithm
>> > >> > page
>> > >> > > running comparison.  In particular, he was not able to compute
>> more
>> > >> than
>> > >> > 40
>> > >> > > eigenvectors with Lanczos on wikipedia dataset. You may refer to
>> > that
>> > >> > > study.
>> > >> > >
>> > >> > > On the accuracy part, it was not observed that it was a problem,
>> > >> assuming
>> > >> > > high level of random noise is not the case, at least not in
>> LSA-like
>> > >> > > application used there.
>> > >> > >
>> > >> > > That said, i am all for diversity of tools, I would actually be
>> +0
>> > on
>> > >> > > deprecating Lanczos, it is not like we are lacking support for
>> it.
>> > >> SSVD
>> > >> > > could use improvements too.
>> > >> > >
>> > >> > >
>> > >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
>> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
>> > >> > >
>> > >> > >> Hi everyone,
>> > >> > >>
>> > >> > >> Sorry if I duplicate the question but I've been looking for an
>> > answer
>> > >> > and I
>> > >> > >> haven't found an explanation other than it's not being used
>> > (together
>> > >> > with
>> > >> > >> some other algorithms). If it's been discussed in depth before
>> > maybe
>> > >> you
>> > >> > >> can point me to some link with the discussion.
>> > >> > >>
>> > >> > >> I have successfully used Lanczos in several projects and it's
>> been
>> > a
>> > >> > >> surprise to me finding that the main reason (according to what
>> I've
>> > >> read
>> > >> > >> that might not be the full story) is that it's not being used.
>> At
>> > the
>> > >> > >> begining I supposed it was because SSVD is supposed to be much
>> > faster
>> > >> > with
>> > >> > >> similar results, but after making some tests I have found that
>> > >> running
>> > >> > >> times are similar or even worse than lanczos for some
>> > configurations
>> > >> (I
>> > >> > >> have tried several combinations of parameters, given child
>> > processes
>> > >> > enough
>> > >> > >> memory, etc. and had no success in running SSVD at least in 3/4
>> of
>> > >> time
>> > >> > >> Lanczos runs, thouh they might be some combinations of
>> parameters I
>> > >> have
>> > >> > >> still not tried). It seems to be quite tricky to find a good
>> > >> > combination of
>> > >> > >> parameters for SSVD and I have seen also a precision loss in
>> some
>> > >> > examples
>> > >> > >> that makes me not confident in migrating Lanczos to SSVD from
>> now
>> > on
>> > >> > (How
>> > >> > >> far can I trust results from a combination of parameters that
>> runs
>> > in
>> > >> > >> significant less time, or at least a good time?).
>> > >> > >>
>> > >> > >> Can someone convince me that SSVD is actually a better option
>> than
>> > >> > Lanczos?
>> > >> > >> (I'm totally willing to be convinced... :) )
>> > >> > >>
>> > >> > >> Thank you very much in advance.
>> > >> > >>
>> > >> > >> Fernando.
>> > >> > >>
>> > >> > >
>> > >> >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
On Fri, Aug 2, 2013 at 2:52 PM, Fernando Fernández <
fernando.fernandez.gonzalez@gmail.com> wrote:

> I don't agree with k>10 being unlikely meaningful. I've used SVD in text
> mining problems where k~150 yielded best results (not only a good choice
> based on plotting eigenvalues and seeing elbow in decay was near 150 but
> checking results with different k's and seeing around 150 made much more
> sense). Currently I'm working in a recommender system and already have
> Lanczos running with k~50 producing best results, again, based on visual
> exploration of eigenvalues and exploring results one by one and seeing they
> were more meaningful. Current tests with SSVD are based on the latter and
> when I say I'm not getting good results I mean Lanczos is working properly
> on the same problem (I've explored eigenvalues up to 150 and have a good
> decay) and SSVD is not, but as I said, this might be caused by some bug in
> the input process, seems to strange to me that results are so different so
>

Depends on how you define "so". But again, in that respect all i can point
to is to the accuracy study by N. Halko, out of published work.


> I'll get back to this discussions when I figure it out :) . If you are
> curious about the numbers: 1MM rows by 150k columns for text mining case
> and 18 MM rows by 80k columns for recommender.
>
> About p and q, I have been playing around with movielens 100k dataset and
> found q>0 actually worsens results in terms of precision (nothing severe
> though, but it happens) and its better to increase p a little in that
> particular case, so my guess is it depends a lot on the dataset though I
> don't know how.
>

This again sounds very strange.  The algorithm is non-deterministic, which
means errors you get in one run, will be different from errors in another
run, but honesly, you would be the first to report that power iterations
worsen expectation of an error. All theoretical work and practical
estimates did not confirm that observation; in fact, quite a bit to the
contrary.


>
> 2013/8/2 Dmitriy Lyubimov <dl...@gmail.com>
>
> > the only time you would not get good results is if spectrum does not
> have a
> > good decay. Which is equivalent to mostly same variance in most of
> original
> > basis directions. This problem is similar to problem that arises with PCA
> > when you try to do dimensionality reduction with retaining certain %-tage
> > of variance. in case of flat spectrum decay, you'd need much bigger k to
> > retain same amount of variance in dimensionally reduced projection. In
> that
> > sense SSVD solution for a given k is as good as PCA gets for the same k.
> > Also, i believe (but not 100% sure) "problems too small" exhibit higher
> > errors due to the law of large numbers.
> >
> >
> > On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <dl...@gmail.com>
> > wrote:
> >
> > > if you use k > 40 you are already beating Lanczos for larger datasets.
> > > k>10 is unlikely meaninful. p need not be more than 15% of k (default
> is
> > > 15). use q=1, q>1 does not yield tangible improvements in real world.
> > >  Again, see Nathan Halko's dissertation on accuracy comparison.
> > >
> > >
> > >
> > > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >
> > >> Keeping Lanczos would be nice, Like I said, it's currently being used
> in
> > >> some projects with good results and I think it's easier to tune so it
> > >> would
> > >> be my first choice for future developments. I still need to further
> test
> > >> SSVD, specially because in the current example I'm working it yields
> > very
> > >> different results from Lanczos. We are investigating if it can be due
> > to a
> > >> bug when loading the data, though dimensions of the ouptut seem ok, or
> > if
> > >> it's a question of increasing p or q parameters. If it's a question of
> > >> increasing p and q I think running times would make SSVD not viable. I
> > >> hope
> > >> to be able to provide some comparison figures in terms of precision
> and
> > >> running time in a month or so.
> > >>
> > >> I hope that other users reads this and say wether they are using
> > Lanczos.
> > >>
> > >> Best,
> > >> Fernando.
> > >>
> > >> 2013/8/2 Sebastian Schelter <ss...@apache.org>
> > >>
> > >> > I would also be fine with keeping if there is demand. I just
> proposed
> > to
> > >> > deprecate it and nobody voted against that at that point in time.
> > >> >
> > >> > --sebastian
> > >> >
> > >> >
> > >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > >> > > There's a part of Nathan Halko's dissertation referenced on
> > algorithm
> > >> > page
> > >> > > running comparison.  In particular, he was not able to compute
> more
> > >> than
> > >> > 40
> > >> > > eigenvectors with Lanczos on wikipedia dataset. You may refer to
> > that
> > >> > > study.
> > >> > >
> > >> > > On the accuracy part, it was not observed that it was a problem,
> > >> assuming
> > >> > > high level of random noise is not the case, at least not in
> LSA-like
> > >> > > application used there.
> > >> > >
> > >> > > That said, i am all for diversity of tools, I would actually be +0
> > on
> > >> > > deprecating Lanczos, it is not like we are lacking support for it.
> > >> SSVD
> > >> > > could use improvements too.
> > >> > >
> > >> > >
> > >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >> > >
> > >> > >> Hi everyone,
> > >> > >>
> > >> > >> Sorry if I duplicate the question but I've been looking for an
> > answer
> > >> > and I
> > >> > >> haven't found an explanation other than it's not being used
> > (together
> > >> > with
> > >> > >> some other algorithms). If it's been discussed in depth before
> > maybe
> > >> you
> > >> > >> can point me to some link with the discussion.
> > >> > >>
> > >> > >> I have successfully used Lanczos in several projects and it's
> been
> > a
> > >> > >> surprise to me finding that the main reason (according to what
> I've
> > >> read
> > >> > >> that might not be the full story) is that it's not being used. At
> > the
> > >> > >> begining I supposed it was because SSVD is supposed to be much
> > faster
> > >> > with
> > >> > >> similar results, but after making some tests I have found that
> > >> running
> > >> > >> times are similar or even worse than lanczos for some
> > configurations
> > >> (I
> > >> > >> have tried several combinations of parameters, given child
> > processes
> > >> > enough
> > >> > >> memory, etc. and had no success in running SSVD at least in 3/4
> of
> > >> time
> > >> > >> Lanczos runs, thouh they might be some combinations of
> parameters I
> > >> have
> > >> > >> still not tried). It seems to be quite tricky to find a good
> > >> > combination of
> > >> > >> parameters for SSVD and I have seen also a precision loss in some
> > >> > examples
> > >> > >> that makes me not confident in migrating Lanczos to SSVD from now
> > on
> > >> > (How
> > >> > >> far can I trust results from a combination of parameters that
> runs
> > in
> > >> > >> significant less time, or at least a good time?).
> > >> > >>
> > >> > >> Can someone convince me that SSVD is actually a better option
> than
> > >> > Lanczos?
> > >> > >> (I'm totally willing to be convinced... :) )
> > >> > >>
> > >> > >> Thank you very much in advance.
> > >> > >>
> > >> > >> Fernando.
> > >> > >>
> > >> > >
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Re: Why is Lanczos deprecated?

Posted by Fernando Fernández <fe...@gmail.com>.
I don't agree with k>10 being unlikely meaningful. I've used SVD in text
mining problems where k~150 yielded best results (not only a good choice
based on plotting eigenvalues and seeing elbow in decay was near 150 but
checking results with different k's and seeing around 150 made much more
sense). Currently I'm working in a recommender system and already have
Lanczos running with k~50 producing best results, again, based on visual
exploration of eigenvalues and exploring results one by one and seeing they
were more meaningful. Current tests with SSVD are based on the latter and
when I say I'm not getting good results I mean Lanczos is working properly
on the same problem (I've explored eigenvalues up to 150 and have a good
decay) and SSVD is not, but as I said, this might be caused by some bug in
the input process, seems to strange to me that results are so different so
I'll get back to this discussions when I figure it out :) . If you are
curious about the numbers: 1MM rows by 150k columns for text mining case
and 18 MM rows by 80k columns for recommender.

About p and q, I have been playing around with movielens 100k dataset and
found q>0 actually worsens results in terms of precision (nothing severe
though, but it happens) and its better to increase p a little in that
particular case, so my guess is it depends a lot on the dataset though I
don't know how.


2013/8/2 Dmitriy Lyubimov <dl...@gmail.com>

> the only time you would not get good results is if spectrum does not have a
> good decay. Which is equivalent to mostly same variance in most of original
> basis directions. This problem is similar to problem that arises with PCA
> when you try to do dimensionality reduction with retaining certain %-tage
> of variance. in case of flat spectrum decay, you'd need much bigger k to
> retain same amount of variance in dimensionally reduced projection. In that
> sense SSVD solution for a given k is as good as PCA gets for the same k.
> Also, i believe (but not 100% sure) "problems too small" exhibit higher
> errors due to the law of large numbers.
>
>
> On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <dl...@gmail.com>
> wrote:
>
> > if you use k > 40 you are already beating Lanczos for larger datasets.
> > k>10 is unlikely meaninful. p need not be more than 15% of k (default is
> > 15). use q=1, q>1 does not yield tangible improvements in real world.
> >  Again, see Nathan Halko's dissertation on accuracy comparison.
> >
> >
> >
> > On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> > fernando.fernandez.gonzalez@gmail.com> wrote:
> >
> >> Keeping Lanczos would be nice, Like I said, it's currently being used in
> >> some projects with good results and I think it's easier to tune so it
> >> would
> >> be my first choice for future developments. I still need to further test
> >> SSVD, specially because in the current example I'm working it yields
> very
> >> different results from Lanczos. We are investigating if it can be due
> to a
> >> bug when loading the data, though dimensions of the ouptut seem ok, or
> if
> >> it's a question of increasing p or q parameters. If it's a question of
> >> increasing p and q I think running times would make SSVD not viable. I
> >> hope
> >> to be able to provide some comparison figures in terms of precision and
> >> running time in a month or so.
> >>
> >> I hope that other users reads this and say wether they are using
> Lanczos.
> >>
> >> Best,
> >> Fernando.
> >>
> >> 2013/8/2 Sebastian Schelter <ss...@apache.org>
> >>
> >> > I would also be fine with keeping if there is demand. I just proposed
> to
> >> > deprecate it and nobody voted against that at that point in time.
> >> >
> >> > --sebastian
> >> >
> >> >
> >> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> >> > > There's a part of Nathan Halko's dissertation referenced on
> algorithm
> >> > page
> >> > > running comparison.  In particular, he was not able to compute more
> >> than
> >> > 40
> >> > > eigenvectors with Lanczos on wikipedia dataset. You may refer to
> that
> >> > > study.
> >> > >
> >> > > On the accuracy part, it was not observed that it was a problem,
> >> assuming
> >> > > high level of random noise is not the case, at least not in LSA-like
> >> > > application used there.
> >> > >
> >> > > That said, i am all for diversity of tools, I would actually be +0
> on
> >> > > deprecating Lanczos, it is not like we are lacking support for it.
> >> SSVD
> >> > > could use improvements too.
> >> > >
> >> > >
> >> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> >> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> >> > >
> >> > >> Hi everyone,
> >> > >>
> >> > >> Sorry if I duplicate the question but I've been looking for an
> answer
> >> > and I
> >> > >> haven't found an explanation other than it's not being used
> (together
> >> > with
> >> > >> some other algorithms). If it's been discussed in depth before
> maybe
> >> you
> >> > >> can point me to some link with the discussion.
> >> > >>
> >> > >> I have successfully used Lanczos in several projects and it's been
> a
> >> > >> surprise to me finding that the main reason (according to what I've
> >> read
> >> > >> that might not be the full story) is that it's not being used. At
> the
> >> > >> begining I supposed it was because SSVD is supposed to be much
> faster
> >> > with
> >> > >> similar results, but after making some tests I have found that
> >> running
> >> > >> times are similar or even worse than lanczos for some
> configurations
> >> (I
> >> > >> have tried several combinations of parameters, given child
> processes
> >> > enough
> >> > >> memory, etc. and had no success in running SSVD at least in 3/4 of
> >> time
> >> > >> Lanczos runs, thouh they might be some combinations of parameters I
> >> have
> >> > >> still not tried). It seems to be quite tricky to find a good
> >> > combination of
> >> > >> parameters for SSVD and I have seen also a precision loss in some
> >> > examples
> >> > >> that makes me not confident in migrating Lanczos to SSVD from now
> on
> >> > (How
> >> > >> far can I trust results from a combination of parameters that runs
> in
> >> > >> significant less time, or at least a good time?).
> >> > >>
> >> > >> Can someone convince me that SSVD is actually a better option than
> >> > Lanczos?
> >> > >> (I'm totally willing to be convinced... :) )
> >> > >>
> >> > >> Thank you very much in advance.
> >> > >>
> >> > >> Fernando.
> >> > >>
> >> > >
> >> >
> >> >
> >>
> >
> >
>

Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
the only time you would not get good results is if spectrum does not have a
good decay. Which is equivalent to mostly same variance in most of original
basis directions. This problem is similar to problem that arises with PCA
when you try to do dimensionality reduction with retaining certain %-tage
of variance. in case of flat spectrum decay, you'd need much bigger k to
retain same amount of variance in dimensionally reduced projection. In that
sense SSVD solution for a given k is as good as PCA gets for the same k.
Also, i believe (but not 100% sure) "problems too small" exhibit higher
errors due to the law of large numbers.


On Fri, Aug 2, 2013 at 10:41 AM, Dmitriy Lyubimov <dl...@gmail.com> wrote:

> if you use k > 40 you are already beating Lanczos for larger datasets.
> k>10 is unlikely meaninful. p need not be more than 15% of k (default is
> 15). use q=1, q>1 does not yield tangible improvements in real world.
>  Again, see Nathan Halko's dissertation on accuracy comparison.
>
>
>
> On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
> fernando.fernandez.gonzalez@gmail.com> wrote:
>
>> Keeping Lanczos would be nice, Like I said, it's currently being used in
>> some projects with good results and I think it's easier to tune so it
>> would
>> be my first choice for future developments. I still need to further test
>> SSVD, specially because in the current example I'm working it yields very
>> different results from Lanczos. We are investigating if it can be due to a
>> bug when loading the data, though dimensions of the ouptut seem ok, or if
>> it's a question of increasing p or q parameters. If it's a question of
>> increasing p and q I think running times would make SSVD not viable. I
>> hope
>> to be able to provide some comparison figures in terms of precision and
>> running time in a month or so.
>>
>> I hope that other users reads this and say wether they are using Lanczos.
>>
>> Best,
>> Fernando.
>>
>> 2013/8/2 Sebastian Schelter <ss...@apache.org>
>>
>> > I would also be fine with keeping if there is demand. I just proposed to
>> > deprecate it and nobody voted against that at that point in time.
>> >
>> > --sebastian
>> >
>> >
>> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
>> > > There's a part of Nathan Halko's dissertation referenced on algorithm
>> > page
>> > > running comparison.  In particular, he was not able to compute more
>> than
>> > 40
>> > > eigenvectors with Lanczos on wikipedia dataset. You may refer to that
>> > > study.
>> > >
>> > > On the accuracy part, it was not observed that it was a problem,
>> assuming
>> > > high level of random noise is not the case, at least not in LSA-like
>> > > application used there.
>> > >
>> > > That said, i am all for diversity of tools, I would actually be +0 on
>> > > deprecating Lanczos, it is not like we are lacking support for it.
>> SSVD
>> > > could use improvements too.
>> > >
>> > >
>> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
>> > > fernando.fernandez.gonzalez@gmail.com> wrote:
>> > >
>> > >> Hi everyone,
>> > >>
>> > >> Sorry if I duplicate the question but I've been looking for an answer
>> > and I
>> > >> haven't found an explanation other than it's not being used (together
>> > with
>> > >> some other algorithms). If it's been discussed in depth before maybe
>> you
>> > >> can point me to some link with the discussion.
>> > >>
>> > >> I have successfully used Lanczos in several projects and it's been a
>> > >> surprise to me finding that the main reason (according to what I've
>> read
>> > >> that might not be the full story) is that it's not being used. At the
>> > >> begining I supposed it was because SSVD is supposed to be much faster
>> > with
>> > >> similar results, but after making some tests I have found that
>> running
>> > >> times are similar or even worse than lanczos for some configurations
>> (I
>> > >> have tried several combinations of parameters, given child processes
>> > enough
>> > >> memory, etc. and had no success in running SSVD at least in 3/4 of
>> time
>> > >> Lanczos runs, thouh they might be some combinations of parameters I
>> have
>> > >> still not tried). It seems to be quite tricky to find a good
>> > combination of
>> > >> parameters for SSVD and I have seen also a precision loss in some
>> > examples
>> > >> that makes me not confident in migrating Lanczos to SSVD from now on
>> > (How
>> > >> far can I trust results from a combination of parameters that runs in
>> > >> significant less time, or at least a good time?).
>> > >>
>> > >> Can someone convince me that SSVD is actually a better option than
>> > Lanczos?
>> > >> (I'm totally willing to be convinced... :) )
>> > >>
>> > >> Thank you very much in advance.
>> > >>
>> > >> Fernando.
>> > >>
>> > >
>> >
>> >
>>
>
>

Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
if you use k > 40 you are already beating Lanczos for larger datasets. k>10
is unlikely meaninful. p need not be more than 15% of k (default is 15).
use q=1, q>1 does not yield tangible improvements in real world.  Again,
see Nathan Halko's dissertation on accuracy comparison.



On Fri, Aug 2, 2013 at 4:17 AM, Fernando Fernández <
fernando.fernandez.gonzalez@gmail.com> wrote:

> Keeping Lanczos would be nice, Like I said, it's currently being used in
> some projects with good results and I think it's easier to tune so it would
> be my first choice for future developments. I still need to further test
> SSVD, specially because in the current example I'm working it yields very
> different results from Lanczos. We are investigating if it can be due to a
> bug when loading the data, though dimensions of the ouptut seem ok, or if
> it's a question of increasing p or q parameters. If it's a question of
> increasing p and q I think running times would make SSVD not viable. I hope
> to be able to provide some comparison figures in terms of precision and
> running time in a month or so.
>
> I hope that other users reads this and say wether they are using Lanczos.
>
> Best,
> Fernando.
>
> 2013/8/2 Sebastian Schelter <ss...@apache.org>
>
> > I would also be fine with keeping if there is demand. I just proposed to
> > deprecate it and nobody voted against that at that point in time.
> >
> > --sebastian
> >
> >
> > On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > > There's a part of Nathan Halko's dissertation referenced on algorithm
> > page
> > > running comparison.  In particular, he was not able to compute more
> than
> > 40
> > > eigenvectors with Lanczos on wikipedia dataset. You may refer to that
> > > study.
> > >
> > > On the accuracy part, it was not observed that it was a problem,
> assuming
> > > high level of random noise is not the case, at least not in LSA-like
> > > application used there.
> > >
> > > That said, i am all for diversity of tools, I would actually be +0 on
> > > deprecating Lanczos, it is not like we are lacking support for it. SSVD
> > > could use improvements too.
> > >
> > >
> > > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > > fernando.fernandez.gonzalez@gmail.com> wrote:
> > >
> > >> Hi everyone,
> > >>
> > >> Sorry if I duplicate the question but I've been looking for an answer
> > and I
> > >> haven't found an explanation other than it's not being used (together
> > with
> > >> some other algorithms). If it's been discussed in depth before maybe
> you
> > >> can point me to some link with the discussion.
> > >>
> > >> I have successfully used Lanczos in several projects and it's been a
> > >> surprise to me finding that the main reason (according to what I've
> read
> > >> that might not be the full story) is that it's not being used. At the
> > >> begining I supposed it was because SSVD is supposed to be much faster
> > with
> > >> similar results, but after making some tests I have found that running
> > >> times are similar or even worse than lanczos for some configurations
> (I
> > >> have tried several combinations of parameters, given child processes
> > enough
> > >> memory, etc. and had no success in running SSVD at least in 3/4 of
> time
> > >> Lanczos runs, thouh they might be some combinations of parameters I
> have
> > >> still not tried). It seems to be quite tricky to find a good
> > combination of
> > >> parameters for SSVD and I have seen also a precision loss in some
> > examples
> > >> that makes me not confident in migrating Lanczos to SSVD from now on
> > (How
> > >> far can I trust results from a combination of parameters that runs in
> > >> significant less time, or at least a good time?).
> > >>
> > >> Can someone convince me that SSVD is actually a better option than
> > Lanczos?
> > >> (I'm totally willing to be convinced... :) )
> > >>
> > >> Thank you very much in advance.
> > >>
> > >> Fernando.
> > >>
> > >
> >
> >
>

Re: Why is Lanczos deprecated?

Posted by Fernando Fernández <fe...@gmail.com>.
Keeping Lanczos would be nice, Like I said, it's currently being used in
some projects with good results and I think it's easier to tune so it would
be my first choice for future developments. I still need to further test
SSVD, specially because in the current example I'm working it yields very
different results from Lanczos. We are investigating if it can be due to a
bug when loading the data, though dimensions of the ouptut seem ok, or if
it's a question of increasing p or q parameters. If it's a question of
increasing p and q I think running times would make SSVD not viable. I hope
to be able to provide some comparison figures in terms of precision and
running time in a month or so.

I hope that other users reads this and say wether they are using Lanczos.

Best,
Fernando.

2013/8/2 Sebastian Schelter <ss...@apache.org>

> I would also be fine with keeping if there is demand. I just proposed to
> deprecate it and nobody voted against that at that point in time.
>
> --sebastian
>
>
> On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> > There's a part of Nathan Halko's dissertation referenced on algorithm
> page
> > running comparison.  In particular, he was not able to compute more than
> 40
> > eigenvectors with Lanczos on wikipedia dataset. You may refer to that
> > study.
> >
> > On the accuracy part, it was not observed that it was a problem, assuming
> > high level of random noise is not the case, at least not in LSA-like
> > application used there.
> >
> > That said, i am all for diversity of tools, I would actually be +0 on
> > deprecating Lanczos, it is not like we are lacking support for it. SSVD
> > could use improvements too.
> >
> >
> > On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> > fernando.fernandez.gonzalez@gmail.com> wrote:
> >
> >> Hi everyone,
> >>
> >> Sorry if I duplicate the question but I've been looking for an answer
> and I
> >> haven't found an explanation other than it's not being used (together
> with
> >> some other algorithms). If it's been discussed in depth before maybe you
> >> can point me to some link with the discussion.
> >>
> >> I have successfully used Lanczos in several projects and it's been a
> >> surprise to me finding that the main reason (according to what I've read
> >> that might not be the full story) is that it's not being used. At the
> >> begining I supposed it was because SSVD is supposed to be much faster
> with
> >> similar results, but after making some tests I have found that running
> >> times are similar or even worse than lanczos for some configurations (I
> >> have tried several combinations of parameters, given child processes
> enough
> >> memory, etc. and had no success in running SSVD at least in 3/4 of time
> >> Lanczos runs, thouh they might be some combinations of parameters I have
> >> still not tried). It seems to be quite tricky to find a good
> combination of
> >> parameters for SSVD and I have seen also a precision loss in some
> examples
> >> that makes me not confident in migrating Lanczos to SSVD from now on
> (How
> >> far can I trust results from a combination of parameters that runs in
> >> significant less time, or at least a good time?).
> >>
> >> Can someone convince me that SSVD is actually a better option than
> Lanczos?
> >> (I'm totally willing to be convinced... :) )
> >>
> >> Thank you very much in advance.
> >>
> >> Fernando.
> >>
> >
>
>

Re: Why is Lanczos deprecated?

Posted by Sebastian Schelter <ss...@apache.org>.
I would also be fine with keeping if there is demand. I just proposed to
deprecate it and nobody voted against that at that point in time.

--sebastian


On 02.08.2013 03:12, Dmitriy Lyubimov wrote:
> There's a part of Nathan Halko's dissertation referenced on algorithm page
> running comparison.  In particular, he was not able to compute more than 40
> eigenvectors with Lanczos on wikipedia dataset. You may refer to that
> study.
> 
> On the accuracy part, it was not observed that it was a problem, assuming
> high level of random noise is not the case, at least not in LSA-like
> application used there.
> 
> That said, i am all for diversity of tools, I would actually be +0 on
> deprecating Lanczos, it is not like we are lacking support for it. SSVD
> could use improvements too.
> 
> 
> On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
> fernando.fernandez.gonzalez@gmail.com> wrote:
> 
>> Hi everyone,
>>
>> Sorry if I duplicate the question but I've been looking for an answer and I
>> haven't found an explanation other than it's not being used (together with
>> some other algorithms). If it's been discussed in depth before maybe you
>> can point me to some link with the discussion.
>>
>> I have successfully used Lanczos in several projects and it's been a
>> surprise to me finding that the main reason (according to what I've read
>> that might not be the full story) is that it's not being used. At the
>> begining I supposed it was because SSVD is supposed to be much faster with
>> similar results, but after making some tests I have found that running
>> times are similar or even worse than lanczos for some configurations (I
>> have tried several combinations of parameters, given child processes enough
>> memory, etc. and had no success in running SSVD at least in 3/4 of time
>> Lanczos runs, thouh they might be some combinations of parameters I have
>> still not tried). It seems to be quite tricky to find a good combination of
>> parameters for SSVD and I have seen also a precision loss in some examples
>> that makes me not confident in migrating Lanczos to SSVD from now on (How
>> far can I trust results from a combination of parameters that runs in
>> significant less time, or at least a good time?).
>>
>> Can someone convince me that SSVD is actually a better option than Lanczos?
>> (I'm totally willing to be convinced... :) )
>>
>> Thank you very much in advance.
>>
>> Fernando.
>>
> 


Re: Why is Lanczos deprecated?

Posted by Dmitriy Lyubimov <dl...@gmail.com>.
There's a part of Nathan Halko's dissertation referenced on algorithm page
running comparison.  In particular, he was not able to compute more than 40
eigenvectors with Lanczos on wikipedia dataset. You may refer to that
study.

On the accuracy part, it was not observed that it was a problem, assuming
high level of random noise is not the case, at least not in LSA-like
application used there.

That said, i am all for diversity of tools, I would actually be +0 on
deprecating Lanczos, it is not like we are lacking support for it. SSVD
could use improvements too.


On Thu, Aug 1, 2013 at 3:15 AM, Fernando Fernández <
fernando.fernandez.gonzalez@gmail.com> wrote:

> Hi everyone,
>
> Sorry if I duplicate the question but I've been looking for an answer and I
> haven't found an explanation other than it's not being used (together with
> some other algorithms). If it's been discussed in depth before maybe you
> can point me to some link with the discussion.
>
> I have successfully used Lanczos in several projects and it's been a
> surprise to me finding that the main reason (according to what I've read
> that might not be the full story) is that it's not being used. At the
> begining I supposed it was because SSVD is supposed to be much faster with
> similar results, but after making some tests I have found that running
> times are similar or even worse than lanczos for some configurations (I
> have tried several combinations of parameters, given child processes enough
> memory, etc. and had no success in running SSVD at least in 3/4 of time
> Lanczos runs, thouh they might be some combinations of parameters I have
> still not tried). It seems to be quite tricky to find a good combination of
> parameters for SSVD and I have seen also a precision loss in some examples
> that makes me not confident in migrating Lanczos to SSVD from now on (How
> far can I trust results from a combination of parameters that runs in
> significant less time, or at least a good time?).
>
> Can someone convince me that SSVD is actually a better option than Lanczos?
> (I'm totally willing to be convinced... :) )
>
> Thank you very much in advance.
>
> Fernando.
>