You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Pavan K Narayanan <pa...@gmail.com> on 2014/01/10 10:11:00 UTC

travelling salesman on Mahout

Hi

Has anyone tried solving travelling salesman problem using Mahout?
(could be any version) . If yes, I would like to know,

1. What is the format of input data? is it txt file or csv file or any
other format.
2. do you have to divide the input data into so chunks? for example,
if your tsp has 10k cities and you divided that into two 5k cities
problems. (I understand running on hadoop will divide the data into
chunks of 64mb or any other user defined, but was there any external
division.

I have not solved TSP using Mahout. Would appreciate if anyone could
walk me thro the process of solving tsp.

thanks

Re: travelling salesman on Mahout

Posted by Ted Dunning <te...@gmail.com>.
I haven't seen that part of the cookbook yet either.  But the package that
it depends on has been removed from Mahout.

Evolutionary algorithms will generally be much better implemented on a
framework that supports iteration such as Giraph or Spark.  For TSP, the
natural representation would be to distribute members of the potential
solution population across the cluster.

If you choose to go with the evolving partial solution approach, you will
need to work out some way of compensating for length.  You want to have
different versions of partial solutions competing against each other, but
you also want long solutions to compete well against short solutions.  One
way to do that is to have "sponsors".  These are long solutions whose
points are a super set of a shorter solution.  Sponsors are used to
estimate the cost of adding the missing cities from the shorter solution so
that long and short solutions have a roughly equal playing field.




On Sun, Jan 12, 2014 at 1:20 AM, Pavan K Narayanan <
pavan.narayanan@gmail.com> wrote:

> Thanks Ted for your response. Any use cases where the evolutionary
> algorithm used apart from tsp ?  I got to know about a "mahout
> cookbook" that has a receipe walkthrough on implementing TSP on
> mahout. The book is not released in my country yet, but I would like
> to find out:
>
> which version of Mahout is being used?
> is there any results available on the internet for solving benchmark
> TSP problems using Mahout?
>
> I am not depending on Mahout to solve planning problems but I would
> like to know how the algorithm works in mapreduce paradigm and
> specifically whether the node arc incidence matrix has been split for
> purpose of mapreduce during run time. (I am not concerned about the
> obtaining optimal solutions from Mahout)
>
>
>
> On 11 January 2014 00:46, Ted Dunning <te...@gmail.com> wrote:
> > TSP is generally solved using a number of heuristics guiding a randomized
> > search.  Mahout has essentially no provision for helping with this.
> >
> > If you want a quick and dirty solution, I would recommend something like
> an
> > evolutionary algorithm in which you have segments that self-assemble or
> > split with the parameters controlling the assembly and splitting subject
> to
> > auto-mutation.
> >
> > Conventional genetic algorithms should also work reasonably well, but you
> > will almost certainly have to include some auto-evolution.
> >
> > Mahout does have a directed-step evolutionary algorithm, but it is not
> > suitable for discrete problems such as TSP.
> >
> >
> >
> > On Fri, Jan 10, 2014 at 1:11 AM, Pavan K Narayanan <
> > pavan.narayanan@gmail.com> wrote:
> >
> >> Hi
> >>
> >> Has anyone tried solving travelling salesman problem using Mahout?
> >> (could be any version) . If yes, I would like to know,
> >>
> >> 1. What is the format of input data? is it txt file or csv file or any
> >> other format.
> >> 2. do you have to divide the input data into so chunks? for example,
> >> if your tsp has 10k cities and you divided that into two 5k cities
> >> problems. (I understand running on hadoop will divide the data into
> >> chunks of 64mb or any other user defined, but was there any external
> >> division.
> >>
> >> I have not solved TSP using Mahout. Would appreciate if anyone could
> >> walk me thro the process of solving tsp.
> >>
> >> thanks
> >>
>

RE: travelling salesman on Mahout

Posted by si...@bt.com.
Hi all - there is a project at MIT called FlexGP that has done more work on this. 

http://groups.csail.mit.edu/EVO-DesignOpt/groupWebSite/index.php?n=Site.FlexGP

Unfortunately I can't find a download for the code so I suppose that it's not opensource, however you might like to contact these folks and see if they can help you get a leg up.

Best, 

Simon 

-----Original Message-----
From: Ted Dunning [mailto:ted.dunning@gmail.com] 
Sent: 13 January 2014 18:00
To: user@mahout.apache.org
Subject: Re: travelling salesman on Mahout

On Mon, Jan 13, 2014 at 8:42 AM, Pavan K Narayanan < pavan.narayanan@gmail.com> wrote:

>
>
> Please may I ask why TSP has been removed from Mahout.


It was the Genetic Algorithms that were removed.

The implementation was unmaintained and not scalable and thus not appropriate for Mahout.


> Its just that I
> see discussions about distributed Genetic Algorithm and other 
> evolutionary algorithms being implemented in distributed environment 
> and feel it would be possible to implement in Mahout as well.
>

Indeed, it is quite possible.  Most evolutionary algorithms are trivially parallel.  But somebody has to step up to maintain the code.


>
> For example, the initial population generation and evaluating the best 
> initial route could be done in 'n' nodes using 'n' map phases and we 
> could have a reduce phase where the best of initial route from 'n'
> nodes could be taken for further treatment like mutation, etc... What 
> are your thoughts on this?
>

I think that it would be better done using something like Giraph or Spark.
 This isn't a natural fit to map-reduce due to the iteration.

Also, TSP itself is not a particularly interesting problem for Mahout.  The Concorde system is very mature and already scales to over a hundred nodes using MPI.  Mahout is not going touch it and has very little to offer.

Re: travelling salesman on Mahout

Posted by Ted Dunning <te...@gmail.com>.
On Mon, Jan 13, 2014 at 8:42 AM, Pavan K Narayanan <
pavan.narayanan@gmail.com> wrote:

>
>
> Please may I ask why TSP has been removed from Mahout.


It was the Genetic Algorithms that were removed.

The implementation was unmaintained and not scalable and thus not
appropriate for Mahout.


> Its just that I
> see discussions about distributed Genetic Algorithm and other
> evolutionary algorithms being implemented in distributed environment
> and feel it would be possible to implement in Mahout as well.
>

Indeed, it is quite possible.  Most evolutionary algorithms are trivially
parallel.  But somebody has to step up to maintain the code.


>
> For example, the initial population generation and evaluating the best
> initial route could be done in 'n' nodes using 'n' map phases and we
> could have a reduce phase where the best of initial route from 'n'
> nodes could be taken for further treatment like mutation, etc... What
> are your thoughts on this?
>

I think that it would be better done using something like Giraph or Spark.
 This isn't a natural fit to map-reduce due to the iteration.

Also, TSP itself is not a particularly interesting problem for Mahout.  The
Concorde system is very mature and already scales to over a hundred nodes
using MPI.  Mahout is not going touch it and has very little to offer.

Re: travelling salesman on Mahout

Posted by Pavan K Narayanan <pa...@gmail.com>.
Thanks again Ted, for the link -- Just read a few pages and it is very
interesting.

Please may I ask why TSP has been removed from Mahout. Its just that I
see discussions about distributed Genetic Algorithm and other
evolutionary algorithms being implemented in distributed environment
and feel it would be possible to implement in Mahout as well.

For example, the initial population generation and evaluating the best
initial route could be done in 'n' nodes using 'n' map phases and we
could have a reduce phase where the best of initial route from 'n'
nodes could be taken for further treatment like mutation, etc... What
are your thoughts on this?

Regards,
Pavan

On 13 January 2014 07:24, Ted Dunning <te...@gmail.com> wrote:
> For what it is worth, R has some really nice interfaces for the standard
> TSP solvers and several sample data sets.  That can really help you in your
> testing.
>
> See http://cran.r-project.org/web/packages/TSP/vignettes/TSP.pdf
>
>
>
> On Sun, Jan 12, 2014 at 1:20 AM, Pavan K Narayanan <
> pavan.narayanan@gmail.com> wrote:
>
>> Thanks Ted for your response. Any use cases where the evolutionary
>> algorithm used apart from tsp ?  I got to know about a "mahout
>> cookbook" that has a receipe walkthrough on implementing TSP on
>> mahout. The book is not released in my country yet, but I would like
>> to find out:
>>
>> which version of Mahout is being used?
>> is there any results available on the internet for solving benchmark
>> TSP problems using Mahout?
>>
>> I am not depending on Mahout to solve planning problems but I would
>> like to know how the algorithm works in mapreduce paradigm and
>> specifically whether the node arc incidence matrix has been split for
>> purpose of mapreduce during run time. (I am not concerned about the
>> obtaining optimal solutions from Mahout)
>>
>>
>>
>> On 11 January 2014 00:46, Ted Dunning <te...@gmail.com> wrote:
>> > TSP is generally solved using a number of heuristics guiding a randomized
>> > search.  Mahout has essentially no provision for helping with this.
>> >
>> > If you want a quick and dirty solution, I would recommend something like
>> an
>> > evolutionary algorithm in which you have segments that self-assemble or
>> > split with the parameters controlling the assembly and splitting subject
>> to
>> > auto-mutation.
>> >
>> > Conventional genetic algorithms should also work reasonably well, but you
>> > will almost certainly have to include some auto-evolution.
>> >
>> > Mahout does have a directed-step evolutionary algorithm, but it is not
>> > suitable for discrete problems such as TSP.
>> >
>> >
>> >
>> > On Fri, Jan 10, 2014 at 1:11 AM, Pavan K Narayanan <
>> > pavan.narayanan@gmail.com> wrote:
>> >
>> >> Hi
>> >>
>> >> Has anyone tried solving travelling salesman problem using Mahout?
>> >> (could be any version) . If yes, I would like to know,
>> >>
>> >> 1. What is the format of input data? is it txt file or csv file or any
>> >> other format.
>> >> 2. do you have to divide the input data into so chunks? for example,
>> >> if your tsp has 10k cities and you divided that into two 5k cities
>> >> problems. (I understand running on hadoop will divide the data into
>> >> chunks of 64mb or any other user defined, but was there any external
>> >> division.
>> >>
>> >> I have not solved TSP using Mahout. Would appreciate if anyone could
>> >> walk me thro the process of solving tsp.
>> >>
>> >> thanks
>> >>
>>

Re: travelling salesman on Mahout

Posted by Ted Dunning <te...@gmail.com>.
For what it is worth, R has some really nice interfaces for the standard
TSP solvers and several sample data sets.  That can really help you in your
testing.

See http://cran.r-project.org/web/packages/TSP/vignettes/TSP.pdf



On Sun, Jan 12, 2014 at 1:20 AM, Pavan K Narayanan <
pavan.narayanan@gmail.com> wrote:

> Thanks Ted for your response. Any use cases where the evolutionary
> algorithm used apart from tsp ?  I got to know about a "mahout
> cookbook" that has a receipe walkthrough on implementing TSP on
> mahout. The book is not released in my country yet, but I would like
> to find out:
>
> which version of Mahout is being used?
> is there any results available on the internet for solving benchmark
> TSP problems using Mahout?
>
> I am not depending on Mahout to solve planning problems but I would
> like to know how the algorithm works in mapreduce paradigm and
> specifically whether the node arc incidence matrix has been split for
> purpose of mapreduce during run time. (I am not concerned about the
> obtaining optimal solutions from Mahout)
>
>
>
> On 11 January 2014 00:46, Ted Dunning <te...@gmail.com> wrote:
> > TSP is generally solved using a number of heuristics guiding a randomized
> > search.  Mahout has essentially no provision for helping with this.
> >
> > If you want a quick and dirty solution, I would recommend something like
> an
> > evolutionary algorithm in which you have segments that self-assemble or
> > split with the parameters controlling the assembly and splitting subject
> to
> > auto-mutation.
> >
> > Conventional genetic algorithms should also work reasonably well, but you
> > will almost certainly have to include some auto-evolution.
> >
> > Mahout does have a directed-step evolutionary algorithm, but it is not
> > suitable for discrete problems such as TSP.
> >
> >
> >
> > On Fri, Jan 10, 2014 at 1:11 AM, Pavan K Narayanan <
> > pavan.narayanan@gmail.com> wrote:
> >
> >> Hi
> >>
> >> Has anyone tried solving travelling salesman problem using Mahout?
> >> (could be any version) . If yes, I would like to know,
> >>
> >> 1. What is the format of input data? is it txt file or csv file or any
> >> other format.
> >> 2. do you have to divide the input data into so chunks? for example,
> >> if your tsp has 10k cities and you divided that into two 5k cities
> >> problems. (I understand running on hadoop will divide the data into
> >> chunks of 64mb or any other user defined, but was there any external
> >> division.
> >>
> >> I have not solved TSP using Mahout. Would appreciate if anyone could
> >> walk me thro the process of solving tsp.
> >>
> >> thanks
> >>
>

Re: travelling salesman on Mahout

Posted by Pavan K Narayanan <pa...@gmail.com>.
Thanks Ted for your response. Any use cases where the evolutionary
algorithm used apart from tsp ?  I got to know about a "mahout
cookbook" that has a receipe walkthrough on implementing TSP on
mahout. The book is not released in my country yet, but I would like
to find out:

which version of Mahout is being used?
is there any results available on the internet for solving benchmark
TSP problems using Mahout?

I am not depending on Mahout to solve planning problems but I would
like to know how the algorithm works in mapreduce paradigm and
specifically whether the node arc incidence matrix has been split for
purpose of mapreduce during run time. (I am not concerned about the
obtaining optimal solutions from Mahout)



On 11 January 2014 00:46, Ted Dunning <te...@gmail.com> wrote:
> TSP is generally solved using a number of heuristics guiding a randomized
> search.  Mahout has essentially no provision for helping with this.
>
> If you want a quick and dirty solution, I would recommend something like an
> evolutionary algorithm in which you have segments that self-assemble or
> split with the parameters controlling the assembly and splitting subject to
> auto-mutation.
>
> Conventional genetic algorithms should also work reasonably well, but you
> will almost certainly have to include some auto-evolution.
>
> Mahout does have a directed-step evolutionary algorithm, but it is not
> suitable for discrete problems such as TSP.
>
>
>
> On Fri, Jan 10, 2014 at 1:11 AM, Pavan K Narayanan <
> pavan.narayanan@gmail.com> wrote:
>
>> Hi
>>
>> Has anyone tried solving travelling salesman problem using Mahout?
>> (could be any version) . If yes, I would like to know,
>>
>> 1. What is the format of input data? is it txt file or csv file or any
>> other format.
>> 2. do you have to divide the input data into so chunks? for example,
>> if your tsp has 10k cities and you divided that into two 5k cities
>> problems. (I understand running on hadoop will divide the data into
>> chunks of 64mb or any other user defined, but was there any external
>> division.
>>
>> I have not solved TSP using Mahout. Would appreciate if anyone could
>> walk me thro the process of solving tsp.
>>
>> thanks
>>

Re: travelling salesman on Mahout

Posted by Ted Dunning <te...@gmail.com>.
TSP is generally solved using a number of heuristics guiding a randomized
search.  Mahout has essentially no provision for helping with this.

If you want a quick and dirty solution, I would recommend something like an
evolutionary algorithm in which you have segments that self-assemble or
split with the parameters controlling the assembly and splitting subject to
auto-mutation.

Conventional genetic algorithms should also work reasonably well, but you
will almost certainly have to include some auto-evolution.

Mahout does have a directed-step evolutionary algorithm, but it is not
suitable for discrete problems such as TSP.



On Fri, Jan 10, 2014 at 1:11 AM, Pavan K Narayanan <
pavan.narayanan@gmail.com> wrote:

> Hi
>
> Has anyone tried solving travelling salesman problem using Mahout?
> (could be any version) . If yes, I would like to know,
>
> 1. What is the format of input data? is it txt file or csv file or any
> other format.
> 2. do you have to divide the input data into so chunks? for example,
> if your tsp has 10k cities and you divided that into two 5k cities
> problems. (I understand running on hadoop will divide the data into
> chunks of 64mb or any other user defined, but was there any external
> division.
>
> I have not solved TSP using Mahout. Would appreciate if anyone could
> walk me thro the process of solving tsp.
>
> thanks
>