You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@gobblin.apache.org by Abhishek Tiwari <ab...@apache.org> on 2018/03/23 18:21:43 UTC

gobblin-standalone.sh vs gobblin-mapreduce.sh

(moved from Gitter)
bryancjacobs @bryancjacobs Mar 16 21:38

> Could someone offer some guidance on when to choose:
> ./bin/gobblin-standalone.sh
> VS
> ./bin/gobblin-mapreduce.sh


Shirshanka Das @shirshanka Mar 19 13:33

> @bryancjacobs : gobblin-standalone runs the gobblin job in a single
> process (with threadpool for parallelism). gobblin-mapreduce launches the
> gobblin job as a Hadoop MR job (with mappers for parallelism).
> gobblin-standalone is definitely the easiest way to get started with
> writing and running gobblin jobs as a developer. If your data sizes are
> small, you can even get by with using it as your production deployment.
> However, most larger data movement use-cases will benefit from the
> parallelism and fault-tolerance of using either the map-reduce mode or the
> cluster mode.