You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@giraph.apache.org by Sergey Edunov <ed...@gmail.com> on 2014/06/27 22:49:00 UTC
Review Request 23140: Fix checkpointing
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/
-----------------------------------------------------------
Review request for giraph.
Repository: giraph-git
Description
-------
This fix merely makes checkpointing work again.
Diffs
-----
giraph-core/src/main/java/org/apache/giraph/aggregators/Aggregator.java 514e470
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorHandler.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
giraph-core/src/main/java/org/apache/giraph/aggregators/BasicAggregator.java 07a4100
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2
giraph-core/src/main/java/org/apache/giraph/comm/aggregators/AllAggregatorServerData.java 177e738
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045
giraph-core/src/main/java/org/apache/giraph/master/DefaultMasterCompute.java bfb6f0e
giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f
giraph-core/src/main/java/org/apache/giraph/master/MasterCompute.java d77a9b5
giraph-core/src/main/java/org/apache/giraph/master/WritableMasterAggregatorUsage.java PRE-CREATION
giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a
giraph-core/src/main/java/org/apache/giraph/worker/WorkerAggregatorHandler.java 9bfd7b5
giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db
giraph-core/src/main/java/org/apache/giraph/worker/WorkerThreadAggregatorUsage.java 194127e
giraph-core/src/main/java/org/apache/giraph/worker/WritableWorkerAggregatorUsage.java PRE-CREATION
giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b
Diff: https://reviews.apache.org/r/23140/diff/
Testing
-------
I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
Thanks,
Sergey Edunov
Re: Review Request 23140: Fix checkpointing
Posted by Sergey Edunov <ed...@gmail.com>.
> On July 2, 2014, 1:53 a.m., Maja Kabiljo wrote:
> > giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java, line 19
> > <https://reviews.apache.org/r/23140/diff/2/?file=622266#file622266line19>
> >
> > Why did you move this file?
> On July 2, 2014, 1:53 a.m., Maja Kabiljo wrote:
> > giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java, lines 817-818
> > <https://reviews.apache.org/r/23140/diff/2/?file=622249#file622249line817>
> >
> > Interesting, where do we rely on this?
I don't remember it right now, will run some experiments later
- Sergey
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/#review47169
-----------------------------------------------------------
On July 15, 2014, 9:08 p.m., Sergey Edunov wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23140/
> -----------------------------------------------------------
>
> (Updated July 15, 2014, 9:08 p.m.)
>
>
> Review request for giraph.
>
>
> Repository: giraph-git
>
>
> Description
> -------
>
> This fix merely makes checkpointing work again.
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
> giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395
> giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
> giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
> giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
> giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
> giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f
> giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5
> giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
>
> Diff: https://reviews.apache.org/r/23140/diff/
>
>
> Testing
> -------
>
> I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
>
>
> Thanks,
>
> Sergey Edunov
>
>
Re: Review Request 23140: Fix checkpointing
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/#review47169
-----------------------------------------------------------
Thanks, much shorter now. Should we add some tests to make sure things don't get broken again?
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java
<https://reviews.apache.org/r/23140/#comment82778>
Why ignore superstep 0? For example there might be a lot of filtering going on during input superstep and it's cheaper to restart from checkpoint than read all the data again
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java
<https://reviews.apache.org/r/23140/#comment82781>
Interesting, where do we rely on this?
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java
<https://reviews.apache.org/r/23140/#comment82777>
Nice bug ;-)
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java
<https://reviews.apache.org/r/23140/#comment82787>
This is what output threads are called, please name these differently
giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java
<https://reviews.apache.org/r/23140/#comment82775>
We are not using Serializable - what's transient here for?
giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java
<https://reviews.apache.org/r/23140/#comment82772>
Why did you move this file?
- Maja Kabiljo
On July 2, 2014, 12:57 a.m., Sergey Edunov wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23140/
> -----------------------------------------------------------
>
> (Updated July 2, 2014, 12:57 a.m.)
>
>
> Review request for giraph.
>
>
> Repository: giraph-git
>
>
> Description
> -------
>
> This fix merely makes checkpointing work again.
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
> giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045
> giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f
> giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
> giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
> giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
> giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
> giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d
> giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db
> giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
> giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b
> giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/23140/diff/
>
>
> Testing
> -------
>
> I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
>
>
> Thanks,
>
> Sergey Edunov
>
>
Re: Review Request 23140: Fix checkpointing
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/#review47838
-----------------------------------------------------------
Looks great, a few final comments about the test.
giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java
<https://reviews.apache.org/r/23140/#comment84071>
I'm a bit concerned that this test would have passed even if restart from checkpoint didn't actually happen but app run from beginning. Can we somehow ensure it did?
giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java
<https://reviews.apache.org/r/23140/#comment84066>
Can you reuse the same conf and just add one setting (or at least create a method which creates conf)
giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java
<https://reviews.apache.org/r/23140/#comment84068>
You can extend DefaultWorkerContext to avoid overriding empty methods
- Maja Kabiljo
On July 15, 2014, 11:33 p.m., Sergey Edunov wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23140/
> -----------------------------------------------------------
>
> (Updated July 15, 2014, 11:33 p.m.)
>
>
> Review request for giraph.
>
>
> Repository: giraph-git
>
>
> Description
> -------
>
> This fix merely makes checkpointing work again.
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
> giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395
> giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
> giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
> giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
> giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
> giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f
> giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5
> giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
> giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/23140/diff/
>
>
> Testing
> -------
>
> I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
>
>
> Thanks,
>
> Sergey Edunov
>
>
Re: Review Request 23140: Fix checkpointing
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/#review47902
-----------------------------------------------------------
Ship it!
Thanks Sergey, +1, I'll commit it!
- Maja Kabiljo
On July 16, 2014, 3:59 a.m., Sergey Edunov wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23140/
> -----------------------------------------------------------
>
> (Updated July 16, 2014, 3:59 a.m.)
>
>
> Review request for giraph.
>
>
> Repository: giraph-git
>
>
> Description
> -------
>
> This fix merely makes checkpointing work again.
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
> giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395
> giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
> giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
> giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
> giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
> giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f
> giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5
> giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
> giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION
>
> Diff: https://reviews.apache.org/r/23140/diff/
>
>
> Testing
> -------
>
> I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
>
>
> Thanks,
>
> Sergey Edunov
>
>
Re: Review Request 23140: Fix checkpointing
Posted by Sergey Edunov <ed...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/
-----------------------------------------------------------
(Updated July 16, 2014, 3:59 a.m.)
Review request for giraph.
Repository: giraph-git
Description
-------
This fix merely makes checkpointing work again.
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395
giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28
giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5
giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION
Diff: https://reviews.apache.org/r/23140/diff/
Testing
-------
I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
Thanks,
Sergey Edunov
Re: Review Request 23140: Fix checkpointing
Posted by Sergey Edunov <ed...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/
-----------------------------------------------------------
(Updated July 15, 2014, 11:33 p.m.)
Review request for giraph.
Repository: giraph-git
Description
-------
This fix merely makes checkpointing work again.
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395
giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28
giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5
giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
giraph-examples/src/test/java/org/apache/giraph/TestCheckpointing.java PRE-CREATION
Diff: https://reviews.apache.org/r/23140/diff/
Testing
-------
I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
Thanks,
Sergey Edunov
Re: Review Request 23140: Fix checkpointing
Posted by Sergey Edunov <ed...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/
-----------------------------------------------------------
(Updated July 15, 2014, 9:08 p.m.)
Review request for giraph.
Changes
-------
Fixed CR issues
Repository: giraph-git
Description
-------
This fix merely makes checkpointing work again.
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java 85bfe04
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java ab0570f
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java 0275395
giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 2c4606f
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java de7af28
giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 29835c5
giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
Diff: https://reviews.apache.org/r/23140/diff/
Testing
-------
I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
Thanks,
Sergey Edunov
Re: Review Request 23140: Fix checkpointing
Posted by Sergey Edunov <ed...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/
-----------------------------------------------------------
(Updated July 2, 2014, 12:57 a.m.)
Review request for giraph.
Changes
-------
I removed aggregators serialization from MasterCompute and workers.
Repository: giraph-git
Description
-------
This fix merely makes checkpointing work again.
Diffs (updated)
-----
giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2
giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2
giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045
giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f
giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d
giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a
giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db
giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b
giraph-examples/src/test/java/org/apache/giraph/master/TestAggregatorsHandling.java PRE-CREATION
Diff: https://reviews.apache.org/r/23140/diff/
Testing
-------
I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
Thanks,
Sergey Edunov
Re: Review Request 23140: Fix checkpointing
Posted by Maja Kabiljo <ma...@fb.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/23140/#review47023
-----------------------------------------------------------
I see a lot of the changes are related to aggregators, and you write them now from master, worker and MasterCompute - can't we write them just once and go through normal path of distributing them in the beginning of the superstep?
- Maja Kabiljo
On June 27, 2014, 8:48 p.m., Sergey Edunov wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/23140/
> -----------------------------------------------------------
>
> (Updated June 27, 2014, 8:48 p.m.)
>
>
> Review request for giraph.
>
>
> Repository: giraph-git
>
>
> Description
> -------
>
> This fix merely makes checkpointing work again.
>
>
> Diffs
> -----
>
> giraph-core/src/main/java/org/apache/giraph/aggregators/Aggregator.java 514e470
> giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorHandler.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/aggregators/AggregatorWrapper.java 9613805
> giraph-core/src/main/java/org/apache/giraph/aggregators/BasicAggregator.java 07a4100
> giraph-core/src/main/java/org/apache/giraph/bsp/BspService.java 2e35373
> giraph-core/src/main/java/org/apache/giraph/comm/ServerData.java f0ecca2
> giraph-core/src/main/java/org/apache/giraph/comm/aggregators/AllAggregatorServerData.java 177e738
> giraph-core/src/main/java/org/apache/giraph/conf/GiraphConstants.java 7d7ceb2
> giraph-core/src/main/java/org/apache/giraph/master/BspServiceMaster.java ad7e045
> giraph-core/src/main/java/org/apache/giraph/master/DefaultMasterCompute.java bfb6f0e
> giraph-core/src/main/java/org/apache/giraph/master/MasterAggregatorHandler.java 325d91f
> giraph-core/src/main/java/org/apache/giraph/master/MasterCompute.java d77a9b5
> giraph-core/src/main/java/org/apache/giraph/master/WritableMasterAggregatorUsage.java PRE-CREATION
> giraph-core/src/main/java/org/apache/giraph/partition/BasicPartitionOwner.java 545d1af
> giraph-core/src/main/java/org/apache/giraph/partition/HashMasterPartitioner.java 240687e
> giraph-core/src/main/java/org/apache/giraph/partition/HashWorkerPartitioner.java d833895
> giraph-core/src/main/java/org/apache/giraph/partition/MasterGraphPartitioner.java 50c750a
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionBalancer.java 3454d62
> giraph-core/src/main/java/org/apache/giraph/partition/PartitionOwner.java 0ac74da
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleMasterPartitioner.java f128f34
> giraph-core/src/main/java/org/apache/giraph/partition/SimpleWorkerPartitioner.java 3c0de44
> giraph-core/src/main/java/org/apache/giraph/partition/WorkerGraphPartitioner.java 004ea81
> giraph-core/src/main/java/org/apache/giraph/utils/InternalVertexRunner.java 09dd46d
> giraph-core/src/main/java/org/apache/giraph/utils/io/ExtendedDataInputOutput.java af45426
> giraph-core/src/main/java/org/apache/giraph/worker/BspServiceWorker.java 8dcf19a
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerAggregatorHandler.java 9bfd7b5
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerContext.java 17347db
> giraph-core/src/main/java/org/apache/giraph/worker/WorkerThreadAggregatorUsage.java 194127e
> giraph-core/src/main/java/org/apache/giraph/worker/WritableWorkerAggregatorUsage.java PRE-CREATION
> giraph-core/src/test/java/org/apache/giraph/partition/SimpleRangePartitionFactoryTest.java 96bd5d7
> giraph-examples/src/test/java/org/apache/giraph/aggregators/TestAggregatorsHandling.java e2b611b
>
> Diff: https://reviews.apache.org/r/23140/diff/
>
>
> Testing
> -------
>
> I tested it running multiple different jobs. I run page rank on 2*10^9 vertices on 200 workers and it seems to work just fine. It only takes 2 minutes to save checkpoint.
>
>
> Thanks,
>
> Sergey Edunov
>
>