You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by dkulp <gi...@git.apache.org> on 2016/09/29 16:36:40 UTC
[GitHub] incubator-beam pull request #1025: [BEAM-674] Gridfs Source refactoring
GitHub user dkulp opened a pull request:
https://github.com/apache/incubator-beam/pull/1025
[BEAM-674] Gridfs Source refactoring
Refactor of the GridFS based Source based on feedback from @jkff
BoundedSource is now a source of ObjectID's and a separate DoFn is used to convert/parse the GridFSDBFile into usable chunks.
Testcase for splitting added.
Variables not needed by the Source are pulled out and stuck on the transform instead.
Optimized the non-split case a bit by not querying all the ObjectIds up front.
Optimize unit tests by setting up test data per class instead of per test.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dkulp/incubator-beam gridfs-t2
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-beam/pull/1025.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1025
----
commit 5aad971bcd1d32ba06cec9d4870e7aa9e9dc17f5
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T02:44:37Z
Split BoundedSource into a BoundedSource<ObjectID> and a DoFn<...>
commit 2fc219cdd33e89d65d457dd3767bd378ff1111c0
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T13:03:31Z
Optimize reading for non-split case
commit e58fc61868988cc40c325d913fca37b26e3db99c
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T13:18:17Z
Use objectId timestamp
commit ed73d77b21651d6ef1d8cf2892dc267794d52d10
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T13:57:44Z
Pull parser out of BoundedSource, add maxSkew
commit 277667527cf0a23704b3ae3d05b2c8e2c2bcea3c
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T14:48:42Z
Add test case for the split
commit db30aabac4629ae167e4ede73de79257b4a93336
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T15:00:44Z
Don't need the generic on the Source and Reader
commit 1cdb2ce716b7e020c5306494b414b5bb136abb24
Author: Daniel Kulp <dk...@apache.org>
Date: 2016-09-29T16:29:51Z
Rename maxSkew to allowedTimestampSkew to match other DoFn's
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] incubator-beam pull request #1025: [BEAM-674] Gridfs Source refactoring
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/incubator-beam/pull/1025
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---