You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@fluo.apache.org by keith-turner <gi...@git.apache.org> on 2017/04/21 18:26:46 UTC

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

GitHub user keith-turner opened a pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128

    Updated ExportQ and CFM to use new ObserverProvider API

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/keith-turner/fluo-recipes observer-factory

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-fluo-recipes/pull/128.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #128
    
----
commit e9daabcb5008618d100774a39f4a356c227c2e0f
Author: Keith Turner <kt...@apache.org>
Date:   2017-03-08T23:46:28Z

    Updated ExportQ and CFM to use new ObserverProvider API

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by mikewalch <gi...@git.apache.org>.
Github user mikewalch commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113274323
  
    --- Diff: docs/cfm.md ---
    @@ -52,85 +46,93 @@ value could also be placed on an export queue to update an external database.
     
     ### Buckets
     
    -A simple implementation of this recipe would be to have an update queue for
    -each key.  However the implementation does something slightly more complex.
    -Each update queue is in a bucket and transactions that process updates, process
    -all of the updates in a bucket.  This allows more efficient processing of
    -updates for the following reasons :
    +A simple implementation of this recipe would have an update queue for each key.  However the
    +implementation is slightly more complex.  Each update queue is in a bucket and transactions process
    +all of the updates in a bucket.  This allows more efficient processing of updates for the following
    +reasons :
     
      * When updates are queued, notifications are made per bucket(instead of per a key).
    - * The transaction doing the update can scan the entire bucket reading updates, this avoids a seek for each key being updated.  
    + * The transaction doing the update can scan the entire bucket reading updates, this avoids a seek for each key being updated.
      * Also the transaction can request a batch lookup to get the current value of all the keys being updated.
      * Any additional actions taken on update (like adding something to an export queue) can also be batched.
      * Data is organized to make reading exiting values for keys in a bucket more efficient.
     
    -Which bucket a key goes to is decided using hash and modulus so that multiple
    -updates for the same key always go to the same bucket.
    +Which bucket a key goes to is decided using hash and modulus so that multiple updates for a key go
    +to the same bucket.
     
    -The initial number of tablets to create when applying table optimizations can be
    -controlled by setting the buckets per tablet option when configuring a Collision
    -Free Map.  For example if you have 20 tablet servers and 1000 buckets and want
    -2 tablets per tserver initially then set buckets per tablet to 1000/(2*20)=25.
    +The initial number of tablets to create when applying table optimizations can be controlled by
    +setting the buckets per tablet option when configuring a Collision Free Map.  For example if you
    +have 20 tablet servers and 1000 buckets and want 2 tablets per tserver initially then set buckets
    +per tablet to 1000/(2*20)=25.
     
     ## Example Use
     
    -The following code snippets show how to setup and use this recipe for
    -wordcount.  The first step in using this recipe is to configure it before
    -initializing Fluo.  When initializing an ID will need to be provided.  This ID
    -is used in two ways.  First, the ID is used as a row prefix in the table.
    -Therefore nothing else should use that row range in the table.  Second, the ID
    -is used in generating configuration keys associated with the instance of the
    -Collision Free Map.
    +The following code snippets show how to use this recipe for wordcount.  The first step is to
    +configure it before initializing Fluo.  When initializing an ID is needed.  This ID is used in two
    +ways.  First, the ID is used as a row prefix in the table.  Therefore nothing else should use that
    +row range in the table.  Second, the ID is used in generating configuration keys.
     
    -The following snippet shows how to setup a collision free map.  
    +The following snippet shows how to configure a collision free map.
     
     ```java
       FluoConfiguration fluoConfig = ...;
     
       int numBuckets = 119;
    +  int numTablets = 20;
     
    -  WordCountMap.configure(fluoConfig, 119);
    +  String mapId = WcObserverProvider.ID;
     
    -  //initialize Fluo using fluoConfig
    +  // Create a Java Object that encapsulates the configuration
    +  CollisionFreeMap.Options cfmOpts =
    +      new CollisionFreeMap.Options(mapId, String.class, Long.class, numBuckets)
    +          .setBucketsPerTablet(numBuckets / numTablets);
    +
    +  // Set application properties for the collision free map.  These properties are read later by
    +  // observers.
    +  CollisionFreeMap.configure(fluoConfig, cfmOpts);
    --- End diff --
    
    `CollisionFreeMap` could be named `CombineQueue` or `Combiner`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by keith-turner <gi...@git.apache.org>.
Github user keith-turner closed the pull request at:

    https://github.com/apache/incubator-fluo-recipes/pull/128


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by mikewalch <gi...@git.apache.org>.
Github user mikewalch commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113275022
  
    --- Diff: modules/core/src/main/java/org/apache/fluo/recipes/core/map/ICombiner.java ---
    @@ -0,0 +1,46 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
    + * agreements. See the NOTICE file distributed with this work for additional information regarding
    + * copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance with the License. You may obtain a
    + * copy of the License at
    + * 
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing, software distributed under the License
    + * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
    + * or implied. See the License for the specific language governing permissions and limitations under
    + * the License.
    + */
    +
    +package org.apache.fluo.recipes.core.map;
    --- End diff --
    
    This could be in `org.apache.fluo.recipes.core.combiner`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by keith-turner <gi...@git.apache.org>.
Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113224356
  
    --- Diff: docs/accumulo-export-queue.md ---
    @@ -19,8 +19,8 @@ limitations under the License.
     ## Background
     
     The [Export Queue Recipe][1] provides a generic foundation for building export mechanism to any
    -external data store. The [AccumuloExporter] provides an implementation of this recipe for
    -Accumulo. The [AccumuloExporter] is located the `fluo-recipes-accumulo` module and provides the
    +external data store. The [AccumuloConsumer] provides an export consumer for writing to
    --- End diff --
    
    A bit of background.  The name I like most is AccumuloExporter. I like this name because it implies sending things out from Fluo.  However this name is currently taken by a class that's deprecated in this PR.
    
    The super type for AccumuloConsumer is ExportConsumer (which is a new type in this PR).  


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by ctubbsii <gi...@git.apache.org>.
Github user ctubbsii commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113216479
  
    --- Diff: docs/accumulo-export-queue.md ---
    @@ -19,8 +19,8 @@ limitations under the License.
     ## Background
     
     The [Export Queue Recipe][1] provides a generic foundation for building export mechanism to any
    -external data store. The [AccumuloExporter] provides an implementation of this recipe for
    -Accumulo. The [AccumuloExporter] is located the `fluo-recipes-accumulo` module and provides the
    +external data store. The [AccumuloConsumer] provides an export consumer for writing to
    --- End diff --
    
    Maybe `AccumuloReceiver` or `AccumuloIngester`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes issue #128: Updated ExportQ and CFM to use new Observ...

Posted by keith-turner <gi...@git.apache.org>.
Github user keith-turner commented on the issue:

    https://github.com/apache/incubator-fluo-recipes/pull/128
  
    I am going to resubmit this PR with the CollisionFreeMap deprecated and renamed to CombineQueue.  The CombineQueue will only support the new Observer API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by mikewalch <gi...@git.apache.org>.
Github user mikewalch commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113275421
  
    --- Diff: modules/core/src/main/java/org/apache/fluo/recipes/core/map/ValueObserver.java ---
    @@ -0,0 +1,37 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
    + * agreements. See the NOTICE file distributed with this work for additional information regarding
    + * copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance with the License. You may obtain a
    + * copy of the License at
    + * 
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing, software distributed under the License
    + * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
    + * or implied. See the License for the specific language governing permissions and limitations under
    + * the License.
    + */
    +
    +package org.apache.fluo.recipes.core.map;
    --- End diff --
    
    could be in `combiner` or `combine` package


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by mikewalch <gi...@git.apache.org>.
Github user mikewalch commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113265960
  
    --- Diff: docs/accumulo-export-queue.md ---
    @@ -19,8 +19,8 @@ limitations under the License.
     ## Background
     
     The [Export Queue Recipe][1] provides a generic foundation for building export mechanism to any
    -external data store. The [AccumuloExporter] provides an implementation of this recipe for
    -Accumulo. The [AccumuloExporter] is located the `fluo-recipes-accumulo` module and provides the
    +external data store. The [AccumuloConsumer] provides an export consumer for writing to
    --- End diff --
    
    You could keep AccumuloExporter name but move to a new package


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by mikewalch <gi...@git.apache.org>.
Github user mikewalch commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113274668
  
    --- Diff: docs/cfm.md ---
    @@ -156,93 +158,65 @@ public class DocumentObserver extends TypedObserver {
     
         return changes;
       }
    -
     }
     ```
     
    -Each collision free map has two extension points, a combiner and an update
    -observer.  These two extension points are defined below as `WordCountCombiner`
    -and  `WordCountObserver`.  The collision free map configures a Fluo observer that
    -will process queued updates.  When processing these queued updates the two
    -extension points are called.  In this example `WordCountCombiner` is called to
    -combine updates that were queued by `DocumentObserver`. The collision free map
    -will process a batch of keys, calling the combiner for each key.  When finished
    -processing a batch, it will call the update observer `WordCountObserver`.
    +Each collision free map has two extension points, a [combiner][ICombiner] and a [value
    --- End diff --
    
    Could be `ValueCombiner` and `ValueObserver`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by mikewalch <gi...@git.apache.org>.
Github user mikewalch commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113275106
  
    --- Diff: modules/core/src/main/java/org/apache/fluo/recipes/core/map/ICombiner.java ---
    @@ -0,0 +1,46 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more contributor license
    + * agreements. See the NOTICE file distributed with this work for additional information regarding
    + * copyright ownership. The ASF licenses this file to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance with the License. You may obtain a
    + * copy of the License at
    + * 
    + * http://www.apache.org/licenses/LICENSE-2.0
    + * 
    + * Unless required by applicable law or agreed to in writing, software distributed under the License
    + * is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express
    + * or implied. See the License for the specific language governing permissions and limitations under
    + * the License.
    + */
    +
    +package org.apache.fluo.recipes.core.map;
    +
    +import java.util.Iterator;
    +import java.util.Optional;
    +import java.util.stream.Stream;
    +
    +/**
    + * This class was created as an alternative to {@link Combiner}. It supports easy and efficient use
    + * of java streams when implementing combiners using lambdas.
    + * 
    + * @since 1.1.0
    + */
    +public interface ICombiner<K, V> {
    --- End diff --
    
    Could be called `ValueCombiner`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by keith-turner <gi...@git.apache.org>.
Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113245202
  
    --- Diff: docs/accumulo-export-queue.md ---
    @@ -19,8 +19,8 @@ limitations under the License.
     ## Background
     
     The [Export Queue Recipe][1] provides a generic foundation for building export mechanism to any
    -external data store. The [AccumuloExporter] provides an implementation of this recipe for
    -Accumulo. The [AccumuloExporter] is located the `fluo-recipes-accumulo` module and provides the
    +external data store. The [AccumuloConsumer] provides an export consumer for writing to
    --- End diff --
    
    A bit more background.   A lot of these changes were motivated by the desire to allow using lambdas after changes in Fluo allowed this.   To support lambdas, the following happened.
    
     * Exporter was an abstract class and could not support lambdas, so was deprecated and replaced with ExportConsumer.  ExportConsumer is a functional interface and lambdas can be assigned to it.
     * AccumuloExporter extended Exporter and was deprecated.  It was replaced with AccumuloConsumer (which implements ExportConsumer) and AccumuloTranslator.  AccumuloTranslator is functional interface and therefore lambdas can be assigned to it.
    
    I really like the names Exporter and AccumuloExporter, but they can't support lambdas.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes pull request #128: Updated ExportQ and CFM to use new...

Posted by keith-turner <gi...@git.apache.org>.
Github user keith-turner commented on a diff in the pull request:

    https://github.com/apache/incubator-fluo-recipes/pull/128#discussion_r113202528
  
    --- Diff: docs/accumulo-export-queue.md ---
    @@ -19,8 +19,8 @@ limitations under the License.
     ## Background
     
     The [Export Queue Recipe][1] provides a generic foundation for building export mechanism to any
    -external data store. The [AccumuloExporter] provides an implementation of this recipe for
    -Accumulo. The [AccumuloExporter] is located the `fluo-recipes-accumulo` module and provides the
    +external data store. The [AccumuloConsumer] provides an export consumer for writing to
    --- End diff --
    
    I do not like the name AccumuloConsumer, any suggestions?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-fluo-recipes issue #128: Updated ExportQ and CFM to use new Observ...

Posted by keith-turner <gi...@git.apache.org>.
Github user keith-turner commented on the issue:

    https://github.com/apache/incubator-fluo-recipes/pull/128
  
    This depends on apache/incubator-fluo-recipes#128


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---