You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by ilooner <gi...@git.apache.org> on 2016/04/06 08:32:11 UTC

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

GitHub user ilooner opened a pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232

    APEXMALHAR-2046 #resolve #comment adding spillable data structures fa…

    …ctory interface.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ilooner/incubator-apex-malhar APEXMALHAR-2046

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-apex-malhar/pull/232.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #232
    
----
commit 3ebee28bf7f06b02c302045565a29427d6844593
Author: Timothy Farkas <ti...@datatorrent.com>
Date:   2016-04-06T06:30:38Z

    APEXMALHAR-2046 #resolve #comment adding spillable data structures factory interface.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58657541
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/SpillableFactory.java ---
    @@ -0,0 +1,171 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Queue;
    +
    +import com.google.common.collect.ListMultimap;
    +
    +import com.datatorrent.api.Component;
    +import com.datatorrent.api.Context.OperatorContext;
    +
    +/**
    + * This is a factory for spillable data structures. This should be used as
    + * a component inside {@link com.datatorrent.api.Operator}s.
    + */
    +public interface SpillableFactory extends Component<OperatorContext>
    +{
    +  /**
    +   * This signals that the parent {@link com.datatorrent.api.Operator}'s
    +   * {@link com.datatorrent.api.Operator#beginWindow(long)} method has been called.
    +   * @param windowId The next windowId of the parent operator.
    +   */
    +  public void beginWindow(long windowId);
    +
    +  /**
    +   * This signals that the parent {@link com.datatorrent.api.Operator}'s
    +   * {@link com.datatorrent.api.Operator#endWindow()} method has been called.
    +   */
    +  public void endWindow();
    +
    +  /**
    +   * This is a factory method for creating a {@link SpillableArrayList}.
    +   * @param <T> The type of data stored in the {@link SpillableArrayList}.
    +   * @param bucket The bucket that this {@link SpillableArrayList} will be spilled too.
    +   * @param serde The Serializer/Deserializer to use for data stored in the {@link SpillableArrayList}.
    +   * @return A {@link SpillableArrayList}.
    +   */
    +  public <T> SpillableArrayList<T> newIndexedList(long bucket, Serde<T, byte[]> serde);
    --- End diff --
    
    If this doesn't need a prefix, will it create a default identifier?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58775045
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/SpillableComponentManager.java ---
    @@ -0,0 +1,122 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import org.apache.apex.malhar.lib.spillable.Spillable.SpillableComponent;
    +
    +import com.datatorrent.api.Component;
    +import com.datatorrent.api.Context.OperatorContext;
    +
    +/**
    + * This is a factory for spillable data structures. This should be used as
    + * a component inside {@link com.datatorrent.api.Operator}s.
    + */
    +public interface SpillableComponentManager extends Component<OperatorContext>, SpillableComponent
    --- End diff --
    
    Where is SplillableComponent?
    
    Documentation is outdated. It is not the factory. Is this the Spillable Composite Component ?  If yes, then can we call it a spillable composite component (or something to that effect) and not a spillable component manger


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58657819
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/SpillableFactory.java ---
    @@ -0,0 +1,171 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Queue;
    +
    +import com.google.common.collect.ListMultimap;
    +
    +import com.datatorrent.api.Component;
    +import com.datatorrent.api.Context.OperatorContext;
    +
    +/**
    + * This is a factory for spillable data structures. This should be used as
    + * a component inside {@link com.datatorrent.api.Operator}s.
    + */
    +public interface SpillableFactory extends Component<OperatorContext>
    +{
    +  /**
    +   * This signals that the parent {@link com.datatorrent.api.Operator}'s
    +   * {@link com.datatorrent.api.Operator#beginWindow(long)} method has been called.
    +   * @param windowId The next windowId of the parent operator.
    +   */
    +  public void beginWindow(long windowId);
    +
    +  /**
    +   * This signals that the parent {@link com.datatorrent.api.Operator}'s
    +   * {@link com.datatorrent.api.Operator#endWindow()} method has been called.
    +   */
    +  public void endWindow();
    +
    +  /**
    +   * This is a factory method for creating a {@link SpillableArrayList}.
    +   * @param <T> The type of data stored in the {@link SpillableArrayList}.
    +   * @param bucket The bucket that this {@link SpillableArrayList} will be spilled too.
    +   * @param serde The Serializer/Deserializer to use for data stored in the {@link SpillableArrayList}.
    +   * @return A {@link SpillableArrayList}.
    +   */
    +  public <T> SpillableArrayList<T> newIndexedList(long bucket, Serde<T, byte[]> serde);
    +
    +  /**
    +   * This is a factory method for creating a {@link SpillableArrayList}.
    +   * @param <T> The type of data stored in the {@link SpillableArrayList}.
    +   * @param prefix The identifier for this {@link SpillableArrayList}.
    +   * @param bucket The bucket that this {@link SpillableArrayList} will be spilled too.
    +   * @param serde The Serializer/Deserializer to use for data stored in the {@link SpillableArrayList}.
    +   * @return A {@link SpillableArrayList}.
    +   */
    +  public <T> SpillableArrayList<T> newIndexedList(byte[] prefix, long bucket, Serde<T, byte[]> serde);
    +
    +  /**
    +   * This is a factory method for creating a {@link SpillableByteIndexedMap}.
    +   * @param <K> The type of the keys.
    +   * @param <V> The type of the values.
    +   * @param bucket The bucket that this {@link SpillableByteIndexedMap} will be spilled too.
    +   * @param serdeKey The Serializer/Deserializer to use for the map's keys.
    +   * @param serdeValue The Serializer/Deserializer to use for the map's values.
    +   * @return A {@link SpillableByteIndexedMap}.
    +   */
    +  public <K, V> SpillableByteIndexedMap<K, V> newMap(long bucket, Serde<K, byte[]> serdeKey,
    --- End diff --
    
    what is a ByteIndexedMap?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58658309
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/Serde.java ---
    @@ -0,0 +1,42 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import org.apache.commons.lang3.mutable.MutableInt;
    +
    +/**
    + * This is an interface for a Serializer/Deserializer class.
    + * @param <OBJ> The type of the object to Serialize and Deserialize.
    + * @param <SER> The type to Serialize an Object to.
    + */
    +public interface Serde<OBJ, SER>
    +{
    +  /**
    +   * Serialized the given object.
    +   * @param object The object to serialize.
    +   * @return The serialized representation of the object.
    +   */
    +  public SER serializeObject(OBJ object);
    +
    +  /**
    +   * Deserializes the given serialized representation of an object.
    +   * @param object The serialized representation of an object.
    +   * @param offset An offset in the serialized reprsentation of the object.
    +   * @return The deserialized object.
    +   */
    +  public OBJ deserializeObject(SER object, MutableInt offset);
    --- End diff --
    
    why MutableInt?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58657579
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/SpillableFactory.java ---
    @@ -0,0 +1,171 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Queue;
    +
    +import com.google.common.collect.ListMultimap;
    +
    +import com.datatorrent.api.Component;
    +import com.datatorrent.api.Context.OperatorContext;
    +
    +/**
    + * This is a factory for spillable data structures. This should be used as
    + * a component inside {@link com.datatorrent.api.Operator}s.
    + */
    +public interface SpillableFactory extends Component<OperatorContext>
    +{
    +  /**
    +   * This signals that the parent {@link com.datatorrent.api.Operator}'s
    +   * {@link com.datatorrent.api.Operator#beginWindow(long)} method has been called.
    +   * @param windowId The next windowId of the parent operator.
    +   */
    +  public void beginWindow(long windowId);
    +
    +  /**
    +   * This signals that the parent {@link com.datatorrent.api.Operator}'s
    +   * {@link com.datatorrent.api.Operator#endWindow()} method has been called.
    +   */
    +  public void endWindow();
    +
    +  /**
    +   * This is a factory method for creating a {@link SpillableArrayList}.
    +   * @param <T> The type of data stored in the {@link SpillableArrayList}.
    +   * @param bucket The bucket that this {@link SpillableArrayList} will be spilled too.
    +   * @param serde The Serializer/Deserializer to use for data stored in the {@link SpillableArrayList}.
    +   * @return A {@link SpillableArrayList}.
    +   */
    +  public <T> SpillableArrayList<T> newIndexedList(long bucket, Serde<T, byte[]> serde);
    +
    +  /**
    +   * This is a factory method for creating a {@link SpillableArrayList}.
    +   * @param <T> The type of data stored in the {@link SpillableArrayList}.
    +   * @param prefix The identifier for this {@link SpillableArrayList}.
    +   * @param bucket The bucket that this {@link SpillableArrayList} will be spilled too.
    +   * @param serde The Serializer/Deserializer to use for data stored in the {@link SpillableArrayList}.
    +   * @return A {@link SpillableArrayList}.
    +   */
    +  public <T> SpillableArrayList<T> newIndexedList(byte[] prefix, long bucket, Serde<T, byte[]> serde);
    --- End diff --
    
    why ```prefix```? why not ```identifier```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58658808
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/Serde.java ---
    @@ -0,0 +1,42 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import org.apache.commons.lang3.mutable.MutableInt;
    +
    +/**
    + * This is an interface for a Serializer/Deserializer class.
    + * @param <OBJ> The type of the object to Serialize and Deserialize.
    + * @param <SER> The type to Serialize an Object to.
    + */
    +public interface Serde<OBJ, SER>
    +{
    +  /**
    +   * Serialized the given object.
    +   * @param object The object to serialize.
    +   * @return The serialized representation of the object.
    +   */
    +  public SER serializeObject(OBJ object);
    +
    +  /**
    +   * Deserializes the given serialized representation of an object.
    +   * @param object The serialized representation of an object.
    +   * @param offset An offset in the serialized reprsentation of the object.
    +   * @return The deserialized object.
    +   */
    +  public OBJ deserializeObject(SER object, MutableInt offset);
    --- End diff --
    
    if this eases the following as @ilooner  mentioned 
    ```
    Object obj;
    MutableInt mi;
     someObj1 = deserialize(obj, mi);
     someObj2 = deserialize(obj, mi);
    So the implementation is expected to modify the offset, then this needs to be documented.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58658226
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/Serde.java ---
    @@ -0,0 +1,42 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import org.apache.commons.lang3.mutable.MutableInt;
    +
    +/**
    + * This is an interface for a Serializer/Deserializer class.
    + * @param <OBJ> The type of the object to Serialize and Deserialize.
    + * @param <SER> The type to Serialize an Object to.
    + */
    +public interface Serde<OBJ, SER>
    +{
    +  /**
    +   * Serialized the given object.
    +   * @param object The object to serialize.
    +   * @return The serialized representation of the object.
    +   */
    +  public SER serializeObject(OBJ object);
    --- End diff --
    
    Isn't ```serialize``` sufficient than ```serializeObject```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-apex-malhar/pull/232


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-apex-malhar pull request: APEXMALHAR-2046 #resolve #comm...

Posted by chandnisingh <gi...@git.apache.org>.
Github user chandnisingh commented on a diff in the pull request:

    https://github.com/apache/incubator-apex-malhar/pull/232#discussion_r58659020
  
    --- Diff: library/src/main/java/org/apache/apex/malhar/lib/spillable/SpillableFactory.java ---
    @@ -0,0 +1,171 @@
    +/*
    + * Copyright 2016 Apache Software Foundation.
    + *
    + * Licensed under the Apache License, Version 2.0 (the "License");
    + * you may not use this file except in compliance with the License.
    + * You may obtain a copy of the License at
    + *
    + *      http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.apex.malhar.lib.spillable;
    +
    +import java.util.List;
    +import java.util.Map;
    +import java.util.Queue;
    +
    +import com.google.common.collect.ListMultimap;
    +
    +import com.datatorrent.api.Component;
    +import com.datatorrent.api.Context.OperatorContext;
    +
    +/**
    + * This is a factory for spillable data structures. This should be used as
    + * a component inside {@link com.datatorrent.api.Operator}s.
    + */
    +public interface SpillableFactory extends Component<OperatorContext>
    --- End diff --
    
    IMO the spillable data-structures should be a component not the factory itself.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---