You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@quickstep.apache.org by hbdeshmukh <gi...@git.apache.org> on 2017/04/18 20:36:59 UTC

[GitHub] incubator-quickstep pull request #233: Topological sort functionality in DAG

GitHub user hbdeshmukh opened a pull request:

    https://github.com/apache/incubator-quickstep/pull/233

    Topological sort functionality in DAG

    - Implemented a very simple Kahn's algorithm for topological sorting of nodes
      in the DAG class.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hbdeshmukh/incubator-quickstep topo-sort-dag

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-quickstep/pull/233.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #233
    
----
commit 4616b140b015b163f5c9d385bcd82d62b83ae900
Author: Harshad Deshmukh <hb...@apache.org>
Date:   2017-04-18T20:34:16Z

    Topological sort functionality in DAG
    
    - Implemented a very simple Kahn's algorithm for topological sorting of nodes
      in the DAG class.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #233: QUICKSTEP-88 Topological sort functionality ...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/233
  
    LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep issue #233: QUICKSTEP-88 Topological sort functionality ...

Posted by hbdeshmukh <gi...@git.apache.org>.
Github user hbdeshmukh commented on the issue:

    https://github.com/apache/incubator-quickstep/pull/233
  
    Hi @hakanmemisoglu 
    
    Can you please review this PR? Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #233: QUICKSTEP-88 Topological sort functio...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/233#discussion_r112067899
  
    --- Diff: utility/DAG.hpp ---
    @@ -489,6 +495,40 @@ bool DAG<T, LinkMetadataT>::hasCycleHelper(const typename DAG<T, LinkMetadataT>:
       return false;
     }
     
    +template <class T, class LinkMetadataT>
    +std::vector<typename DAG<T, LinkMetadataT>::size_type_nodes> DAG<T, LinkMetadataT>::getTopologicalSorting() const {
    +  // We implement "Kahn's algorithm" for the sorting.
    +  DCHECK(!hasCycle());
    +  std::unique_ptr<std::vector<typename DAG<T, LinkMetadataT>::size_type_nodes>>
    +      sorted_list(new std::vector<size_type_nodes>());
    +  sorted_list->reserve(this->size());
    +  // Key = node ID, value = # incoming edges for this node.
    +  // NOTE(harshad) - We modify the "values" in this map as we go along.
    +  std::unordered_map<typename DAG<T, LinkMetadataT>::size_type_nodes, std::size_t> num_incoming_edges;
    +  std::queue<typename DAG<T, LinkMetadataT>::size_type_nodes> nodes_with_no_children;
    +  for (auto node_id = 0u; node_id < this->size(); ++node_id) {
    +    if (nodes_[node_id].getDependencies().empty()) {
    +      nodes_with_no_children.emplace(node_id);
    +    }
    +    num_incoming_edges[node_id] = nodes_[node_id].getDependencies().size();
    +  }
    +  while (!nodes_with_no_children.empty()) {
    --- End diff --
    
    I think this is the place where Kahn's algorithm starts to work. Can you separate methods into logical pieces with comments or by refactoring?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #233: QUICKSTEP-88 Topological sort functio...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-quickstep/pull/233


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #233: QUICKSTEP-88 Topological sort functio...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/233#discussion_r112069450
  
    --- Diff: utility/tests/DAG_unittest.cpp ---
    @@ -490,6 +490,51 @@ TEST(DAGTest, LinkMetadataBoolTest) {
       EXPECT_FALSE(dag_.getLinkMetadata(1, 0));
     }
     
    +TEST(DAGTest, TopoSortTest) {
    --- End diff --
    
    One way to check the correctness is by first creating `reachable` mapping, and then checking for each node `a`, for each node `b` that come after the node `a` in `topological_sorted_list`,  there must be no `reachable` path from `b` to `a`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #233: QUICKSTEP-88 Topological sort functio...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/233#discussion_r112284625
  
    --- Diff: utility/tests/DAG_unittest.cpp ---
    @@ -490,6 +490,88 @@ TEST(DAGTest, LinkMetadataBoolTest) {
       EXPECT_FALSE(dag_.getLinkMetadata(1, 0));
     }
     
    +TEST(DAGTest, TopoSortTest) {
    +  const int kNodeSize = 5;
    +  DAG<DummyPayload, int> dag_;
    +
    +  for (int node_index = 0; node_index < kNodeSize; ++node_index) {
    +    ASSERT_EQ(static_cast<std::size_t>(node_index),
    +              dag_.createNode(new DummyPayload(node_index)));
    +  }
    +
    +  /*
    +   *    0
    +   *   / \
    +   *  v   v
    +   *  1   2
    +   *   \ /
    +   *    v
    +   *    3
    +   *    |
    +   *    v
    +   *    4
    +   *
    +   */
    +
    +  dag_.createLink(0, 1, 2);
    +  dag_.createLink(0, 2, 1);
    +  dag_.createLink(1, 3, 1);
    +  dag_.createLink(2, 3, 1);
    +  dag_.createLink(3, 4, 1);
    +
    +  vector<DAG<DummyPayload, int>::size_type_nodes> sorted_list =
    +      dag_.getTopologicalSorting();
    +  std::unordered_map<DAG<DummyPayload, int>::size_type_nodes, bool> node_exists;
    +  // First check if the ordering is a legal sequence of nodes, i.e. every node
    +  // appears exactly once.
    +  for (auto node_id = 0u; node_id < dag_.size(); ++node_id) {
    +    node_exists[node_id] = false;
    +  }
    +  for (auto i : sorted_list) {
    +    node_exists[i] = true;
    +  }
    +  for (auto node_id = 0u; node_id < dag_.size(); ++node_id) {
    +    ASSERT_TRUE(node_exists[node_id]);
    +  }
    +  // We apply the following condition for verifying if we have obtained a valid
    +  // topological sorting.
    +  // If there is an edge i->j between two nodes i and j, then i comes before j
    +  // in the topologically sorted order.
    +  // We use a map to store the position of a given node in the sorted order.
    +  // Key = node ID, value = position of the node in the sorted order.
    +  std::unordered_map<std::size_t, std::size_t> position_in_sorted_order;
    +  for (std::size_t i = 0; i < sorted_list.size(); ++i) {
    +    position_in_sorted_order[sorted_list[i]] = i;
    +  }
    +  std::vector<std::tuple<std::size_t, std::size_t>> edges;
    +  // Populate the list of edges.
    +  edges.emplace_back(0, 1);
    +  edges.emplace_back(0, 2);
    +  edges.emplace_back(1, 3);
    +  edges.emplace_back(2, 3);
    +  edges.emplace_back(3, 4);
    +  for (auto curr_edge : edges) {
    --- End diff --
    
    I suppose we need to check each reachable pair instead of checking each pair that has an edge between them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-quickstep pull request #233: QUICKSTEP-88 Topological sort functio...

Posted by hakanmemisoglu <gi...@git.apache.org>.
Github user hakanmemisoglu commented on a diff in the pull request:

    https://github.com/apache/incubator-quickstep/pull/233#discussion_r112285916
  
    --- Diff: utility/tests/DAG_unittest.cpp ---
    @@ -490,6 +490,88 @@ TEST(DAGTest, LinkMetadataBoolTest) {
       EXPECT_FALSE(dag_.getLinkMetadata(1, 0));
     }
     
    +TEST(DAGTest, TopoSortTest) {
    +  const int kNodeSize = 5;
    +  DAG<DummyPayload, int> dag_;
    +
    +  for (int node_index = 0; node_index < kNodeSize; ++node_index) {
    +    ASSERT_EQ(static_cast<std::size_t>(node_index),
    +              dag_.createNode(new DummyPayload(node_index)));
    +  }
    +
    +  /*
    +   *    0
    +   *   / \
    +   *  v   v
    +   *  1   2
    +   *   \ /
    +   *    v
    +   *    3
    +   *    |
    +   *    v
    +   *    4
    +   *
    +   */
    +
    +  dag_.createLink(0, 1, 2);
    +  dag_.createLink(0, 2, 1);
    +  dag_.createLink(1, 3, 1);
    +  dag_.createLink(2, 3, 1);
    +  dag_.createLink(3, 4, 1);
    +
    +  vector<DAG<DummyPayload, int>::size_type_nodes> sorted_list =
    +      dag_.getTopologicalSorting();
    +  std::unordered_map<DAG<DummyPayload, int>::size_type_nodes, bool> node_exists;
    +  // First check if the ordering is a legal sequence of nodes, i.e. every node
    +  // appears exactly once.
    +  for (auto node_id = 0u; node_id < dag_.size(); ++node_id) {
    +    node_exists[node_id] = false;
    +  }
    +  for (auto i : sorted_list) {
    +    node_exists[i] = true;
    +  }
    +  for (auto node_id = 0u; node_id < dag_.size(); ++node_id) {
    +    ASSERT_TRUE(node_exists[node_id]);
    +  }
    +  // We apply the following condition for verifying if we have obtained a valid
    +  // topological sorting.
    +  // If there is an edge i->j between two nodes i and j, then i comes before j
    +  // in the topologically sorted order.
    +  // We use a map to store the position of a given node in the sorted order.
    +  // Key = node ID, value = position of the node in the sorted order.
    +  std::unordered_map<std::size_t, std::size_t> position_in_sorted_order;
    +  for (std::size_t i = 0; i < sorted_list.size(); ++i) {
    +    position_in_sorted_order[sorted_list[i]] = i;
    +  }
    +  std::vector<std::tuple<std::size_t, std::size_t>> edges;
    +  // Populate the list of edges.
    +  edges.emplace_back(0, 1);
    +  edges.emplace_back(0, 2);
    +  edges.emplace_back(1, 3);
    +  edges.emplace_back(2, 3);
    +  edges.emplace_back(3, 4);
    +  for (auto curr_edge : edges) {
    --- End diff --
    
    My bad! There is a check already.
    I am merging the branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---