You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/11/18 04:57:50 UTC

[GitHub] [kafka] ableegoldman opened a new pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

ableegoldman opened a new pull request #9609:
URL: https://github.com/apache/kafka/pull/9609


   Followup to https://github.com/apache/kafka/pull/9582
   
   Will leave the ability to create multiple KTables from the same source topic as followup work. Similarly, creating a KStream and a KTable from the same topic can be tackled later if need be
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#issuecomment-736766888


   Merged to trunk


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cadonna commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
cadonna commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r530196629



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       I like your changes. What I meant is that we could change the constructor of `SourceGraphNode` to:
   
   ```
   public SourceGraphNode(final String nodeName,
                          final Set<String> topicNames,
                          final ConsumedInternal<K, V> consumedInternal)
   ``` 
   
   and the one of `StreamSourceNode` to:
   
   ```
   public StreamSourceNode(final String nodeName,
                           final Set<String> topicNames,
                           final ConsumedInternal<K, V> consumedInternal)
   ```
   
   In this way, we have set of topics as soon as possible in the code path from the public API. I think this makes it clearer that it is not possible to have duplicates of topics internally.
   
   To keep this PR small, I would propose to just do the changes for `SourceGraphNode`, and do the other changes in a separate PR. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cadonna commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
cadonna commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r530187379



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {

Review comment:
       I have never said you need to do it in this PR 😉 . Jokes apart, I think in general it would be better to do such things in a separate PR, but when I wrote my comment, I completely forgot about it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r533876280



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       Ah ok you meant making it a Set vs a Collection -- I do agree with the principle, and I did push the Set-ification of the topics up one level so that the actual class field is a Set. But I don't think it's really worth it to push it up another layer and Set-ify the constructor argument. For one thing we would just have to do the same conversion to a Set but in more places, and more importantly, the actual callers of the constructor don't care at all whether it's a Set or any other Collection. So I think it actually does make sense to convert to a Set inside the constructor body




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#issuecomment-733408778


   @mjsax it's not a bugfix PR, this is only going to trunk. Well technically it is fixing a bug, but that bug was only merged to trunk a few weeks ago. Otherwise I would agree, that would be merge hell 🔥 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cadonna commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
cadonna commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r530196629



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       I like your changes. What I meant is that we could change the constructor of `SourceGraphNode` to:
   
   ```
   public SourceGraphNode(final String nodeName,
                          final Set<String> topicNames,
                          final ConsumedInternal<K, V> consumedInternal)
   ``` 
   
   and the one of `StreamSourceNode` to:
   
   ```
   public StreamSourceNode(final String nodeName,
                           final Set<String> topicNames,
                           final ConsumedInternal<K, V> consumedInternal)
   ```
   
   In this way, we have a set of topics as soon as possible in the code path from the public API. I think this makes it clearer that it is not possible to have duplicates of topics internally.
   
   To keep this PR small, I would propose to just do the changes for `SourceGraphNode`, and do the other changes in a separate PR. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mjsax edited a comment on pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
mjsax edited a comment on pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#issuecomment-733353791


   I don't want to spoil the party, but as this is a bug-fix PR, that we want to back-port, we should not to massive renaming... Happy, do rename in follow up for `trunk` (and 2.7). -- Or do a new PR without the renaming for 2.6.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mjsax commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
mjsax commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r530028617



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/CogroupedKStreamImpl.java
##########
@@ -129,7 +129,7 @@
             subTopologySourceNodes,
             name,
             aggregateBuilder,
-            streamsGraphNode,
+                                                         graphNode,

Review comment:
       nit: indention




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#issuecomment-733283434


   Also just FYI since the suggested renaming touched on a lot of files, the only logical changes are to the SourceGraphNode,  StreamSourceNode, and StreamsBuilderTest classes


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cadonna commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
cadonna commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r533990330



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       Fair enough and I think that this is nothing urgent or absolute necessary. However, I would like to explain my line of thoughts. I think an interface of a class should also describe the constraints on the the object and as far as I see it does not make any sense to pass the same topic name multiple times to a source node. 
   I do not see an issue with making the same conversion in more places and actually this is even not true because the only place we would do a conversion is in `StreamsBuilder#stream()`. All other dependent calls create a singleton collection which can be easily replaced with a singleton set. Actually, I do not understand why `StreamsBuilder#stream()` takes a collection instead of a set.
   
   I am not sure I can follow your other argument
   
   >  the actual callers of the constructor don't care at all whether it's a Set or any other Collection
    
   Do you refer to the creation of the singleton collection in the callers?
   
   As I said, I do not say we need to follow my proposal. I just wanted to argue in favor of a cleaner and more descriptive interface.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r530053365



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/CogroupedKStreamImpl.java
##########
@@ -129,7 +129,7 @@
             subTopologySourceNodes,
             name,
             aggregateBuilder,
-            streamsGraphNode,
+                                                         graphNode,

Review comment:
       wth happened here 🤔 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman merged pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman merged pull request #9609:
URL: https://github.com/apache/kafka/pull/9609


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mjsax commented on pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
mjsax commented on pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#issuecomment-733353791


   I don't want to spoil the party, but as this is a bug-fix PR, that we want to back-port, we should not to massive renaming... Happy, do rename in follow up for `trunk` only.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cadonna commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
cadonna commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r528847393



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {

Review comment:
       Can we also rename `StreamsGraphNode` to `GraphNode`? The `Streams` prefix is a bit confusing, IMO, because `StreamSourceNode` and `StreamsGraphNode` seem really similar although they are quite different.

##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       Should we already demand a set of topics in the constructors of  `SourceGraphNode()` and its children?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r534424621



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       > All other dependent calls create a singleton collection which can be easily replaced with a singleton set
   
   Ah ok, I didn't notice that. I guess I only looked at `StreamsBuilder#stream`
   
   > Actually, I do not understand why StreamsBuilder#stream() takes a collection instead of a set.
   This I totally agree with. I suspect the intention was just for convenience, so users don't have to do a list->set conversion themselves, but I personally don't find that to be a very strong argument. It doesn't seem worth doing a KIP over, but maybe if we rewrite some large parts of the DSL in the future, we can fix this as well
   
   By "callers" I meant the method body of `StreamsBuilder#stream`, which doesn't really care whether there are duplicates in the collection because it's only job is to pass the topics straight from the user to this source node.
   
   But I see your point. If I touch on some related code in a future PR I can fix this on the side, or I'd be happy to review a PR if you want to submit one




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r534424621



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       > All other dependent calls create a singleton collection which can be easily replaced with a singleton set
   
   Ah ok, I didn't notice that. I guess I only looked at `StreamsBuilder#stream`
   
   > Actually, I do not understand why StreamsBuilder#stream() takes a collection instead of a set.
   
   This I totally agree with. I suspect the intention was just for convenience, so users don't have to do a list->set conversion themselves, but I personally don't find that to be a very strong argument. It doesn't seem worth doing a KIP over, but maybe if we rewrite some large parts of the DSL in the future, we can fix this as well
   
   By "callers" I meant the method body of `StreamsBuilder#stream`, which doesn't really care whether there are duplicates in the collection because its only job is to pass the topics straight from the user to this source node.
   
   But I see your point. If I touch on some related code in a future PR I can fix this on the side, or I'd be happy to review a PR if you want to submit one. Thanks for the discussion




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r529971103



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,
+                            final ConsumedInternal<K, V> consumedInternal) {
+        super(nodeName);
+
+        this.topicNames = topicNames;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public SourceGraphNode(final String nodeName,
+                            final Pattern topicPattern,
+                            final ConsumedInternal<K, V> consumedInternal) {
+
+        super(nodeName);
+
+        this.topicPattern = topicPattern;
+        this.consumedInternal = consumedInternal;
+    }
+
+    public Set<String> topicNames() {
+        return new HashSet<>(topicNames);

Review comment:
       I'm not sure I understand exactly what you're asking, but I made a few changes to this topic collection/method. Please lmk if it hasn't addressed your question




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r529974090



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {

Review comment:
       Ok, but don't come crying when the PR blows up in length 😉 (but yeah that makes sense to me)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] ableegoldman commented on pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
ableegoldman commented on pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#issuecomment-733282299


   @mjsax @cadonna I addressed your feedback, lmk if there's anything else (to answer your question, Matthias, this fixes the problem because we use `instanceof StreamsSourceNode` to determine which nodes to merge. Reclassifying things makes this pass over the table nodes, and imo makes the class hierarchy easier to understand anyway


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] cadonna commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
cadonna commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r530187379



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {

Review comment:
       I have never said you need to do it in this PR 😉 . Jokes apart, I think in general it would be better to do such things in a separate PR, but when I wrote my comment, I completely forgot about it. Sorry about that!




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] mjsax commented on a change in pull request #9609: KAFKA-6687: restrict DSL to allow only Streams from the same source topics

Posted by GitBox <gi...@apache.org>.
mjsax commented on a change in pull request #9609:
URL: https://github.com/apache/kafka/pull/9609#discussion_r528250579



##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;

Review comment:
       Can me make those field `final`?

##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;

Review comment:
       `final`?

##########
File path: streams/src/main/java/org/apache/kafka/streams/kstream/internals/graph/SourceGraphNode.java
##########
@@ -0,0 +1,71 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.kafka.streams.kstream.internals.graph;
+
+import java.util.Collection;
+import java.util.HashSet;
+import java.util.Set;
+import java.util.regex.Pattern;
+import org.apache.kafka.common.serialization.Serde;
+import org.apache.kafka.streams.kstream.internals.ConsumedInternal;
+
+abstract public class SourceGraphNode<K, V> extends StreamsGraphNode {
+
+    private Collection<String> topicNames;
+    private Pattern topicPattern;
+    private final ConsumedInternal<K, V> consumedInternal;
+
+    public SourceGraphNode(final String nodeName,
+                            final Collection<String> topicNames,

Review comment:
       nit: fix indention (same below in other constructor)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org