You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "zhourui999 (via GitHub)" <gi...@apache.org> on 2024/04/05 07:38:36 UTC

[PR] fix: query values with the same primary key [paimon]

zhourui999 opened a new pull request, #3158:
URL: https://github.com/apache/paimon/pull/3158

   <!-- Please specify the module before the PR name: [core] ... or [flink] ... -->
   
   ### Purpose
   
   <!-- Linking this pull request to the issue -->
   Linked issue: #3157
   
   <!-- What is the purpose of the change -->
   
   ### Tests
   
   <!-- List UT and IT cases to verify this change -->
   
   ### API and Format
   
   <!-- Does this change affect API or storage format -->
   
   ### Documentation
   
   <!-- Does this change introduce a new feature -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [core] Fix duplicate primary keys in query results [paimon]

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi merged PR #3158:
URL: https://github.com/apache/paimon/pull/3158


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [core] Fix duplicate primary keys in query results [paimon]

Posted by "zhourui999 (via GitHub)" <gi...@apache.org>.
zhourui999 commented on code in PR #3158:
URL: https://github.com/apache/paimon/pull/3158#discussion_r1554797974


##########
paimon-core/src/main/java/org/apache/paimon/table/source/snapshot/SnapshotReaderImpl.java:
##########
@@ -289,6 +283,12 @@ private List<DataSplit> generateSplits(
                                         .orElse(null)
                                 : null;
                 for (SplitGenerator.SplitGroup splitGroup : splitGroups) {
+                    DataSplit.Builder builder =
+                            DataSplit.builder()
+                                    .withSnapshot(snapshotId)
+                                    .withPartition(partition)
+                                    .withBucket(bucket)
+                                    .isStreaming(isStreaming);
                     List<DataFileMeta> dataFiles = splitGroup.files;
                     builder.withDataFiles(dataFiles);
                     if (splitGroup.rawConvertible) {

Review Comment:
   Okay,I think your suggestion is better. I'll make some changes



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [core] Fix duplicate primary keys in query results [paimon]

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on code in PR #3158:
URL: https://github.com/apache/paimon/pull/3158#discussion_r1554796101


##########
paimon-core/src/main/java/org/apache/paimon/table/source/snapshot/SnapshotReaderImpl.java:
##########
@@ -289,6 +283,12 @@ private List<DataSplit> generateSplits(
                                         .orElse(null)
                                 : null;
                 for (SplitGenerator.SplitGroup splitGroup : splitGroups) {
+                    DataSplit.Builder builder =
+                            DataSplit.builder()
+                                    .withSnapshot(snapshotId)
+                                    .withPartition(partition)
+                                    .withBucket(bucket)
+                                    .isStreaming(isStreaming);
                     List<DataFileMeta> dataFiles = splitGroup.files;
                     builder.withDataFiles(dataFiles);
                     if (splitGroup.rawConvertible) {

Review Comment:
   It can also be modified this way. 
   I think the current implementation is okay too, what do you think? @JingsongLi 
   
   ```
   builder.rawFiles(
                  splitGroup.rawConvertible
                          ? convertToRawFiles(partition, bucket, dataFiles)
                          : Collections.emptyList());
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [core] Fix duplicate primary keys in query results [paimon]

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi commented on PR #3158:
URL: https://github.com/apache/paimon/pull/3158#issuecomment-2041282426

   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [core] Fix duplicate primary keys in query results [paimon]

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on PR #3158:
URL: https://github.com/apache/paimon/pull/3158#issuecomment-2041271537

   Thanks for pointing it out!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org