You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/04/27 12:03:37 UTC

[GitHub] [incubator-doris] tianhui5 opened a new pull request, #9264: [Enhancement][Statistics] Store last visited time of partition

tianhui5 opened a new pull request, #9264:
URL: https://github.com/apache/incubator-doris/pull/9264

   # Proposed changes
   
   When we start to do Data Governance in Doris, it's hard to tell data from warm to cold. 
   I think it's better to store the last visited time of partition, as a sign of data heat.
   
   ## Problem Summary:
   
   Can't tell data from warm to cold. 
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (No)
   2. Has unit tests been added: (No Need)
   3. Has document been added or modified: (No Need)
   4. Does it need to update dependencies: (No)
   5. Are there any changes that cannot be rolled back: (Yes)
   
   ## Further comments
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #9264: [Enhancement][Statistics] Store last visited time of partition

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9264:
URL: https://github.com/apache/doris/pull/9264#issuecomment-1301511155

   We're closing this PR because it hasn't been updated in a while.
   This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and feel free a maintainer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morrySnow commented on a diff in pull request #9264: [Enhancement][Statistics] Store last visited time of partition

Posted by GitBox <gi...@apache.org>.
morrySnow commented on code in PR #9264:
URL: https://github.com/apache/incubator-doris/pull/9264#discussion_r861433781


##########
fe/fe-core/src/main/java/org/apache/doris/persist/VisitPartitionsInfo.java:
##########
@@ -0,0 +1,89 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.persist;
+
+import org.apache.doris.common.io.Text;
+import org.apache.doris.common.io.Writable;
+import org.apache.doris.persist.gson.GsonUtils;
+import com.google.gson.annotations.SerializedName;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.util.Collection;
+
+/**
+ * used for batch store partitions visit info when a query arrive.
+ */
+
+public class VisitPartitionsInfo implements Writable {
+    @SerializedName(value = "dbId")
+    private long dbId;
+    @SerializedName(value = "tableId")
+    private long tableId;
+    @SerializedName(value = "partitionIds")
+    private Collection<Long> partitionIds;
+    @SerializedName(value = "timestamp")
+    private long timestamp;
+
+    public VisitPartitionsInfo(long dbId, long tableId, Collection<Long> partitionIds, long timestamp) {
+        this.dbId = dbId;
+        this.tableId = tableId;
+        this.partitionIds = partitionIds;
+        this.timestamp = timestamp;
+    }
+
+    @Override
+    public void write(DataOutput out) throws IOException {
+        Text.writeString(out, GsonUtils.GSON.toJson(this));
+    }
+
+    public static VisitPartitionsInfo read(DataInput in) throws IOException {
+        String json = Text.readString(in);
+        return GsonUtils.GSON.fromJson(json, VisitPartitionsInfo.class);
+    }
+
+    @Override
+    public boolean equals(Object other) {

Review Comment:
   we need to override hashCode when we override equals



##########
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java:
##########
@@ -597,11 +599,17 @@ private void computePartitionInfo() throws AnalysisException {
         }
         selectedPartitionNum = selectedPartitionIds.size();
 
+        String dbName = desc.getRef().getName().getDb();
+        long dbId = analyzer.getCatalog().getDb(dbName).map(Database::getId).orElse(0L);
+        long timestamp = System.currentTimeMillis();
+        VisitPartitionsInfo info = new VisitPartitionsInfo(dbId, olapTable.getId(), selectedPartitionIds, timestamp);
+        Catalog.getCurrentCatalog().getEditLog().logVisitPartition(info);

Review Comment:
   bdbje cannot write log on non-master node. so if we wan't log this info, we either transfer all log request to master, or redirect all query to master.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] closed pull request #9264: [Enhancement][Statistics] Store last visited time of partition

Posted by GitBox <gi...@apache.org>.
github-actions[bot] closed pull request #9264: [Enhancement][Statistics] Store last visited time of partition
URL: https://github.com/apache/doris/pull/9264


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] tianhui5 commented on a diff in pull request #9264: [Enhancement][Statistics] Store last visited time of partition

Posted by GitBox <gi...@apache.org>.
tianhui5 commented on code in PR #9264:
URL: https://github.com/apache/incubator-doris/pull/9264#discussion_r865537451


##########
fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java:
##########
@@ -597,11 +599,17 @@ private void computePartitionInfo() throws AnalysisException {
         }
         selectedPartitionNum = selectedPartitionIds.size();
 
+        String dbName = desc.getRef().getName().getDb();
+        long dbId = analyzer.getCatalog().getDb(dbName).map(Database::getId).orElse(0L);
+        long timestamp = System.currentTimeMillis();
+        VisitPartitionsInfo info = new VisitPartitionsInfo(dbId, olapTable.getId(), selectedPartitionIds, timestamp);
+        Catalog.getCurrentCatalog().getEditLog().logVisitPartition(info);

Review Comment:
   Since we can't redirect all queries to master, should I transfer all log request to master? Is there too many RPCs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] tianhui5 commented on a diff in pull request #9264: [Enhancement][Statistics] Store last visited time of partition

Posted by GitBox <gi...@apache.org>.
tianhui5 commented on code in PR #9264:
URL: https://github.com/apache/incubator-doris/pull/9264#discussion_r865536869


##########
fe/fe-core/src/main/java/org/apache/doris/persist/VisitPartitionsInfo.java:
##########
@@ -0,0 +1,89 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.persist;
+
+import org.apache.doris.common.io.Text;
+import org.apache.doris.common.io.Writable;
+import org.apache.doris.persist.gson.GsonUtils;
+import com.google.gson.annotations.SerializedName;
+
+import java.io.DataInput;
+import java.io.DataOutput;
+import java.io.IOException;
+import java.util.Collection;
+
+/**
+ * used for batch store partitions visit info when a query arrive.
+ */
+
+public class VisitPartitionsInfo implements Writable {
+    @SerializedName(value = "dbId")
+    private long dbId;
+    @SerializedName(value = "tableId")
+    private long tableId;
+    @SerializedName(value = "partitionIds")
+    private Collection<Long> partitionIds;
+    @SerializedName(value = "timestamp")
+    private long timestamp;
+
+    public VisitPartitionsInfo(long dbId, long tableId, Collection<Long> partitionIds, long timestamp) {
+        this.dbId = dbId;
+        this.tableId = tableId;
+        this.partitionIds = partitionIds;
+        this.timestamp = timestamp;
+    }
+
+    @Override
+    public void write(DataOutput out) throws IOException {
+        Text.writeString(out, GsonUtils.GSON.toJson(this));
+    }
+
+    public static VisitPartitionsInfo read(DataInput in) throws IOException {
+        String json = Text.readString(in);
+        return GsonUtils.GSON.fromJson(json, VisitPartitionsInfo.class);
+    }
+
+    @Override
+    public boolean equals(Object other) {

Review Comment:
   Actually this function is never used, I'll delete it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] tianhui5 commented on pull request #9264: [Enhancement][Statistics] Store last visited time of partition

Posted by GitBox <gi...@apache.org>.
tianhui5 commented on PR #9264:
URL: https://github.com/apache/incubator-doris/pull/9264#issuecomment-1119443043

   Hi, @morrySnow 
   Since store visit time in meta data every time query arrive waste too much resource, I put the visit partitions info into audit log.
   Please review it again, thanks for your time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org