You are viewing a plain text version of this content. The canonical link for it is here.
Posted to pr@cassandra.apache.org by GitBox <gi...@apache.org> on 2023/01/04 17:14:49 UTC

[GitHub] [cassandra] isaacreath commented on a diff in pull request #2058: [CASSANDRA-16325] Update streaming metrics incrementally

isaacreath commented on code in PR #2058:
URL: https://github.com/apache/cassandra/pull/2058#discussion_r1061709122


##########
src/java/org/apache/cassandra/streaming/StreamSession.java:
##########
@@ -1027,9 +1022,30 @@ public void receive(IncomingStreamMessage message)
     public void progress(String filename, ProgressInfo.Direction direction, long bytes, long total)
     {
         ProgressInfo progress = new ProgressInfo(peer, index, filename, direction, bytes, total);
+        updateMetricsOnProgress(progress);
         streamResult.handleProgress(progress);
     }
 
+    private void updateMetricsOnProgress(ProgressInfo progress)
+    {
+        ProgressInfo.Direction direction = progress.direction;
+        long lastSeenBytesStreamedForProgress = lastSeenBytesStreamed.getOrDefault(progress, 0L);
+        long newBytesStreamed = progress.currentBytes - lastSeenBytesStreamedForProgress;
+        if (direction == ProgressInfo.Direction.OUT)
+        {
+            StreamingMetrics.totalOutgoingBytes.inc(newBytesStreamed);
+            metrics.outgoingBytes.inc(newBytesStreamed);
+        }
+
+        else if (direction == ProgressInfo.Direction.IN)
+        {
+            StreamingMetrics.totalIncomingBytes.inc(newBytesStreamed);
+            metrics.incomingBytes.inc(newBytesStreamed);
+        }
+
+        lastSeenBytesStreamed.put(progress, lastSeenBytesStreamedForProgress + newBytesStreamed);

Review Comment:
   In practice we will be adding a new progress object for every file streamed in each direction by this `StreamSession` (see: [ProgressInfo::hashCode](https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/streaming/ProgressInfo.java#L98)). The worst case is the number of total entries is equal to number of files on the local node * number of files on the remote node for each `StreamSession`. 
   
   A simple optimization I can add here would be to remove the progress object from the map once we've completed streaming. In this case when `progress.currentBytes == progress.totalBytes`.  This would clean things up as each file completes and probably improve memory utilization in the average case. This wouldn't handle the worst case where all files complete streaming at the same time. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: pr-unsubscribe@cassandra.apache.org
For additional commands, e-mail: pr-help@cassandra.apache.org