You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2020/12/17 03:26:49 UTC

[GitHub] [skywalking-satellite] hanahmily commented on a change in pull request #10: mmap-queue-plugin

hanahmily commented on a change in pull request #10:
URL: https://github.com/apache/skywalking-satellite/pull/10#discussion_r544746659



##########
File path: dist/licenses/LICENSE-go-cmp
##########
@@ -0,0 +1,27 @@
+Copyright (c) 2017 The Go Authors. All rights reserved.

Review comment:
       The file and others located at `dist/licenses` should be appended with the `.txt` suffix as indicated in `dist/LICENSE`.

##########
File path: docs/configuration/queue.md
##########
@@ -0,0 +1,11 @@
+# queue configuration
+
+|  Type   | Param  | DefaultValue| Meaning| 
+|  ----  | ----  |----  | ----  |
+| mmap-queue  | segment_size | 131072 | The size of each segment. The unit is Byte.

Review comment:
       Should mention that it has to be more than system memory page size

##########
File path: go.mod
##########
@@ -3,8 +3,9 @@ module github.com/apache/skywalking-satellite
 go 1.14
 
 require (
+	github.com/google/go-cmp v0.5.4

Review comment:
       After issuing `go mod tidy`, I found this file's content is updated. My preference to run the tidy command before committing, in order to keep go.mod's content consistent. A better practice to add a task to CI to check it as https://github.com/apache/skywalking-swck/blob/master/.github/workflows/go.yml#L42 did

##########
File path: plugins/queue/mmap/queue.go
##########
@@ -0,0 +1,192 @@
+// Licensed to Apache Software Foundation (ASF) under one or more contributor
+// license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright
+// ownership. Apache Software Foundation (ASF) licenses this file to you under
+// the Apache License, Version 2.0 (the "License"); you may
+// not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package mmap
+
+import (
+	"bytes"
+	"context"
+	"fmt"
+	"os"
+	"sync"
+
+	"github.com/grandecola/mmap"
+
+	"github.com/apache/skywalking-satellite/internal/pkg/event"
+	"github.com/apache/skywalking-satellite/internal/pkg/log"
+	"github.com/apache/skywalking-satellite/plugins/queue/api"
+	"github.com/apache/skywalking-satellite/plugins/queue/mmap/meta"
+)
+
+// Queue is a memory mapped queue to store the input data.
+type Queue struct {
+	sync.Mutex
+	// config
+	SegmentSize           int    `mapstructure:"segment_size"`            // The size of each segment. The unit is byte.
+	MaxInMemSegments      int    `mapstructure:"max_in_mem_segments"`     // The max num of segments in memory.
+	QueueCapacitySegments int    `mapstructure:"queue_capacity_segments"` // The capacity of Queue = segment_size * queue_capacity_segments.
+	FlushPeriod           int    `mapstructure:"flush_period"`            // The period flush time. The unit is ms.
+	FlushCeilingNum       int    `mapstructure:"flush_ceiling_num"`       // The max number in one flush time.
+	MaxEventSize          int    `mapstructure:"max_event_size"`          // The max size of the input event.
+	QueueDir              string `mapstructure:"queue_dir"`               // Contains all files in the queue.
+
+	// running components
+	meta                   *meta.Metadata // The metadata file.
+	segments               []*mmap.File   // The data files.

Review comment:
       In flush goroutine, you opt to the mutex to make it visible to flush operation. Could you use the `atomic` package to improve the performance?

##########
File path: docs/configuration/queue.md
##########
@@ -0,0 +1,11 @@
+# queue configuration
+
+|  Type   | Param  | DefaultValue| Meaning| 
+|  ----  | ----  |----  | ----  |
+| mmap-queue  | segment_size | 131072 | The size of each segment. The unit is Byte.
+| mmap-queue  | max_in_mem_segments | 10 | The max num of segments in memory.

Review comment:
       Could you mention it could be more than 3?

##########
File path: docs/plugins/queue/mmap/README.md
##########
@@ -0,0 +1,66 @@
+# Design
+The mmap-queue is a big, fast and persistent queue based on the memory mapped files. One mmap-queue has a directory to store the whole data. The Queue directory is made up with many segments and 1 meta file. 
+
+- Segment: Segment is the real data store center, that provides large-space storage and does not reduce read and write performance as much as possible by using mmap. And we will avoid deleting files by reusing them.
+- Meta: The purpose of meta is to find the data that the consumer needs.
+
+## Meta
+Metadata only needs 80B to store the Metadata for the pipe. But for memory alignment, it takes at least one memory page size, which is generally 4K.
+```
+[    8Bit   ][  8Bit ][  8Bit ][  8Bit ][  8Bit ][  8Bit ][  8Bit ][  8Bit ][  8Bit ][  8Bit  ]
+[metaVersion][  ID   ][ offset][  ID   ][ offset][  ID   ][ offset][  ID   ][ offset][capacity]
+[metaVersion][writing   offset][watermark offset][committed offset][reading   offset][capacity]
+
+```
+### Transforming
+
+![](https://skywalking.apache.org/blog/2020-11-25-skywalking-satellite-0.1.0-design/offset-convert.jpg)
+
+## Configuration
+[Configuration Params](../../../configuration/queue.md)
+
+## Segment
+Segments are a series of files of the same size. Each input data would cost `8Bit+Len(data)Bit` to store the raw bytes. The first 8Bit is equal to `Len(data)` for finding the ending position. 
+### Swapper
+For the performance and resources thinking, we define a page replacement policy.
+
+- Keep the reading and writing segments on the memory.
+- When the mmapcount is greater or equals to the max_in_mem_segments, we first scan the read scope and then scan the written scope to swap the segments to promise the reading or writing segments are always in memory.
+    - Read scope: [reading_offset - max_in_mem_segments,reading_offset - 1]
+    - Written scope: [writing_offset - max_in_mem_segments,writing_offset - 1]
+    - Each displacement operation guarantees at least `max_in_mem_segments/2-1` capacity available. Subtract operation to subtract the amount of memory that must always exist.
+
+## BenchmarkTest

Review comment:
       The result looks great. Does it contain some scenarios under memory pressure, that means swap is triggered.

##########
File path: plugins/queue/mmap/queue.go
##########
@@ -0,0 +1,192 @@
+// Licensed to Apache Software Foundation (ASF) under one or more contributor
+// license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright
+// ownership. Apache Software Foundation (ASF) licenses this file to you under
+// the Apache License, Version 2.0 (the "License"); you may
+// not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package mmap
+
+import (
+	"bytes"
+	"context"
+	"fmt"
+	"os"
+	"sync"
+
+	"github.com/grandecola/mmap"
+
+	"github.com/apache/skywalking-satellite/internal/pkg/event"
+	"github.com/apache/skywalking-satellite/internal/pkg/log"
+	"github.com/apache/skywalking-satellite/plugins/queue/api"
+	"github.com/apache/skywalking-satellite/plugins/queue/mmap/meta"
+)
+
+// Queue is a memory mapped queue to store the input data.
+type Queue struct {
+	sync.Mutex
+	// config
+	SegmentSize           int    `mapstructure:"segment_size"`            // The size of each segment. The unit is byte.
+	MaxInMemSegments      int    `mapstructure:"max_in_mem_segments"`     // The max num of segments in memory.
+	QueueCapacitySegments int    `mapstructure:"queue_capacity_segments"` // The capacity of Queue = segment_size * queue_capacity_segments.
+	FlushPeriod           int    `mapstructure:"flush_period"`            // The period flush time. The unit is ms.
+	FlushCeilingNum       int    `mapstructure:"flush_ceiling_num"`       // The max number in one flush time.
+	MaxEventSize          int    `mapstructure:"max_event_size"`          // The max size of the input event.
+	QueueDir              string `mapstructure:"queue_dir"`               // Contains all files in the queue.
+
+	// running components
+	meta                   *meta.Metadata // The metadata file.
+	segments               []*mmap.File   // The data files.
+	mmapCount              int            // The number of the memory mapped files.

Review comment:
       it's shared in different goroutines, swapper contains reading and misusing operations but push take cares of adding. How to sync this shared variable?

##########
File path: plugins/queue/mmap/meta/meta.go
##########
@@ -0,0 +1,153 @@
+// Licensed to Apache Software Foundation (ASF) under one or more contributor
+// license agreements. See the NOTICE file distributed with
+// this work for additional information regarding copyright
+// ownership. Apache Software Foundation (ASF) licenses this file to you under
+// the Apache License, Version 2.0 (the "License"); you may
+// not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package meta
+
+import (
+	"fmt"
+	"syscall"
+
+	"path/filepath"

Review comment:
       Imports are not sorted




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org