You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pulsar.apache.org by Apache Pulsar Slack <ap...@gmail.com> on 2018/01/19 23:36:02 UTC

Slack digest for #general - 2018-01-19

2018-01-19 03:02:35 UTC - Jaebin Yoon: @Matteo Merli @Sijie Guo I'm setting up bookies on AWS d2.4xlarge instances (16 cores, 122G memory, 12x2TB raid-0 hd). Do you have any recommendation for memory configuration for this kind of setup? 
for configurations like java heap, direct memory and dbStorage_writeCacheMaxSizeMb, dbStorage_readAheadCacheMaxSizeMb, dbStorage_rocksDB_blockCacheSize.
BTW, I'm going to use journalSyncData=false since we cannot recover machines when they shutsdown. So no fsync is required for every message.
----
2018-01-19 03:14:43 UTC - Matteo Merli: Since the VM has lot of RAM you can increase a lot from the defaults and leave the rest page cache. For JVM heap I'd say ~24g. WriteCacheMaxSize and ReadAheadCacheMaxSize are both coming from JVM direct memory.  I'd say to start with 16g @ 16g. For rocksdb block cache, which is allocated in JNI so it's completely out of JVM configuration, ideally you want to cache most of the indexes. I'd say 4gb should be enough to index all the data in the 24Tb storage space.  
----
2018-01-19 03:19:39 UTC - Jaebin Yoon: alright. thanks @Matteo Merli for quick response! let me try that. And I'm going to use m4.2x for brokers (8 cores, 32G).
----
2018-01-19 03:27:45 UTC - Matteo Merli: No prob. If you post the final settings I can take a look as well
----
2018-01-19 03:28:12 UTC - Jaebin Yoon: sounds good. thanks!
----
2018-01-19 03:38:42 UTC - YANGLiiN: @YANGLiiN has joined the channel
----
2018-01-19 04:43:26 UTC - Jaebin Yoon: @Matteo Merli here is the bookie configuration I'm going to use on d2.4xlarge :

```Bookie JVM options

-server
-Dsnappy.bufferSize=32768
-Dlog4j.configuration=file:///apps/pulsarbookie/conf/log4j.properties
-XX:+UseCompressedOops
-XX:+DisableExplicitGC
-Xms24g
-Xmx24g
-XX:MaxDirectMemorySize=16g
-verbose:gc
-Xloggc:$GCLOG
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=30
-XX:GCLogFileSize=10M
-XX:+PreserveFramePointer
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
-XX:+PrintTenuringDistribution
-Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Djava.awt.headless=true
-XX:+UseG1GC
-XX:MaxGCPauseMillis=10
-XX:+ParallelRefProcEnabled
-XX:+UnlockExperimentalVMOptions
-XX:+AggressiveOpts
-XX:+DoEscapeAnalysis
-XX:ParallelGCThreads=32
-XX:ConcGCThreads=32
-XX:G1NewSizePercent=50
-XX:+DisableExplicitGC
-XX:-ResizePLAB
-Djute.maxbuffer=10485760
-Djava.net.preferIPv4Stack=true
-Dio.netty.leakDetectionLevel=disabled
-Dio.netty.recycler.maxCapacity.default=1000
-Dio.netty.recycler.linkCapacity=1024


 == Bookie Configuration

dbStorage_writeCacheMaxSizeMb=4096
dbStorage_readAheadCacheMaxSizeMb=4096
dbStorage_rocksDB_blockCacheSize=4294967296

readBufferSizeBytes=4096
writeBufferSizeBytes=65536

journalSyncData=false

# defaults
ledgerStorageClass=org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage
entryLogFilePreallocationEnabled=true
logSizeLimit=2147483648
minorCompactionThreshold=0.2
minorCompactionInterval=3600
majorCompactionThreshold=0.5
majorCompactionInterval=86400
compactionMaxOutstandingRequests=100000
compactionRate=1000
isThrottleByBytes=false
compactionRateByEntries=1000
compactionRateByBytes=1000000
journalMaxSizeMB=2048
journalMaxBackups=5
journalPreAllocSizeMB=16
journalWriteBufferSizeKB=64
journalRemoveFromPageCache=true
journalAdaptiveGroupWrites=true
journalMaxGroupWaitMSec=1
journalAlignmentSize=4096
journalBufferedWritesThreshold=524288
journalFlushWhenQueueEmpty=false
numJournalCallbackThreads=8
rereplicationEntryBatchSize=5000
gcWaitTime=900000
gcOverreplicatedLedgerWaitTime=86400000
flushInterval=60000
bookieDeathWatchInterval=1000
zkTimeout=30000
serverTcpNoDelay=true
openFileLimit=0
pageLimit=0
readOnlyModeEnabled=true
diskUsageThreshold=0.95
diskCheckInterval=10000
auditorPeriodicCheckInterval=604800
auditorPeriodicBookieCheckInterval=86400
numAddWorkerThreads=0
numReadWorkerThreads=8
maxPendingReadRequestsPerThread=2500
useHostNameAsBookieID=false

dbStorage_readAheadCacheBatchSize=1000
dbStorage_rocksDB_writeBufferSizeMB=64
dbStorage_rocksDB_sstSizeInMB=64
dbStorage_rocksDB_blockSize=65536
dbStorage_rocksDB_bloomFilterBitsPerKey=10
dbStorage_rocksDB_numLevels=-1
dbStorage_rocksDB_numFilesInLevel0=4
dbStorage_rocksDB_maxSizeInLevel1MB=256```
----
2018-01-19 07:21:31 UTC - DengJian: @DengJian has joined the channel
----
2018-01-19 17:56:08 UTC - Jaebin Yoon: When there are multiple consumers for a topic, the broker reads once from bookies and send them to all consumers with some buffer? or go get from bookies all the time for each consumers ?
----
2018-01-19 17:56:47 UTC - Matteo Merli: In general, all dispatching is done directly by broker memory
----
2018-01-19 17:57:13 UTC - Matteo Merli: we only read from bookies when consumer are falling behind
----
2018-01-19 17:57:39 UTC - Jaebin Yoon: ah i see. that's great. I'm trying to understand network implication of high fan-out case.
----
2018-01-19 17:58:08 UTC - Matteo Merli: in that case it depends on the broker cache, if they’re reading close together they’ll be probably be cached anyway. Otherwise go back to bookies
----
2018-01-19 17:58:29 UTC - Jaebin Yoon: is the broker cache size configurable?
----
2018-01-19 17:58:41 UTC - Matteo Merli: yes
----
2018-01-19 17:59:17 UTC - Jaebin Yoon: which configuration is it? definitely this is something i need to tweak for high fanout case.
----
2018-01-19 18:00:02 UTC - Matteo Merli: @Matteo Merli uploaded a file: <https://apache-pulsar.slack.com/files/U680ZCXA5/F8WGEBS30/-.sh|Untitled>
----
2018-01-19 18:04:46 UTC - Matteo Merli: Yes, the config looks good. One other “improvement”, since you’re disabling the fsync could be to either reduce `journalMaxGroupWaitMSec` to 0 oe 0.1 to avoid the “minimal” group commit latency.
----
2018-01-19 19:08:18 UTC - Fred Monroe: hi everyone, i apologize if this is a very noob question, is there a simple example and client library somewhere of publishing to apache pulsar from go (the programming language) - similar to the examples for python.
----
2018-01-19 19:09:14 UTC - Ali Ahmed: @Fred Monroe there is no official go client at this time
----
2018-01-19 19:09:41 UTC - Fred Monroe: ok thanks, i looked around a little, just wanted to make sure i wasn’t missing something
----
2018-01-19 19:21:17 UTC - Jaebin Yoon: Yeah I'm interested in go client too. Since c++ client is available, it would be relatively easy to have a wrapper for go.
----
2018-01-19 19:25:13 UTC - Matteo Merli: There was some effort of a pure go client some time back. I cannot vouch for completeness/stability though: <https://github.com/t2y/go-pulsar>
----
2018-01-19 19:26:02 UTC - Matteo Merli: Though yeah, my preference would be to have a C++ based wrapper. That would ensure to start from a mature library and have all features available.
----
2018-01-19 19:26:32 UTC - Matteo Merli: I don’t know how bad is to distribute Go libraries with native components
----
2018-01-19 19:29:27 UTC - Jaebin Yoon: yeah that would be challenging. Users might have to compile in their environment.  Kafka go client takes that approach. <https://github.com/confluentinc/confluent-kafka-go>
----
2018-01-19 19:32:00 UTC - Matteo Merli: is that basically telling to have the client library installed in your system? so there’s no “embedding” of sort
----
2018-01-19 19:34:10 UTC - Matteo Merli: on that front, Python wheel files are nicer!
----
2018-01-19 22:58:53 UTC - Allen Wang: Hello: what configuration to use if we want the bookkeeper-ensemble for a namespace to be all bookies in the cluster? We want the ensemble to grow as the bookie cluster size grows.
----
2018-01-19 22:59:54 UTC - Matteo Merli: but you still want to write 2 (or 3) copies of the data, right?
----
2018-01-19 23:00:00 UTC - Allen Wang: Yes
----
2018-01-19 23:00:27 UTC - Matteo Merli: the, default ensemble size (2) is good then
----
2018-01-19 23:00:44 UTC - Matteo Merli: that is just referred to a particular “ledger”
----
2018-01-19 23:01:44 UTC - Allen Wang: I thought the number of copies is controlled by bookkeeper-write-quorum
----
2018-01-19 23:01:57 UTC - Matteo Merli: defaults are `ensemble=2`, `write-quorum=2`, `ack-quorum=2`. This means: 
for a new ledger, pick any 2 available bookies and write 2 copies and wait for 2 acks
----
2018-01-19 23:02:44 UTC - Matteo Merli: in this same scenario, if you increase the ensemble size, you would be enabling “striping” when writing into a specific ledger
----
2018-01-19 23:03:24 UTC - Matteo Merli: eg: `e=5 w=2 a=2` -&gt; Picks 5 bookies and write 2 copies of the data, striping in round-robin across the 5 bookies
----
2018-01-19 23:12:39 UTC - Matteo Merli: Again, this is only for a particular ledger (segment of a topic), over time data for a topic will be assigned to multiple bookies. Every ledger's ensemble pick is unrelated to the previouses
----