You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by Pinot Slack Email Digest <sn...@apache.org> on 2020/10/17 02:00:19 UTC
Apache Pinot Daily Email Digest (2020-10-16)

### _#general_

  
 **@chundong.wang:** @chundong.wang has joined the channel  
 **@anshu.jalan:** @anshu.jalan has joined the channel  
 **@snlee727:** @snlee727 has joined the channel  
 **@jiapengtao0:** @jiapengtao0 has joined the channel  

###  _#random_

  
 **@chundong.wang:** @chundong.wang has joined the channel  
 **@anshu.jalan:** @anshu.jalan has joined the channel  
 **@snlee727:** @snlee727 has joined the channel  
 **@jiapengtao0:** @jiapengtao0 has joined the channel  

###  _#troubleshooting_

  
 **@snlee727:** @snlee727 has joined the channel  

###  _#pinot-dev_

  
 **@chundong.wang:** @chundong.wang has joined the channel  

###  _#metadata-push-api_

  
 **@snlee727:** @snlee727 has joined the channel  

###  _#pinot-realtime-table-rebalance_

  
 **@tingchen:** @tingchen has joined the channel  
 **@tingchen:** @tingchen set the channel purpose: Discussion about Pinot
table rebalance  
 **@yupeng:** @yupeng has joined the channel  
 **@ujwala.tulshigiri:** @ujwala.tulshigiri has joined the channel  
 **@jackie.jxt:** @jackie.jxt has joined the channel  
 **@npawar:** @npawar has joined the channel  
 **@ssubrama:** @ssubrama has joined the channel  
 **@tingchen:** Created this channel to discuss about Pinot rebalance issues
we faced earlier in Uber.  
 **@tingchen:** We have a heavily used Pinot tenant which had 6 servers to
begin with.  
 **@tingchen:** I added 3 new servers with identical spec to that tenant. The
problem is that the 3 new servers showed twice as much as CPU load compared
with the rest 6.  
 **@tingchen:** The 3 new servers also have 50% more documents than the rest
6. So it seems that table rebalance does not distribute the data evenly among
all servers?  
 **@tingchen:** Another related qns: is Replica Group availabe for realtime
LLC query routing ?  
**@tingchen:** the examples above use OFFLINE table. @jackie.jxt @npawar  
 **@npawar:** replica groups for realtime:  
**@npawar:** it does distribute evenly. but it is possible that some servers
have 1 more consuming partition than the others  
 **@npawar:** does the ideal state show a bigger imbalance?  
 **@tingchen:** yes  
 **@tingchen:** tingchen@streampinot-prod40-dca8:~$ upinot-admin.sh
ShowIdealState storeindex_search_history |grep CONSUMING "Server_streampinot-
prod04-dca8_7090": "CONSUMING", "Server_streampinot-prod05-dca8_7090":
"CONSUMING", "Server_streampinot-prod06-dca8_7090": "CONSUMING"
"Server_streampinot-prod164-dca8_7090": "CONSUMING", "Server_streampinot-
prod165-dca8_7090": "CONSUMING", "Server_streampinot-prod166-dca8_7090":
"CONSUMING" "Server_streampinot-prod167-dca8_7090": "CONSUMING",
"Server_streampinot-prod168-dca8_7090": "CONSUMING", "Server_streampinot-
prod169-dca8_7090": "CONSUMING" "Server_streampinot-prod04-dca8_7090":
"CONSUMING", "Server_streampinot-prod05-dca8_7090": "CONSUMING",
"Server_streampinot-prod06-dca8_7090": "CONSUMING"  
 **@npawar:** this looks fine and balanced rt?  
 **@jackie.jxt:** Seems the problem is that there are 12 kafka partitions
(streaming partition * replication), but only 9 servers  
 **@jackie.jxt:** So 3 servers will have 2 partitions to consume, the other 6
has 1 partition  
 **@tingchen:** yes. so the 9 servers' perf is in fact worse than 6 servers.  
 **@yupeng:** no. this topic has 4 partitions  
 **@yupeng:** but 3 repication factor  
 **@tingchen:** I also observed the document distribution is not even -- the
latter 3 servers has 50% more doc than the original 6.  
 **@tingchen:** is that expected?  
 **@jackie.jxt:** The difference between LLC and realtime is that all the
segments for one partition will be hosted on the same server, so think of
partition as the smallest unit of the table  
**@npawar:** wont the completed segments get distributed evenly though? if you
run rebalance  
**@jackie.jxt:** No, unless you configure the COMPLETED segment assignment  
 **@tingchen:** what do you mean by the diff between LLC and realtime? I
thought LLC is realtime?  
 **@jackie.jxt:** Sorry, LLC and offline  
 **@jackie.jxt:** Performance wise, 9 servers should be similar to 6 servers
because the server load on the 3 new servers are the same as before  
 **@jackie.jxt:** Do you use partitioning or replica-group routing for this
table?  
 **@yupeng:** we use default  
 **@tingchen:** we want to use replica-group routing for this tenant (right
now it has 12 servers)  
 **@tingchen:** otherwise each query got fanned out to all 12 now -- which
could depend on the slowest server.  
 **@jackie.jxt:** For LLC table, because of the nature of the streaming
partition, the segments are already assigned into replica-groups  
 **@tingchen:** ```{ "tableName": "pinotTable", "tableType": "REALTIME",
"routing": { "instanceSelectorType": "replicaGroup" } .. }```  
**@jackie.jxt:** Simple enable replica-group routing should do the work  
 **@jackie.jxt:** Yes, correct  
 **@tingchen:** so we just need to add the above and restart broker?  
 **@jackie.jxt:** Let me check, I think we have an API to avoid restarting
broker  
 **@jackie.jxt:** You can use the broker rebuild routing API to enable it:
```@PUT @Produces(MediaType.TEXT_PLAIN) @Path("/routing/{tableName}")```  
**@jackie.jxt:** (table name here is the full table name, e.g.
`pinotTable_REALTIME`)  
 **@npawar:** though this won’t solve you issue of some servers seeing twice
the load. 1 server in each replica group is still going to have the same
behavior  
 **@jackie.jxt:** I think they already scaled up the cluster to 12 servers  
 **@npawar:** o okay  
 **@tingchen:** yes.  
 **@tingchen:** looks like 6-9 was not a good idea. 6->12 is.  
 **@yupeng:** Thanks @npawar @jackie.jxt for the help  
\--------------------------------------------------------------------- To
unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org For additional commands,
e-mail: dev-help@pinot.apache.org