You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@iotdb.apache.org by GitBox <gi...@apache.org> on 2020/11/03 14:14:15 UTC

[GitHub] [iotdb] JackieTien97 commented on issue #1833: I want to know the detail design for tag inverted index

JackieTien97 commented on issue #1833:
URL: https://github.com/apache/iotdb/issues/1833#issuecomment-721052984


   > thank you for your detailed description!
   > 
   > from your comment above, and the system design article in iotdb official website, I think you do great job with the timeseries data format to achieve the best compression ratio and query&write performance. but I still doubt whether the tag index design now could work well:
   > 
   > * tag index just a hashmap, it could mean lots of potential probelm, if multi timeseries created, all of them should wait the write lock, it's a concurrency write performance.
   > * If query series device by sequence, hashmap couldn't do well in such situation.
   > * have you test the performance when large number devices write to db, such as billiion devices, it may face many times of hashmap expansion.  and so the query performance  include where clause.
   > * all index saved in memory may lead to system unstable and need expensive machine. influxdb in older version save tag index all in memory, seems a bad attemp(although now they still do bad, too). also it mean mmanger component need lots of memory. for now iotdb seems doesn't implement a distributed program, this more worst.
   > * I guess when device quantity increase to a big number, it would take lots of time for iotdb to restart and recover.
   > * and so on.
   > 
   > in our tsdb maintain experience, tag index is a very important part, develpoer & user should design and use it carefully. for example, if user write a tag with long text string, it could cause index use more mem, and drop down system availability.
   > 
   > our company have now billion device, we are planning to test tdengine & iotdb to verify whether they could meet our requirement. tdengine programmed by c, it's little hard for us to maintain it, and iotdb seems have some problem too(not distributed system, seems simple index).
   > 
   > If you have any advice, please tell us, thanks!
   > 
   > 顺便问下能不能直接和你们的开发人员沟通。我看你们github回复有点慢, apache mail list也不咋活跃,官网上提到有钉钉,但是我不想用钉钉。。
   
   确实是的,我们目前对于时间序列的tag的倒排索引的实现比较native,在大数据量情况下会有问题。
   BTW,我们的分布式版本也正在内测阶段,在cluster_new分支,也有相应的设计文档,您可以在对应分支看到。
   
   我的微信号是JackieTien,有什么疑问,微信响应的快一些。


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org