You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2019/09/26 12:53:00 UTC

[GitHub] [incubator-doris] imay opened a new issue #1891: Release Notes 0.11.0

imay opened a new issue #1891: Release Notes 0.11.0
URL: https://github.com/apache/incubator-doris/issues/1891
 
 
   ## Highlight
   
   ### Storage Engine Refactor
   
   We refactor our storage engine, following is the main changes:
   
   1. Refactoring be is to clarify the structure the codes.
   2. Using unique id to indicate a rowset.
      Name rowset with tablet_id and version will lead to
      many conflicts among compaction, clone, restore.
   3. Extract an rowset interface to encapsulate rowsets
      with different format.
   
   Because of this work, we can support more format of storage file.
   And now we are working for BetaRowset, which will introduce a more
   effective compression method for string type. We will support
   inverted index based on BetaRowset, it should be released in the
   next version.
   
   ### Support Bitmap
   
   We support bitmap type and bitmap union operation on it. User can
   leverage this function to compute accurate distinct number quickly.
   For example, user can map visitor ids to this type and get distinct
   number of visitors through bitmap union operation.
   
   #1610
   #1721
   
   ### More documents
   
   We unify all type of our documents. We should write several
   copies documents for one function before. Now only one copy of
   document is needed to be written, it will be used in many place,
   such as our website or our help command.
   
   We also support english document and english website in this release.
   
   #1586
   #1518
   #1719
   #1729
   
   ### Support Load Parquet Format
   
   Now user can load Parquet format files through broker load. Besides this,
   Doris can get column content from file path. This is more friendly for
   integration with Spark and Hive.
   
   ## Enhancement
   
   ### Support Timezone
   
   Now we support timezone, user can specify timezone when querying or loading.
   
   #1587
   #1598
   #1631
   
   ### Add ChunkAllocator
   
   We add allocator to reduce huge memory allocating and free, which can improve
   performance when there are high concurrency requests.
   
   ### Support MIN/MAX for char/varchar
   
   User can create min max for char or varchar type.
   
   ### Others
   
   Remove query status report from BE when query is cancelled normally (#1489)
   Optimize the load performance for large file (#1798)
   Free olap scanner out of lock (#1733)
   Add exchange in MemPool to reduce alloc/free operation (#1732)
   Shuffle partitioned instance to avoid skew (#1744)
   Reduce unnecessary memory allocate and copy in OlapScanNode (#1742)
   Improve LRUCache to get better performance (#1826)
   Split channel close operation into two phase (#1830)
   Make http server and thrift server backlog num configurable (#1638)
   Add timeout on snapshot of data (#1672)
   Make the max recursion depth of distribution pruner configurable (#1709)
   Limit the disk usage to avoid running out of disk capacity (#1702)
   Refactor DateLiteral class in FE (#1644)
   Add limit to show tablet stmt (#1547)
   Unify the msg of 'Memory exceed limit' (#1737)
   Encapsulate HLL logic (#1756)
   Make CpuInfo::get_current_core work (#1773)
   Refactor alter job (#1695)
   Add parallel_exchange_instance_num to set parallel after exchange (#1788)
   Resolve reduce/reduce conflict in our syntax (#1811)
   Limit the max version to cumulative compaction (#1813)
   Check file descriptor number is larger than 65536 upon start (#1819)
   Check buckets limit: buckets > 0 when adding partition (#1855)
   
   ## SQL Support
   
   ### Multiple Columns Partition
   
   When creating table with OLAP engine, use can specify multi partition columns.
   eg:
   ```
   PARTITION BY RANGE(`date`, `id`)
   (
       PARTITION `p201701_1000` VALUES LESS THAN ("2017-02-01", "1000"),
       PARTITION `p201702_2000` VALUES LESS THAN ("2017-03-01", "2000"),
       PARTITION `p201703_all`  VALUES LESS THAN ("2017-04-01")
   )
   ```
   
   ### More Push Down
   
   Support push down predicates past agg, win and sort. This will filter data ASAP,
   which can improve query's performance.
   
   ### Others
   
   Support TIME type and timediff function (#1505)
   Show load statement support offset (#1531)
   Fix <=> operator and in operator get wrong result
   Add PreAgg Hint (#1617)
   Bug-fix: error result of union stmt (#1758)
   Fix bug: unknown column from the inline view (#1770)
   Support table comment and column comment for view (#1799)
   Support grant GRANT privilege on database or table #1472
   Fix bug: Remove conjuncts for empty set node (#1840)
   Add a ALTER operation to change distribution type from RANDOM to HASH (#1823)
   Support cast datetime to decimal (#1849)
   Enable StringLiteral cast to Varchar (#1846)
   Support hll_empty function (#1825)
   Fix NPE error when creating table with bool column (#1864)
   
   ## Load
   
   ### Funcion and Where in Broker Load
   
   User can specify function map and where clause in Broker Load.
   
   ### Others
   
   Support timeout in stream load #1480
   Add more profile for OlapTableSink #1487
   Fix the duplicated request bug of mini load #1504 
   Add more logs and metrics to trace the broker load process (#1530)
   Fix Bug: Load fail when we don't specify format type. (#1538)
   Allow the null default in insert into stmt (#1556)
   Fix Broker load hang when rpc failed (#1567)
   Fix parquet directory have empty file (#1593)
   Support Decimal Type when load Parquet File (#1595)
   Broker load supports function (#1592)
   Insert select Stmt keep the same semantics with mysql (#1626) (#1628)
   Support read kafka partition from start (#1642)
   Support checking error data row when doing INSERT (#1597)
   Enable parsing columns from file path for Broker Load (#1582) (#1635)
   Support setting timeout for stream load (#1670)
   Reduce the number of partition info in BrokerScanNode param (#1675)
   Add strict mode in Routine load, Stream load and Mini load (#1677)
   Fix bug that 2 same stream load jobs may both be able to executed successfully (#1690)
   Add a loaded rows in SHOW LOAD result  (#1686)
   Error check about column which has no default value (#1728)
   Optimize some kinds of load jobs (#1762)
   Commit kafka offset (#1734)
   Fix bug that routine load may mistakenly skipped some data (#1832)
   Support setting timezone for stream load and routine load (#1831)
   
   ## Bug Fixes
   
   Fix bug that replicas of a tablet may be located on same host (#1517)
   Fix bug that bad replica can not be synchronized when report (#1634)
   Fix bug that failed to get enough normal replica because path hash is not set. (#1714)
   Take segments in singleton rowset into consideration upon cumulative compaction (#1866)
   Fix BE crash when doing rollup #1502
   Fix get wrong partition type for non partition table #1503
   Fix bug that BE may crash when closing OlapTableSink (#1507)
   Fix bug that WrapperField does not consider HLL column type when creating (#1514)
   Fix variable arguments bug in UDAF (#1523)
   Fix bug that user with LOAD_PRIV can see load job by SHOW LOAD stmt (#1528)
   Fix Bug: Load fail when we don't specify format type. (#1538)
   Fix the null pointer exception when ReplayOnAborted of txn in broker load (#1543)
   Fix bug which make BE crash when load HLL type (#1552)
   Fix bug that getting compatible type for TIME with other types fails (#1544)
   Fix bugs of Broker load (#1546)
   Fix bug that unable to delete replica if version is missing (#1585)
   Fix errors when ES username and passwd is empty (#1601)
   Fix bug that cluster balance may cause load job failed (#1581)
   Fix bug: localtime is not thread-safe,then changed to localtime_r. (#1614)
   Fix bug that encounter "No more data to read" when accessing broker (#1621)
   Fix tablet restore api in BE(#1623) (#1624)
   Fix bug: "SHOW DATA" or "SHOW PARTITIONS", the DATA-SIZE less than 0 (#1680)
   Fix bug that failed to create a new partition when no partition in a table (#1688)
   Fix bug that the calculation of disk usage percent is wrong (#1791)
   Fix tablet meta tool command argument bug (#1810)
   Seek block when starts a ScanKey (#1828)
   Fix two digit year bug in to_days function (#1839)
   Fix bug: compare column with equals rather than == (#1850)
   Collect scanner's status when es_http_scan_node close (#1861)
   
   ## Thirdparty
   
   Bump thirdparty's BZIP2 version to 1.0.8 (#1559)
   
   # Credits
   
   Thanks to everyone who contributed to this release!
   
   @DDDDDDouble
   @EmmyMiao87
   @HangyuanLiu
   @WingsGo
   @Youngwb
   @acelyc111
   @chaoyli
   @chenhao7253886
   @cquptEthan
   @gaodayue
   @imay
   @kangkaisen
   @kangpinghuang
   @lenmom
   @liutang123
   @lxqfy
   @manannan2017
   @morningman
   @shgxwxl
   @wangbo
   @wkhappy1
   @worker24h
   @wubiaoi
   @wuyunfeng
   @xionglei0
   @xy720
   @yangzhg
   @yiguolei
   @yuanlihan

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org