You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by Raghunandan S <ca...@gmail.com> on 2019/10/25 03:22:08 UTC
[ANNOUNCE] Apache CarbonData 1.6.1 release
Hi All,
Apache CarbonData community is pleased to announce the release of the
Version 1.6.1 in The Apache Software Foundation (ASF).
CarbonData is a high-performance data solution that supports various data
analytic scenarios, including BI analysis, ad-hoc SQL query, fast filter
lookup on detail record, streaming analytics, and so on. CarbonData has
been deployed in many enterprise production environments, in one of the
largest scenarios, it supports queries on a single table with 3PB data
(more than 5 trillion records) with response time less than 3 seconds!
We encourage you to use the release
https://dist.apache.org/repos/dist/release/carbondata/1.6.1/, and feedback
through the CarbonData user mailing lists <us...@carbondata.apache.org>!
This release note provides information on the new features, improvements,
and bug fixes of this release.
What’s New in CarbonData Version 1.6.1?
CarbonData 1.6.1 intention was to move closer to unified analytics and
improve the stability. In this version of CarbonData, around 40 JIRA
tickets related to improvements, and bugs have been resolved. Following are
the summary.
Index Server performance improvements for Full Scan and TPCH Queries
Carbon currently prunes and caches all block/blocklet datamap index
information into the driver. If the cache size becomes huge(70-80% of the
driver memory) then there can be excessive GC in the driver which can slow
down the queries and the driver may even go OutOfMemory. Moving out the
indexes to separate JDBCServer reduced the overhead on the primary
JDBCServer, but introduced delay in fetching the bulk pruning blocks list
from the Index server. This is improved in this release and performance is
same as running without Index Server.
Behaviour Change
None
Please find the detailed JIRA list:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12320220&version=12345993
Sub-task
- [CARBONDATA-3454
<https://issues.apache.org/jira/browse/CARBONDATA-3454>] - Optimize the
performance of select coun(*) for index server
- [CARBONDATA-3462
<https://issues.apache.org/jira/browse/CARBONDATA-3462>] - Add usage and
deployment document for index server
Bug
- [CARBONDATA-3452
<https://issues.apache.org/jira/browse/CARBONDATA-3452>] - select query
failure when substring on dictionary column with join
- [CARBONDATA-3474
<https://issues.apache.org/jira/browse/CARBONDATA-3474>] - Fix validate
mvQuery having filter expression and correct error message
- [CARBONDATA-3476
<https://issues.apache.org/jira/browse/CARBONDATA-3476>] - Read time and
scan time stats shown wrong in executor log for filter query
- [CARBONDATA-3477
<https://issues.apache.org/jira/browse/CARBONDATA-3477>] - Throw out
exception when use sql: 'update table select\n...'
- [CARBONDATA-3478
<https://issues.apache.org/jira/browse/CARBONDATA-3478>] - Fix
ArrayIndexOutOfBoundsException issue on compaction after alter rename
operation
- [CARBONDATA-3480
<https://issues.apache.org/jira/browse/CARBONDATA-3480>] - Remove
Modified MDT and make relation refresh only when schema file is modified.
- [CARBONDATA-3481
<https://issues.apache.org/jira/browse/CARBONDATA-3481>] - Multi-thread
pruning fails when datamaps count is just near numOfThreadsForPruning
- [CARBONDATA-3482
<https://issues.apache.org/jira/browse/CARBONDATA-3482>] - Null pointer
exception when concurrent select queries are executed from different
beeline terminals.
- [CARBONDATA-3483
<https://issues.apache.org/jira/browse/CARBONDATA-3483>] - Can not run
horizontal compaction when execute update sql
- [CARBONDATA-3485
<https://issues.apache.org/jira/browse/CARBONDATA-3485>] - data loading
is failed from S3 to hdfs table having ~2K carbonfiles
- [CARBONDATA-3486
<https://issues.apache.org/jira/browse/CARBONDATA-3486>] -
Serialization/ deserialization issue with Datatype
- [CARBONDATA-3487
<https://issues.apache.org/jira/browse/CARBONDATA-3487>] - wrong Input
metrics (size/record) displayed in spark UI during insert into
- [CARBONDATA-3490
<https://issues.apache.org/jira/browse/CARBONDATA-3490>] - Concurrent
data load failure with carbondata FileNotFound exception
- [CARBONDATA-3493
<https://issues.apache.org/jira/browse/CARBONDATA-3493>] - Carbon query
fails when enable.query.statistics is true in specific scenario.
- [CARBONDATA-3494
<https://issues.apache.org/jira/browse/CARBONDATA-3494>] - Nullpointer
exception in case of drop table
- [CARBONDATA-3495
<https://issues.apache.org/jira/browse/CARBONDATA-3495>] - Insert into
Complex data type of Binary fails with Carbon & SparkFileFormat
- [CARBONDATA-3499
<https://issues.apache.org/jira/browse/CARBONDATA-3499>] - Fix insert
failure with customFileProvider
- [CARBONDATA-3502
<https://issues.apache.org/jira/browse/CARBONDATA-3502>] - Select query
fails with UDF having Match expression inside IN expression
- [CARBONDATA-3505
<https://issues.apache.org/jira/browse/CARBONDATA-3505>] - Fixed drop
database cascade issue when 2 database point to same location.
- [CARBONDATA-3506
<https://issues.apache.org/jira/browse/CARBONDATA-3506>] - Alter table
add, drop, rename and datatype change fails with hive compatile property
- [CARBONDATA-3507
<https://issues.apache.org/jira/browse/CARBONDATA-3507>] - Create Table
As Select Fails in Spark-2.3
- [CARBONDATA-3508
<https://issues.apache.org/jira/browse/CARBONDATA-3508>] - Select query
fails when the cg datamap is dropped concurrently while running the select
query on filter column on which datamap is created
- [CARBONDATA-3513
<https://issues.apache.org/jira/browse/CARBONDATA-3513>] - can not run
major compaction when using hive partition table
- [CARBONDATA-3520
<https://issues.apache.org/jira/browse/CARBONDATA-3520>] - CTAS should
fail if select query contains duplicate columns
- [CARBONDATA-3526
<https://issues.apache.org/jira/browse/CARBONDATA-3526>] - Cache issue
and select query failure with multiple updates
- [CARBONDATA-3527
<https://issues.apache.org/jira/browse/CARBONDATA-3527>] - Throw 'String
length cannot exceed 32000 characters' exception when load data with
'GLOBAL_SORT' from csv which include big complex type data
Improvement
- [CARBONDATA-3488
<https://issues.apache.org/jira/browse/CARBONDATA-3488>] - Check the
file size after move local file to carbon path
- [CARBONDATA-3489
<https://issues.apache.org/jira/browse/CARBONDATA-3489>] - Optimizing
the performance of sorting
- [CARBONDATA-3491
<https://issues.apache.org/jira/browse/CARBONDATA-3491>] - Return
updated/deleted rows count when execute update/delete sql
- [CARBONDATA-3501
<https://issues.apache.org/jira/browse/CARBONDATA-3501>] - Support to
execute update sql on table with long_string field (Not update long_string
field)
- [CARBONDATA-3511
<https://issues.apache.org/jira/browse/CARBONDATA-3511>] - Query time
improvement by reducing the number of NameNode calls while having
carbonindex files in the store
- [CARBONDATA-3515
<https://issues.apache.org/jira/browse/CARBONDATA-3515>] - Limit local
dictionary size to 10% of allowed blocklet size
- [CARBONDATA-3523
<https://issues.apache.org/jira/browse/CARBONDATA-3523>] - Should store
file size into index file
- [CARBONDATA-3524
<https://issues.apache.org/jira/browse/CARBONDATA-3524>] - support
compaction by GLOBAL_SORT
- [CARBONDATA-3528
<https://issues.apache.org/jira/browse/CARBONDATA-3528>] - refactor java
checkstyle rules
- [CARBONDATA-3540
<https://issues.apache.org/jira/browse/CARBONDATA-3540>] - Delete all
external segments when dropping table
- [CARBONDATA-3544
<https://issues.apache.org/jira/browse/CARBONDATA-3544>] - CLI should
support a option to show statistics for all columns