You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by xm_zzc <44...@qq.com> on 2018/10/16 02:07:27 UTC
java.lang.NegativeArraySizeException occurred when compact
Hi:
I encounter 'java.lang.NegativeArraySizeException' error with carbondata
1.3.1 + spark 2.2.
When I run the compact command to compact 8 level-1 segments to a level-2
segment, the 'java.lang.NegativeArraySizeException' error occurred:
*java.lang.NegativeArraySizeException
at
org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimesionDataChunkStore.getRow(UnsafeVariableLengthDimesionDataChunkStore.java:172)
at
org.apache.carbondata.core.datastore.chunk.impl.AbstractDimensionDataChunk.getChunkData(AbstractDimensionDataChunk.java:46)
at
org.apache.carbondata.core.scan.result.AbstractScannedResult.getNoDictionaryKeyArray(AbstractScannedResult.java:431)
at
org.apache.carbondata.core.scan.result.impl.NonFilterQueryScannedResult.getNoDictionaryKeyArray(NonFilterQueryScannedResult.java:67)
at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.scanResultAndGetData(RawBasedResultCollector.java:83)
at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.collectData(RawBasedResultCollector.java:58)
at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51)
at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
at
org.apache.carbondata.core.scan.result.iterator.RawResultIterator.hasNext(RawResultIterator.java:72)
at
org.apache.carbondata.processing.merger.RowResultMergerProcessor.execute(RowResultMergerProcessor.java:131)
at
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:228)
at
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:84)
at
org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:109)
at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)*
I traced the code of 'UnsafeVariableLengthDimesionDataChunkStore.getRow',
found that the root cause is the value of length is negative when create
byte array: 'byte[] data = new byte[length];', the value of some parameters
are below when error ocurred:
when 'rowId < numberOfRows - 1':
*this.dataLength=192000
currentDataOffset=2
rowId=0
OffsetOfNextdata=-12173 (why)
length=-12177*
otherwise :
*this.dataLength=320000
currentDataOffset=263702
rowId=31999
length=-9238*
the value of (320000 - 263702) is exceed the range of short.
I patch the PR#2796(https://github.com/apache/carbondata/pull/2796), but
error still occurred.
finally, my test steps are:
for example: there are 4 level-1 compacted segments: 1.1, 2.1, 3.1, 4.1:
*1. run compact command, it failed;
2. delete 1.1 segment, run compact command again, it failed;
3. delete 2.1 segment, run compact command again, it failed;
3. delete 3.1 segment, run compact command again, it succeeded;*
So I think that one of 8 level-1 compacted segments maybe have some problem
but I don't how to find out.
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by xm_zzc <44...@qq.com>.
Hi Kunal Kapoor, Babu :
My stream app has ran for few days, the issue no longer occur, PR#2796
works. Thanks.
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by Kunal Kapoor <ku...@gmail.com>.
Okay sure. Please let us know the result
On Thu, Oct 18, 2018, 11:01 AM xm_zzc <44...@qq.com> wrote:
> Hi Kunal Kapoor:
> I have patched PR#2796 into 1.3.1 and run stream app again, this issue
> does not happen often, I will run for a few days to check whether it works.
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by xm_zzc <44...@qq.com>.
Hi Kunal Kapoor:
I have patched PR#2796 into 1.3.1 and run stream app again, this issue
does not happen often, I will run for a few days to check whether it works.
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by Kunal Kapoor <ku...@gmail.com>.
Right.
Can you try to write the segments after cherry picking #2796?
On Wed, Oct 17, 2018 at 3:21 PM xm_zzc <44...@qq.com> wrote:
> Hi Kunal Kapoor:
> 1. No;
> 2. query unsuccessfully, I use Carbon SDK Reader to read that wrong
> segment and it failed too.
> 3. the schema of the table:
>
> | rt | string
>
> | timestamp_1min | bigint
>
> | timestamp_5min | bigint
>
> | timestamp_1hour | bigint
>
> | customer_id | bigint
>
> | transport_id | bigint
>
> | transport_code | string
>
> | tcp_udp | int
>
> | pre_hdt_id | string
>
> | hdt_id | string
>
> | status | int
>
> | is_end_user | int
>
> | transport_type | string
>
> | transport_type_nam | string
>
> | fcip | string
>
> | host | string
>
> | cip | string
>
> | code | int
>
> | conn_status | int
>
> | recv | bigint
>
> | send | bigint
>
> | msec | bigint
>
> | dst_prefix | string
>
> | next_type | int
>
> | next | string
>
> | hdt_sid | string
>
> | from_endpoint_type | int
>
> | to_endpoint_type | int
>
> | fcip_view | string
>
> | fcip_country | string
>
> | fcip_province | string
>
> | fcip_city | string
>
> | fcip_longitude | string
>
> | fcip_latitude | string
>
> | fcip_node_name | string
>
> | fcip_node_name_cn | string
>
> | host_view | string
>
> | host_country | string
>
> | host_province | string
>
> | host_city | string
>
> | host_longitude | string
>
> | host_latitude | string
>
> | cip_view | string
>
> | cip_country | string
>
> | cip_province | string
>
> | cip_city | string
>
> | cip_longitude | string
>
> | cip_latitude | string
>
> | cip_node_name | string
>
> | cip_node_name_cn | string
>
> | dtp_send | string
>
> | client_port | int
>
> | server_ip | string
>
> | server_port | int
>
> | state | string
>
> | response_code | int
>
> | access_domain | string
>
> | valid | int
>
> | min_batch_time | bigint
>
> | update_time | bigint
>
> | |
>
> | ##Detailed Table Information |
>
> | Database Name | hdt_sys
>
> | Table Name | transport_access_log
>
> | CARBON Store Path | hdfs://hdtcluster/carbon_store
>
> | Comment |
>
> | Table Block Size | 512 MB
>
> | Table Data Size | 777031634135
>
> | Table Index Size | 72894232
>
> | Last Update Time | 1539769299990
>
> | SORT_SCOPE | local_sort
>
> | Streaming | true
>
> | MAJOR_COMPACTION_SIZE | 4096
>
> | AUTO_LOAD_MERGE | true
>
> | COMPACTION_LEVEL_THRESHOLD | 2,8
>
> | |
>
> | ##Detailed Column property |
>
> | ADAPTIVE |
>
> | SORT_COLUMNS |
> is_end_user,status,customer_id,access_domain,transport_id,timestamp_1hour,timestamp_1min,conn_status,pre_hdt_id,tcp_udp,transport_code,fcip,cip
>
> |
>
>
> I think maybe the data are written wrongly when generated segment because
> of
> the issue : MemoryBlock is cleaned by some other thread and result in
> wrong data when write data.
> Now I deleted the wrong segment and then patch the PR#2796, and continue to
> run stream app. If the exception no longer occur, it proves that PR#2796
> works. Right?
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by xm_zzc <44...@qq.com>.
Hi Kunal Kapoor:
1. No;
2. query unsuccessfully, I use Carbon SDK Reader to read that wrong
segment and it failed too.
3. the schema of the table:
| rt | string
| timestamp_1min | bigint
| timestamp_5min | bigint
| timestamp_1hour | bigint
| customer_id | bigint
| transport_id | bigint
| transport_code | string
| tcp_udp | int
| pre_hdt_id | string
| hdt_id | string
| status | int
| is_end_user | int
| transport_type | string
| transport_type_nam | string
| fcip | string
| host | string
| cip | string
| code | int
| conn_status | int
| recv | bigint
| send | bigint
| msec | bigint
| dst_prefix | string
| next_type | int
| next | string
| hdt_sid | string
| from_endpoint_type | int
| to_endpoint_type | int
| fcip_view | string
| fcip_country | string
| fcip_province | string
| fcip_city | string
| fcip_longitude | string
| fcip_latitude | string
| fcip_node_name | string
| fcip_node_name_cn | string
| host_view | string
| host_country | string
| host_province | string
| host_city | string
| host_longitude | string
| host_latitude | string
| cip_view | string
| cip_country | string
| cip_province | string
| cip_city | string
| cip_longitude | string
| cip_latitude | string
| cip_node_name | string
| cip_node_name_cn | string
| dtp_send | string
| client_port | int
| server_ip | string
| server_port | int
| state | string
| response_code | int
| access_domain | string
| valid | int
| min_batch_time | bigint
| update_time | bigint
| |
| ##Detailed Table Information |
| Database Name | hdt_sys
| Table Name | transport_access_log
| CARBON Store Path | hdfs://hdtcluster/carbon_store
| Comment |
| Table Block Size | 512 MB
| Table Data Size | 777031634135
| Table Index Size | 72894232
| Last Update Time | 1539769299990
| SORT_SCOPE | local_sort
| Streaming | true
| MAJOR_COMPACTION_SIZE | 4096
| AUTO_LOAD_MERGE | true
| COMPACTION_LEVEL_THRESHOLD | 2,8
| |
| ##Detailed Column property |
| ADAPTIVE |
| SORT_COLUMNS |
is_end_user,status,customer_id,access_domain,transport_id,timestamp_1hour,timestamp_1min,conn_status,pre_hdt_id,tcp_udp,transport_code,fcip,cip
|
I think maybe the data are written wrongly when generated segment because of
the issue : MemoryBlock is cleaned by some other thread and result in
wrong data when write data.
Now I deleted the wrong segment and then patch the PR#2796, and continue to
run stream app. If the exception no longer occur, it proves that PR#2796
works. Right?
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by Kunal Kapoor <ku...@gmail.com>.
Hi,
I have a few question regarding this Exception:
1. Does the table have a string columns for which the length of the data is
exceeding 32k characters?
2. Are you able to query(select *) on the table successfully?
3. Can you share the schema of the table?
Meanwhile i am looking into the possibilities of any other thread clearing
the MemoryBlock.
Regards
Kunal Kapoor
On Tue, Oct 16, 2018 at 1:31 PM xm_zzc <44...@qq.com> wrote:
> Hi Babu:
> Thanks for your reply.
> I set enable.unsafe.in.query.processing=false and
> enable.unsafe.columnpage=false , and test failed still.
> I think the issue I met is not related to MemoryBlock which is cleaned by
> some other thread. As the test steps I mentioned above, I copy the wrong
> segment and use SDKReader to read data, it failed too, the error message
> is
> following:
> *java.lang.RuntimeException: java.lang.IllegalArgumentException
> at
>
> org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:120)
> at
>
> org.apache.carbondata.core.scan.result.BlockletScannedResult.fillDataChunks(BlockletScannedResult.java:355)
> at
>
> org.apache.carbondata.core.scan.result.BlockletScannedResult.hasNext(BlockletScannedResult.java:559)
> at
>
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:137)
> at
>
> org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:109)
> at
>
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
> at
>
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
> at
>
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:1)
> at
>
> org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:58)
> at
>
> org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:104)
> at
> org.apache.carbondata.sdk.file.CarbonReader.hasNext(CarbonReader.java:71)
> at
> cn.xm.zzc.carbonsdktest.CarbonSDKTest.main(CarbonSDKTest.java:68)
> Caused by: java.lang.IllegalArgumentException
> at java.nio.Buffer.position(Buffer.java:244)
> at
>
> org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimensionDataChunkStore.putArray(UnsafeVariableLengthDimensionDataChunkStore.java:97)
> at
>
> org.apache.carbondata.core.datastore.chunk.impl.VariableLengthDimensionColumnPage.<init>(VariableLengthDimensionColumnPage.java:58)
> at
>
> org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimensionLegacy(CompressedDimensionChunkFileBasedReaderV3.java:325)
> at
>
> org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimension(CompressedDimensionChunkFileBasedReaderV3.java:266)
> at
>
> org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeColumnPage(CompressedDimensionChunkFileBasedReaderV3.java:224)
> at
>
> org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:118)
> ... 11 more*
>
> when error occurred, the values of some parameters in
> UnsafeVariableLengthDimensionDataChunkStore.putArray are as following :
>
> buffer.limit=192000
> buffer.cap=192000
> startOffset=300289
> numberOfRows=32000
> this.dataPointersOffsets=288000
>
> startOffset is bigger than buffer.limit, so error occurred.
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by xm_zzc <44...@qq.com>.
Hi Babu:
Thanks for your reply.
I set enable.unsafe.in.query.processing=false and
enable.unsafe.columnpage=false , and test failed still.
I think the issue I met is not related to MemoryBlock which is cleaned by
some other thread. As the test steps I mentioned above, I copy the wrong
segment and use SDKReader to read data, it failed too, the error message is
following:
*java.lang.RuntimeException: java.lang.IllegalArgumentException
at
org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:120)
at
org.apache.carbondata.core.scan.result.BlockletScannedResult.fillDataChunks(BlockletScannedResult.java:355)
at
org.apache.carbondata.core.scan.result.BlockletScannedResult.hasNext(BlockletScannedResult.java:559)
at
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:137)
at
org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:109)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:1)
at
org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:58)
at
org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:104)
at
org.apache.carbondata.sdk.file.CarbonReader.hasNext(CarbonReader.java:71)
at cn.xm.zzc.carbonsdktest.CarbonSDKTest.main(CarbonSDKTest.java:68)
Caused by: java.lang.IllegalArgumentException
at java.nio.Buffer.position(Buffer.java:244)
at
org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimensionDataChunkStore.putArray(UnsafeVariableLengthDimensionDataChunkStore.java:97)
at
org.apache.carbondata.core.datastore.chunk.impl.VariableLengthDimensionColumnPage.<init>(VariableLengthDimensionColumnPage.java:58)
at
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimensionLegacy(CompressedDimensionChunkFileBasedReaderV3.java:325)
at
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimension(CompressedDimensionChunkFileBasedReaderV3.java:266)
at
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeColumnPage(CompressedDimensionChunkFileBasedReaderV3.java:224)
at
org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:118)
... 11 more*
when error occurred, the values of some parameters in
UnsafeVariableLengthDimensionDataChunkStore.putArray are as following :
buffer.limit=192000
buffer.cap=192000
startOffset=300289
numberOfRows=32000
this.dataPointersOffsets=288000
startOffset is bigger than buffer.limit, so error occurred.
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
Re: java.lang.NegativeArraySizeException occurred when compact
Posted by BabuLal <ba...@gmail.com>.
Hi :
It seems that MemoryBlock is cleaned by some other thread. i will
investigate this ,you can continue by setting up below parameter in
carbon.properties.
enable.unsafe.in.query.processing=false
enable.unsafe.columnpage=false
Thanks
Babu
--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/