You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@carbondata.apache.org by xm_zzc <44...@qq.com> on 2018/10/16 02:07:27 UTC

java.lang.NegativeArraySizeException occurred when compact

Hi:
  I encounter 'java.lang.NegativeArraySizeException' error with carbondata
1.3.1 + spark 2.2.
  When I run the compact command to compact 8 level-1 segments to a level-2
segment, the 'java.lang.NegativeArraySizeException' error occurred:
*java.lang.NegativeArraySizeException
        at
org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimesionDataChunkStore.getRow(UnsafeVariableLengthDimesionDataChunkStore.java:172)
        at
org.apache.carbondata.core.datastore.chunk.impl.AbstractDimensionDataChunk.getChunkData(AbstractDimensionDataChunk.java:46)
        at
org.apache.carbondata.core.scan.result.AbstractScannedResult.getNoDictionaryKeyArray(AbstractScannedResult.java:431)
        at
org.apache.carbondata.core.scan.result.impl.NonFilterQueryScannedResult.getNoDictionaryKeyArray(NonFilterQueryScannedResult.java:67)
        at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.scanResultAndGetData(RawBasedResultCollector.java:83)
        at
org.apache.carbondata.core.scan.collector.impl.RawBasedResultCollector.collectData(RawBasedResultCollector.java:58)
        at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:51)
        at
org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.next(DataBlockIteratorImpl.java:32)
        at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
        at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
        at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:31)
        at
org.apache.carbondata.core.scan.result.iterator.RawResultIterator.hasNext(RawResultIterator.java:72)
        at
org.apache.carbondata.processing.merger.RowResultMergerProcessor.execute(RowResultMergerProcessor.java:131)
        at
org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.<init>(CarbonMergerRDD.scala:228)
        at
org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:84)
        at
org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
        at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
        at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
        at
org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
        at org.apache.spark.scheduler.Task.run(Task.scala:109)
        at
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
        at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)*

I traced the code of 'UnsafeVariableLengthDimesionDataChunkStore.getRow',
found that the root cause is the value of length is negative when create
byte array: 'byte[] data = new byte[length];', the value of some parameters
are below when error ocurred:

when 'rowId < numberOfRows - 1':
*this.dataLength=192000
currentDataOffset=2
rowId=0
OffsetOfNextdata=-12173  (why)
length=-12177*

otherwise :

*this.dataLength=320000
currentDataOffset=263702
rowId=31999
length=-9238*

the value of (320000 - 263702) is exceed the range of short.

I patch the PR#2796(https://github.com/apache/carbondata/pull/2796), but
error still occurred.

finally, my test steps are:

for example: there are 4 level-1 compacted segments: 1.1, 2.1, 3.1, 4.1:
*1. run compact command, it failed;
2. delete 1.1 segment, run compact command again, it failed;
3. delete 2.1 segment, run compact command again, it failed;
3. delete 3.1 segment, run compact command again, it succeeded;*

So I think that one of 8 level-1 compacted segments maybe have some problem
but I don't how to find out.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by xm_zzc <44...@qq.com>.
Hi Kunal Kapoor, Babu :
  My stream app has ran for few days, the issue no longer occur, PR#2796
works. Thanks.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by Kunal Kapoor <ku...@gmail.com>.
Okay sure. Please let us know the result

On Thu, Oct 18, 2018, 11:01 AM xm_zzc <44...@qq.com> wrote:

> Hi Kunal Kapoor:
>   I have patched PR#2796 into 1.3.1 and run stream app again, this issue
> does not happen often, I will run for a few days to check whether it works.
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by xm_zzc <44...@qq.com>.
Hi Kunal Kapoor: 
  I have patched PR#2796 into 1.3.1 and run stream app again, this issue
does not happen often, I will run for a few days to check whether it works.



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by Kunal Kapoor <ku...@gmail.com>.
Right.
Can you try to write the segments after cherry picking #2796?

On Wed, Oct 17, 2018 at 3:21 PM xm_zzc <44...@qq.com> wrote:

> Hi Kunal Kapoor:
>   1.  No;
>   2.  query unsuccessfully, I use Carbon SDK Reader to read that wrong
> segment and it failed too.
>   3.  the schema of the table:
>
> | rt                                    | string
>
> | timestamp_1min                        | bigint
>
> | timestamp_5min                        | bigint
>
> | timestamp_1hour                       | bigint
>
> | customer_id                           | bigint
>
> | transport_id                          | bigint
>
> | transport_code                        | string
>
> | tcp_udp                               | int
>
> | pre_hdt_id                            | string
>
> | hdt_id                                | string
>
> | status                                | int
>
> | is_end_user                           | int
>
> | transport_type                        | string
>
> | transport_type_nam                    | string
>
> | fcip                                  | string
>
> | host                                  | string
>
> | cip                                   | string
>
> | code                                  | int
>
> | conn_status                           | int
>
> | recv                                  | bigint
>
> | send                                  | bigint
>
> | msec                                  | bigint
>
> | dst_prefix                            | string
>
> | next_type                             | int
>
> | next                                  | string
>
> | hdt_sid                               | string
>
> | from_endpoint_type                    | int
>
> | to_endpoint_type                      | int
>
> | fcip_view                             | string
>
> | fcip_country                          | string
>
> | fcip_province                         | string
>
> | fcip_city                             | string
>
> | fcip_longitude                        | string
>
> | fcip_latitude                         | string
>
> | fcip_node_name                        | string
>
> | fcip_node_name_cn                     | string
>
> | host_view                             | string
>
> | host_country                          | string
>
> | host_province                         | string
>
> | host_city                             | string
>
> | host_longitude                        | string
>
> | host_latitude                         | string
>
> | cip_view                              | string
>
> | cip_country                           | string
>
> | cip_province                          | string
>
> | cip_city                              | string
>
> | cip_longitude                         | string
>
> | cip_latitude                          | string
>
> | cip_node_name                         | string
>
> | cip_node_name_cn                      | string
>
> | dtp_send                              | string
>
> | client_port                           | int
>
> | server_ip                             | string
>
> | server_port                           | int
>
> | state                                 | string
>
> | response_code                         | int
>
> | access_domain                         | string
>
> | valid                                 | int
>
> | min_batch_time                        | bigint
>
> | update_time                           | bigint
>
> |                                       |
>
> | ##Detailed Table Information          |
>
> | Database Name                         | hdt_sys
>
> | Table Name                            | transport_access_log
>
> | CARBON Store Path                     | hdfs://hdtcluster/carbon_store
>
> | Comment                               |
>
> | Table Block Size                      | 512 MB
>
> | Table Data Size                       | 777031634135
>
> | Table Index Size                      | 72894232
>
> | Last Update Time                      | 1539769299990
>
> | SORT_SCOPE                            | local_sort
>
> | Streaming                             | true
>
> | MAJOR_COMPACTION_SIZE                 | 4096
>
> | AUTO_LOAD_MERGE                       | true
>
> | COMPACTION_LEVEL_THRESHOLD            | 2,8
>
> |                                       |
>
> | ##Detailed Column property            |
>
> | ADAPTIVE                              |
>
> | SORT_COLUMNS                          |
> is_end_user,status,customer_id,access_domain,transport_id,timestamp_1hour,timestamp_1min,conn_status,pre_hdt_id,tcp_udp,transport_code,fcip,cip
>
> |
>
>
> I think maybe the data are written wrongly when generated segment because
> of
> the issue :   MemoryBlock is cleaned by some other thread and result in
> wrong data when write data.
> Now I deleted the wrong segment and then patch the PR#2796, and continue to
> run stream app. If the exception  no longer occur, it proves that PR#2796
> works. Right?
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by xm_zzc <44...@qq.com>.
Hi Kunal Kapoor:
  1.  No;
  2.  query unsuccessfully, I use Carbon SDK Reader to read that wrong
segment and it failed too.
  3.  the schema of the table:

| rt                                    | string                            
| timestamp_1min                        | bigint                            
| timestamp_5min                        | bigint                            
| timestamp_1hour                       | bigint                            
| customer_id                           | bigint                            
| transport_id                          | bigint                            
| transport_code                        | string                            
| tcp_udp                               | int                               
| pre_hdt_id                            | string                            
| hdt_id                                | string                            
| status                                | int                               
| is_end_user                           | int                               
| transport_type                        | string                            
| transport_type_nam                    | string                            
| fcip                                  | string                            
| host                                  | string                            
| cip                                   | string                            
| code                                  | int                               
| conn_status                           | int                               
| recv                                  | bigint                            
| send                                  | bigint                            
| msec                                  | bigint                            
| dst_prefix                            | string                            
| next_type                             | int                               
| next                                  | string                            
| hdt_sid                               | string                            
| from_endpoint_type                    | int                               
| to_endpoint_type                      | int                               
| fcip_view                             | string                            
| fcip_country                          | string                            
| fcip_province                         | string                            
| fcip_city                             | string                            
| fcip_longitude                        | string                            
| fcip_latitude                         | string                            
| fcip_node_name                        | string                            
| fcip_node_name_cn                     | string                            
| host_view                             | string                            
| host_country                          | string                            
| host_province                         | string                            
| host_city                             | string                            
| host_longitude                        | string                            
| host_latitude                         | string                            
| cip_view                              | string                            
| cip_country                           | string                            
| cip_province                          | string                            
| cip_city                              | string                            
| cip_longitude                         | string                            
| cip_latitude                          | string                            
| cip_node_name                         | string                            
| cip_node_name_cn                      | string                            
| dtp_send                              | string                            
| client_port                           | int                               
| server_ip                             | string                            
| server_port                           | int                               
| state                                 | string                            
| response_code                         | int                               
| access_domain                         | string                            
| valid                                 | int                               
| min_batch_time                        | bigint                            
| update_time                           | bigint                            
|                                       |                                   
| ##Detailed Table Information          |                                   
| Database Name                         | hdt_sys                           
| Table Name                            | transport_access_log              
| CARBON Store Path                     | hdfs://hdtcluster/carbon_store    
| Comment                               |                                   
| Table Block Size                      | 512 MB                            
| Table Data Size                       | 777031634135                      
| Table Index Size                      | 72894232                          
| Last Update Time                      | 1539769299990                     
| SORT_SCOPE                            | local_sort                        
| Streaming                             | true                              
| MAJOR_COMPACTION_SIZE                 | 4096                              
| AUTO_LOAD_MERGE                       | true                              
| COMPACTION_LEVEL_THRESHOLD            | 2,8                               
|                                       |                                   
| ##Detailed Column property            |                                   
| ADAPTIVE                              |                                   
| SORT_COLUMNS                          |
is_end_user,status,customer_id,access_domain,transport_id,timestamp_1hour,timestamp_1min,conn_status,pre_hdt_id,tcp_udp,transport_code,fcip,cip 
|


I think maybe the data are written wrongly when generated segment because of
the issue :   MemoryBlock is cleaned by some other thread and result in
wrong data when write data.
Now I deleted the wrong segment and then patch the PR#2796, and continue to
run stream app. If the exception  no longer occur, it proves that PR#2796
works. Right?
  



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by Kunal Kapoor <ku...@gmail.com>.
Hi,
I have a few question regarding this Exception:
1. Does the table have a string columns for which the length of the data is
exceeding 32k characters?
2. Are you able to query(select *) on the table successfully?
3. Can you share the schema of the table?

Meanwhile i am looking into the possibilities of any other thread clearing
the MemoryBlock.

Regards
Kunal Kapoor

On Tue, Oct 16, 2018 at 1:31 PM xm_zzc <44...@qq.com> wrote:

> Hi Babu:
>   Thanks for your reply.
>   I set enable.unsafe.in.query.processing=false  and
> enable.unsafe.columnpage=false , and test failed still.
>   I think the issue I met is not related to MemoryBlock which is cleaned by
> some other thread. As the test steps I mentioned above, I copy the wrong
> segment and use SDKReader to read data, it failed too, the error  message
> is
> following:
> *java.lang.RuntimeException: java.lang.IllegalArgumentException
>         at
>
> org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:120)
>         at
>
> org.apache.carbondata.core.scan.result.BlockletScannedResult.fillDataChunks(BlockletScannedResult.java:355)
>         at
>
> org.apache.carbondata.core.scan.result.BlockletScannedResult.hasNext(BlockletScannedResult.java:559)
>         at
>
> org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:137)
>         at
>
> org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:109)
>         at
>
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
>         at
>
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
>         at
>
> org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:1)
>         at
>
> org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:58)
>         at
>
> org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:104)
>         at
> org.apache.carbondata.sdk.file.CarbonReader.hasNext(CarbonReader.java:71)
>         at
> cn.xm.zzc.carbonsdktest.CarbonSDKTest.main(CarbonSDKTest.java:68)
> Caused by: java.lang.IllegalArgumentException
>         at java.nio.Buffer.position(Buffer.java:244)
>         at
>
> org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimensionDataChunkStore.putArray(UnsafeVariableLengthDimensionDataChunkStore.java:97)
>         at
>
> org.apache.carbondata.core.datastore.chunk.impl.VariableLengthDimensionColumnPage.<init>(VariableLengthDimensionColumnPage.java:58)
>         at
>
> org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimensionLegacy(CompressedDimensionChunkFileBasedReaderV3.java:325)
>         at
>
> org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimension(CompressedDimensionChunkFileBasedReaderV3.java:266)
>         at
>
> org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeColumnPage(CompressedDimensionChunkFileBasedReaderV3.java:224)
>         at
>
> org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:118)
>         ... 11 more*
>
> when error occurred, the values of some parameters in
> UnsafeVariableLengthDimensionDataChunkStore.putArray are as following :
>
> buffer.limit=192000
> buffer.cap=192000
> startOffset=300289
> numberOfRows=32000
> this.dataPointersOffsets=288000
>
> startOffset is bigger than buffer.limit, so error occurred.
>
>
>
>
> --
> Sent from:
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/
>

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by xm_zzc <44...@qq.com>.
Hi Babu:
  Thanks for your reply.
  I set enable.unsafe.in.query.processing=false  and
enable.unsafe.columnpage=false , and test failed still.
  I think the issue I met is not related to MemoryBlock which is cleaned by
some other thread. As the test steps I mentioned above, I copy the wrong
segment and use SDKReader to read data, it failed too, the error  message is
following:
*java.lang.RuntimeException: java.lang.IllegalArgumentException
	at
org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:120)
	at
org.apache.carbondata.core.scan.result.BlockletScannedResult.fillDataChunks(BlockletScannedResult.java:355)
	at
org.apache.carbondata.core.scan.result.BlockletScannedResult.hasNext(BlockletScannedResult.java:559)
	at
org.apache.carbondata.core.scan.collector.impl.DictionaryBasedResultCollector.collectResultInRow(DictionaryBasedResultCollector.java:137)
	at
org.apache.carbondata.core.scan.processor.DataBlockIterator.next(DataBlockIterator.java:109)
	at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.getBatchResult(DetailQueryResultIterator.java:49)
	at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:41)
	at
org.apache.carbondata.core.scan.result.iterator.DetailQueryResultIterator.next(DetailQueryResultIterator.java:1)
	at
org.apache.carbondata.core.scan.result.iterator.ChunkRowIterator.hasNext(ChunkRowIterator.java:58)
	at
org.apache.carbondata.hadoop.CarbonRecordReader.nextKeyValue(CarbonRecordReader.java:104)
	at
org.apache.carbondata.sdk.file.CarbonReader.hasNext(CarbonReader.java:71)
	at cn.xm.zzc.carbonsdktest.CarbonSDKTest.main(CarbonSDKTest.java:68)
Caused by: java.lang.IllegalArgumentException
	at java.nio.Buffer.position(Buffer.java:244)
	at
org.apache.carbondata.core.datastore.chunk.store.impl.unsafe.UnsafeVariableLengthDimensionDataChunkStore.putArray(UnsafeVariableLengthDimensionDataChunkStore.java:97)
	at
org.apache.carbondata.core.datastore.chunk.impl.VariableLengthDimensionColumnPage.<init>(VariableLengthDimensionColumnPage.java:58)
	at
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimensionLegacy(CompressedDimensionChunkFileBasedReaderV3.java:325)
	at
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeDimension(CompressedDimensionChunkFileBasedReaderV3.java:266)
	at
org.apache.carbondata.core.datastore.chunk.reader.dimension.v3.CompressedDimensionChunkFileBasedReaderV3.decodeColumnPage(CompressedDimensionChunkFileBasedReaderV3.java:224)
	at
org.apache.carbondata.core.datastore.chunk.impl.DimensionRawColumnChunk.convertToDimColDataChunkWithOutCache(DimensionRawColumnChunk.java:118)
	... 11 more*

when error occurred, the values of some parameters in
UnsafeVariableLengthDimensionDataChunkStore.putArray are as following :

buffer.limit=192000
buffer.cap=192000
startOffset=300289
numberOfRows=32000
this.dataPointersOffsets=288000

startOffset is bigger than buffer.limit, so error occurred.




--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/

Re: java.lang.NegativeArraySizeException occurred when compact

Posted by BabuLal <ba...@gmail.com>.
Hi :
It seems that MemoryBlock is cleaned by some other thread. i will
investigate this ,you can continue by setting up below parameter in
carbon.properties. 

enable.unsafe.in.query.processing=false 
enable.unsafe.columnpage=false


Thanks
Babu



--
Sent from: http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/