You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Jarck (JIRA)" <ji...@apache.org> on 2017/03/06 02:08:32 UTC

[jira] [Created] (CARBONDATA-748) "between and" filter query is very slow

Jarck created CARBONDATA-748:
--------------------------------

             Summary: "between and" filter query is very slow
                 Key: CARBONDATA-748
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-748
             Project: CarbonData
          Issue Type: Improvement
            Reporter: Jarck


Hi,

Currently In include and exclude filter case when dimension column does not
have inverted index it is doing linear search , We can add binary search
when data for that column is sorted, to get this information we can check
in carbon table for that column whether user has selected no inverted index
or not. If user has selected No inverted index while creating a column this
code is fine, if user has not selected then data will be sorted so we can
add binary search which will improve the performance.

Please raise a Jira for this improvement

-Regards
Kumar Vishal


On Fri, Mar 3, 2017 at 7:42 PM, 马云 <si...@163.com> wrote:

Hi Dev,


I used carbondata version 0.2 in my local machine, and found that the
"between and" filter query is very slow.
the root caused is by the below code in IncludeFilterExecuterImpl.java.
It takes about 20s in my test.
The code's  time complexity is O(n*m). I think it needs to optimized,
please confirm. thanks





 private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimens
ionColumnDataChunk,

     intnumerOfRows) {

   BitSet bitSet = new BitSet(numerOfRows);

   if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk)
{

     FixedLengthDimensionDataChunk fixedDimensionChunk =

         (FixedLengthDimensionDataChunk) dimensionColumnDataChunk;

     byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();



     longstart = System.currentTimeMillis();

     for (intk = 0; k < filterValues.length; k++) {

       for (intj = 0; j < numerOfRows; j++) {

         if (ByteUtil.UnsafeComparer.INSTANCE

             .compareTo(fixedDimensionChunk.getCompleteDataChunk(), j *
filterValues[k].length,

                 filterValues[k].length, filterValues[k], 0,
filterValues[k].length) == 0) {

           bitSet.set(j);

         }

       }

     }

     System.out.println("loop time: "+(System.currentTimeMillis() -
start));

   }






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)