You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Ravindra Pesala (JIRA)" <ji...@apache.org> on 2017/03/11 01:49:04 UTC

[jira] [Resolved] (CARBONDATA-748) "between and" filter query is very slow

     [ https://issues.apache.org/jira/browse/CARBONDATA-748?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ravindra Pesala resolved CARBONDATA-748.
----------------------------------------
       Resolution: Fixed
         Assignee: Jarck
    Fix Version/s: 1.0.1-incubating

> "between and" filter query is very slow
> ---------------------------------------
>
>                 Key: CARBONDATA-748
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-748
>             Project: CarbonData
>          Issue Type: Improvement
>            Reporter: Jarck
>            Assignee: Jarck
>             Fix For: 1.0.1-incubating
>
>          Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Hi,
> Currently In include and exclude filter case when dimension column does not
> have inverted index it is doing linear search , We can add binary search
> when data for that column is sorted, to get this information we can check
> in carbon table for that column whether user has selected no inverted index
> or not. If user has selected No inverted index while creating a column this
> code is fine, if user has not selected then data will be sorted so we can
> add binary search which will improve the performance.
> Please raise a Jira for this improvement
> -Regards
> Kumar Vishal
> On Fri, Mar 3, 2017 at 7:42 PM, 马云 <si...@163.com> wrote:
> Hi Dev,
> I used carbondata version 0.2 in my local machine, and found that the
> "between and" filter query is very slow.
> the root caused is by the below code in IncludeFilterExecuterImpl.java.
> It takes about 20s in my test.
> The code's  time complexity is O(n*m). I think it needs to optimized,
> please confirm. thanks
>  private BitSet setFilterdIndexToBitSet(DimensionColumnDataChunkdimens
> ionColumnDataChunk,
>      intnumerOfRows) {
>    BitSet bitSet = new BitSet(numerOfRows);
>    if (dimensionColumnDataChunkinstanceof FixedLengthDimensionDataChunk)
> {
>      FixedLengthDimensionDataChunk fixedDimensionChunk =
>          (FixedLengthDimensionDataChunk) dimensionColumnDataChunk;
>      byte[][] filterValues = dimColumnExecuterInfo.getFilterKeys();
>      longstart = System.currentTimeMillis();
>      for (intk = 0; k < filterValues.length; k++) {
>        for (intj = 0; j < numerOfRows; j++) {
>          if (ByteUtil.UnsafeComparer.INSTANCE
>              .compareTo(fixedDimensionChunk.getCompleteDataChunk(), j *
> filterValues[k].length,
>                  filterValues[k].length, filterValues[k], 0,
> filterValues[k].length) == 0) {
>            bitSet.set(j);
>          }
>        }
>      }
>      System.out.println("loop time: "+(System.currentTimeMillis() -
> start));
>    }



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)