You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by ravipesala <gi...@git.apache.org> on 2017/11/06 10:04:58 UTC

[GitHub] carbondata pull request #1471: [WIP] Datamap FineGrain implementation

GitHub user ravipesala opened a pull request:

    https://github.com/apache/carbondata/pull/1471

    [WIP] Datamap FineGrain implementation

    Be sure to do all of the following checklist to help us incorporate 
    your contribution quickly and easily:
    
     Implemented interfaces for FG datamap and integrated to filterscanner to use the pruned bitset from FG datamap.
    FG Query flow as follows.
     1.The user can add FG datamap to any table and implement there interfaces.
     2. Any filter query which hits the table with datamap will call prune method of FGdatamap. 
     3. The prune method of FGDatamap return list FineGrainBlocklet , these blocklets contain the information of block, blocklet, page and rowids information as well.
     4. The pruned  blocklets are internally wriitten to file and returns only the block , blocklet and filepath  information as part of Splits.
    5. Based on the splits scanrdd schedule the tasks. 
    6. In filterscanner we check the datamapwriterpath from split and reNoteads the bitset if exists. And pass this bitset as input to it.
    
    NOTE : This PR depends on https://github.com/apache/carbondata/pull/1410
    
     - [X] Any interfaces changed?
           Few changes to datamap interfaces only.
     
     - [X] Any backward compatibility impacted? NO
     
     - [X] Document update required? 
         Yes, Required to add to dev guide about interfaces
    
     - [X] Testing done
            Tests are added for both CG and FG
           
     - [X] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ravipesala/incubator-carbondata datamap-fg-impl

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/carbondata/pull/1471.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1471
    
----
commit 8d727ffbc465191040274318a55dd0a69063bdfd
Author: ravipesala <ra...@gmail.com>
Date:   2017-10-11T13:37:22Z

    Added FG interfaces

commit 0cc36e4b28d2747728013116cdacecb80beb0239
Author: ravipesala <ra...@gmail.com>
Date:   2017-10-17T12:23:09Z

    Refactored code to remove path from blocklet

commit 3c204c2d2e69512f21f630a091df756255ccd6e8
Author: ravipesala <ra...@gmail.com>
Date:   2017-10-31T09:55:57Z

    fgdatamap implement

commit 717ba3e41a6f3d1936ff14857a9b992cff13c8a6
Author: ravipesala <ra...@gmail.com>
Date:   2017-11-03T16:52:33Z

    Added test for FGDatamap

commit e6093540c800c2e86c472787603c884a72e1d129
Author: ravipesala <ra...@gmail.com>
Date:   2017-11-04T06:21:21Z

    Added testcase

----


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1700/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1250/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151830437
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -687,16 +689,17 @@ protected Expression getFilterPredicates(Configuration configuration) {
         // get tokens for all the required FileSystem for table path
         TokenCache.obtainTokensForNamenodes(job.getCredentials(),
             new Path[] { new Path(absoluteTableIdentifier.getTablePath()) }, job.getConfiguration());
    -
    -    TableDataMap blockletMap = DataMapStoreManager.getInstance()
    -        .getDataMap(absoluteTableIdentifier, BlockletDataMap.NAME,
    -            BlockletDataMapFactory.class.getName());
    +    boolean distributedCG = Boolean.parseBoolean(CarbonProperties.getInstance()
    +        .getProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +            CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT));
    +    TableDataMap blockletMap =
    +        DataMapStoreManager.getInstance().chooseDataMap(absoluteTableIdentifier);
         DataMapJob dataMapJob = getDataMapJob(job.getConfiguration());
         List<ExtendedBlocklet> prunedBlocklets;
    -    if (dataMapJob != null) {
    +    if (distributedCG || blockletMap.getDataMapFactory().getDataMapType() == DataMapType.FG) {
    --- End diff --
    
    In case of FG datamap the rowid information would be written to temp files and writepath is given to FGBlocklet so it does not contain any row information when it returns to the driver.
    While executing filter query executor reads rowids from disk and pass to the scanner. 


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1468/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151830168
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -463,20 +426,24 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException {
        * @throws CarbonDataWriterException
        */
       protected void closeExecutorService() throws CarbonDataWriterException {
    -    executorService.shutdown();
         try {
    -      executorService.awaitTermination(2, TimeUnit.HOURS);
    -    } catch (InterruptedException e) {
    -      throw new CarbonDataWriterException(e.getMessage());
    -    }
    -    for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +      listener.finish();
    +      executorService.shutdown();
           try {
    -        executorServiceSubmitList.get(i).get();
    +        executorService.awaitTermination(2, TimeUnit.HOURS);
           } catch (InterruptedException e) {
             throw new CarbonDataWriterException(e.getMessage());
    -      } catch (ExecutionException e) {
    -        throw new CarbonDataWriterException(e.getMessage());
           }
    +      for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +        executorServiceSubmitList.get(i).get();
    +      }
    +    } catch (InterruptedException e) {
    +      throw new CarbonDataWriterException(e.getMessage());
    +    } catch (ExecutionException e) {
    --- End diff --
    
    OK


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151598459
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java ---
    @@ -0,0 +1,110 @@
    +/*
    --- End diff --
    
    Can you explain , why change "DataMapWriter.java" to "AbstractDataMapWriter.java",  for easier supporting uses to customize other type of datamapwriter?


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1100/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151637482
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java ---
    @@ -0,0 +1,110 @@
    +/*
    --- End diff --
    
    Changed to Abstract class to enforce the user to pass the needed parameters through the constructor. And also the concrete method `commitFile` is added to this class.


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1087/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1582/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1107/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827371
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -463,20 +426,24 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException {
        * @throws CarbonDataWriterException
        */
       protected void closeExecutorService() throws CarbonDataWriterException {
    -    executorService.shutdown();
         try {
    -      executorService.awaitTermination(2, TimeUnit.HOURS);
    -    } catch (InterruptedException e) {
    -      throw new CarbonDataWriterException(e.getMessage());
    -    }
    -    for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +      listener.finish();
    +      executorService.shutdown();
           try {
    -        executorServiceSubmitList.get(i).get();
    +        executorService.awaitTermination(2, TimeUnit.HOURS);
           } catch (InterruptedException e) {
             throw new CarbonDataWriterException(e.getMessage());
    -      } catch (ExecutionException e) {
    -        throw new CarbonDataWriterException(e.getMessage());
           }
    +      for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +        executorServiceSubmitList.get(i).get();
    +      }
    +    } catch (InterruptedException e) {
    +      throw new CarbonDataWriterException(e.getMessage());
    +    } catch (ExecutionException e) {
    +      throw new CarbonDataWriterException(e.getMessage());
    +    } catch (IOException e) {
    +      LOGGER.error(e, "Error while writing datamap");
    +      throw new CarbonDataWriterException(e.getMessage());
    --- End diff --
    
    merge with line 440


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151831374
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java ---
    @@ -75,7 +79,12 @@ public TableDataMap(AbsoluteTableIdentifier identifier, String dataMapName,
         SegmentProperties segmentProperties;
         for (String segmentId : segmentIds) {
           List<Blocklet> pruneBlocklets = new ArrayList<>();
    -      List<DataMap> dataMaps = dataMapFactory.getDataMaps(segmentId);
    +      List<DataMap> dataMaps;
    +      if (blockletDetailsFetcher instanceof DataMapFactory && filterExp == null) {
    +        dataMaps = ((DataMapFactory)blockletDetailsFetcher).getDataMaps(segmentId);
    --- End diff --
    
    I have added here but placing generic assignment all places is not possible because we are deriving two types of datamaps through generic .


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1723/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    retest this please


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1159/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1200/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1156/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1155/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala closed the pull request at:

    https://github.com/apache/carbondata/pull/1471


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827164
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/FGDataMapTestCase.scala ---
    @@ -0,0 +1,434 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.spark.testsuite.datamap
    +
    +import java.io.{ByteArrayInputStream, DataOutputStream, ObjectInputStream, ObjectOutputStream}
    +
    +import scala.collection.JavaConverters._
    +import scala.collection.mutable.ArrayBuffer
    +
    +import com.sun.xml.internal.messaging.saaj.util.ByteOutputStream
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +import org.apache.carbondata.core.datamap.dev.fgdatamap.{AbstractFineGrainDataMap, AbstractFineGrainDataMapFactory}
    +import org.apache.carbondata.core.datamap.dev.{AbstractDataMapWriter, DataMapModel}
    +import org.apache.carbondata.core.datamap.{DataMapDistributable, DataMapMeta, DataMapStoreManager}
    +import org.apache.carbondata.core.datastore.FileHolder
    +import org.apache.carbondata.core.datastore.block.SegmentProperties
    +import org.apache.carbondata.core.datastore.compression.SnappyCompressor
    +import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, CarbonFileFilter}
    +import org.apache.carbondata.core.datastore.impl.FileFactory
    +import org.apache.carbondata.core.datastore.page.ColumnPage
    +import org.apache.carbondata.core.indexstore.FineGrainBlocklet
    +import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapDistributable
    +import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, CarbonMetadata}
    +import org.apache.carbondata.core.scan.expression.Expression
    +import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression
    +import org.apache.carbondata.core.scan.filter.intf.ExpressionType
    +import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf
    +import org.apache.carbondata.core.util.ByteUtil
    +import org.apache.carbondata.core.util.path.CarbonTablePath
    +import org.apache.carbondata.events.Event
    +import org.apache.carbondata.spark.testsuite.datacompaction.CompactionSupportGlobalSortBigFileTest
    +
    +class FGDataMapFactory extends AbstractFineGrainDataMapFactory {
    +  var identifier: AbsoluteTableIdentifier = _
    +  var dataMapName: String = _
    +
    +  /**
    +   * Initialization of Datamap factory with the identifier and datamap name
    +   */
    +  override def init(identifier: AbsoluteTableIdentifier,
    --- End diff --
    
    Please add return type in all functions in this file and CGDataMapTestCase.scala also


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Merged, please close this PR


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151615805
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java ---
    @@ -0,0 +1,110 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.core.datamap.dev;
    +
    +import java.io.IOException;
    +
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.datastore.page.ColumnPage;
    +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
    +import org.apache.carbondata.core.util.CarbonUtil;
    +import org.apache.carbondata.core.util.path.CarbonTablePath;
    +
    +/**
    + * Data Map writer
    + */
    +public abstract class AbstractDataMapWriter {
    +
    +  protected AbsoluteTableIdentifier identifier;
    +
    +  protected String segmentId;
    +
    +  protected String writeDirectoryPath;
    +
    +  public AbstractDataMapWriter(AbsoluteTableIdentifier identifier, String segmentId,
    +      String writeDirectoryPath) {
    +    this.identifier = identifier;
    +    this.segmentId = segmentId;
    +    this.writeDirectoryPath = writeDirectoryPath;
    +  }
    +
    +  /**
    +   * Start of new block notification.
    +   *
    +   * @param blockId file name of the carbondata file
    +   */
    +  public abstract void onBlockStart(String blockId);
    +
    +  /**
    +   * End of block notification
    +   */
    +  public abstract void onBlockEnd(String blockId);
    +
    +  /**
    +   * Start of new blocklet notification.
    +   *
    +   * @param blockletId sequence number of blocklet in the block
    +   */
    +  public abstract void onBlockletStart(int blockletId);
    +
    +  /**
    +   * End of blocklet notification
    +   *
    +   * @param blockletId sequence number of blocklet in the block
    +   */
    +  public abstract void onBlockletEnd(int blockletId);
    +
    +  /**
    +   * Add the column pages row to the datamap, order of pages is same as `indexColumns` in
    +   * DataMapMeta returned in DataMapFactory.
    +   * Implementation should copy the content of `pages` as needed, because `pages` memory
    +   * may be freed after this method returns, if using unsafe column page.
    +   */
    +  public abstract void onPageAdded(int blockletId, int pageId, ColumnPage[] pages);
    +
    +  /**
    +   * This is called during closing of writer.So after this call no more data will be sent to this
    +   * class.
    +   */
    +  public abstract void finish();
    +
    +  /**
    +   * It copies the file from temp folder to actual folder
    +   *
    +   * @param dataMapFile
    +   * @throws IOException
    +   */
    +  protected void commitFile(String dataMapFile) throws IOException {
    --- End diff --
    
    What if anything failed inside this function, who will catch IOException and handle it?


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    LGTM


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    retest this please


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151828034
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/BlockletSerializer.java ---
    @@ -0,0 +1,57 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.core.datamap.dev;
    +
    +import java.io.DataInputStream;
    +import java.io.DataOutputStream;
    +import java.io.IOException;
    +
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.indexstore.FineGrainBlocklet;
    +
    +public class BlockletSerializer {
    +
    +  /**
    +   * Serialize and write blocklet to the file.
    +   * @param grainBlocklet
    +   * @param writePath
    +   * @throws IOException
    +   */
    +  public void serializeBlocklet(FineGrainBlocklet grainBlocklet, String writePath)
    --- End diff --
    
    Can we move this to FineGrainBlocklet itself? So that we do not need to keep writePath in many places, like in TableBlockInfo


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827453
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -687,16 +689,17 @@ protected Expression getFilterPredicates(Configuration configuration) {
         // get tokens for all the required FileSystem for table path
         TokenCache.obtainTokensForNamenodes(job.getCredentials(),
             new Path[] { new Path(absoluteTableIdentifier.getTablePath()) }, job.getConfiguration());
    -
    -    TableDataMap blockletMap = DataMapStoreManager.getInstance()
    -        .getDataMap(absoluteTableIdentifier, BlockletDataMap.NAME,
    -            BlockletDataMapFactory.class.getName());
    +    boolean distributedCG = Boolean.parseBoolean(CarbonProperties.getInstance()
    +        .getProperty(CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP,
    +            CarbonCommonConstants.USE_DISTRIBUTED_DATAMAP_DEFAULT));
    +    TableDataMap blockletMap =
    +        DataMapStoreManager.getInstance().chooseDataMap(absoluteTableIdentifier);
         DataMapJob dataMapJob = getDataMapJob(job.getConfiguration());
         List<ExtendedBlocklet> prunedBlocklets;
    -    if (dataMapJob != null) {
    +    if (distributedCG || blockletMap.getDataMapFactory().getDataMapType() == DataMapType.FG) {
    --- End diff --
    
    It seems distributedCG and FG behave the same, right?
    Earlier I though FG datamap will be called in ScanRDD.compute, but it seems not?
    If we collect all pruned blocklet in driver, will it be too many for driver? 


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1152/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1135/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151830137
  
    --- Diff: integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datamap/FGDataMapTestCase.scala ---
    @@ -0,0 +1,434 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.spark.testsuite.datamap
    +
    +import java.io.{ByteArrayInputStream, DataOutputStream, ObjectInputStream, ObjectOutputStream}
    +
    +import scala.collection.JavaConverters._
    +import scala.collection.mutable.ArrayBuffer
    +
    +import com.sun.xml.internal.messaging.saaj.util.ByteOutputStream
    +import org.apache.spark.sql.test.util.QueryTest
    +import org.scalatest.BeforeAndAfterAll
    +
    +import org.apache.carbondata.core.datamap.dev.fgdatamap.{AbstractFineGrainDataMap, AbstractFineGrainDataMapFactory}
    +import org.apache.carbondata.core.datamap.dev.{AbstractDataMapWriter, DataMapModel}
    +import org.apache.carbondata.core.datamap.{DataMapDistributable, DataMapMeta, DataMapStoreManager}
    +import org.apache.carbondata.core.datastore.FileHolder
    +import org.apache.carbondata.core.datastore.block.SegmentProperties
    +import org.apache.carbondata.core.datastore.compression.SnappyCompressor
    +import org.apache.carbondata.core.datastore.filesystem.{CarbonFile, CarbonFileFilter}
    +import org.apache.carbondata.core.datastore.impl.FileFactory
    +import org.apache.carbondata.core.datastore.page.ColumnPage
    +import org.apache.carbondata.core.indexstore.FineGrainBlocklet
    +import org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapDistributable
    +import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, CarbonMetadata}
    +import org.apache.carbondata.core.scan.expression.Expression
    +import org.apache.carbondata.core.scan.expression.conditional.EqualToExpression
    +import org.apache.carbondata.core.scan.filter.intf.ExpressionType
    +import org.apache.carbondata.core.scan.filter.resolver.FilterResolverIntf
    +import org.apache.carbondata.core.util.ByteUtil
    +import org.apache.carbondata.core.util.path.CarbonTablePath
    +import org.apache.carbondata.events.Event
    +import org.apache.carbondata.spark.testsuite.datacompaction.CompactionSupportGlobalSortBigFileTest
    +
    +class FGDataMapFactory extends AbstractFineGrainDataMapFactory {
    +  var identifier: AbsoluteTableIdentifier = _
    +  var dataMapName: String = _
    +
    +  /**
    +   * Initialization of Datamap factory with the identifier and datamap name
    +   */
    +  override def init(identifier: AbsoluteTableIdentifier,
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    retest sdv please


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/835/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1278/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827748
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/TableDataMap.java ---
    @@ -75,7 +79,12 @@ public TableDataMap(AbsoluteTableIdentifier identifier, String dataMapName,
         SegmentProperties segmentProperties;
         for (String segmentId : segmentIds) {
           List<Blocklet> pruneBlocklets = new ArrayList<>();
    -      List<DataMap> dataMaps = dataMapFactory.getDataMaps(segmentId);
    +      List<DataMap> dataMaps;
    +      if (blockletDetailsFetcher instanceof DataMapFactory && filterExp == null) {
    +        dataMaps = ((DataMapFactory)blockletDetailsFetcher).getDataMaps(segmentId);
    --- End diff --
    
    Please add correct generic assignment, it is hard to check whether the type is correct or not


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1090/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1706/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Success , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1688/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827225
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -463,20 +426,24 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException {
        * @throws CarbonDataWriterException
        */
       protected void closeExecutorService() throws CarbonDataWriterException {
    -    executorService.shutdown();
         try {
    -      executorService.awaitTermination(2, TimeUnit.HOURS);
    -    } catch (InterruptedException e) {
    -      throw new CarbonDataWriterException(e.getMessage());
    -    }
    -    for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +      listener.finish();
    +      executorService.shutdown();
           try {
    --- End diff --
    
    This try is not required, catch in line 440 is enough


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151827362
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -463,20 +426,24 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException {
        * @throws CarbonDataWriterException
        */
       protected void closeExecutorService() throws CarbonDataWriterException {
    -    executorService.shutdown();
         try {
    -      executorService.awaitTermination(2, TimeUnit.HOURS);
    -    } catch (InterruptedException e) {
    -      throw new CarbonDataWriterException(e.getMessage());
    -    }
    -    for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +      listener.finish();
    +      executorService.shutdown();
           try {
    -        executorServiceSubmitList.get(i).get();
    +        executorService.awaitTermination(2, TimeUnit.HOURS);
           } catch (InterruptedException e) {
             throw new CarbonDataWriterException(e.getMessage());
    -      } catch (ExecutionException e) {
    -        throw new CarbonDataWriterException(e.getMessage());
           }
    +      for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +        executorServiceSubmitList.get(i).get();
    +      }
    +    } catch (InterruptedException e) {
    +      throw new CarbonDataWriterException(e.getMessage());
    +    } catch (ExecutionException e) {
    --- End diff --
    
    merge with line 440


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1162/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1466/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151636955
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapMeta.java ---
    @@ -19,15 +19,15 @@
     
     import java.util.List;
     
    -import org.apache.carbondata.core.indexstore.schema.FilterType;
    +import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
     
     public class DataMapMeta {
     
       private List<String> indexedColumns;
     
    -  private FilterType optimizedOperation;
    +  private List<ExpressionType> optimizedOperation;
    --- End diff --
    
    Currently, the like expression is converted to greater than and less than equal to filter. so there is no LIKE expression in the types. 


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151615573
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -755,7 +758,8 @@ private CarbonInputSplit convertToCarbonInputSplit(ExtendedBlocklet blocklet)
             org.apache.carbondata.hadoop.CarbonInputSplit.from(blocklet.getSegmentId(),
                 new FileSplit(new Path(blocklet.getPath()), 0, blocklet.getLength(),
                     blocklet.getLocations()),
    -            ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber()));
    +            ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber()),
    +            blocklet.getDataMapWriterPath());
    --- End diff --
    
    indentation  not correct


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1691/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1743/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/841/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/968/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151831837
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/BlockletSerializer.java ---
    @@ -0,0 +1,57 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.core.datamap.dev;
    +
    +import java.io.DataInputStream;
    +import java.io.DataOutputStream;
    +import java.io.IOException;
    +
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.indexstore.FineGrainBlocklet;
    +
    +public class BlockletSerializer {
    +
    +  /**
    +   * Serialize and write blocklet to the file.
    +   * @param grainBlocklet
    +   * @param writePath
    +   * @throws IOException
    +   */
    +  public void serializeBlocklet(FineGrainBlocklet grainBlocklet, String writePath)
    --- End diff --
    
    writePath is part of ExtendedBlocklet and FineGrainBlocklet only used till data is written in TableDataMap. And also we need to add writepath to TableBlockinfo as it does not have FineGrainBlocklet. 


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1474/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151639114
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/dev/AbstractDataMapWriter.java ---
    @@ -0,0 +1,110 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.carbondata.core.datamap.dev;
    +
    +import java.io.IOException;
    +
    +import org.apache.carbondata.core.datastore.impl.FileFactory;
    +import org.apache.carbondata.core.datastore.page.ColumnPage;
    +import org.apache.carbondata.core.metadata.AbsoluteTableIdentifier;
    +import org.apache.carbondata.core.util.CarbonUtil;
    +import org.apache.carbondata.core.util.path.CarbonTablePath;
    +
    +/**
    + * Data Map writer
    + */
    +public abstract class AbstractDataMapWriter {
    +
    +  protected AbsoluteTableIdentifier identifier;
    +
    +  protected String segmentId;
    +
    +  protected String writeDirectoryPath;
    +
    +  public AbstractDataMapWriter(AbsoluteTableIdentifier identifier, String segmentId,
    +      String writeDirectoryPath) {
    +    this.identifier = identifier;
    +    this.segmentId = segmentId;
    +    this.writeDirectoryPath = writeDirectoryPath;
    +  }
    +
    +  /**
    +   * Start of new block notification.
    +   *
    +   * @param blockId file name of the carbondata file
    +   */
    +  public abstract void onBlockStart(String blockId);
    +
    +  /**
    +   * End of block notification
    +   */
    +  public abstract void onBlockEnd(String blockId);
    +
    +  /**
    +   * Start of new blocklet notification.
    +   *
    +   * @param blockletId sequence number of blocklet in the block
    +   */
    +  public abstract void onBlockletStart(int blockletId);
    +
    +  /**
    +   * End of blocklet notification
    +   *
    +   * @param blockletId sequence number of blocklet in the block
    +   */
    +  public abstract void onBlockletEnd(int blockletId);
    +
    +  /**
    +   * Add the column pages row to the datamap, order of pages is same as `indexColumns` in
    +   * DataMapMeta returned in DataMapFactory.
    +   * Implementation should copy the content of `pages` as needed, because `pages` memory
    +   * may be freed after this method returns, if using unsafe column page.
    +   */
    +  public abstract void onPageAdded(int blockletId, int pageId, ColumnPage[] pages);
    +
    +  /**
    +   * This is called during closing of writer.So after this call no more data will be sent to this
    +   * class.
    +   */
    +  public abstract void finish();
    +
    +  /**
    +   * It copies the file from temp folder to actual folder
    +   *
    +   * @param dataMapFile
    +   * @throws IOException
    +   */
    +  protected void commitFile(String dataMapFile) throws IOException {
    --- End diff --
    
    Basically, this method should be used inside DataMapWriter to the files once they finish writing in it. It is used for copying from temp location to store. If the error occurs here then it will throw to the DataMapwriter implementation. And writer implementation should handle it otherwise load fails because of the error if it thrown to fact writer


---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1132/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by jackylk <gi...@git.apache.org>.
Github user jackylk commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151615263
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -574,7 +482,9 @@ private CopyThread(String fileName) {
          * @throws Exception if unable to compute a result
          */
         @Override public Void call() throws Exception {
    -      copyCarbonDataFileToCarbonStorePath(fileName);
    +      CarbonUtil.copyCarbonDataFileToCarbonStorePath(fileName,
    --- End diff --
    
    move parameter to next line


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    SDV Build Fail , Please check CI http://144.76.159.231:8080/job/ApacheSDVTests/1733/



---

[GitHub] carbondata issue #1471: [WIP][CARBONDATA-1544][Datamap] Datamap FineGrain im...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Failed  with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/833/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1230/



---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by CarbonDataQA <gi...@git.apache.org>.
Github user CarbonDataQA commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    Build Success with Spark 2.1.0, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder1/1264/



---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151638182
  
    --- Diff: hadoop/src/main/java/org/apache/carbondata/hadoop/api/CarbonTableInputFormat.java ---
    @@ -755,7 +758,8 @@ private CarbonInputSplit convertToCarbonInputSplit(ExtendedBlocklet blocklet)
             org.apache.carbondata.hadoop.CarbonInputSplit.from(blocklet.getSegmentId(),
                 new FileSplit(new Path(blocklet.getPath()), 0, blocklet.getLength(),
                     blocklet.getLocations()),
    -            ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber()));
    +            ColumnarFormatVersion.valueOf((short) blocklet.getDetailInfo().getVersionNumber()),
    +            blocklet.getDataMapWriterPath());
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by chenliang613 <gi...@git.apache.org>.
Github user chenliang613 commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151595971
  
    --- Diff: core/src/main/java/org/apache/carbondata/core/datamap/DataMapMeta.java ---
    @@ -19,15 +19,15 @@
     
     import java.util.List;
     
    -import org.apache.carbondata.core.indexstore.schema.FilterType;
    +import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
     
     public class DataMapMeta {
     
       private List<String> indexedColumns;
     
    -  private FilterType optimizedOperation;
    +  private List<ExpressionType> optimizedOperation;
    --- End diff --
    
    in ExpressionType,    no "like" expression.


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151830196
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -463,20 +426,24 @@ protected void writeIndexFile() throws IOException, CarbonDataWriterException {
        * @throws CarbonDataWriterException
        */
       protected void closeExecutorService() throws CarbonDataWriterException {
    -    executorService.shutdown();
         try {
    -      executorService.awaitTermination(2, TimeUnit.HOURS);
    -    } catch (InterruptedException e) {
    -      throw new CarbonDataWriterException(e.getMessage());
    -    }
    -    for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +      listener.finish();
    +      executorService.shutdown();
           try {
    -        executorServiceSubmitList.get(i).get();
    +        executorService.awaitTermination(2, TimeUnit.HOURS);
           } catch (InterruptedException e) {
             throw new CarbonDataWriterException(e.getMessage());
    -      } catch (ExecutionException e) {
    -        throw new CarbonDataWriterException(e.getMessage());
           }
    +      for (int i = 0; i < executorServiceSubmitList.size(); i++) {
    +        executorServiceSubmitList.get(i).get();
    +      }
    +    } catch (InterruptedException e) {
    +      throw new CarbonDataWriterException(e.getMessage());
    +    } catch (ExecutionException e) {
    +      throw new CarbonDataWriterException(e.getMessage());
    +    } catch (IOException e) {
    +      LOGGER.error(e, "Error while writing datamap");
    +      throw new CarbonDataWriterException(e.getMessage());
    --- End diff --
    
    ok


---

[GitHub] carbondata pull request #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain ...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/1471#discussion_r151637852
  
    --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/writer/AbstractFactDataWriter.java ---
    @@ -574,7 +482,9 @@ private CopyThread(String fileName) {
          * @throws Exception if unable to compute a result
          */
         @Override public Void call() throws Exception {
    -      copyCarbonDataFileToCarbonStorePath(fileName);
    +      CarbonUtil.copyCarbonDataFileToCarbonStorePath(fileName,
    --- End diff --
    
    ok


---

[GitHub] carbondata issue #1471: [CARBONDATA-1544][Datamap] Datamap FineGrain impleme...

Posted by ravipesala <gi...@git.apache.org>.
Github user ravipesala commented on the issue:

    https://github.com/apache/carbondata/pull/1471
  
    retest this please


---