You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/03/29 05:26:16 UTC

[GitHub] [incubator-pinot] mqliang commented on pull request #6710: DataTable V3 implementation and measure data table serialization cost on server

mqliang commented on pull request #6710:
URL: https://github.com/apache/incubator-pinot/pull/6710#issuecomment-809076119


   @Jackie-Jiang @siddharthteotia @mcvsubbu Comment addressed and are ready for review now. I split the change into 5 commits:
   * 1st commit:
      * Rename TrailerKeys as MetadataKeys
      * Associate ID/Name with MetadataKeys
      * V2->V3 convert instead of construct V3 from V2 bytes
      * ASCII layout of V3 datatable
      * Address a TODO in DataTableBuilder: store bytes data into variable size data section, instead of String
     
   * 2nd commit: Address a TODO in DataTableBuilder: fix float size issue at DataTableBuilder
   * 3rd commit:  Address a TODO in DataTableBuilder: use one Map to map a String to Integer for all columns in V3.
   * 4th commit:  fix a bug at BrokerReduceService, which bring down integration test.
   * 5th commit:  Log `responseSerializationCpuTimeNs` at QueryScheduler and emit a broker gauge; put "executionThreadCpuTimeNs" and "responseSerializationCpuTimeNs" into metadata so that they can be sent to broker
   
   
   There is still one more TODO in DataTableBuilder: Given a data schema, write all values one by one instead of using rowId and colId to position (save time). It will not impact the serialized bytes layout of data table, it's just some implementation optimization. Which means it does not require a version bumping up, so can be done in a separate PR, I create a issue: https://github.com/apache/incubator-pinot/issues/6720 to track this. And a preliminary benchmark shows that the optimization is quite speculative -- there is no improvement by writing all values one by one without using rowId and colId to position, for more details, see the benchmark result at: https://github.com/apache/incubator-pinot/issues/6720
   
   There is one more thing need to be done: change the interface of `DataTable.getMetadata()` returns a `Map<MetadataKeys, String>`, instead of `Map<String, String>`. This PR is already quite large,  I wanner address it in a separate PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org