You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/02/11 19:28:13 UTC

[GitHub] [druid] jihoonson opened a new issue #9352: Broken feature: appending linearly partitioned segments into a hash partitioned datasource

jihoonson opened a new issue #9352: Broken feature: appending linearly partitioned segments into a hash partitioned datasource
URL: https://github.com/apache/druid/issues/9352
 
 
   ### Affected Version
   
   0.16, 0.17, master
   
   ### Description
   
   Before 0.16, Druid used to allow you to create a datasource with the `HashedPartitionsSpec` and then run a task that appends to the datasource with a linear partitioning (using `maxRowsPerSegment`). This was possible because the segments created with `HashedPartitionsSpec` have the `HashBasedNumberedShardSpec` which extends `NumberedShardSpec` which in turn is used for linearly partitioned segments (see https://github.com/apache/druid/blob/0.15.1-incubating/server/src/main/java/org/apache/druid/metadata/IndexerSQLMetadataStorageCoordinator.java#L691-L700). 
   
   This feature was broken in #7547 and it is supposed to be a bug. However, I'm wondering we really want to support this in the future because of the below reasons.
   
   - Allowing mixed partitioning methods for one datasource is confusing and not very useful.
   - This feature introduces an ambiguous concept of the "core partitions". Only the hash partitioned datasource has the core partitions which is the set of segments created by the initial task. All segments in the core partitions should have the same `HashBasedNumberedShardSpec`, but other segments should have the `NumberedShardSpec`. In the timeline management, a hash partitioned datasource is regarded as visible in brokers once all segments in the core partitions become available in historicals no matter how many segments are left in the non-core partitions. I think this concept is not that useful but makes things complicated.
   - This feature allows you to append only _linearly_ partitioned segments to a _hash_ partitioned datasource. Other combinations or directions are not allowed.
   - Finally, https://github.com/apache/druid/issues/9241 was recently proposed which seems more promising.
   
   I would like to promote https://github.com/apache/druid/issues/9241 rather than fixing this bug. Welcome any thoughts.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org