You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Volodymyr Vysotskyi (JIRA)" <ji...@apache.org> on 2019/06/07 10:47:00 UTC
[jira] [Updated] (DRILL-7271) Refactor Metadata interfaces and
classes to contain all needed information for the File based Metastore
[ https://issues.apache.org/jira/browse/DRILL-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Volodymyr Vysotskyi updated DRILL-7271:
---------------------------------------
Description:
1. Merge info from metadataStatistics + statisticsKinds into one holder: Map<String, StatisticsHolder>.
2. Rename hasStatistics to hasDescriptiveStatistics
3. Remove drill-file-metastore-plugin
4. Move org.apache.drill.exec.physical.base.AbstractGroupScanWithMetadata.MetadataLevel to metadata module, rename to MetadataType and add new value: DIRECTORY.
5. Add JSON ser/de for ColumnStatistics, StatisticsHolder.
6. Add new info classes:
{noformat}
class TableInfo {
String storagePlugin;
String workspace;
String name;
String type;
String owner;
}
class MetadataInfo {
public static final String GENERAL_INFO_KEY = "GENERAL_INFO";
public static final String DEFAULT_PARTITION_KEY = "DEFAULT_PARTITION";
MetadataType type (enum);
String key;
String identifier;
}
{noformat}
7. Modify existing metadata classes:
org.apache.drill.metastore.FileTableMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace, tableType -> will be covered by TableInfo class
metadataType, metadataKey -> will be covered by MetadataInfo class
interestingColumns
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Set<String> partitionKeys; -> Map<String, String>
{noformat}
org.apache.drill.metastore.PartitionMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace -> will be covered by TableInfo class
metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
partitionValues (List<String>)
location (String) (for directory level metadata) - directory location
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Set<Path> location; -> locations
{noformat}
org.apache.drill.metastore.FileMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace -> will be covered by TableInfo class
metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
path - path to file
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Path location; - should contain directory to which file belongs
{noformat}
org.apache.drill.metastore.RowGroupMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace -> will be covered by TableInfo class
metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
path - path to file
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Path location; - should contain directory to which file belongs
{noformat}
8. Remove org.apache.drill.exec package from metastore module.
9. Rename ColumnStatisticsImpl class.
10. Separate existing classes in org.apache.drill.metastore package into sub-packages.
11. Rename FileTableMetadata -> BaseTableMetadata
12. TableMetadataProvider.getNonInterestingColumnsMeta() -> getNonInterestingColumnsMetadata
13. Introduce segment-level metadata class:
{noformat}
class SegmentMetadata {
TableInfo tableInfo;
MetadataInfo metadataInfo;
SchemaPath column;
TupleMetadata schema;
String location;
Map<SchemaPath, ColumnStatistics> columnsStatistics;
Map<String, StatisticsHolder> statistics;
List<String> partitionValues;
List<String> locations;
long lastModifiedTime;
}
{noformat}
was:
1. Merge info from metadataStatistics + statisticsKinds into one holder: Map<String, StatisticsHolder>.
2. Rename hasStatistics to hasDescriptiveStatistics
3. Remove drill-file-metastore-plugin
4. Move org.apache.drill.exec.physical.base.AbstractGroupScanWithMetadata.MetadataLevel to metadata module, rename to MetadataType and add new value: DIRECTORY.
5. Add JSON ser/de for ColumnStatistics, StatisticsHolder.
6. Add new info classes:
{noformat}
class TableInfo {
String storagePlugin;
String workspace;
String name;
String type;
String owner;
}
class MetadataInfo {
public static final String GENERAL_INFO_KEY = "GENERAL_INFO";
public static final String DEFAULT_PARTITION_KEY = "DEFAULT_PARTITION";
MetadataType type (enum);
String key;
String identifier;
}
{noformat}
7. Modify existing metadata classes:
org.apache.drill.metastore.FileTableMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace, tableType -> will be covered by TableInfo class
metadataType, metadataKey -> will be covered by MetadataInfo class
interestingColumns
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Set<String> partitionKeys; -> Map<String, String>
{noformat}
org.apache.drill.metastore.PartitionMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace -> will be covered by TableInfo class
metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
partitionValues (List<String>)
location (String) (for directory level metadata) - directory location
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Set<Path> location; -> locations
{noformat}
org.apache.drill.metastore.FileMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace -> will be covered by TableInfo class
metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
path - path to file
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Path location; - should contain directory to which file belongs
{noformat}
org.apache.drill.metastore.RowGroupMetadata
{noformat}
missing fields
------------------
storagePlugin, workspace -> will be covered by TableInfo class
metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
path - path to file
fields to modify
----------------
private final Map<String, Object> tableStatistics;
private final Map<String, StatisticsKind> statisticsKinds;
private final Path location; - should contain directory to which file belongs
{noformat}
8. Remove org.apache.drill.exec package from metastore module.
9. Rename ColumnStatisticsImpl class.
10. Separate existing classes in org.apache.drill.metastore package into sub-packages.
11. Rename FileTableMetadata -> BaseTableMetadata
12. TableMetadataProvider.getNonInterestingColumnsMeta() -> getNonInterestingColumnsMetadata
> Refactor Metadata interfaces and classes to contain all needed information for the File based Metastore
> -------------------------------------------------------------------------------------------------------
>
> Key: DRILL-7271
> URL: https://issues.apache.org/jira/browse/DRILL-7271
> Project: Apache Drill
> Issue Type: Sub-task
> Reporter: Arina Ielchiieva
> Assignee: Volodymyr Vysotskyi
> Priority: Major
> Fix For: 1.17.0
>
>
> 1. Merge info from metadataStatistics + statisticsKinds into one holder: Map<String, StatisticsHolder>.
> 2. Rename hasStatistics to hasDescriptiveStatistics
> 3. Remove drill-file-metastore-plugin
> 4. Move org.apache.drill.exec.physical.base.AbstractGroupScanWithMetadata.MetadataLevel to metadata module, rename to MetadataType and add new value: DIRECTORY.
> 5. Add JSON ser/de for ColumnStatistics, StatisticsHolder.
> 6. Add new info classes:
> {noformat}
> class TableInfo {
> String storagePlugin;
> String workspace;
> String name;
> String type;
> String owner;
> }
> class MetadataInfo {
> public static final String GENERAL_INFO_KEY = "GENERAL_INFO";
> public static final String DEFAULT_PARTITION_KEY = "DEFAULT_PARTITION";
> MetadataType type (enum);
> String key;
> String identifier;
> }
> {noformat}
> 7. Modify existing metadata classes:
> org.apache.drill.metastore.FileTableMetadata
> {noformat}
> missing fields
> ------------------
> storagePlugin, workspace, tableType -> will be covered by TableInfo class
> metadataType, metadataKey -> will be covered by MetadataInfo class
> interestingColumns
> fields to modify
> ----------------
> private final Map<String, Object> tableStatistics;
> private final Map<String, StatisticsKind> statisticsKinds;
> private final Set<String> partitionKeys; -> Map<String, String>
> {noformat}
> org.apache.drill.metastore.PartitionMetadata
> {noformat}
> missing fields
> ------------------
> storagePlugin, workspace -> will be covered by TableInfo class
> metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
> partitionValues (List<String>)
> location (String) (for directory level metadata) - directory location
> fields to modify
> ----------------
> private final Map<String, Object> tableStatistics;
> private final Map<String, StatisticsKind> statisticsKinds;
> private final Set<Path> location; -> locations
> {noformat}
> org.apache.drill.metastore.FileMetadata
> {noformat}
> missing fields
> ------------------
> storagePlugin, workspace -> will be covered by TableInfo class
> metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
> path - path to file
> fields to modify
> ----------------
> private final Map<String, Object> tableStatistics;
> private final Map<String, StatisticsKind> statisticsKinds;
> private final Path location; - should contain directory to which file belongs
> {noformat}
> org.apache.drill.metastore.RowGroupMetadata
> {noformat}
> missing fields
> ------------------
> storagePlugin, workspace -> will be covered by TableInfo class
> metadataType, metadataKey, metadataIdentifier -> will be covered by MetadataInfo class
> path - path to file
> fields to modify
> ----------------
> private final Map<String, Object> tableStatistics;
> private final Map<String, StatisticsKind> statisticsKinds;
> private final Path location; - should contain directory to which file belongs
> {noformat}
> 8. Remove org.apache.drill.exec package from metastore module.
> 9. Rename ColumnStatisticsImpl class.
> 10. Separate existing classes in org.apache.drill.metastore package into sub-packages.
> 11. Rename FileTableMetadata -> BaseTableMetadata
> 12. TableMetadataProvider.getNonInterestingColumnsMeta() -> getNonInterestingColumnsMetadata
> 13. Introduce segment-level metadata class:
> {noformat}
> class SegmentMetadata {
> TableInfo tableInfo;
> MetadataInfo metadataInfo;
> SchemaPath column;
> TupleMetadata schema;
> String location;
> Map<SchemaPath, ColumnStatistics> columnsStatistics;
> Map<String, StatisticsHolder> statistics;
> List<String> partitionValues;
> List<String> locations;
> long lastModifiedTime;
> }
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)