You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by GitBox <gi...@apache.org> on 2019/10/16 16:49:36 UTC

[GitHub] [drill] paul-rogers commented on issue #1749: DRILL-7177: Format Plugin for Excel Files

paul-rogers commented on issue #1749: DRILL-7177: Format Plugin for Excel Files
URL: https://github.com/apache/drill/pull/1749#issuecomment-542792797
 
 
   One additional thought: how are column types handled? Do we require all fields to have the same types within a column?
   
   Name | Balance
   ----| -----
   Fred | 123.45
   Barney | 556.78
   
   If so, then it might make sense to hold an array of column handler objects, like in the earlier version of the Regex plugin, that holds the column accessor and performs any required type conversions. This would be cleaner/faster than doing a `switch` per column.
   
   Can a column type vary?
   
   Name | Balance
   ----| -----
   Fred | 123.45
   Barney | "Not sure"
   
   If so, then the code has to handle conversions. The above would generate an error, but would we allow:
   
   Name | Balance
   ----| -----
   Fred | 123.45
   Barney | "556.78"
   
   Can a column start as empty (null) so that we have to defer type selection?
   
   Name | Balance
   ----| -----
   Fred | 
   Barney | 556.78
   
   If so, then we need a way to defer column type selection until we see the first non-null value. The (not yet merged) new JSON reader hands the deferred-type case; I can share that logic if you need it.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services