You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Mich Talebzadeh <mi...@peridale.co.uk> on 2016/01/16 23:58:36 UTC

A simple question

We always assume that the strength of Hive comes from its ability to use a
very close ensemble to ANSI SQL enabling queries to be done with a smooth
learning curve,

 

Additionally we like to assume that Hive strength is about Schema on Read.
However, after having some discussion on it, we still need to know the table
composite (columns, type etc.) before we can create a table for  raw data,

 

In short it does not make sense to store the data in Hive without knowing
some sequence of data. For example we cannot digest an excel sheet data
without understanding its format?

 

Simple point. If I have a currency column in excel I can do aggregates in
that column in excel. In relational databases I can get away with storing
the value as currency in format 999,999,999,99 and another column for CCY
(USD, GBP, EUR etc).

 

In EXECL I can display $999,999,999.99 and do operations on it. In Hive I
will need to find out how to interpret the whole String. What other options
do I have.

 

I don't know if this make sense. We can of course store excel data in Hive
as heap. However, some columns end up to be difficult to interpret when we
read the schema?

 

 

Thanks

 

 

Dr Mich Talebzadeh

 

LinkedIn
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABU
rV8Pw>
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUr
V8Pw

 

Sybase ASE 15 Gold Medal Award 2008

A Winning Strategy: Running the most Critical Financial Data on ASE 15

 
<http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908
.pdf>
http://login.sybase.com/files/Product_Overviews/ASE-Winning-Strategy-091908.
pdf

Author of the books "A Practitioner's Guide to Upgrading to Sybase ASE 15",
ISBN 978-0-9563693-0-7. 

co-author "Sybase Transact SQL Guidelines Best Practices", ISBN
978-0-9759693-0-4

Publications due shortly:

Complex Event Processing in Heterogeneous Environments, ISBN:
978-0-9563693-3-8

Oracle and Sybase, Concepts and Contrasts, ISBN: 978-0-9563693-1-4, volume
one out shortly

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is
the responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Technology Ltd, its subsidiaries nor their
employees accept any responsibility.