You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Santosh Achhra <sa...@gmail.com> on 2013/04/07 12:23:36 UTC

Sqoop converting timestamp field to string

Hello,

I am using hive version 0.9 which supports timestamp fields. When I import
data using sqoop using hive-import option, sqoop converts timestamp fields
to string.

sqoop version is Sqoop 1.4.1-cdh4.1.0


Good wishes,always !
Santosh

Re: Sqoop converting timestamp field to string

Posted by Santosh Achhra <sa...@gmail.com>.
Hi Felix,

Many thanks for your inputs. Yes using  --map-column-hive did work for me
but as you said it better to go with string columns.



Good wishes,always !
Santosh


On Tue, Apr 9, 2013 at 5:49 AM, Felix GV <fe...@mate1inc.com> wrote:

> Actually, what we needed to use inside of the Hive date functions to use
> our timestamp columns was this:
>
> from_unixtime(bigint unixtime[, string format])
>
> The rest of the points I brought up remain the same, IIRC...
>
> --
> Felix
>
>
> On Mon, Apr 8, 2013 at 5:44 PM, Felix GV <fe...@mate1inc.com> wrote:
>
>> You can use --map-column-hive to manually specify which columns should
>> be mapped to a timestamp instead of a string. See the sqoop guide<http://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html#_importing_data_into_hive>for more info.
>>
>> A couple caveats:
>>
>>    - I don't know of any way of changing the default sqoop behavior so
>>    you don't need to manually name all relevant columns in every table you're
>>    sqooping.
>>    - MySQL's date columns break when converted with this mapping
>>    (datetime and timestamp work, but not date). In order to map a MySQL date
>>    column, I had to use a custom query that converted the column to a datetime
>>    (with 00:00:00 time values) inside of MySQL so that sqoop doesn't choke on
>>    it. Obviously, this is even more annoying than having to set
>>    --map-column-hive, since it makes each sqoop command even more ad-hoc.
>>    - In the end, although all of that worked (in a very ad-hoc manner),
>>    we decided to drop it all and just go with string columns in Hive, since
>>    all (or most) of the Hive date-related functions<https://cwiki.apache.org/Hive/languagemanual-udf.html#LanguageManualUDF-DateFunctions>take only strings as input parameters, and not timestamps, so having
>>    timestamp columns meant we had to use to_date(string timestamp) inside of
>>    each Hive date function, which made all of our queries a bit unwieldy, and
>>    also (I assume) precluded us from gaining any speed advantages from having
>>    timestamps instead of strings.
>>
>> The last point makes me think Hive timestamps are not really ready
>> for prime time yet... They seem to be tacked on top of the system in a way
>> that's fully well integrated.
>>
>> Please let me know if you think I'm drawing the wrong conclusions from
>> this :) ...!
>>
>> --
>> Felix
>>
>>
>> On Sun, Apr 7, 2013 at 6:23 AM, Santosh Achhra <sa...@gmail.com>wrote:
>>
>>> Hello,
>>>
>>> I am using hive version 0.9 which supports timestamp fields. When I
>>> import data using sqoop using hive-import option, sqoop converts timestamp
>>> fields to string.
>>>
>>> sqoop version is Sqoop 1.4.1-cdh4.1.0
>>>
>>>
>>> Good wishes,always !
>>> Santosh
>>>
>>
>>
>

Re: Sqoop converting timestamp field to string

Posted by Felix GV <fe...@mate1inc.com>.
Actually, what we needed to use inside of the Hive date functions to use
our timestamp columns was this:

from_unixtime(bigint unixtime[, string format])

The rest of the points I brought up remain the same, IIRC...

--
Felix


On Mon, Apr 8, 2013 at 5:44 PM, Felix GV <fe...@mate1inc.com> wrote:

> You can use --map-column-hive to manually specify which columns should be
> mapped to a timestamp instead of a string. See the sqoop guide<http://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html#_importing_data_into_hive>for more info.
>
> A couple caveats:
>
>    - I don't know of any way of changing the default sqoop behavior so
>    you don't need to manually name all relevant columns in every table you're
>    sqooping.
>    - MySQL's date columns break when converted with this mapping
>    (datetime and timestamp work, but not date). In order to map a MySQL date
>    column, I had to use a custom query that converted the column to a datetime
>    (with 00:00:00 time values) inside of MySQL so that sqoop doesn't choke on
>    it. Obviously, this is even more annoying than having to set
>    --map-column-hive, since it makes each sqoop command even more ad-hoc.
>    - In the end, although all of that worked (in a very ad-hoc manner),
>    we decided to drop it all and just go with string columns in Hive, since
>    all (or most) of the Hive date-related functions<https://cwiki.apache.org/Hive/languagemanual-udf.html#LanguageManualUDF-DateFunctions>take only strings as input parameters, and not timestamps, so having
>    timestamp columns meant we had to use to_date(string timestamp) inside of
>    each Hive date function, which made all of our queries a bit unwieldy, and
>    also (I assume) precluded us from gaining any speed advantages from having
>    timestamps instead of strings.
>
> The last point makes me think Hive timestamps are not really ready
> for prime time yet... They seem to be tacked on top of the system in a way
> that's fully well integrated.
>
> Please let me know if you think I'm drawing the wrong conclusions from
> this :) ...!
>
> --
> Felix
>
>
> On Sun, Apr 7, 2013 at 6:23 AM, Santosh Achhra <sa...@gmail.com>wrote:
>
>> Hello,
>>
>> I am using hive version 0.9 which supports timestamp fields. When I
>> import data using sqoop using hive-import option, sqoop converts timestamp
>> fields to string.
>>
>> sqoop version is Sqoop 1.4.1-cdh4.1.0
>>
>>
>> Good wishes,always !
>> Santosh
>>
>
>

Re: Sqoop converting timestamp field to string

Posted by Felix GV <fe...@mate1inc.com>.
You can use --map-column-hive to manually specify which columns should be
mapped to a timestamp instead of a string. See the sqoop
guide<http://sqoop.apache.org/docs/1.4.1-incubating/SqoopUserGuide.html#_importing_data_into_hive>for
more info.

A couple caveats:

   - I don't know of any way of changing the default sqoop behavior so you
   don't need to manually name all relevant columns in every table you're
   sqooping.
   - MySQL's date columns break when converted with this mapping (datetime
   and timestamp work, but not date). In order to map a MySQL date column, I
   had to use a custom query that converted the column to a datetime (with
   00:00:00 time values) inside of MySQL so that sqoop doesn't choke on it.
   Obviously, this is even more annoying than having to set --map-column-hive,
   since it makes each sqoop command even more ad-hoc.
   - In the end, although all of that worked (in a very ad-hoc manner), we
   decided to drop it all and just go with string columns in Hive, since all
   (or most) of the Hive date-related
functions<https://cwiki.apache.org/Hive/languagemanual-udf.html#LanguageManualUDF-DateFunctions>take
only strings as input parameters, and not timestamps, so having
   timestamp columns meant we had to use to_date(string timestamp) inside of
   each Hive date function, which made all of our queries a bit unwieldy, and
   also (I assume) precluded us from gaining any speed advantages from having
   timestamps instead of strings.

The last point makes me think Hive timestamps are not really ready
for prime time yet... They seem to be tacked on top of the system in a way
that's fully well integrated.

Please let me know if you think I'm drawing the wrong conclusions from this
:) ...!

--
Felix


On Sun, Apr 7, 2013 at 6:23 AM, Santosh Achhra <sa...@gmail.com>wrote:

> Hello,
>
> I am using hive version 0.9 which supports timestamp fields. When I import
> data using sqoop using hive-import option, sqoop converts timestamp fields
> to string.
>
> sqoop version is Sqoop 1.4.1-cdh4.1.0
>
>
> Good wishes,always !
> Santosh
>