You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Brad Heller <br...@gmail.com> on 2014/04/20 23:50:17 UTC
Hung inserts?
Hey list,
I've got some CSV data I'm importing from S3. I can create the external
table well enough, and I can also do a CREATE TABLE ... AS SELECT ... from
it to pull the data internal to Spark.
Here's the HQL for my external table:
https://gist.github.com/bradhe/11126024
Now I'd like to add partitioning and clustering to my permanent table. So,
I create a new table and try to do an INSERT ... SELECT
Here's the HQL for my internal, partitioned table and the insert select:
https://gist.github.com/bradhe/11126047
Oddly, the query is scheduled...but it never makes any progress!
http://i.imgur.com/vXvgpzD.png
Is this a bug? Am I doing something dumb?
Thanks,
Brad Heller
Re: Hung inserts?
Posted by Brad Heller <br...@gmail.com>.
So after a little more investigation it turns out this issue happens
specifically when I interact with shark server. If I log in to the master
and start a shark session (./bin/shark), everything works as expected.
i'm starting shark server with the following upstart script, am I doing
something wrong?? https://gist.github.com/bradhe/11159123
On Mon, Apr 21, 2014 at 3:31 PM, Brad Heller <br...@gmail.com> wrote:
> I tried removing the CLUSTERED directive and get the same results :( I
> also removed SORTED, same deal.
>
> I'm going to try removign partitioning all together for now.
>
>
> On Mon, Apr 21, 2014 at 4:58 AM, Mayur Rustagi <ma...@gmail.com>wrote:
>
>> Clustering is not supported. Can you remove that & give it a go.
>>
>> Mayur Rustagi
>> Ph: +1 (760) 203 3257
>> http://www.sigmoidanalytics.com
>> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>>
>>
>>
>> On Mon, Apr 21, 2014 at 3:20 AM, Brad Heller <br...@gmail.com>wrote:
>>
>>> Hey list,
>>>
>>> I've got some CSV data I'm importing from S3. I can create the external
>>> table well enough, and I can also do a CREATE TABLE ... AS SELECT ... from
>>> it to pull the data internal to Spark.
>>>
>>> Here's the HQL for my external table:
>>> https://gist.github.com/bradhe/11126024
>>>
>>> Now I'd like to add partitioning and clustering to my permanent table.
>>> So, I create a new table and try to do an INSERT ... SELECT
>>>
>>> Here's the HQL for my internal, partitioned table and the insert select:
>>> https://gist.github.com/bradhe/11126047
>>>
>>> Oddly, the query is scheduled...but it never makes any progress!
>>> http://i.imgur.com/vXvgpzD.png
>>>
>>> Is this a bug? Am I doing something dumb?
>>>
>>> Thanks,
>>> Brad Heller
>>>
>>
>>
>
Re: Hung inserts?
Posted by Brad Heller <br...@gmail.com>.
I tried removing the CLUSTERED directive and get the same results :( I also
removed SORTED, same deal.
I'm going to try removign partitioning all together for now.
On Mon, Apr 21, 2014 at 4:58 AM, Mayur Rustagi <ma...@gmail.com>wrote:
> Clustering is not supported. Can you remove that & give it a go.
>
> Mayur Rustagi
> Ph: +1 (760) 203 3257
> http://www.sigmoidanalytics.com
> @mayur_rustagi <https://twitter.com/mayur_rustagi>
>
>
>
> On Mon, Apr 21, 2014 at 3:20 AM, Brad Heller <br...@gmail.com>wrote:
>
>> Hey list,
>>
>> I've got some CSV data I'm importing from S3. I can create the external
>> table well enough, and I can also do a CREATE TABLE ... AS SELECT ... from
>> it to pull the data internal to Spark.
>>
>> Here's the HQL for my external table:
>> https://gist.github.com/bradhe/11126024
>>
>> Now I'd like to add partitioning and clustering to my permanent table.
>> So, I create a new table and try to do an INSERT ... SELECT
>>
>> Here's the HQL for my internal, partitioned table and the insert select:
>> https://gist.github.com/bradhe/11126047
>>
>> Oddly, the query is scheduled...but it never makes any progress!
>> http://i.imgur.com/vXvgpzD.png
>>
>> Is this a bug? Am I doing something dumb?
>>
>> Thanks,
>> Brad Heller
>>
>
>
Re: Hung inserts?
Posted by Mayur Rustagi <ma...@gmail.com>.
Clustering is not supported. Can you remove that & give it a go.
Mayur Rustagi
Ph: +1 (760) 203 3257
http://www.sigmoidanalytics.com
@mayur_rustagi <https://twitter.com/mayur_rustagi>
On Mon, Apr 21, 2014 at 3:20 AM, Brad Heller <br...@gmail.com> wrote:
> Hey list,
>
> I've got some CSV data I'm importing from S3. I can create the external
> table well enough, and I can also do a CREATE TABLE ... AS SELECT ... from
> it to pull the data internal to Spark.
>
> Here's the HQL for my external table:
> https://gist.github.com/bradhe/11126024
>
> Now I'd like to add partitioning and clustering to my permanent table. So,
> I create a new table and try to do an INSERT ... SELECT
>
> Here's the HQL for my internal, partitioned table and the insert select:
> https://gist.github.com/bradhe/11126047
>
> Oddly, the query is scheduled...but it never makes any progress!
> http://i.imgur.com/vXvgpzD.png
>
> Is this a bug? Am I doing something dumb?
>
> Thanks,
> Brad Heller
>
Re: Hung inserts?
Posted by Rahul Chugh <ra...@gmail.com>.
M ¥
n vc czwqq
On Sunday, April 20, 2014, Brad Heller <br...@gmail.com> wrote:
> Hey list,
>
> I've got some CSV data I'm importing from S3. I can create the external
> table well enough, and I can also do a CREATE TABLE ... AS SELECT ... from
> it to pull the data internal to Spark.
>
> Here's the HQL for my external table:
> https://gist.github.com/bradhe/11126024
>
> Now I'd like to add partitioning and clustering to my permanent table. So,
> I create a new table and try to do an INSERT ... SELECT
>
> Here's the HQL for my internal, partitioned table and the insert select:
> https://gist.github.com/bradhe/11126047
>
> Oddly, the query is scheduled...but it never makes any progress!
> http://i.imgur.com/vXvgpzD.png
>
> Is this a bug? Am I doing something dumb?
>
> Thanks,
> Brad Heller
>