You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Benjamin Kim <bb...@gmail.com> on 2016/10/08 16:42:17 UTC

Inserting New Primary Keys

I have a table with data already in it that has primary keys generated by the function monotonicallyIncreasingId. Now, I want to insert more data into it with primary keys that will auto-increment from where the existing data left off. How would I do this? There is no argument I can pass into the function monotonicallyIncreasingId to seed it.

Thanks,
Ben


---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscribe@spark.apache.org


Re: Inserting New Primary Keys

Posted by Benjamin Kim <bb...@gmail.com>.
Jean,

I see your point. For the incremental data, which is very small, I should make sure that the PARTITION BY in the OVER(PARTITION BY ...) is left out so that all the data will be in one partition when assigned a row number. The query below should avoid any problems.

“SELECT ROW_NUMBER() OVER() + b.id_max AS id, a.* FROM source a CROSS JOIN (SELECT COALESCE(MAX(id),0) AS id_max FROM tmp_destination) b”.

But initially, I’ll use the monotonicallyIncreasingId function when I first load the data.

Thanks,
Ben


> On Oct 10, 2016, at 8:36 AM, Jean Georges Perrin <jg...@jgp.net> wrote:
> 
> Is there only one process adding rows? because this seems a little risky if you have multiple threads doing that… 
> 
>> On Oct 8, 2016, at 1:43 PM, Benjamin Kim <bbuild11@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Mich,
>> 
>> After much searching, I found and am trying to use “SELECT ROW_NUMBER() OVER() + b.id_max AS id, a.* FROM source a CROSS JOIN (SELECT COALESCE(MAX(id),0) AS id_max FROM tmp_destination) b”. I think this should do it.
>> 
>> Thanks,
>> Ben
>> 
>> 
>>> On Oct 8, 2016, at 9:48 AM, Mich Talebzadeh <mich.talebzadeh@gmail.com <ma...@gmail.com>> wrote:
>>> 
>>> can you get the max value from the current  table and start from MAX(ID) + 1 assuming it is a numeric value (it should be)?
>>> 
>>> HTH
>>> 
>>> HTH
>>> 
>>> Dr Mich Talebzadeh
>>>  
>>> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>>  
>>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>> 
>>> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
>>>  
>>> 
>>> On 8 October 2016 at 17:42, Benjamin Kim <bbuild11@gmail.com <ma...@gmail.com>> wrote:
>>> I have a table with data already in it that has primary keys generated by the function monotonicallyIncreasingId. Now, I want to insert more data into it with primary keys that will auto-increment from where the existing data left off. How would I do this? There is no argument I can pass into the function monotonicallyIncreasingId to seed it.
>>> 
>>> Thanks,
>>> Ben
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org <ma...@spark.apache.org>
>>> 
>>> 
>> 
> 


Re: Inserting New Primary Keys

Posted by Jean Georges Perrin <jg...@jgp.net>.
Is there only one process adding rows? because this seems a little risky if you have multiple threads doing that… 

> On Oct 8, 2016, at 1:43 PM, Benjamin Kim <bb...@gmail.com> wrote:
> 
> Mich,
> 
> After much searching, I found and am trying to use “SELECT ROW_NUMBER() OVER() + b.id_max AS id, a.* FROM source a CROSS JOIN (SELECT COALESCE(MAX(id),0) AS id_max FROM tmp_destination) b”. I think this should do it.
> 
> Thanks,
> Ben
> 
> 
>> On Oct 8, 2016, at 9:48 AM, Mich Talebzadeh <mich.talebzadeh@gmail.com <ma...@gmail.com>> wrote:
>> 
>> can you get the max value from the current  table and start from MAX(ID) + 1 assuming it is a numeric value (it should be)?
>> 
>> HTH
>> 
>> HTH
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
>>  
>> 
>> On 8 October 2016 at 17:42, Benjamin Kim <bbuild11@gmail.com <ma...@gmail.com>> wrote:
>> I have a table with data already in it that has primary keys generated by the function monotonicallyIncreasingId. Now, I want to insert more data into it with primary keys that will auto-increment from where the existing data left off. How would I do this? There is no argument I can pass into the function monotonicallyIncreasingId to seed it.
>> 
>> Thanks,
>> Ben
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe e-mail: user-unsubscribe@spark.apache.org <ma...@spark.apache.org>
>> 
>> 
> 


Re: Inserting New Primary Keys

Posted by Benjamin Kim <bb...@gmail.com>.
Mich,

After much searching, I found and am trying to use “SELECT ROW_NUMBER() OVER() + b.id_max AS id, a.* FROM source a CROSS JOIN (SELECT COALESCE(MAX(id),0) AS id_max FROM tmp_destination) b”. I think this should do it.

Thanks,
Ben


> On Oct 8, 2016, at 9:48 AM, Mich Talebzadeh <mi...@gmail.com> wrote:
> 
> can you get the max value from the current  table and start from MAX(ID) + 1 assuming it is a numeric value (it should be)?
> 
> HTH
> 
> HTH
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
>  
> 
> On 8 October 2016 at 17:42, Benjamin Kim <bbuild11@gmail.com <ma...@gmail.com>> wrote:
> I have a table with data already in it that has primary keys generated by the function monotonicallyIncreasingId. Now, I want to insert more data into it with primary keys that will auto-increment from where the existing data left off. How would I do this? There is no argument I can pass into the function monotonicallyIncreasingId to seed it.
> 
> Thanks,
> Ben
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org <ma...@spark.apache.org>
> 
> 


Re: Inserting New Primary Keys

Posted by Mich Talebzadeh <mi...@gmail.com>.
can you get the max value from the current  table and start from MAX(ID) +
1 assuming it is a numeric value (it should be)?

HTH

HTH

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 8 October 2016 at 17:42, Benjamin Kim <bb...@gmail.com> wrote:

> I have a table with data already in it that has primary keys generated by
> the function monotonicallyIncreasingId. Now, I want to insert more data
> into it with primary keys that will auto-increment from where the existing
> data left off. How would I do this? There is no argument I can pass into
> the function monotonicallyIncreasingId to seed it.
>
> Thanks,
> Ben
>
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: user-unsubscribe@spark.apache.org
>
>