You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Nallapati, Sreenivasulu" <Sr...@intuit.com> on 2019/07/23 13:45:59 UTC
Atlas import taking huge amount of time
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by Ashutosh Mestry <am...@cloudera.com>.
Right now this implementation is only for Import.
~ ashutosh
Ashutosh Mestry <ma...@cloudera.com> . Software Engineer . Cloudera, Inc. . +1-310-988 0670 <tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 8/8/19, 12:04 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:
Will this solve / improve the issue with extremely slow Kafka consumption as well?
Cheers
Bolke
Sent from my iPhone
> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
>
> Sure Ashutosh,
>
> Please let me know once it is done.
>
>
>
> ---
> Regards,
> Sreeni
>
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hi Ashutosh,
>
> Thanks for your reply.
>
> Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
>
> Do you have a patch for 2.0.0?
>
>
> ---
> Regards,
> Sreeni
>
> On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> Hi
>
> Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
>
> Please take a look at:
> JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
> Review: https://reviews.apache.org/r/71025/ (latest patch is here)
>
> Best regards,
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hello folks,
>
> We are trying to export and import the existing data to different atlas system.
> We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
> Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
> When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
>
> Export command
> curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
> "itemsToExport": [
> { "typeName": "kafka_topic" }
> ],
> "options": {
> "matchType": "forType"
> }
> }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
>
>
> Import command
> curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
>
> let us know if we are missing in the import process
>
>
> ---
> Regards,
> Sreeni
>
>
>
>
>
>
Re: Atlas import taking huge amount of time
Posted by Ashutosh Mestry <am...@cloudera.com>.
Right now this implementation is only for Import.
~ ashutosh
Ashutosh Mestry <ma...@cloudera.com> . Software Engineer . Cloudera, Inc. . +1-310-988 0670 <tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 8/8/19, 12:04 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:
Will this solve / improve the issue with extremely slow Kafka consumption as well?
Cheers
Bolke
Sent from my iPhone
> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
>
> Sure Ashutosh,
>
> Please let me know once it is done.
>
>
>
> ---
> Regards,
> Sreeni
>
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hi Ashutosh,
>
> Thanks for your reply.
>
> Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
>
> Do you have a patch for 2.0.0?
>
>
> ---
> Regards,
> Sreeni
>
> On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> Hi
>
> Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
>
> Please take a look at:
> JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
> Review: https://reviews.apache.org/r/71025/ (latest patch is here)
>
> Best regards,
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hello folks,
>
> We are trying to export and import the existing data to different atlas system.
> We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
> Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
> When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
>
> Export command
> curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
> "itemsToExport": [
> { "typeName": "kafka_topic" }
> ],
> "options": {
> "matchType": "forType"
> }
> }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
>
>
> Import command
> curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
>
> let us know if we are missing in the import process
>
>
> ---
> Regards,
> Sreeni
>
>
>
>
>
>
Re: Atlas import taking huge amount of time
Posted by Bolke de Bruin <bd...@gmail.com>.
Will this solve / improve the issue with extremely slow Kafka consumption as well?
Cheers
Bolke
Sent from my iPhone
> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
>
> Sure Ashutosh,
>
> Please let me know once it is done.
>
>
>
> ---
> Regards,
> Sreeni
>
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hi Ashutosh,
>
> Thanks for your reply.
>
> Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
>
> Do you have a patch for 2.0.0?
>
>
> ---
> Regards,
> Sreeni
>
> On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> Hi
>
> Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
>
> Please take a look at:
> JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
> Review: https://reviews.apache.org/r/71025/ (latest patch is here)
>
> Best regards,
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hello folks,
>
> We are trying to export and import the existing data to different atlas system.
> We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
> Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
> When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
>
> Export command
> curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
> "itemsToExport": [
> { "typeName": "kafka_topic" }
> ],
> "options": {
> "matchType": "forType"
> }
> }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
>
>
> Import command
> curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
>
> let us know if we are missing in the import process
>
>
> ---
> Regards,
> Sreeni
>
>
>
>
>
>
Re: Atlas import taking huge amount of time
Posted by Bolke de Bruin <bd...@gmail.com>.
Will this solve / improve the issue with extremely slow Kafka consumption as well?
Cheers
Bolke
Sent from my iPhone
> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
>
> Sure Ashutosh,
>
> Please let me know once it is done.
>
>
>
> ---
> Regards,
> Sreeni
>
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hi Ashutosh,
>
> Thanks for your reply.
>
> Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
>
> Do you have a patch for 2.0.0?
>
>
> ---
> Regards,
> Sreeni
>
> On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
>
> This email is from an external sender.
>
>
> Hi
>
> Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
>
> Please take a look at:
> JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
> Review: https://reviews.apache.org/r/71025/ (latest patch is here)
>
> Best regards,
>
> ~ ashutosh
> .......
> No hurry, no pause. – Tim Ferriss, Life Hacker, Author
>
>
> On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
>
> Hello folks,
>
> We are trying to export and import the existing data to different atlas system.
> We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
> Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
> When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
>
> Export command
> curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
> "itemsToExport": [
> { "typeName": "kafka_topic" }
> ],
> "options": {
> "matchType": "forType"
> }
> }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
>
>
> Import command
> curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
>
> let us know if we are missing in the import process
>
>
> ---
> Regards,
> Sreeni
>
>
>
>
>
>
Re: Atlas import taking huge amount of time
Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Sure Ashutosh,
Please let me know once it is done.
---
Regards,
Sreeni
On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hi Ashutosh,
Thanks for your reply.
Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
Do you have a patch for 2.0.0?
---
Regards,
Sreeni
On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Sure Ashutosh,
Please let me know once it is done.
---
Regards,
Sreeni
On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hi Ashutosh,
Thanks for your reply.
Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
Do you have a patch for 2.0.0?
---
Regards,
Sreeni
On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by Ashutosh Mestry <am...@cloudera.com>.
There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hi Ashutosh,
Thanks for your reply.
Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
Do you have a patch for 2.0.0?
---
Regards,
Sreeni
On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by Ashutosh Mestry <am...@cloudera.com.INVALID>.
There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hi Ashutosh,
Thanks for your reply.
Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
Do you have a patch for 2.0.0?
---
Regards,
Sreeni
On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Hi Ashutosh,
Thanks for your reply.
Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
Do you have a patch for 2.0.0?
---
Regards,
Sreeni
On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Hi Ashutosh,
Thanks for your reply.
Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
Do you have a patch for 2.0.0?
---
Regards,
Sreeni
On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
This email is from an external sender.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by Ashutosh Mestry <am...@cloudera.com.INVALID>.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni
Re: Atlas import taking huge amount of time
Posted by Ashutosh Mestry <am...@cloudera.com>.
Hi
Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
Please take a look at:
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)
Best regards,
~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
Hello folks,
We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
"itemsToExport": [
{ "typeName": "kafka_topic" }
],
"options": {
"matchType": "forType"
}
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
let us know if we are missing in the import process
---
Regards,
Sreeni