You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Nallapati, Sreenivasulu" <Sr...@intuit.com> on 2019/07/23 13:45:59 UTC

Atlas import taking huge amount of time

Hello folks,

We are trying to export and import the existing data to different atlas system.
We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?

Export command
curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    "itemsToExport": [
       { "typeName": "kafka_topic" }
    ],
    "options": {
        "matchType": "forType"
    }
}' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip


Import command
curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"

let us know if we are missing in the import process


---
Regards,
Sreeni

Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com>.
Right now this implementation is only for Import.

~ ashutosh
Ashutosh Mestry <ma...@cloudera.com> . Software Engineer . Cloudera, Inc. .  +1-310-988 0670 <tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 8/8/19, 12:04 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:

    Will this solve / improve the issue with extremely slow Kafka consumption as well? 
    
    Cheers
    Bolke
    
    Sent from my iPhone
    
    > On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
    > 
    > Sure Ashutosh,
    > 
    > Please let me know once it is done. 
    > 
    > 
    > 
    > ---
    > Regards,
    > Sreeni 
    > 
    > On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >    This email is from an external sender.
    > 
    > 
    >    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
    > 
    >    ~ ashutosh
    >    .......
    >    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >        Hi Ashutosh,
    > 
    >        Thanks for your reply.
    > 
    >        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    > 
    >        Do you have a patch for 2.0.0?
    > 
    > 
    >        ---
    >        Regards,
    >        Sreeni
    > 
    >        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >            This email is from an external sender.
    > 
    > 
    >            Hi
    > 
    >            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    > 
    >            Please take a look at:
    >            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
    >            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    > 
    >            Best regards,
    > 
    >            ~ ashutosh
    >            .......
    >            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >                Hello folks,
    > 
    >                We are trying to export and import the existing data to different atlas system.
    >                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
    >                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
    >                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    > 
    >                Export command
    >                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    >                    "itemsToExport": [
    >                       { "typeName": "kafka_topic" }
    >                    ],
    >                    "options": {
    >                        "matchType": "forType"
    >                    }
    >                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    > 
    > 
    >                Import command
    >                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    > 
    >                let us know if we are missing in the import process
    > 
    > 
    >                ---
    >                Regards,
    >                Sreeni
    > 
    > 
    > 
    > 
    > 
    > 
    

Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com>.
Right now this implementation is only for Import.

~ ashutosh
Ashutosh Mestry <ma...@cloudera.com> . Software Engineer . Cloudera, Inc. .  +1-310-988 0670 <tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 8/8/19, 12:04 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:

    Will this solve / improve the issue with extremely slow Kafka consumption as well? 
    
    Cheers
    Bolke
    
    Sent from my iPhone
    
    > On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
    > 
    > Sure Ashutosh,
    > 
    > Please let me know once it is done. 
    > 
    > 
    > 
    > ---
    > Regards,
    > Sreeni 
    > 
    > On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >    This email is from an external sender.
    > 
    > 
    >    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
    > 
    >    ~ ashutosh
    >    .......
    >    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >        Hi Ashutosh,
    > 
    >        Thanks for your reply.
    > 
    >        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    > 
    >        Do you have a patch for 2.0.0?
    > 
    > 
    >        ---
    >        Regards,
    >        Sreeni
    > 
    >        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >            This email is from an external sender.
    > 
    > 
    >            Hi
    > 
    >            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    > 
    >            Please take a look at:
    >            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
    >            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    > 
    >            Best regards,
    > 
    >            ~ ashutosh
    >            .......
    >            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >                Hello folks,
    > 
    >                We are trying to export and import the existing data to different atlas system.
    >                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
    >                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
    >                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    > 
    >                Export command
    >                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    >                    "itemsToExport": [
    >                       { "typeName": "kafka_topic" }
    >                    ],
    >                    "options": {
    >                        "matchType": "forType"
    >                    }
    >                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    > 
    > 
    >                Import command
    >                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    > 
    >                let us know if we are missing in the import process
    > 
    > 
    >                ---
    >                Regards,
    >                Sreeni
    > 
    > 
    > 
    > 
    > 
    > 
    

Re: Atlas import taking huge amount of time

Posted by Bolke de Bruin <bd...@gmail.com>.
Will this solve / improve the issue with extremely slow Kafka consumption as well? 

Cheers
Bolke

Sent from my iPhone

> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
> 
> Sure Ashutosh,
> 
> Please let me know once it is done. 
> 
> 
> 
> ---
> Regards,
> Sreeni 
> 
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
> 
>    This email is from an external sender.
> 
> 
>    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
> 
>    ~ ashutosh
>    .......
>    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
> 
>        Hi Ashutosh,
> 
>        Thanks for your reply.
> 
>        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
> 
>        Do you have a patch for 2.0.0?
> 
> 
>        ---
>        Regards,
>        Sreeni
> 
>        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
> 
>            This email is from an external sender.
> 
> 
>            Hi
> 
>            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
> 
>            Please take a look at:
>            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
>            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
> 
>            Best regards,
> 
>            ~ ashutosh
>            .......
>            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
> 
>                Hello folks,
> 
>                We are trying to export and import the existing data to different atlas system.
>                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
>                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
>                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
> 
>                Export command
>                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
>                    "itemsToExport": [
>                       { "typeName": "kafka_topic" }
>                    ],
>                    "options": {
>                        "matchType": "forType"
>                    }
>                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
> 
> 
>                Import command
>                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
> 
>                let us know if we are missing in the import process
> 
> 
>                ---
>                Regards,
>                Sreeni
> 
> 
> 
> 
> 
> 

Re: Atlas import taking huge amount of time

Posted by Bolke de Bruin <bd...@gmail.com>.
Will this solve / improve the issue with extremely slow Kafka consumption as well? 

Cheers
Bolke

Sent from my iPhone

> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
> 
> Sure Ashutosh,
> 
> Please let me know once it is done. 
> 
> 
> 
> ---
> Regards,
> Sreeni 
> 
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
> 
>    This email is from an external sender.
> 
> 
>    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
> 
>    ~ ashutosh
>    .......
>    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
> 
>        Hi Ashutosh,
> 
>        Thanks for your reply.
> 
>        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
> 
>        Do you have a patch for 2.0.0?
> 
> 
>        ---
>        Regards,
>        Sreeni
> 
>        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
> 
>            This email is from an external sender.
> 
> 
>            Hi
> 
>            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
> 
>            Please take a look at:
>            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
>            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
> 
>            Best regards,
> 
>            ~ ashutosh
>            .......
>            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
> 
>                Hello folks,
> 
>                We are trying to export and import the existing data to different atlas system.
>                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
>                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
>                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
> 
>                Export command
>                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
>                    "itemsToExport": [
>                       { "typeName": "kafka_topic" }
>                    ],
>                    "options": {
>                        "matchType": "forType"
>                    }
>                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
> 
> 
>                Import command
>                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
> 
>                let us know if we are missing in the import process
> 
> 
>                ---
>                Regards,
>                Sreeni
> 
> 
> 
> 
> 
> 

Re: Atlas import taking huge amount of time

Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Sure Ashutosh,

Please let me know once it is done. 

 
 
---
Regards,
Sreeni 

On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:

    This email is from an external sender.
    
    
    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
    
    ~ ashutosh
    .......
    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    
    
    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    
        Hi Ashutosh,
    
        Thanks for your reply.
    
        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    
        Do you have a patch for 2.0.0?
    
    
        ---
        Regards,
        Sreeni
    
        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    
            This email is from an external sender.
    
    
            Hi
    
            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    
            Please take a look at:
            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    
            Best regards,
    
            ~ ashutosh
            .......
            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    
    
            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    
                Hello folks,
    
                We are trying to export and import the existing data to different atlas system.
                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    
                Export command
                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
                    "itemsToExport": [
                       { "typeName": "kafka_topic" }
                    ],
                    "options": {
                        "matchType": "forType"
                    }
                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    
    
                Import command
                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    
                let us know if we are missing in the import process
    
    
                ---
                Regards,
                Sreeni
    
    
    
    
    


Re: Atlas import taking huge amount of time

Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Sure Ashutosh,

Please let me know once it is done. 

 
 
---
Regards,
Sreeni 

On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:

    This email is from an external sender.
    
    
    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
    
    ~ ashutosh
    .......
    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    
    
    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    
        Hi Ashutosh,
    
        Thanks for your reply.
    
        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    
        Do you have a patch for 2.0.0?
    
    
        ---
        Regards,
        Sreeni
    
        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    
            This email is from an external sender.
    
    
            Hi
    
            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    
            Please take a look at:
            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    
            Best regards,
    
            ~ ashutosh
            .......
            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    
    
            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    
                Hello folks,
    
                We are trying to export and import the existing data to different atlas system.
                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    
                Export command
                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
                    "itemsToExport": [
                       { "typeName": "kafka_topic" }
                    ],
                    "options": {
                        "matchType": "forType"
                    }
                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    
    
                Import command
                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    
                let us know if we are missing in the import process
    
    
                ---
                Regards,
                Sreeni
    
    
    
    
    


Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com>.
There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.

~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:

    Hi Ashutosh,
    
    Thanks for your reply. 
    
    Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    
    Do you have a patch for 2.0.0?
    
     
    ---
    Regards,
    Sreeni 
    
    On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    
        This email is from an external sender.
        
        
        Hi
        
        Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
        
        Please take a look at:
        JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
        Review: https://reviews.apache.org/r/71025/ (latest patch is here)
        
        Best regards,
        
        ~ ashutosh
        .......
        No hurry, no pause. – Tim Ferriss, Life Hacker, Author
        
        
        On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
        
            Hello folks,
        
            We are trying to export and import the existing data to different atlas system.
            We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
            Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
            When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
        
            Export command
            curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
                "itemsToExport": [
                   { "typeName": "kafka_topic" }
                ],
                "options": {
                    "matchType": "forType"
                }
            }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
        
        
            Import command
            curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
        
            let us know if we are missing in the import process
        
        
            ---
            Regards,
            Sreeni
        
        
    
    

Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com.INVALID>.
There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.

~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:

    Hi Ashutosh,
    
    Thanks for your reply. 
    
    Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    
    Do you have a patch for 2.0.0?
    
     
    ---
    Regards,
    Sreeni 
    
    On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    
        This email is from an external sender.
        
        
        Hi
        
        Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
        
        Please take a look at:
        JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
        Review: https://reviews.apache.org/r/71025/ (latest patch is here)
        
        Best regards,
        
        ~ ashutosh
        .......
        No hurry, no pause. – Tim Ferriss, Life Hacker, Author
        
        
        On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
        
            Hello folks,
        
            We are trying to export and import the existing data to different atlas system.
            We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
            Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
            When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
        
            Export command
            curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
                "itemsToExport": [
                   { "typeName": "kafka_topic" }
                ],
                "options": {
                    "matchType": "forType"
                }
            }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
        
        
            Import command
            curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
        
            let us know if we are missing in the import process
        
        
            ---
            Regards,
            Sreeni
        
        
    
    

Re: Atlas import taking huge amount of time

Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Hi Ashutosh,

Thanks for your reply. 

Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.

Do you have a patch for 2.0.0?

 
---
Regards,
Sreeni 

On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:

    This email is from an external sender.
    
    
    Hi
    
    Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    
    Please take a look at:
    JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
    Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    
    Best regards,
    
    ~ ashutosh
    .......
    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    
    
    On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    
        Hello folks,
    
        We are trying to export and import the existing data to different atlas system.
        We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
        Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
        When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    
        Export command
        curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
            "itemsToExport": [
               { "typeName": "kafka_topic" }
            ],
            "options": {
                "matchType": "forType"
            }
        }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    
    
        Import command
        curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    
        let us know if we are missing in the import process
    
    
        ---
        Regards,
        Sreeni
    
    


Re: Atlas import taking huge amount of time

Posted by "Nallapati, Sreenivasulu" <Sr...@intuit.com>.
Hi Ashutosh,

Thanks for your reply. 

Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.

Do you have a patch for 2.0.0?

 
---
Regards,
Sreeni 

On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:

    This email is from an external sender.
    
    
    Hi
    
    Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    
    Please take a look at:
    JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
    Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    
    Best regards,
    
    ~ ashutosh
    .......
    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    
    
    On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    
        Hello folks,
    
        We are trying to export and import the existing data to different atlas system.
        We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
        Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
        When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    
        Export command
        curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
            "itemsToExport": [
               { "typeName": "kafka_topic" }
            ],
            "options": {
                "matchType": "forType"
            }
        }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    
    
        Import command
        curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    
        let us know if we are missing in the import process
    
    
        ---
        Regards,
        Sreeni
    
    


Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com.INVALID>.
Hi

Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.

Please take a look at: 
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)

Best regards,

~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:

    Hello folks,
    
    We are trying to export and import the existing data to different atlas system.
    We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
    Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
    When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    
    Export command
    curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
        "itemsToExport": [
           { "typeName": "kafka_topic" }
        ],
        "options": {
            "matchType": "forType"
        }
    }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    
    
    Import command
    curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    
    let us know if we are missing in the import process
    
    
    ---
    Regards,
    Sreeni
    

Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com>.
Hi

Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.

Please take a look at: 
JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
Review: https://reviews.apache.org/r/71025/ (latest patch is here)

Best regards,

~ ashutosh
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:

    Hello folks,
    
    We are trying to export and import the existing data to different atlas system.
    We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
    Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
    When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    
    Export command
    curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
        "itemsToExport": [
           { "typeName": "kafka_topic" }
        ],
        "options": {
            "matchType": "forType"
        }
    }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    
    
    Import command
    curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    
    let us know if we are missing in the import process
    
    
    ---
    Regards,
    Sreeni