You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@atlas.apache.org by Bolke de Bruin <bd...@gmail.com> on 2019/08/08 19:04:48 UTC

Re: Atlas import taking huge amount of time

Will this solve / improve the issue with extremely slow Kafka consumption as well? 

Cheers
Bolke

Sent from my iPhone

> On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
> 
> Sure Ashutosh,
> 
> Please let me know once it is done. 
> 
> 
> 
> ---
> Regards,
> Sreeni 
> 
> On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
> 
>    This email is from an external sender.
> 
> 
>    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
> 
>    ~ ashutosh
>    .......
>    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
> 
>        Hi Ashutosh,
> 
>        Thanks for your reply.
> 
>        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
> 
>        Do you have a patch for 2.0.0?
> 
> 
>        ---
>        Regards,
>        Sreeni
> 
>        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
> 
>            This email is from an external sender.
> 
> 
>            Hi
> 
>            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
> 
>            Please take a look at:
>            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
>            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
> 
>            Best regards,
> 
>            ~ ashutosh
>            .......
>            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
> 
> 
>            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
> 
>                Hello folks,
> 
>                We are trying to export and import the existing data to different atlas system.
>                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
>                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
>                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
> 
>                Export command
>                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
>                    "itemsToExport": [
>                       { "typeName": "kafka_topic" }
>                    ],
>                    "options": {
>                        "matchType": "forType"
>                    }
>                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
> 
> 
>                Import command
>                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
> 
>                let us know if we are missing in the import process
> 
> 
>                ---
>                Regards,
>                Sreeni
> 
> 
> 
> 
> 
> 

Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com>.
Right now this implementation is only for Import.

~ ashutosh
Ashutosh Mestry <ma...@cloudera.com> . Software Engineer . Cloudera, Inc. .  +1-310-988 0670 <tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 8/8/19, 12:04 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:

    Will this solve / improve the issue with extremely slow Kafka consumption as well? 
    
    Cheers
    Bolke
    
    Sent from my iPhone
    
    > On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
    > 
    > Sure Ashutosh,
    > 
    > Please let me know once it is done. 
    > 
    > 
    > 
    > ---
    > Regards,
    > Sreeni 
    > 
    > On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >    This email is from an external sender.
    > 
    > 
    >    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
    > 
    >    ~ ashutosh
    >    .......
    >    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >        Hi Ashutosh,
    > 
    >        Thanks for your reply.
    > 
    >        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    > 
    >        Do you have a patch for 2.0.0?
    > 
    > 
    >        ---
    >        Regards,
    >        Sreeni
    > 
    >        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >            This email is from an external sender.
    > 
    > 
    >            Hi
    > 
    >            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    > 
    >            Please take a look at:
    >            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
    >            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    > 
    >            Best regards,
    > 
    >            ~ ashutosh
    >            .......
    >            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >                Hello folks,
    > 
    >                We are trying to export and import the existing data to different atlas system.
    >                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
    >                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
    >                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    > 
    >                Export command
    >                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    >                    "itemsToExport": [
    >                       { "typeName": "kafka_topic" }
    >                    ],
    >                    "options": {
    >                        "matchType": "forType"
    >                    }
    >                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    > 
    > 
    >                Import command
    >                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    > 
    >                let us know if we are missing in the import process
    > 
    > 
    >                ---
    >                Regards,
    >                Sreeni
    > 
    > 
    > 
    > 
    > 
    > 
    

Re: Atlas import taking huge amount of time

Posted by Ashutosh Mestry <am...@cloudera.com>.
Right now this implementation is only for Import.

~ ashutosh
Ashutosh Mestry <ma...@cloudera.com> . Software Engineer . Cloudera, Inc. .  +1-310-988 0670 <tel:%2B1-310-988%200670>
.......
No hurry, no pause. – Tim Ferriss, Life Hacker, Author
 

On 8/8/19, 12:04 PM, "Bolke de Bruin" <bd...@gmail.com> wrote:

    Will this solve / improve the issue with extremely slow Kafka consumption as well? 
    
    Cheers
    Bolke
    
    Sent from my iPhone
    
    > On 25 Jul 2019, at 09:54, Nallapati, Sreenivasulu <Sr...@intuit.com> wrote:
    > 
    > Sure Ashutosh,
    > 
    > Please let me know once it is done. 
    > 
    > 
    > 
    > ---
    > Regards,
    > Sreeni 
    > 
    > On 25/07/19, 4:52 AM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >    This email is from an external sender.
    > 
    > 
    >    There are few dependent patches. I will try to put out a patch on version 2.0. It will take me few days to get this. Please bear with me.
    > 
    >    ~ ashutosh
    >    .......
    >    No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >    On 7/24/19, 4:31 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >        Hi Ashutosh,
    > 
    >        Thanks for your reply.
    > 
    >        Currently we are using Atlas 2.0.0 and I am not able to apply this patch. It has lot of compilation errors.
    > 
    >        Do you have a patch for 2.0.0?
    > 
    > 
    >        ---
    >        Regards,
    >        Sreeni
    > 
    >        On 23/07/19, 11:36 PM, "Ashutosh Mestry" <am...@cloudera.com> wrote:
    > 
    >            This email is from an external sender.
    > 
    > 
    >            Hi
    > 
    >            Existing import processes 1 entity at a time. Thus time taken is linear. There is a JIRA that improves the situation. It is being tested right now.
    > 
    >            Please take a look at:
    >            JIRA: https://issues.apache.org/jira/browse/ATLAS-3320
    >            Review: https://reviews.apache.org/r/71025/ (latest patch is here)
    > 
    >            Best regards,
    > 
    >            ~ ashutosh
    >            .......
    >            No hurry, no pause. – Tim Ferriss, Life Hacker, Author
    > 
    > 
    >            On 7/23/19, 6:46 AM, "Nallapati, Sreenivasulu" <Sr...@intuit.com> wrote:
    > 
    >                Hello folks,
    > 
    >                We are trying to export and import the existing data to different atlas system.
    >                We have around 10000 entities in the exported zip file. The export is taking around 2-3 mins.
    >                Total zip file size is 14 MB. The largest file in the zip is around 7 MB which has almost 1000 relationshipAttributes in it.
    >                When we try to import this, the import is running for more than 25 hours. Is this expected behaviour? Is there any way to speed up this process?
    > 
    >                Export command
    >                curl -igk -X POST -u admin:admin -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    >                    "itemsToExport": [
    >                       { "typeName": "kafka_topic" }
    >                    ],
    >                    "options": {
    >                        "matchType": "forType"
    >                    }
    >                }' "http:// localhost:21000/api/atlas/admin/export" > /tmp/kafka_topic.zip
    > 
    > 
    >                Import command
    >                curl -ikg -X POST -u admin:admin -H "Content-Type: multipart/form-data" -H "Cache-Control: no-cache" -F data=@ kafka_topic.zip "http://localhost:21000/api/atlas/admin/import"
    > 
    >                let us know if we are missing in the import process
    > 
    > 
    >                ---
    >                Regards,
    >                Sreeni
    > 
    > 
    > 
    > 
    > 
    >