You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by Ashutosh Mestry <am...@hortonworks.com> on 2018/03/20 23:14:01 UTC
Review Request 66184: Migration Utility: Branch 0.8: Performance
Improvement
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66184/
-----------------------------------------------------------
Review request for atlas, Madhan Neethiraj, Ruchi Solani, and Sarath Subramanian.
Bugs: ATLAS-2461
https://issues.apache.org/jira/browse/ATLAS-2461
Repository: atlas
Description
-------
**Background**
The migration utility committed ealier has couple of short comings:
- Relies on Export service.
- Needs _export-options.json_ to be specified.
- Exporting everything means meticuloulsy updating the options file. It is likely some specification is missed and hence will lead to less data being migrated.
- Suffers from performance problems for large data sets.
**Approach**
The new approach uses _Titan's_ _GraphSON_ writer. This is configured to export all data in _EXTENDED_ format.
The _EXTENDED_ format separates _vertices_ and _edges_. This open other interesting avenues for import.
**Implementation**
- Modified _Exporter_ to use _AtlasTypeRegistry_ and _GraphSONWriter_.
- Produced files:
- _atlas-typedef.json_: Contains type definitions of all types.
- _atlas-migration-data.json_: Contains data from the database.
Diffs
-----
tools/atlas-migration-exporter/pom.xml 5c6c61ee
tools/atlas-migration-exporter/src/main/java/org/apache/atlas/migration/Exporter.java a9873df0
Diff: https://reviews.apache.org/r/66184/diff/1/
Testing
-------
**Functional tests**
Export from repositories with:
- Custom types.
- Complex lineages.
- Created hive entities via beeline.
- Imported data.
**Gremlin Shell**
- Used _Gremlin_ shell to perform export operation.
Thanks,
Ashutosh Mestry
Re: Review Request 66184: Migration Utility: Branch 0.8: Performance
Improvement
Posted by Madhan Neethiraj <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66184/#review199992
-----------------------------------------------------------
Ship it!
Ship It!
- Madhan Neethiraj
On March 26, 2018, 7:38 p.m., Ashutosh Mestry wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/66184/
> -----------------------------------------------------------
>
> (Updated March 26, 2018, 7:38 p.m.)
>
>
> Review request for atlas, Madhan Neethiraj, Ruchi Solani, and Sarath Subramanian.
>
>
> Bugs: ATLAS-2461
> https://issues.apache.org/jira/browse/ATLAS-2461
>
>
> Repository: atlas
>
>
> Description
> -------
>
> **Background**
> The migration utility committed ealier has couple of short comings:
> - Relies on Export service.
> - Needs _export-options.json_ to be specified.
> - Exporting everything means meticuloulsy updating the options file. It is likely some specification is missed and hence will lead to less data being migrated.
> - Suffers from performance problems for large data sets.
>
> **Approach**
> The new approach uses _Titan's_ _GraphSON_ writer. This is configured to export all data in _EXTENDED_ format.
>
> The _EXTENDED_ format separates _vertices_ and _edges_. This open other interesting avenues for import.
>
> **Implementation**
> - Modified _Exporter_ to use _AtlasTypeRegistry_ and _GraphSONWriter_.
> - Produced files:
> - _atlas-typedef.json_: Contains type definitions of all types.
> - _atlas-migration-data.json_: Contains data from the database.
>
>
> Diffs
> -----
>
> distro/src/main/assemblies/migration-exporter.xml 8f751ff9
> tools/atlas-migration-exporter/pom.xml 5c6c61ee
> tools/atlas-migration-exporter/src/main/java/org/apache/atlas/migration/Exporter.java a9873df0
> tools/atlas-migration-exporter/src/main/resources/README 2f2bf3e1
> tools/atlas-migration-exporter/src/main/resources/atlas-log4j.xml PRE-CREATION
> tools/atlas-migration-exporter/src/main/resources/atlas_migration.py 199cde28
> tools/atlas-migration-exporter/src/main/resources/migration-export-request.json 64002aff
>
>
> Diff: https://reviews.apache.org/r/66184/diff/3/
>
>
> Testing
> -------
>
> **Functional tests**
> Export from repositories with:
> - Custom types.
> - Complex lineages.
> - Created hive entities via beeline.
> - Imported data.
>
> **Gremlin Shell**
> - Used _Gremlin_ shell to perform export operation.
>
>
> Thanks,
>
> Ashutosh Mestry
>
>
Re: Review Request 66184: Migration Utility: Branch 0.8: Performance
Improvement
Posted by Ashutosh Mestry <am...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66184/
-----------------------------------------------------------
(Updated March 26, 2018, 7:38 p.m.)
Review request for atlas, Madhan Neethiraj, Ruchi Solani, and Sarath Subramanian.
Changes
-------
Updates include:
- Separate log4j xml for migration log configuration.
- Updated atlas_migration.py to display log display on screen.
- Minor changes to README.
Bugs: ATLAS-2461
https://issues.apache.org/jira/browse/ATLAS-2461
Repository: atlas
Description
-------
**Background**
The migration utility committed ealier has couple of short comings:
- Relies on Export service.
- Needs _export-options.json_ to be specified.
- Exporting everything means meticuloulsy updating the options file. It is likely some specification is missed and hence will lead to less data being migrated.
- Suffers from performance problems for large data sets.
**Approach**
The new approach uses _Titan's_ _GraphSON_ writer. This is configured to export all data in _EXTENDED_ format.
The _EXTENDED_ format separates _vertices_ and _edges_. This open other interesting avenues for import.
**Implementation**
- Modified _Exporter_ to use _AtlasTypeRegistry_ and _GraphSONWriter_.
- Produced files:
- _atlas-typedef.json_: Contains type definitions of all types.
- _atlas-migration-data.json_: Contains data from the database.
Diffs (updated)
-----
distro/src/main/assemblies/migration-exporter.xml 8f751ff9
tools/atlas-migration-exporter/pom.xml 5c6c61ee
tools/atlas-migration-exporter/src/main/java/org/apache/atlas/migration/Exporter.java a9873df0
tools/atlas-migration-exporter/src/main/resources/README 2f2bf3e1
tools/atlas-migration-exporter/src/main/resources/atlas-log4j.xml PRE-CREATION
tools/atlas-migration-exporter/src/main/resources/atlas_migration.py 199cde28
tools/atlas-migration-exporter/src/main/resources/migration-export-request.json 64002aff
Diff: https://reviews.apache.org/r/66184/diff/3/
Changes: https://reviews.apache.org/r/66184/diff/2-3/
Testing
-------
**Functional tests**
Export from repositories with:
- Custom types.
- Complex lineages.
- Created hive entities via beeline.
- Imported data.
**Gremlin Shell**
- Used _Gremlin_ shell to perform export operation.
Thanks,
Ashutosh Mestry
Re: Review Request 66184: Migration Utility: Branch 0.8: Performance
Improvement
Posted by Ashutosh Mestry <am...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/66184/
-----------------------------------------------------------
(Updated March 20, 2018, 11:15 p.m.)
Review request for atlas, Madhan Neethiraj, Ruchi Solani, and Sarath Subramanian.
Changes
-------
Updates include:
- Updated README file.
- Removed _migration-export-options.json_ file.
Bugs: ATLAS-2461
https://issues.apache.org/jira/browse/ATLAS-2461
Repository: atlas
Description
-------
**Background**
The migration utility committed ealier has couple of short comings:
- Relies on Export service.
- Needs _export-options.json_ to be specified.
- Exporting everything means meticuloulsy updating the options file. It is likely some specification is missed and hence will lead to less data being migrated.
- Suffers from performance problems for large data sets.
**Approach**
The new approach uses _Titan's_ _GraphSON_ writer. This is configured to export all data in _EXTENDED_ format.
The _EXTENDED_ format separates _vertices_ and _edges_. This open other interesting avenues for import.
**Implementation**
- Modified _Exporter_ to use _AtlasTypeRegistry_ and _GraphSONWriter_.
- Produced files:
- _atlas-typedef.json_: Contains type definitions of all types.
- _atlas-migration-data.json_: Contains data from the database.
Diffs (updated)
-----
tools/atlas-migration-exporter/pom.xml 5c6c61ee
tools/atlas-migration-exporter/src/main/java/org/apache/atlas/migration/Exporter.java a9873df0
tools/atlas-migration-exporter/src/main/resources/README 2f2bf3e1
tools/atlas-migration-exporter/src/main/resources/migration-export-request.json 64002aff
Diff: https://reviews.apache.org/r/66184/diff/2/
Changes: https://reviews.apache.org/r/66184/diff/1-2/
Testing
-------
**Functional tests**
Export from repositories with:
- Custom types.
- Complex lineages.
- Created hive entities via beeline.
- Imported data.
**Gremlin Shell**
- Used _Gremlin_ shell to perform export operation.
Thanks,
Ashutosh Mestry