You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hugegraph.apache.org by GitBox <gi...@apache.org> on 2022/09/16 09:45:30 UTC

[GitHub] [incubator-hugegraph-toolchain] haohao0103 opened a new pull request, #334: Schema cache optimize

haohao0103 opened a new pull request, #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334

   SchemaCache init and after query schema then put the latest schema to SchemaCache
   
   I have read the CLA Document and I hereby sign the CLA


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250916642

   > > @haohao0103 明白你的意思了，但是 `HugeGraphLoader` 还有一个创建 Schema 的逻辑`createSchema`，SparkLoader 还没有加这个逻辑，但是也是需要的，创建 schema 后再加载 schema 缓存看起来更合理一些。另外如果有必要的话完全可以将 initPartition 方法改为 public，或者定义一个其他更通用的方法。
   > 
   > 明白了，谢谢哈
   
   我需要重新提交一个pr?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1251804234

   > > 记得也顺便检查一下 `flink-loader` 那边是否也需要加一下
   > > 如果需要 public 的话可以改
   > 
   > 需要修改，需要在initBuilders()方法中调用context.updateSchemaCache();
   
    private Map<ElementBuilder, List<String>> initBuilders() {
           LoadContext loadContext = new LoadContext(this.loadOptions);
           Map<ElementBuilder, List<String>> builders = new HashMap<>();
           for (VertexMapping vertexMapping : this.struct.vertices()) {
               builders.put(new VertexBuilder(loadContext, this.struct, vertexMapping),
                            new ArrayList<>());
           }
           for (EdgeMapping edgeMapping : this.struct.edges()) {
               builders.put(new EdgeBuilder(loadContext, this.struct, edgeMapping),
                            new ArrayList<>());
           }
   // TODO       
   //  loadContext.updateSchemaCache();
           return builders;
       }


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250796686

   > hi @haohao0103 ,thanks for your contribution. I think maybe it's better to updateSchemaCache after `new LoadContext` just like `HugeGraphLoader`, what do you think? @imbajin @haohao0103 https://github.com/apache/incubator-hugegraph-toolchain/blob/master/hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/spark/HugeGraphSparkLoader.java#L120
   > 
   > ```java
   >     private LoadContext initPartition(
   >             LoadOptions loadOptions, InputStruct struct) {
   >         LoadContext context = new LoadContext(loadOptions);
   >         for (VertexMapping vertexMapping : struct.vertices()) {
   >             this.builders.put(
   >                     new VertexBuilder(context, struct, vertexMapping),
   >                     new ArrayList<>());
   >         }
   >         for (EdgeMapping edgeMapping : struct.edges()) {
   >             this.builders.put(new EdgeBuilder(context, struct, edgeMapping),
   >                               new ArrayList<>());
   >         }
   >         context.updateSchemaCache();
   >         return context;
   >     }
   > ```
   
   我觉得放在这里是可以的，我们第一次就在这里解决的；但是initPartition方法是HugeGraphSparkLoader类私有的；感觉LoadContext对象创建后需要调用initPartition初始化方法才能正确使用，感觉有点不合适哈。个人的浅见。。。。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] imbajin commented on a diff in pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

imbajin commented on code in PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#discussion_r973844260


##########
hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/builder/SchemaCache.java:
##########
@@ -46,6 +46,7 @@ public SchemaCache(HugeClient client) {
         this.propertyKeys = new HashMap<>();
         this.vertexLabels = new HashMap<>();
         this.edgeLabels = new HashMap<>();
+        updateAll();

Review Comment:
   why we need clear the cache when we firstly create it?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] imbajin commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

imbajin commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250538035

   2 little questions:
   1. why it only influence spark-loader now?
   2. shall we flush the cache periodically?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] simon824 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

simon824 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250854391

   > > hi @haohao0103 ,thanks for your contribution. I think maybe it's better to updateSchemaCache after `new LoadContext` just like `HugeGraphLoader`, what do you think? @imbajin @haohao0103 https://github.com/apache/incubator-hugegraph-toolchain/blob/master/hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/spark/HugeGraphSparkLoader.java#L120
   > > ```java
   > >     private LoadContext initPartition(
   > >             LoadOptions loadOptions, InputStruct struct) {
   > >         LoadContext context = new LoadContext(loadOptions);
   > >         for (VertexMapping vertexMapping : struct.vertices()) {
   > >             this.builders.put(
   > >                     new VertexBuilder(context, struct, vertexMapping),
   > >                     new ArrayList<>());
   > >         }
   > >         for (EdgeMapping edgeMapping : struct.edges()) {
   > >             this.builders.put(new EdgeBuilder(context, struct, edgeMapping),
   > >                               new ArrayList<>());
   > >         }
   > >         context.updateSchemaCache();
   > >         return context;
   > >     }
   > > ```
   > 
   > 我觉得放在这里是可以的，我们第一次就在这里解决的；但是initPartition方法是HugeGraphSparkLoader类私有的；感觉需要调用initPartition方法才能正确正确创建及初始化LoadContext对象，感觉有点不合适哈。个人的浅见。。。。
   
   没有明白你的意思，你是指哪里不合适？LoadContext 应该是只有在初始化分区的时候才需要创建？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1251803681

   > 记得也顺便检查一下 `flink-loader` 那边是否也需要加一下
   > 
   > 如果需要 public 的话可以改
   
   需要修改，需要在initBuilders()方法中调用context.updateSchemaCache();


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1251865239

   > 记得也顺便检查一下 `flink-loader` 那边是否也需要加一下
   > 
   > 如果需要 public 的话可以改
   
   还可以继续提代码吗？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] simon824 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

simon824 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1251720614

   @haohao0103 可以直接在这个PR上提交代码覆盖


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1249164761

   fix #333
   Fix bug: SchemaCache init and after query schema then put the latest schema to SchemaCache


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] imbajin commented on a diff in pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

imbajin commented on code in PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#discussion_r973844260


##########
hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/builder/SchemaCache.java:
##########
@@ -46,6 +46,7 @@ public SchemaCache(HugeClient client) {
         this.propertyKeys = new HashMap<>();
         this.vertexLabels = new HashMap<>();
         this.edgeLabels = new HashMap<>();
+        updateAll();

Review Comment:
   why we need clear the cache when we firstly create it? (for example?)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1251878788

   > > 记得也顺便检查一下 `flink-loader` 那边是否也需要加一下
   > > 如果需要 public 的话可以改
   > 
   > 还可以继续提代码吗？
   
   @imbajin @simon824 flink-loader的代码也修复了，麻烦review。谢谢


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] simon824 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

simon824 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250914177

   @haohao0103 明白你的意思了，但是 `HugeGraphLoader` 还有一个创建 Schema 的逻辑`createSchema`，SparkLoader 还没有加这个逻辑，但是也是需要的，创建 schema 后再加载 schema 缓存看起来更合理一些。另外如果有必要的话完全可以将 initPartition 方法改为 public，或者定义一个其他更通用的方法。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] codecov[bot] commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

codecov[bot] commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250538550

   # [Codecov](https://codecov.io/gh/apache/incubator-hugegraph-toolchain/pull/334?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#334](https://codecov.io/gh/apache/incubator-hugegraph-toolchain/pull/334?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (22a8c88) into [master](https://codecov.io/gh/apache/incubator-hugegraph-toolchain/commit/9d34f4aaa9debf51f1f8b433d579d6fe8bd9b110?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (9d34f4a) will **decrease** coverage by `0.06%`.
   > The diff coverage is `50.00%`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master     #334      +/-   ##
   ============================================
   - Coverage     67.49%   67.42%   -0.07%     
   - Complexity      877      878       +1     
   ============================================
     Files            86       86              
     Lines          4024     4031       +7     
     Branches        475      477       +2     
   ============================================
   + Hits           2716     2718       +2     
   - Misses         1104     1108       +4     
   - Partials        204      205       +1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/incubator-hugegraph-toolchain/pull/334?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...om/baidu/hugegraph/loader/builder/SchemaCache.java](https://codecov.io/gh/apache/incubator-hugegraph-toolchain/pull/334/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVnZWdyYXBoLWxvYWRlci9zcmMvbWFpbi9qYXZhL2NvbS9iYWlkdS9odWdlZ3JhcGgvbG9hZGVyL2J1aWxkZXIvU2NoZW1hQ2FjaGUuamF2YQ==) | `46.77% <25.00%> (-1.51%)` | :arrow_down: |
   | [...gegraph/loader/reader/file/OrcFileLineFetcher.java](https://codecov.io/gh/apache/incubator-hugegraph-toolchain/pull/334/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-aHVnZWdyYXBoLWxvYWRlci9zcmMvbWFpbi9qYXZhL2NvbS9iYWlkdS9odWdlZ3JhcGgvbG9hZGVyL3JlYWRlci9maWxlL09yY0ZpbGVMaW5lRmV0Y2hlci5qYXZh) | `87.71% <66.66%> (-3.03%)` | :arrow_down: |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1251762816

   > @haohao0103 可以直接在这个PR上提交代码覆盖
   
   已提交代码


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] imbajin merged pull request #334: fix: schema not cached in spark/flink loader

Posted by GitBox <gi...@apache.org>.

imbajin merged PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250541098

   > 2 little questions:
   > 
   > 1. why it only influences `spark-loader` now?
   > 2. shall we flush the cache periodically?
   
   1、Hugegraphloader中load()方法会执行this.context.updateSchemaCache()更新元数据
   2、可以改成第一次的初始化，或者改成第一次查询后更新，这两种我觉得保留一种即可


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on a diff in pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on code in PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#discussion_r973847060


##########
hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/builder/SchemaCache.java:
##########
@@ -46,6 +46,7 @@ public SchemaCache(HugeClient client) {
         this.propertyKeys = new HashMap<>();
         this.vertexLabels = new HashMap<>();
         this.edgeLabels = new HashMap<>();
+        updateAll();

Review Comment:
   The HugeGraphSparkLoader does not get the real metadata information when SchemaCache is first created



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] simon824 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

simon824 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250683001

   hi @haohao0103 ,thanks for your contribution.
   I think maybe it's better to updateSchemaCache after `new LoadContext` just like `HugeGraphLoader`, what do you think? @imbajin @haohao0103 
   https://github.com/apache/incubator-hugegraph-toolchain/blob/master/hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/spark/HugeGraphSparkLoader.java#L120
   
   ``` java
       private LoadContext initPartition(
               LoadOptions loadOptions, InputStruct struct) {
           LoadContext context = new LoadContext(loadOptions);
           for (VertexMapping vertexMapping : struct.vertices()) {
               this.builders.put(
                       new VertexBuilder(context, struct, vertexMapping),
                       new ArrayList<>());
           }
           for (EdgeMapping edgeMapping : struct.edges()) {
               this.builders.put(new EdgeBuilder(context, struct, edgeMapping),
                                 new ArrayList<>());
           }
           context.updateSchemaCache();
           return context;
       }
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250915980

   > @haohao0103 明白你的意思了，但是 `HugeGraphLoader` 还有一个创建 Schema 的逻辑`createSchema`，SparkLoader 还没有加这个逻辑，但是也是需要的，创建 schema 后再加载 schema 缓存看起来更合理一些。另外如果有必要的话完全可以将 initPartition 方法改为 public，或者定义一个其他更通用的方法。
   
   明白了，谢谢哈


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-hugegraph-toolchain] haohao0103 commented on pull request #334: Schema cache optimize

Posted by GitBox <gi...@apache.org>.

haohao0103 commented on PR #334:
URL: https://github.com/apache/incubator-hugegraph-toolchain/pull/334#issuecomment-1250892881

   > > > hi @haohao0103 ,thanks for your contribution. I think maybe it's better to updateSchemaCache after `new LoadContext` just like `HugeGraphLoader`, what do you think? @imbajin @haohao0103 https://github.com/apache/incubator-hugegraph-toolchain/blob/master/hugegraph-loader/src/main/java/com/baidu/hugegraph/loader/spark/HugeGraphSparkLoader.java#L120
   > > > ```java
   > > >     private LoadContext initPartition(
   > > >             LoadOptions loadOptions, InputStruct struct) {
   > > >         LoadContext context = new LoadContext(loadOptions);
   > > >         for (VertexMapping vertexMapping : struct.vertices()) {
   > > >             this.builders.put(
   > > >                     new VertexBuilder(context, struct, vertexMapping),
   > > >                     new ArrayList<>());
   > > >         }
   > > >         for (EdgeMapping edgeMapping : struct.edges()) {
   > > >             this.builders.put(new EdgeBuilder(context, struct, edgeMapping),
   > > >                               new ArrayList<>());
   > > >         }
   > > >         context.updateSchemaCache();
   > > >         return context;
   > > >     }
   > > > ```
   > > 
   > > 
   > > 我觉得放在这里是可以的，我们第一次就在这里解决的；但是initPartition方法是HugeGraphSparkLoader类私有的；感觉需要调用initPartition方法才能正确正确创建及初始化LoadContext对象，感觉有点不合适哈。个人的浅见。。。。
   > 
   > 没有明白你的意思，你是指哪里不合适？LoadContext 应该是只有在初始化分区的时候才需要创建？
   
   我的意思是这样会不会限定了LoadContext对象只有在HugeGraphSparkLoader类才能被正确初始化，限制了扩展性；我们在开发Bulkload代码的时候，需要扩展新的类然后使用LoadContext对象，就没办法调用initPartition方法了；在另外的类中构建LoadContext对象时需要再显示执行下updateSchemaCache()方法才能保证loadcontext对象可以正确使用。这个是我的理解哈，代码设计等我也不是很精通，仅供大佬参考哈


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hugegraph.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org