You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/12/07 20:13:50 UTC

[GitHub] [spark] baganokodo2022 commented on pull request #38922: [SPARK-41396][SQL][PROTOBUF] OneOf field support and recursion checks

baganokodo2022 commented on PR #38922:
URL: https://github.com/apache/spark/pull/38922#issuecomment-1341534552

   Hi @SandishKumarHN,
   
   For the `recursionDepth` option, could we consider naming it as `CircularReferenceTolerance` or `CircularReferenceDepth` for clarity?
   For instance, -1 (default value) will error out on any circular reference, 0 drops any circular reference field, 1 allows the same field to be entered twice, and on.
    
   Besides, can we also support a "CircularReferenceType" option with a enum value of `[FIELD_NAME, FIELD_TYPE]`. The reason is because navigation can go very deep before the same **fully-qualified** `FIELD_NAME` is encountered again. While `FIELD_TYPE` stops recursive navigation much faster. We could make `FIELD_NAME` the default option. In my test cases, with `FIELD_TYPE`, a circular reference can repeat 3 times before the executor hit OOM, while `FIELD_NAME` hit OOM when `CircularReferenceTolerance` is set to 1.
   
   Please let me know your thoughts.
   
   cc @rangadi 
   
   Thank you
   
   Xinyu Liu
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org