You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/03/24 15:56:23 UTC

[GitHub] [beam] mosche edited a comment on pull request #17172: [BEAM-14166] Performance / Cache improvements to RowWithGetter

mosche edited a comment on pull request #17172:
URL: https://github.com/apache/beam/pull/17172#issuecomment-1077772316


   @reuvenlax certainly a valid question and I'm happy to discuss what's worth caching. 
   
   Though, the key point here is, that there's already a cache in place for array, iterable and map types. Processing for these is currently not ideal. For instance, looking at an array field the following steps happen on `getValue`:
   
   1) Invoke getter, the generated byte code wraps the array in a list
   2) Then lazily transform the list elements (`Lists.transform`) and cache the result
   
   The problem is really on the 2nd access. Step 1) is repeated, but step 2) uses the cache discarding all the work of step 1).
   
   This PR mostly addresses some of the inconsistencies using the existing cache and reduces the overhead of the cache by choosing better fitted data structures to store cached values.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org