You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/05/24 18:34:59 UTC

[GitHub] [iceberg] jasonhughes248 commented on pull request #2622: Docs: add catalog and metadata files to metadata structure diagram

jasonhughes248 commented on pull request #2622:
URL: https://github.com/apache/iceberg/pull/2622#issuecomment-847247618


   @rdblue yeah, I'd be happy to make these changes in the original drawing if you want to share that with me. if it's lucidchart which it looks like, my username is jason@dremio.com
   
   I think the diagram in that linked deck has some good additional aspects:
   1.  info on the content/purpose for each file type
   2. visually distinguishing catalog vs metadata vs data tiers
   3. the view of how the high-level physical representation of the table changes over time. 
   
   but, I think the lucidchart diagram looks more polished/official, which is probably better for the docs site, so I think it'd be good to try to merge the two - use the current diagram as the base, and add the aspects of the one you linked (plus the both snapshots in the metadata file change). only thing I worry about is the size of the boxes and diagram with the file content/purpose in them since they'll all be visible at the same time in this one-image usage vs the multi-slide build in the deck, but let's see. I can create a copy of the diagram and try it out. 
   
   on #3, I think a diagram like this can be helpful in two related but different situations - an introductory overview for new folks that gives a high-level understanding of the layout and types of files, and a deeper view of how the layout and underlying structure changes as the dataset changes. 
   
   to me, having the point-in-time diagram works really well for the spot it's currently in the docs, since it's a good introduction for new users on that page. then there could be another section covering the changes-over-time content for users who want that more detailed view into how it changes as table changes are made
   
   I actually did a presentation internally where a subset has that structure ("layout overview and file types, then how it changes over time") and it worked well. I'm also doing it publicly via webinar on thursday and making a written version of it shortly thereafter, so that could help here too. the changes-over-time view in that content currently shows the files and their location in the filesystem with color-coding of file types, but I think this diagram view like you have here is a better way to tell that story (or maybe both). I can try to address point #3 with multiple versions of this new merged diagram in that content and share here to see if it works well.
   
   what do you think? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org