You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/08/05 12:39:10 UTC

[GitHub] [arrow] nealrichardson commented on issue #13803: Get the table schema as json?

nealrichardson commented on issue #13803:
URL: https://github.com/apache/arrow/issues/13803#issuecomment-1206399291

   The trouble is: what do you expect to do with this JSON? Round-trip and re-create the Schema from the JSON? That gets tricky with things like nested types, extension types, metadata, etc. It really becomes an Arrow specification question, not a JSON serialization problem. As such, if you want to pursue it, you should raise it on the dev mailing list. I seem to recall this idea coming up in the past and it not getting much support, but maybe I misremember, or maybe the times are different. 
   
   Naively, in R you can get a JSON string version of the Schema with some manipulation. Here's an ugly base-R one-liner, just to illustrate:
   
   ```
   > s
   Schema
   name: string
   height: int32
   mass: double
   hair_color: string
   skin_color: string
   eye_color: string
   birth_year: double
   sex: string
   gender: string
   homeworld: string
   species: string
   films: list<item: string>
   vehicles: list<item: string>
   starships: list<item: string>
   
   > cat(jsonlite::toJSON(setNames(lapply(s, function (x) x$type$ToString()), s$names), pretty = TRUE, auto_unbox = TRUE))
   {
     "name": "string",
     "height": "int32",
     "mass": "double",
     "hair_color": "string",
     "skin_color": "string",
     "eye_color": "string",
     "birth_year": "double",
     "sex": "string",
     "gender": "string",
     "homeworld": "string",
     "species": "string",
     "films": "list<item: string>",
     "vehicles": "list<item: string>",
     "starships": "list<item: string>"
   }
   ```
   
   But if you wanted to create a Schema from that, you're still in the world of parsing things like `"list<item: string>"`.
   
   On the R side, there is this nice code generating utility that Romain added recently:
   
   ```
   > s$code()
   schema(name = utf8(), height = int32(), mass = float64(), hair_color = utf8(), 
       skin_color = utf8(), eye_color = utf8(), birth_year = float64(), 
       sex = utf8(), gender = utf8(), homeworld = utf8(), species = utf8(), 
       films = list_of(utf8()), vehicles = list_of(utf8()), starships = list_of(utf8()))
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org