You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Xander Dunn <xa...@xander.ai.INVALID> on 2020/10/04 00:51:48 UTC

Defining Compression Type in Arrow GLib?

I am writing a Swift language interface to the Apache Arrow GLib C bindings.

I am currently successfully putting Swift arrays into GArrays into GTables into feather files on disk:

func saveGTableToFeather(_ gTable: UnsafeMutablePointer<GArrowTable>, outputPath: String) throws {
   var error: UnsafeMutablePointer<GError>? = nil
   // TODO: How do I turn on compression?
   let properties = garrow_feather_write_properties_new()
   let path = outputPath.cString(using: .utf8)
   let output = garrow_file_output_stream_new(path, 0, &error)
   if let error = error {
       let errorString: String = String(cString: error.pointee.message)
       g_error_free(error)
       g_object_unref(output)
       g_object_unref(properties)
       throw ArrowError.invalidTableCreation(errorString)
   }
   let result: gboolean = garrow_table_write_as_feather(gTable, GARROW_OUTPUT_STREAM(output), properties, &error)
   if result == 0 {
       let errorString: String = error != nil ? String(cString: error!.pointee.message) : ""
       g_error_free(error)
       g_object_unref(output)
       g_object_unref(properties)
       throw ArrowError.invalidTableCreation(errorString)
   }
   g_object_unref(output)
   g_object_unref(properties)
}

When I run file myData.feather on Ubuntu 18.04, it doesn’t indicate any kind of compression, so the files I’m saving may not be compressed at all.

You can see I’ve created a new GArrowFeatherWriteProperties with garrow_feather_write_properties_new , but how do I set the compression level? I see it is meant to have a compression property based on the docs here ( https://arrow.apache.org/docs/c_glib/arrow-glib/GArrowTable.html#GArrowFeatherWriteProperties--compression ). When I attempt to set properties.compression (In Swift this is likely properties.pointee.compression ), I get a compiler error that it has no such property. Do I need to make a cast? Is there any example code of setting the compression on a GArrowFeatherWriteProperties ?

Thank you,

Xander

​

Re: Defining Compression Type in Arrow GLib?

Posted by Sutou Kouhei <ko...@clear-code.com>.
Hi,

You can use g_object_set():

  g_object_set(G_OBJECT(properties),
               "compression", GARROW_COMPRESSION_TYPE_ZSTD,
               NULL);

See also:

  * g_object_set()'s reference manual:
    https://developer.gnome.org/gobject/stable/gobject-The-Base-Object-Type.html#g-object-set

  * "compression" property's reference manual:
    https://arrow.apache.org/docs/c_glib/arrow-glib/GArrowTable.html#GArrowFeatherWriteProperties--compression


Thanks,
--
kou

In <kf...@we.are.superhuman.com>
  "Defining Compression Type in Arrow GLib?" on Sun, 04 Oct 2020 00:51:48 +0000,
  "Xander Dunn" <xa...@xander.ai.INVALID> wrote:

> I am writing a Swift language interface to the Apache Arrow GLib C bindings.
> 
> I am currently successfully putting Swift arrays into GArrays into GTables into feather files on disk:
> 
> func saveGTableToFeather(_ gTable: UnsafeMutablePointer<GArrowTable>, outputPath: String) throws {
>    var error: UnsafeMutablePointer<GError>? = nil
>    // TODO: How do I turn on compression?
>    let properties = garrow_feather_write_properties_new()
>    let path = outputPath.cString(using: .utf8)
>    let output = garrow_file_output_stream_new(path, 0, &error)
>    if let error = error {
>        let errorString: String = String(cString: error.pointee.message)
>        g_error_free(error)
>        g_object_unref(output)
>        g_object_unref(properties)
>        throw ArrowError.invalidTableCreation(errorString)
>    }
>    let result: gboolean = garrow_table_write_as_feather(gTable, GARROW_OUTPUT_STREAM(output), properties, &error)
>    if result == 0 {
>        let errorString: String = error != nil ? String(cString: error!.pointee.message) : ""
>        g_error_free(error)
>        g_object_unref(output)
>        g_object_unref(properties)
>        throw ArrowError.invalidTableCreation(errorString)
>    }
>    g_object_unref(output)
>    g_object_unref(properties)
> }
> 
> When I run file myData.feather on Ubuntu 18.04, it doesn’t indicate any kind of compression, so the files I’m saving may not be compressed at all.
> 
> You can see I’ve created a new GArrowFeatherWriteProperties with garrow_feather_write_properties_new , but how do I set the compression level? I see it is meant to have a compression property based on the docs here ( https://arrow.apache.org/docs/c_glib/arrow-glib/GArrowTable.html#GArrowFeatherWriteProperties--compression ). When I attempt to set properties.compression (In Swift this is likely properties.pointee.compression ), I get a compiler error that it has no such property. Do I need to make a cast? Is there any example code of setting the compression on a GArrowFeatherWriteProperties ?
> 
> Thank you,
> 
> Xander
> 
> ​