You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/11/24 13:01:54 UTC

[GitHub] [arrow] phpsxg opened a new issue, #14728: The metadata parameter of pa.filed is not valid

phpsxg opened a new issue, #14728:
URL: https://github.com/apache/arrow/issues/14728

   ### Describe the usage question you have. Please include as many useful details as  possible.
   
   
   pa.field sets the metadata parameter, prints schema.metadata, but the metadata of columns is empty, how to correctly display the metadata of columns
   
   
   
   ```
   class Dimension:
       INSTRUMENT_TYPE = 'instrument_type'
       INSTRUMENT_CODE = 'instrument_code'
       INSTRUMENT_NAME = 'instrument_name'
   
   
   
   schema = pa.schema([
           pa.field(Dimension.INSTRUMENT_CODE, pa.string(), metadata={b"table_filed": b"FUND_CODE"}),
           pa.field(Dimension.INSTRUMENT_NAME, pa.string(), metadata={b"table_filed": b"FUND_NAME"}),
           pa.field(Dimension.INSTRUMENT_TYPE, pa.string())
       ],
           metadata={
               Dimension.INSTRUMENT_CODE: 'code',
               Dimension.INSTRUMENT_NAME: 'name',
               Dimension.INSTRUMENT_TYPE: 'type',
           }
       )
   df = cls.query(sql)
           df.rename(columns=cls.etl_rename_dict, inplace=True)
           df[Dimension.INSTRUMENT_TYPE] = InstrumentType.FUND
   
           table = pa.Table.from_pandas(df, schema=cls.schema)
           # pq.write_table(table, 'test_parquet')
           table_schema = table.schema
           print(table_schema.metadata)
   
   ```
   **Print the results**
   ```
   {b'instrument_code': b'code', b'instrument_name': b'name', b'instrument_type': b'type', b'pandas': b'{"index_columns": [], "column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", "numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": [{"name": "instrument_code", "field_name": "instrument_code", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "instrument_name", "field_name": "instrument_name", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}, {"name": "instrument_type", "field_name": "instrument_type", "pandas_type": "unicode", "numpy_type": "object", "metadata": null}], "creator": {"library": "pyarrow", "version": "10.0.0"}, "pandas_version": "1.5.1"}'}
   
   ```
   
   
   
   
   
   - python:3.10
   - pyarrow: 10.0.0
   
   ### Component(s)
   
   Parquet, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] phpsxg closed issue #14728: The metadata parameter of pa.filed is not valid

Posted by GitBox <gi...@apache.org>.
phpsxg closed issue #14728: The metadata parameter of pa.filed is not valid
URL: https://github.com/apache/arrow/issues/14728


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] phpsxg commented on issue #14728: The metadata parameter of pa.filed is not valid

Posted by GitBox <gi...@apache.org>.
phpsxg commented on issue #14728:
URL: https://github.com/apache/arrow/issues/14728#issuecomment-1327189854

   > `pa.field` is going to assign **column** metadata. You are printing **table** metadata.
   > 
   > Try:
   > 
   > ```
   > for field in table_schema:
   >   print(field.metadata)
   > ```
   
   Great, solved my problem, thanks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #14728: The metadata parameter of pa.filed is not valid

Posted by GitBox <gi...@apache.org>.
westonpace commented on issue #14728:
URL: https://github.com/apache/arrow/issues/14728#issuecomment-1327002022

   `pa.field` is going to assign **column** metadata.  You are printing **table** metadata.
   
   Try:
   
   ```
   for field in table_schema:
     print(field.metadata)
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org