You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Susan Borda <sb...@umich.edu> on 2018/06/26 15:28:24 UTC

Text extraction for FITS similar to NetCDF?

Hi-
I'm working with NetCDF and FITS files and I have Tika working for
extracting the header text in NetCDF files but I can only get basic file
metadata for FITS files. Does header text extraction not work on FITS
files?

Followed this for FITS:
https://wiki.apache.org/tika/TikaGDAL
And am only seeing the basic file metadata not the actual text from the
header.

This is what I'm using for NetCDF files (also used tika --gui to see the
header text):
curl -X -PUT --data-binary @age4_timeseries.nc http://localhost:9998/tika
--header "Content-type: text/-t"
curl -T age4_timeseries.nc http://localhost:9998/tika --header "Accept:
text/plain"

I've looked through the Tika Jira and found a reference from 2012:
https://issues.apache.org/jira/browse/TIKA-874

Any advice would be appreciated.

Thanks,
susan