You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2014/12/05 22:51:13 UTC
[jira] [Comment Edited] (TIKA-1423) Build a parser to extract data
from GRIB formats
[ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14236174#comment-14236174 ]
Tyler Palsulich edited comment on TIKA-1423 at 12/5/14 9:50 PM:
----------------------------------------------------------------
Thanks [~lewismc]. If you look at http://www.unidata.ucar.edu/software/thredds/v4.5/netcdf-java/reference/JarDependencies.html, in order to parse GRIB files we'll need the grib module. After removing it, I get the following error:
{code}
Caused by: java.io.IOException: Cant read /var/folders/bk/9dn54j3s2y14_ktr0jg89nh00000gn/T/apache-tika-1935265831405715273.tmp: not a valid CDM file.
at ucar.nc2.NetcdfFile.open(NetcdfFile.java:811)
at ucar.nc2.NetcdfFile.open(NetcdfFile.java:428)
{code}
We don't need the cdm module. Also, I was mistaken when I said all tests pass. For some reason, some tika-bundle tests fail... I don't know why.
was (Author: tpalsulich):
Thanks [~lewismc]. If you look at http://www.unidata.ucar.edu/software/thredds/v4.5/netcdf-java/reference/JarDependencies.html, in order to parse GRIB files we'll need the grib module. After removing it, I get the following error:
{code}
Caused by: java.io.IOException: Cant read /var/folders/bk/9dn54j3s2y14_ktr0jg89nh00000gn/T/apache-tika-1935265831405715273.tmp: not a valid CDM file.
at ucar.nc2.NetcdfFile.open(NetcdfFile.java:811)
at ucar.nc2.NetcdfFile.open(NetcdfFile.java:428)
{code}
We don't need the cdm module. But, I'm not sure. Also, I was mistaken when I said all tests pass. For some reason, some tika-bundle tests fail... I don't know why.
> Build a parser to extract data from GRIB formats
> ------------------------------------------------
>
> Key: TIKA-1423
> URL: https://issues.apache.org/jira/browse/TIKA-1423
> Project: Tika
> Issue Type: New Feature
> Components: metadata, mime, parser
> Affects Versions: 1.6
> Reporter: Vineet Ghatge
> Assignee: Vineet Ghatge
> Priority: Critical
> Labels: features, newbie
> Fix For: 1.8
>
> Attachments: GRIBParsertest.java, GribParser.java, NLDAS_FORA0125_H.A20130112.1200.002.grb, fileName.html, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB - General
> Regularlydistributed information in Binary form http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is a concise data format used in meteorology to store historical and
> weather data. There are 2 different types of the format GRIB 0, GRIB 2. The focus will be on GRIB 2 which is the most prevalent. Each GRIB record intended for either transmission or storage contains a single parameter with values located at an array of grid points, or represented as a set of spectral coefficients, for a single level (or layer), encoded as a continuous bit stream. Logical divisions of the record are designated as "sections", each of which provides control information and/or data. A GRIB record consists of six sections, two of which are optional:
>
> (0) Indicator Section
> (1) Product Definition Section (PDS)
> (2) Grid Description Section (GDS) optional
> (3) Bit Map Section (BMS) optional
> (4) Binary Data Section (BDS)
> (5) '7777' (ASCII Characters)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)