You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Lewis John McGibbney (JIRA)" <ji...@apache.org> on 2015/01/29 08:54:35 UTC

[jira] [Comment Edited] (TIKA-1423) Build a parser to extract data from GRIB formats

    [ https://issues.apache.org/jira/browse/TIKA-1423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14296541#comment-14296541 ] 

Lewis John McGibbney edited comment on TIKA-1423 at 1/29/15 7:54 AM:
---------------------------------------------------------------------

Patch for trunk which passes all tests including issues experienced with bundle module. Some investigative work was required here as well as publishing [Unidata dependencies|http://search.maven.org/#search|ga|1|ucar] to Maven central and updating our [wiki documentation|https://wiki.apache.org/tika/ThirdPartySonaType]. 
Please insert the .grib file into
{code}
tika-parsers/src/test/resources/test-documents/gdas1.forecmwf.2014062612.grib2
{code}


was (Author: lewismc):
Patch for trunk which passes all tests including issues experienced with bundle module. Some investigative work was required here as well as publishing [Unidata dependencies|http://search.maven.org/#search|ga|1|ucar] to Maven central and updating our [https://wiki.apache.org/tika/ThirdPartySonaType|wiki documentation]. 

> Build a parser to extract data from GRIB formats
> ------------------------------------------------
>
>                 Key: TIKA-1423
>                 URL: https://issues.apache.org/jira/browse/TIKA-1423
>             Project: Tika
>          Issue Type: New Feature
>          Components: metadata, mime, parser
>    Affects Versions: 1.6
>            Reporter: Vineet Ghatge
>            Assignee: Vineet Ghatge
>            Priority: Critical
>              Labels: features, newbie
>             Fix For: 1.8
>
>         Attachments: GRIBParsertest.java, GribParser.java, NLDAS_FORA0125_H.A20130112.1200.002.grb, TIKA-1423.palsulich.120614.patch, TIKA-1423.patch, TIKA-1423v2.patch, fileName.html, gdas1.forecmwf.2014062612.grib2
>
>
> Arctic dataset contains a MIME format called GRIB -  General 
> Regularly­distributed information in Binary form http://en.wikipedia.org/wiki/GRIB . GRIB is a well known data format which is a concise data format used in meteorology to store historical and 
> weather data. There are 2 different types of the format ­ GRIB 0, GRIB 2.  The focus will be on GRIB 2 which is the most prevalent. Each GRIB record intended for either transmission or storage contains a single parameter with values located at an array of grid points, or represented as a set of spectral coefficients, for a single level (or layer), encoded as a continuous bit stream. Logical divisions of the record are designated as "sections", each of which provides control information and/or data. A GRIB record consists of six sections, two of which are optional: 
>  
> (0) Indicator Section 
> (1) Product Definition Section (PDS) 
> (2) Grid Description Section (GDS) ­ optional 
> (3) Bit Map Section (BMS) ­ optional 
> (4) Binary Data Section (BDS) 
> (5) '7777' (ASCII Characters)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)