You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Chris Mattmann <ma...@apache.org> on 2014/10/10 08:22:53 UTC

Review Request 26542: Tika GDAL parser

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26542/
-----------------------------------------------------------

Review request for tika, Lewis McGibbney and Tyler Palsulich.


Bugs: TIKA-605
    https://issues.apache.org/jira/browse/TIKA-605


Repository: tika


Description
-------

This is a patch that wraps the Geospatial Data Abstraction Library (GDAL) tool that parses literally hundreds of geospatial formats as a Tika parser.


Diffs
-----

  ./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java PRE-CREATION 
  ./trunk/tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser 1629918 
  ./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java PRE-CREATION 

Diff: https://reviews.apache.org/r/26542/diff/


Testing
-------

Tested via unit tests, and ran locally.


Thanks,

Chris Mattmann


Re: Review Request 26542: Tika GDAL parser

Posted by Tyler Palsulich <tp...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26542/#review56290
-----------------------------------------------------------



./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java
<https://reviews.apache.org/r/26542/#comment96589>

    Small comment, should indent inside the if, else.



./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java
<https://reviews.apache.org/r/26542/#comment96590>

    Same indentation comment. Trying to stay under 80 lines?



./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java
<https://reviews.apache.org/r/26542/#comment96591>

    Would it be better to include this directly in ExternalParser? TesseractOCRParser has a very similar method.



./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java
<https://reviews.apache.org/r/26542/#comment96592>

    Same comment, move this to ExternalParser? Make all external Parsers (e.g. OCR and GDAL) in a subpackage of o.a.t.parsers.external and make this method protected.



./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java
<https://reviews.apache.org/r/26542/#comment96593>

    Should close the xhtml inside the finally, in case read throws an exception.



./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java
<https://reviews.apache.org/r/26542/#comment96594>

    Need an assumeTrue. Causes test to fail when gdalinfo not installed.



./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java
<https://reviews.apache.org/r/26542/#comment96595>

    Should remove the whitespace here.



./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java
<https://reviews.apache.org/r/26542/#comment96598>

    Need another assumeTrue(canRun()).



./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java
<https://reviews.apache.org/r/26542/#comment96596>

    Same, whitespace.



./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java
<https://reviews.apache.org/r/26542/#comment96597>

    Whitespace.


- Tyler Palsulich


On Oct. 11, 2014, 3:16 p.m., Chris Mattmann wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/26542/
> -----------------------------------------------------------
> 
> (Updated Oct. 11, 2014, 3:16 p.m.)
> 
> 
> Review request for tika, Lewis McGibbney and Tyler Palsulich.
> 
> 
> Bugs: TIKA-605
>     https://issues.apache.org/jira/browse/TIKA-605
> 
> 
> Repository: tika
> 
> 
> Description
> -------
> 
> This is a patch that wraps the Geospatial Data Abstraction Library (GDAL) tool that parses literally hundreds of geospatial formats as a Tika parser.
> 
> 
> Diffs
> -----
> 
>   ./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java PRE-CREATION 
>   ./trunk/tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser 1629918 
>   ./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/26542/diff/
> 
> 
> Testing
> -------
> 
> Tested via unit tests, and ran locally.
> 
> 
> Thanks,
> 
> Chris Mattmann
> 
>


Re: Review Request 26542: Tika GDAL parser

Posted by Chris Mattmann <ma...@apache.org>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/26542/
-----------------------------------------------------------

(Updated Oct. 11, 2014, 3:16 p.m.)


Review request for tika, Lewis McGibbney and Tyler Palsulich.


Changes
-------

Added a test to parse FITS files. Tested now with TikaApp from the CLI. Works great! Fixed some bugs related to getting the ContentHandler metadata output correct. Had to move ExternalParser functionality inside of the class, see comment notes inside.


Bugs: TIKA-605
    https://issues.apache.org/jira/browse/TIKA-605


Repository: tika


Description
-------

This is a patch that wraps the Geospatial Data Abstraction Library (GDAL) tool that parses literally hundreds of geospatial formats as a Tika parser.


Diffs (updated)
-----

  ./trunk/tika-parsers/src/main/java/org/apache/tika/parser/gdal/GDALParser.java PRE-CREATION 
  ./trunk/tika-parsers/src/main/resources/META-INF/services/org.apache.tika.parser.Parser 1629918 
  ./trunk/tika-parsers/src/test/java/org/apache/tika/parser/gdal/TestGDALParser.java PRE-CREATION 

Diff: https://reviews.apache.org/r/26542/diff/


Testing
-------

Tested via unit tests, and ran locally.


Thanks,

Chris Mattmann