You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Nick Burch <ni...@apache.org> on 2014/08/07 09:30:59 UTC
Re: svn commit: r1616295 [1/2] - in /tika/trunk: ./ tika-app/src/main/java/org/apache/tika/cli/
tika-app/src/test/java/org/apache/tika/cli/ tika-core/src/main/java/org/apache/tika/detect/
tika-core/src/main/java/org/apache/tika/io/ tika-core/src/main/java/...
On 06/08/14 19:16, tpalsulich@apache.org wrote:
> Author: tpalsulich
> Date: Wed Aug 6 18:16:27 2014
> New Revision: 1616295
>
> URL: http://svn.apache.org/r1616295
> Log:
> Fix for TIKA-1387 (thanks Uwe Schindler). Adding the Maven forbidden-apis plugin and fixing identified errors.
Minor thing, but any chance you could change your IDE to use explicit
imports rather than wildcard ones? (Sadly your patch seems to have
re-written some)
I notice in a few places you've used things like Locale.getDefaultLocale
and Charset.getDefaultCharset. Are we sure they're correct? For example:
---
tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java
(original)
+++
tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java
Wed Aug 6 18:16:27 2014
@@ -245,7 +245,7 @@ public class ImageMetadataExtractor {
}
static class ExifHandler implements DirectoryHandler {
- private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new
SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss");
+ private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new
SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.getDefault());
That looks to be formatting to ISO-8859-1 format, so should probably be
using a standard locale not the system default - ISO-8859-1 is the same
everywhere!
Otherwise, looks a pretty epic patch for a short period of time!
Nick
Re: svn commit: r1616295 [1/2] - in /tika/trunk: ./
tika-app/src/main/java/org/apache/tika/cli/ tika-app/src/test/java/org/apache/tika/cli/
tika-core/src/main/java/org/apache/tika/detect/ tika-core/src/main/java/org/apache/tika/io/
tika-core/src/main/java/...
Posted by Tyler Palsulich <tp...@gmail.com>.
Hi Nick,
Good catch! I'll fix the imports now.
No, I'm not sure about the default Locale, Charset, etc. See the discussion
at TIKA-1387 and the corresponding review board.
Thanks for keeping an eye out!
Tyler
On Aug 7, 2014 12:31 AM, "Nick Burch" <ni...@apache.org> wrote:
> On 06/08/14 19:16, tpalsulich@apache.org wrote:
>
>> Author: tpalsulich
>> Date: Wed Aug 6 18:16:27 2014
>> New Revision: 1616295
>>
>> URL: http://svn.apache.org/r1616295
>> Log:
>> Fix for TIKA-1387 (thanks Uwe Schindler). Adding the Maven forbidden-apis
>> plugin and fixing identified errors.
>>
>
> Minor thing, but any chance you could change your IDE to use explicit
> imports rather than wildcard ones? (Sadly your patch seems to have
> re-written some)
>
> I notice in a few places you've used things like Locale.getDefaultLocale
> and Charset.getDefaultCharset. Are we sure they're correct? For example:
>
> --- tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java
> (original)
> +++ tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java
> Wed Aug 6 18:16:27 2014
> @@ -245,7 +245,7 @@ public class ImageMetadataExtractor {
> }
>
> static class ExifHandler implements DirectoryHandler {
> - private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new
> SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss");
> + private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new
> SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.getDefault());
>
> That looks to be formatting to ISO-8859-1 format, so should probably be
> using a standard locale not the system default - ISO-8859-1 is the same
> everywhere!
>
>
> Otherwise, looks a pretty epic patch for a short period of time!
>
> Nick
>