You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Nick Burch <ni...@apache.org> on 2014/08/07 09:30:59 UTC

Re: svn commit: r1616295 [1/2] - in /tika/trunk: ./ tika-app/src/main/java/org/apache/tika/cli/ tika-app/src/test/java/org/apache/tika/cli/ tika-core/src/main/java/org/apache/tika/detect/ tika-core/src/main/java/org/apache/tika/io/ tika-core/src/main/java/...

On 06/08/14 19:16, tpalsulich@apache.org wrote:
> Author: tpalsulich
> Date: Wed Aug  6 18:16:27 2014
> New Revision: 1616295
>
> URL: http://svn.apache.org/r1616295
> Log:
> Fix for TIKA-1387 (thanks Uwe Schindler). Adding the Maven forbidden-apis plugin and fixing identified errors.

Minor thing, but any chance you could change your IDE to use explicit 
imports rather than wildcard ones? (Sadly your patch seems to have 
re-written some)

I notice in a few places you've used things like Locale.getDefaultLocale 
and Charset.getDefaultCharset. Are we sure they're correct? For example:

--- 
tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java 
(original)
+++ 
tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java 
Wed Aug  6 18:16:27 2014
@@ -245,7 +245,7 @@ public class ImageMetadataExtractor {
      }

      static class ExifHandler implements DirectoryHandler {
-        private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new 
SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss");
+        private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new 
SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.getDefault());

That looks to be formatting to ISO-8859-1 format, so should probably be 
using a standard locale not the system default - ISO-8859-1 is the same 
everywhere!


Otherwise, looks a pretty epic patch for a short period of time!

Nick

Re: svn commit: r1616295 [1/2] - in /tika/trunk: ./ tika-app/src/main/java/org/apache/tika/cli/ tika-app/src/test/java/org/apache/tika/cli/ tika-core/src/main/java/org/apache/tika/detect/ tika-core/src/main/java/org/apache/tika/io/ tika-core/src/main/java/...

Posted by Tyler Palsulich <tp...@gmail.com>.
Hi Nick,

Good catch! I'll fix the imports now.

No, I'm not sure about the default Locale, Charset, etc. See the discussion
at TIKA-1387 and the corresponding review board.

Thanks for keeping an eye out!
Tyler
On Aug 7, 2014 12:31 AM, "Nick Burch" <ni...@apache.org> wrote:

> On 06/08/14 19:16, tpalsulich@apache.org wrote:
>
>> Author: tpalsulich
>> Date: Wed Aug  6 18:16:27 2014
>> New Revision: 1616295
>>
>> URL: http://svn.apache.org/r1616295
>> Log:
>> Fix for TIKA-1387 (thanks Uwe Schindler). Adding the Maven forbidden-apis
>> plugin and fixing identified errors.
>>
>
> Minor thing, but any chance you could change your IDE to use explicit
> imports rather than wildcard ones? (Sadly your patch seems to have
> re-written some)
>
> I notice in a few places you've used things like Locale.getDefaultLocale
> and Charset.getDefaultCharset. Are we sure they're correct? For example:
>
> --- tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java
> (original)
> +++ tika/trunk/tika-parsers/src/main/java/org/apache/tika/parser/image/ImageMetadataExtractor.java
> Wed Aug  6 18:16:27 2014
> @@ -245,7 +245,7 @@ public class ImageMetadataExtractor {
>      }
>
>      static class ExifHandler implements DirectoryHandler {
> -        private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new
> SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss");
> +        private static final SimpleDateFormat DATE_UNSPECIFIED_TZ = new
> SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss", Locale.getDefault());
>
> That looks to be formatting to ISO-8859-1 format, so should probably be
> using a standard locale not the system default - ISO-8859-1 is the same
> everywhere!
>
>
> Otherwise, looks a pretty epic patch for a short period of time!
>
> Nick
>