You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Luís Filipe Nassif (Jira)" <ji...@apache.org> on 2022/07/11 21:50:00 UTC
[jira] [Created] (TIKA-3815) Inconsistent Date/Time information extracted from Exif data
Luís Filipe Nassif created TIKA-3815:
----------------------------------------
Summary: Inconsistent Date/Time information extracted from Exif data
Key: TIKA-3815
URL: https://issues.apache.org/jira/browse/TIKA-3815
Project: Tika
Issue Type: Bug
Components: parser
Affects Versions: 2.4.1
Reporter: Luís Filipe Nassif
Attachments: IMG_20220616_111848_HDR.jpg
Running tika-app-2.4.1.jar on the attached image, this metadata is returned:
Exif IFD0:Date/Time: 2022:06:16 11:18:49
Exif SubIFD:Date/Time Digitized: 2022:06:16 11:18:49
Exif SubIFD:Date/Time Original: 2022:06:16 11:18:49
Exif SubIFD:Time Zone: -03:00
Exif SubIFD:Time Zone Digitized: -03:00
Exif SubIFD:Time Zone Original: -03:00
File Modified Date: Thu Jun 16 11:18:50 -03:00 2022
GPS:GPS Date Stamp: 2022:06:16
GPS:GPS Time-Stamp: 14:18:47.000 UTC
dcterms:created: 2022-06-16T08:18:49
dcterms:modified: 2022-06-16T08:18:49
exif:DateTimeOriginal: 2022-06-16T08:18:49
The right value is 2022-06-16T14:18:49Z. Although there is no timezone specified for some values, I think it makes no sense converting it to timezones different than GMT or the one used to take the picture (-03:00), so Tika could be making an incorrect timezone conversion on the last 3 fields.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)