You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2021/06/15 17:11:00 UTC
[jira] [Created] (TIKA-3446) OneNote - look into adding support for
OneNote 365 documents
Nicholas DiPiazza created TIKA-3446:
---------------------------------------
Summary: OneNote - look into adding support for OneNote 365 documents
Key: TIKA-3446
URL: https://issues.apache.org/jira/browse/TIKA-3446
Project: Tika
Issue Type: New Feature
Components: parser
Affects Versions: 1.27
Reporter: Nicholas DiPiazza
Assignee: Nicholas DiPiazza
While doing some parsing of OneNote documents, I was investigating a slew of them that did not seem to parse very well.
When I did some digging, I found out that these documents were generated from SharePoint Online.
I had hoped that OneNote documents generated from OneNote would just be the same as OnPrem OneNote documents from 2016, 2019 etc.
But turns out this is NOT the case.
I checked out the Microsoft specification MS-ONESTORE and found that the documents do not match the specifications that are published.
Opened a community post: [Looking for the MS spec for OneNote 365 version - Microsoft Q&A|https://docs.microsoft.com/en-us/answers/questions/436943/looking-for-the-ms-spec-for-onenote-365-version-1.html]
And also opened an internal ticket with Microsoft.
They will be responding soon with an analysis of my issue and we'll see if there is anything we can do.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)