You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2021/06/27 02:42:00 UTC
[jira] [Commented] (TIKA-3446) OneNote - look into adding support
for OneNote 365 documents
[ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17370146#comment-17370146 ]
Nicholas DiPiazza commented on TIKA-3446:
-----------------------------------------
Talked to Microsoft open docs people and they informed me that the additional information I need is in MS-ONESTORE 2.8 and includes references to another spec [https://interoperability.blob.core.windows.net/files/MS-FSSHTTPB/%5bMS-FSSHTTPB%5d.pdf]
While trying to implement this I noticed the spec doesn't match what my example document contains. I have emailed them again to follow up.
> OneNote - look into adding support for OneNote 365 documents
> ------------------------------------------------------------
>
> Key: TIKA-3446
> URL: https://issues.apache.org/jira/browse/TIKA-3446
> Project: Tika
> Issue Type: New Feature
> Components: parser
> Affects Versions: 1.27
> Reporter: Nicholas DiPiazza
> Assignee: Nicholas DiPiazza
> Priority: Major
>
> While doing some parsing of OneNote documents, I was investigating a slew of them that did not seem to parse very well.
> When I did some digging, I found out that these documents were generated from SharePoint Online.
> I had hoped that OneNote documents generated from SharePoint Online would just be the same as OnPrem OneNote documents from 2016, 2019 etc.
> But turns out this is NOT the case.
> I checked out the Microsoft specification MS-ONESTORE and found that the documents do not match the specifications that are published.
> Opened a community post: [Looking for the MS spec for OneNote 365 version - Microsoft Q&A|https://docs.microsoft.com/en-us/answers/questions/436943/looking-for-the-ms-spec-for-onenote-365-version-1.html]
> And also opened an internal ticket with Microsoft.
> They will be responding soon with an analysis of my issue and we'll see if there is anything we can do.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)