You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Nicholas DiPiazza (Jira)" <ji...@apache.org> on 2021/06/15 17:12:00 UTC

[jira] [Updated] (TIKA-3446) OneNote - look into adding support for OneNote 365 documents

     [ https://issues.apache.org/jira/browse/TIKA-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas DiPiazza updated TIKA-3446:
------------------------------------
    Description: 
While doing some parsing of OneNote documents, I was investigating a slew of them that did not seem to parse very well. 

When I did some digging, I found out that these documents were generated from SharePoint Online. 

I had hoped that OneNote documents generated from SharePoint Online would just be the same as OnPrem OneNote documents from 2016, 2019 etc. 

But turns out this is NOT the case. 

I checked out the Microsoft specification MS-ONESTORE and found that the documents do not match the specifications that are published. 

Opened a community post: [Looking for the MS spec for OneNote 365 version - Microsoft Q&A|https://docs.microsoft.com/en-us/answers/questions/436943/looking-for-the-ms-spec-for-onenote-365-version-1.html]

And also opened an internal ticket with Microsoft. 

They will be responding soon with an analysis of my issue and we'll see if there is anything we can do. 

  was:
While doing some parsing of OneNote documents, I was investigating a slew of them that did not seem to parse very well. 

When I did some digging, I found out that these documents were generated from SharePoint Online. 

I had hoped that OneNote documents generated from OneNote would just be the same as OnPrem OneNote documents from 2016, 2019 etc. 

But turns out this is NOT the case. 

I checked out the Microsoft specification MS-ONESTORE and found that the documents do not match the specifications that are published. 

Opened a community post: [Looking for the MS spec for OneNote 365 version - Microsoft Q&A|https://docs.microsoft.com/en-us/answers/questions/436943/looking-for-the-ms-spec-for-onenote-365-version-1.html]

And also opened an internal ticket with Microsoft. 

They will be responding soon with an analysis of my issue and we'll see if there is anything we can do. 


> OneNote - look into adding support for OneNote 365 documents
> ------------------------------------------------------------
>
>                 Key: TIKA-3446
>                 URL: https://issues.apache.org/jira/browse/TIKA-3446
>             Project: Tika
>          Issue Type: New Feature
>          Components: parser
>    Affects Versions: 1.27
>            Reporter: Nicholas DiPiazza
>            Assignee: Nicholas DiPiazza
>            Priority: Major
>
> While doing some parsing of OneNote documents, I was investigating a slew of them that did not seem to parse very well. 
> When I did some digging, I found out that these documents were generated from SharePoint Online. 
> I had hoped that OneNote documents generated from SharePoint Online would just be the same as OnPrem OneNote documents from 2016, 2019 etc. 
> But turns out this is NOT the case. 
> I checked out the Microsoft specification MS-ONESTORE and found that the documents do not match the specifications that are published. 
> Opened a community post: [Looking for the MS spec for OneNote 365 version - Microsoft Q&A|https://docs.microsoft.com/en-us/answers/questions/436943/looking-for-the-ms-spec-for-onenote-365-version-1.html]
> And also opened an internal ticket with Microsoft. 
> They will be responding soon with an analysis of my issue and we'll see if there is anything we can do. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)