You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@annotator.apache.org by Benjamin Young <by...@bigbluehat.com> on 2020/03/27 20:07:39 UTC

Pondering getting docs into HTML for annotation

Hi all,

We've got the (very promising!) beginnings of a core framework for building HTML annotation tools that work well with the W3C Web Annotation format. What I'm beginning to wonder is if we might also need/want to consider bringing in (or building or integrating with) other code that may get more content into HTML for annotating.

There are of course projects like PDF.js (Mozilla) and EPUB.js (FuturePress.org) which would be great to see integrations and/or demos on top of (at some point).

There's another project I'm connected with (dedocx) that does Microsoft Word .docx file (OOXML) conversion into "ugly" HTML and then has a plugins system for post-processing that into more meaningful content (i.e. adding Linked Data, etc.): https://github.com/science-periodicals/dedocx

Would this sort of project be something we might be able to find a home for here under the Apache Annotator banner? Or, barring that, maybe consider sending through the incubator process on its own--if others are interested?

Just musings at this point, but thought I'd reach out to see if y'all had thoughts. :)

Cheers!
Benjamin


--

http://bigbluehat.com/

http://linkedin.com/in/benjaminyoung

Re: Pondering getting docs into HTML for annotation

Posted by Benjamin Young <by...@bigbluehat.com>.
Yeah, don't think we'll have disagreement there. :) Mainly, I think it's a question of does it make sense to provide some of those conversion tools (or wrappers around them) within this project? or should there be (or is there already?) another place within Apache for such projects?

I think some "team ups" are in order, regardless. :)

Cheers,
Benjamin


--

http://bigbluehat.com/

http://linkedin.com/in/benjaminyoung

________________________________
From: Jack Park <ja...@topicquests.org>
Sent: Friday, March 27, 2020 4:15 PM
To: dev@annotator.incubator.apache.org <de...@annotator.incubator.apache.org>
Subject: Re: Pondering getting docs into HTML for annotation

For me, it is valuable to be able to visit a PDF online and annotate and
tag it as if it is an HTML page.


On Fri, Mar 27, 2020 at 1:07 PM Benjamin Young <by...@bigbluehat.com>
wrote:

> Hi all,
>
> We've got the (very promising!) beginnings of a core framework for
> building HTML annotation tools that work well with the W3C Web Annotation
> format. What I'm beginning to wonder is if we might also need/want to
> consider bringing in (or building or integrating with) other code that may
> get more content into HTML for annotating.
>
> There are of course projects like PDF.js (Mozilla) and EPUB.js
> (FuturePress.org) which would be great to see integrations and/or demos on
> top of (at some point).
>
> There's another project I'm connected with (dedocx) that does Microsoft
> Word .docx file (OOXML) conversion into "ugly" HTML and then has a plugins
> system for post-processing that into more meaningful content (i.e. adding
> Linked Data, etc.): https://github.com/science-periodicals/dedocx
>
> Would this sort of project be something we might be able to find a home
> for here under the Apache Annotator banner? Or, barring that, maybe
> consider sending through the incubator process on its own--if others are
> interested?
>
> Just musings at this point, but thought I'd reach out to see if y'all had
> thoughts. :)
>
> Cheers!
> Benjamin
>
>
> --
>
> http://bigbluehat.com/
>
> http://linkedin.com/in/benjaminyoung
>

Re: Pondering getting docs into HTML for annotation

Posted by Jack Park <ja...@topicquests.org>.
For me, it is valuable to be able to visit a PDF online and annotate and
tag it as if it is an HTML page.


On Fri, Mar 27, 2020 at 1:07 PM Benjamin Young <by...@bigbluehat.com>
wrote:

> Hi all,
>
> We've got the (very promising!) beginnings of a core framework for
> building HTML annotation tools that work well with the W3C Web Annotation
> format. What I'm beginning to wonder is if we might also need/want to
> consider bringing in (or building or integrating with) other code that may
> get more content into HTML for annotating.
>
> There are of course projects like PDF.js (Mozilla) and EPUB.js
> (FuturePress.org) which would be great to see integrations and/or demos on
> top of (at some point).
>
> There's another project I'm connected with (dedocx) that does Microsoft
> Word .docx file (OOXML) conversion into "ugly" HTML and then has a plugins
> system for post-processing that into more meaningful content (i.e. adding
> Linked Data, etc.): https://github.com/science-periodicals/dedocx
>
> Would this sort of project be something we might be able to find a home
> for here under the Apache Annotator banner? Or, barring that, maybe
> consider sending through the incubator process on its own--if others are
> interested?
>
> Just musings at this point, but thought I'd reach out to see if y'all had
> thoughts. :)
>
> Cheers!
> Benjamin
>
>
> --
>
> http://bigbluehat.com/
>
> http://linkedin.com/in/benjaminyoung
>