You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Cassandra Targett (JIRA)" <ji...@apache.org> on 2018/10/18 21:14:00 UTC

[jira] [Commented] (SOLR-12746) Ref Guide HTML output should adhere to more standard HTML5

    [ https://issues.apache.org/jira/browse/SOLR-12746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16655875#comment-16655875 ] 

Cassandra Targett commented on SOLR-12746:
------------------------------------------

Based on feedback I got from the Asciidoctor community, this error I mentioned as the #2 caveat earlier:

bq. There is an error output by the Slim engine ({{Slim::Engine: Option :asciidoc is invalid}}) during the HTML build for every template (so, 30+ times).

is resolved by downgrading the Slim version to v3.0 instead of 4.0.1 which I'd installed as the latest since the templates didn't specify any specific version. I don't think we care about the Slim version, so I can update the README and Jenkins scripts to force this version and we can call that problem resolved.

I found some more CSS changes to fix & still a TODO or two I'd mentioned earlier, so I'll update the branch with these changes as soon as I can (next week).

> Ref Guide HTML output should adhere to more standard HTML5
> ----------------------------------------------------------
>
>                 Key: SOLR-12746
>                 URL: https://issues.apache.org/jira/browse/SOLR-12746
>             Project: Solr
>          Issue Type: Improvement
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: documentation
>            Reporter: Cassandra Targett
>            Assignee: Cassandra Targett
>            Priority: Major
>
> The default HTML produced by Jekyll/Asciidoctor adds a lot of extra {{<div>}} tags to the content which break up our content into very small chunks. This is acceptable to a casual website reader as far as it goes, but any Reader view in a browser or another type of content extraction system that uses a similar "readability" scoring algorithm is going to either miss a lot of content or fail to display the page entirely.
> To see what I mean, take a page like https://lucene.apache.org/solr/guide/7_4/language-analysis.html and enable Reader View in your browser (I used Firefox; Steve Rowe told me offline Safari would not even offer the option on the page for him). You will notice a lot of missing content. It's almost like someone selected sentences at random.
> Asciidoctor has a long-standing issue to provide a better more semantic-oriented HTML5 output, but it has not been resolved yet: https://github.com/asciidoctor/asciidoctor/issues/242
> Asciidoctor does provide a way to override the default output templates by providing your own in Slim, HAML, ERB or any other template language supported by Tilt (none of which I know yet). There are some samples available via the Asciidoctor project which we can borrow, but it's otherwise unknown as of yet what parts of the output are causing the worst of the problems. This issue is to explore how to fix it to improve this part of the HTML reading experience.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org