You are viewing a plain text version of this content. The canonical link for it is here.
Posted to batik-dev@xmlgraphics.apache.org by "Sebastián Passaro (Jira)" <ji...@apache.org> on 2022/01/19 17:19:00 UTC

[jira] [Updated] (BATIK-1320) Small/Big numbers in dimension values get transformed into exponents

     [ https://issues.apache.org/jira/browse/BATIK-1320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sebastián Passaro updated BATIK-1320:
-------------------------------------
    Description: 
h3. Overview

When you exceed the three significant figures, the value in CSS as a lexical unit is interpreted with an exponential value. This means {{0.001}} is parsed to {{{}0.001{}}}, but {{0.0001}} is parsed to {{{}1.0E-4{}}}.
h3. Steps to Reproduce

It can be reproduced by using the Parser.parseStyleDeclaration() method with a String like "margin: 0.0001pt;". With a custom DocumentHandler, {{LexicalUnit}} value with its unit can be obtained with: {{lu.getFloatValue() + lu.getDimensionUnitText()}}
h3. Actual Results

When using {{0.0001pt}} as input, it results in {{{}"1.0E-5" + "pt"{}}}.
h3. Expected Results

When using {{0.0001pt}} as input, result should be {{{}"{}}}{{{}0.0001{}}}{{{}" + "pt". The same as input.{}}}

It also happens with big numbers, meaning {{10000000pt}} results in {{{}1.0E7pt{}}}.

Of course,\{{ }}exponents as input should be supported too, which is not a problem now but needs to be considered when fixing this.
h3. Additional Information

Also there are problems when going backwards. Cases like {{1.0E+4pt}} result in {{{}1.0E{}}}, being {{1.0}} the float value and {{E}} the unit instead of pt. However, using + after the exponent letter is not mentioned in the [W3C spec|#number], even if sign needs to be used with negative exponent, so then why leave positive sign behind?

Anyway browsers may consider as valid a wider range of options on how to write the same number with or without exponents. The right behavior should be to respect the input format instead of changing it, being compliant with W3C allowed formats.

I'd like to include a test case but I don't really know how to because my knowledge of Batik is limited and I use it because I'm a new maintainer in OWASP AntiSamy project (there is an [open issue for this|https://github.com/nahsra/antisamy/issues/101]). We use Batik to parse CSS in HTML to validate input and have some custom classes to retrieve a final "clean" CSS.

AntiSamy uses version 1.14 but it seems to me this happens on older versions. Feel free to change that based on the root cause when it's detected.

  was:
h3. Overview

When you exceed the three significant figures, the value in CSS as a lexical unit is interpreted with an exponential value. This means {{0.001}} is parsed to {{{}0.001{}}}, but {{0.0001}} is parsed to {{{}1.0E-4{}}}.
h3. Steps to Reproduce

It can be reproduced by using the Parser.parseStyleDeclaration() method with a String like "margin: 0.0001pt;". With a custom DocumentHandler, {{LexicalUnit}} value with its unit can be obtained with: {{lu.getFloatValue() + lu.getDimensionUnitText()}}
h3. Actual Results

When using {{0.0001pt}} as input, it results in {{{}"1.0E-5" + "pt"{}}}.
h3. Expected Results

When using {{0.0001pt}} as input, result should be {{{}"{}}}{{{}0.0001{}}}{{{}" + "pt". The same as input.{}}}

It also happens with big numbers, meaning {{10000000pt}} results in {{{}1.0E7pt{}}}.

Of course,{{ }}exponents as input should be supported too, which is not a problem now but needs to be considered when fixing this.
h3. Additional Information

Also there are problems when going backwards. Cases like {{1.0E+4pt}} result in {{{}1.0E{}}}, being {{1.0}} the float value and {{E}} the unit instead of pt. However, using + after the exponent letter is not mentioned in the [W3C spec|#number],], even if sign needs to be used with negative exponent, so then why leave positive sign behind?

Anyway browsers may consider as valid a wider range of options on how to write the same number with or without exponents. The right behavior should be to respect the input format instead of changing it, being compliant with W3C allowed formats.

I'd like to include a test case but I don't really know how to because my knowledge of Batik is limited and I use it because I'm a new maintainer in OWASP AntiSamy project (there is an [open issue for this|https://github.com/nahsra/antisamy/issues/101]). We use Batik to parse CSS in HTML to validate input and have some custom classes to retrieve a final "clean" CSS.

AntiSamy uses version 1.14 but it seems to me this happens on older versions. Feel free to change that based on the root cause when it's detected.


> Small/Big numbers in dimension values get transformed into exponents
> --------------------------------------------------------------------
>
>                 Key: BATIK-1320
>                 URL: https://issues.apache.org/jira/browse/BATIK-1320
>             Project: Batik
>          Issue Type: Bug
>    Affects Versions: 1.14
>            Reporter: Sebastián Passaro
>            Priority: Minor
>
> h3. Overview
> When you exceed the three significant figures, the value in CSS as a lexical unit is interpreted with an exponential value. This means {{0.001}} is parsed to {{{}0.001{}}}, but {{0.0001}} is parsed to {{{}1.0E-4{}}}.
> h3. Steps to Reproduce
> It can be reproduced by using the Parser.parseStyleDeclaration() method with a String like "margin: 0.0001pt;". With a custom DocumentHandler, {{LexicalUnit}} value with its unit can be obtained with: {{lu.getFloatValue() + lu.getDimensionUnitText()}}
> h3. Actual Results
> When using {{0.0001pt}} as input, it results in {{{}"1.0E-5" + "pt"{}}}.
> h3. Expected Results
> When using {{0.0001pt}} as input, result should be {{{}"{}}}{{{}0.0001{}}}{{{}" + "pt". The same as input.{}}}
> It also happens with big numbers, meaning {{10000000pt}} results in {{{}1.0E7pt{}}}.
> Of course,\{{ }}exponents as input should be supported too, which is not a problem now but needs to be considered when fixing this.
> h3. Additional Information
> Also there are problems when going backwards. Cases like {{1.0E+4pt}} result in {{{}1.0E{}}}, being {{1.0}} the float value and {{E}} the unit instead of pt. However, using + after the exponent letter is not mentioned in the [W3C spec|#number], even if sign needs to be used with negative exponent, so then why leave positive sign behind?
> Anyway browsers may consider as valid a wider range of options on how to write the same number with or without exponents. The right behavior should be to respect the input format instead of changing it, being compliant with W3C allowed formats.
> I'd like to include a test case but I don't really know how to because my knowledge of Batik is limited and I use it because I'm a new maintainer in OWASP AntiSamy project (there is an [open issue for this|https://github.com/nahsra/antisamy/issues/101]). We use Batik to parse CSS in HTML to validate input and have some custom classes to retrieve a final "clean" CSS.
> AntiSamy uses version 1.14 but it seems to me this happens on older versions. Feel free to change that based on the root cause when it's detected.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: batik-dev-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: batik-dev-help@xmlgraphics.apache.org