You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by "dvineshkumar@gmail.com" <dv...@gmail.com> on 2015/08/11 16:36:51 UTC
Re: FOP2.0 taking more time format complex script documents
Hi,
After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
even after a glyph character is found. Updated the findGlyphIndex() method
to terminate the loop once the glyph character is found and performance got
improved much. Refer below existing and updated method.
Existing:
public int findGlyphIndex(int c) {
int idx = c;
int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
// for most users the most likely glyphs are in the first cmap
segments (meaning the one with
// the lowest unicode start values)
if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
return mostLikelyGlyphs[idx];
}
for (CMapSegment i : cmap) {
if (retIdx == 0
&& i.getUnicodeStart() <= idx
&& i.getUnicodeEnd() >= idx) {
retIdx = i.getGlyphStartIndex()
+ idx
- i.getUnicodeStart();
if (idx < NUM_MOST_LIKELY_GLYPHS) {
mostLikelyGlyphs[idx] = retIdx;
}
}
}
return retIdx;
}
Updated:
public int findGlyphIndex(int c) {
int idx = c;
int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
// for most users the most likely glyphs are in the first cmap segments
(meaning the one with
// the lowest unicode start values)
if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
return mostLikelyGlyphs[idx];
}
for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
if (cmap.get(i).getUnicodeStart() <= idx
&& cmap.get(i).getUnicodeEnd() >= idx) {
retIdx = cmap.get(i).getGlyphStartIndex()
+ idx
- cmap.get(i).getUnicodeStart();
if (idx < NUM_MOST_LIKELY_GLYPHS) {
mostLikelyGlyphs[idx] = retIdx;
}
}
}
return retIdx;
}
Regards,
Vinesh Kumar. D
--
View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-to-format-complex-script-documents-tp42461p42749.html
Sent from the FOP - Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
Re: FOP2.0 taking more time format complex script documents
Posted by Matthias Reischenbacher <ma...@gmx.at>.
Hi,
thanks for your analysis. I've committed a fix as part of
https://issues.apache.org/jira/browse/FOP-2530
Best regards,
Matthias
On 11.08.2015 11:36, dvineshkumar@gmail.com wrote:
> Hi,
>
> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
> even after a glyph character is found. Updated the findGlyphIndex() method
> to terminate the loop once the glyph character is found and performance got
> improved much. Refer below existing and updated method.
>
> Existing:
>
> public int findGlyphIndex(int c) {
> int idx = c;
> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
> // for most users the most likely glyphs are in the first cmap
> segments (meaning the one with
> // the lowest unicode start values)
> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
> return mostLikelyGlyphs[idx];
> }
> for (CMapSegment i : cmap) {
> if (retIdx == 0
> && i.getUnicodeStart() <= idx
> && i.getUnicodeEnd() >= idx) {
> retIdx = i.getGlyphStartIndex()
> + idx
> - i.getUnicodeStart();
> if (idx < NUM_MOST_LIKELY_GLYPHS) {
> mostLikelyGlyphs[idx] = retIdx;
> }
> }
> }
> return retIdx;
> }
>
> Updated:
>
> public int findGlyphIndex(int c) {
> int idx = c;
> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
> // for most users the most likely glyphs are in the first cmap segments
> (meaning the one with
> // the lowest unicode start values)
> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
> return mostLikelyGlyphs[idx];
> }
>
> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
> if (cmap.get(i).getUnicodeStart() <= idx
> && cmap.get(i).getUnicodeEnd() >= idx) {
>
> retIdx = cmap.get(i).getGlyphStartIndex()
> + idx
> - cmap.get(i).getUnicodeStart();
> if (idx < NUM_MOST_LIKELY_GLYPHS) {
> mostLikelyGlyphs[idx] = retIdx;
>
> }
> }
> }
> return retIdx;
> }
>
> Regards,
> Vinesh Kumar. D
>
>
>
>
> --
> View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-to-format-complex-script-documents-tp42461p42749.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
Re: FOP2.0 taking more time format complex script documents
Posted by Pascal Sancho <ps...@gmail.com>.
Hi,
AFAIK, there is no rules that prevent such usage.
as a starting point, you can follow this:
http://xmlgraphics.apache.org/fop/dev/conventions.html
2015-08-13 10:15 GMT+02:00 Klaus Malorny <Kl...@knipp.de>:
> On 12.08.2015 08:38, Pascal Sancho wrote:
>>
>> Hi,
>>
>> please, can you file in a Jira entry, attaching all materials (test
>> case, patch, etc.)
>>
>>
>> 2015-08-11 16:36 GMT+02:00 dvineshkumar@gmail.com
>> <dv...@gmail.com>:
>>>
>>> Hi,
>>>
>>> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
>>> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
>>> even after a glyph character is found. Updated the findGlyphIndex()
>>> method
>>> to terminate the loop once the glyph character is found and performance
>>> got
>>> improved much. Refer below existing and updated method.
>>>
>>> Existing:
>>>
>>> public int findGlyphIndex(int c) {
>>> int idx = c;
>>> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>>
>>> // for most users the most likely glyphs are in the first cmap
>>> segments (meaning the one with
>>> // the lowest unicode start values)
>>> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0)
>>> {
>>> return mostLikelyGlyphs[idx];
>>> }
>>> for (CMapSegment i : cmap) {
>>> if (retIdx == 0
>>> && i.getUnicodeStart() <= idx
>>> && i.getUnicodeEnd() >= idx) {
>>> retIdx = i.getGlyphStartIndex()
>>> + idx
>>> - i.getUnicodeStart();
>>> if (idx < NUM_MOST_LIKELY_GLYPHS) {
>>> mostLikelyGlyphs[idx] = retIdx;
>>> }
>>> }
>>> }
>>> return retIdx;
>>> }
>>>
>>> Updated:
>>>
>>> public int findGlyphIndex(int c) {
>>> int idx = c;
>>> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>>
>>> // for most users the most likely glyphs are in the first cmap segments
>>> (meaning the one with
>>> // the lowest unicode start values)
>>> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>>> return mostLikelyGlyphs[idx];
>>> }
>>>
>>> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
>>> if (cmap.get(i).getUnicodeStart() <= idx
>>> && cmap.get(i).getUnicodeEnd() >= idx) {
>>>
>>> retIdx = cmap.get(i).getGlyphStartIndex()
>>> + idx
>>> - cmap.get(i).getUnicodeStart();
>>> if (idx < NUM_MOST_LIKELY_GLYPHS) {
>>> mostLikelyGlyphs[idx] = retIdx;
>>>
>>> }
>>> }
>>> }
>>> return retIdx;
>>> }
>>>
>>> Regards,
>>> Vinesh Kumar. D
>>>
>
> Just for curiosity: Are breaks and returns within loops forbidden in your
> coding conventions? ;-)
>
> By the way, if this is really a performance bottleneck and the number of
> segments are typically larger (say e.g. >= 10), I would sort the segments by
> their starts and convert the three values into arrays (during object
> construction) and would perform a binary search on the starts, then test for
> the end and finally calculate the index.
>
> Regards,
> Klaus
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
--
pascal
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
Re: FOP2.0 taking more time format complex script documents
Posted by Klaus Malorny <Kl...@knipp.de>.
On 12.08.2015 08:38, Pascal Sancho wrote:
> Hi,
>
> please, can you file in a Jira entry, attaching all materials (test
> case, patch, etc.)
>
>
> 2015-08-11 16:36 GMT+02:00 dvineshkumar@gmail.com <dv...@gmail.com>:
>> Hi,
>>
>> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
>> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
>> even after a glyph character is found. Updated the findGlyphIndex() method
>> to terminate the loop once the glyph character is found and performance got
>> improved much. Refer below existing and updated method.
>>
>> Existing:
>>
>> public int findGlyphIndex(int c) {
>> int idx = c;
>> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>
>> // for most users the most likely glyphs are in the first cmap
>> segments (meaning the one with
>> // the lowest unicode start values)
>> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>> return mostLikelyGlyphs[idx];
>> }
>> for (CMapSegment i : cmap) {
>> if (retIdx == 0
>> && i.getUnicodeStart() <= idx
>> && i.getUnicodeEnd() >= idx) {
>> retIdx = i.getGlyphStartIndex()
>> + idx
>> - i.getUnicodeStart();
>> if (idx < NUM_MOST_LIKELY_GLYPHS) {
>> mostLikelyGlyphs[idx] = retIdx;
>> }
>> }
>> }
>> return retIdx;
>> }
>>
>> Updated:
>>
>> public int findGlyphIndex(int c) {
>> int idx = c;
>> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>>
>> // for most users the most likely glyphs are in the first cmap segments
>> (meaning the one with
>> // the lowest unicode start values)
>> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
>> return mostLikelyGlyphs[idx];
>> }
>>
>> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
>> if (cmap.get(i).getUnicodeStart() <= idx
>> && cmap.get(i).getUnicodeEnd() >= idx) {
>>
>> retIdx = cmap.get(i).getGlyphStartIndex()
>> + idx
>> - cmap.get(i).getUnicodeStart();
>> if (idx < NUM_MOST_LIKELY_GLYPHS) {
>> mostLikelyGlyphs[idx] = retIdx;
>>
>> }
>> }
>> }
>> return retIdx;
>> }
>>
>> Regards,
>> Vinesh Kumar. D
>>
Just for curiosity: Are breaks and returns within loops forbidden in your coding
conventions? ;-)
By the way, if this is really a performance bottleneck and the number of
segments are typically larger (say e.g. >= 10), I would sort the segments by
their starts and convert the three values into arrays (during object
construction) and would perform a binary search on the starts, then test for the
end and finally calculate the index.
Regards,
Klaus
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
Re: FOP2.0 taking more time format complex script documents
Posted by Pascal Sancho <ps...@gmail.com>.
Hi,
please, can you file in a Jira entry, attaching all materials (test
case, patch, etc.)
2015-08-11 16:36 GMT+02:00 dvineshkumar@gmail.com <dv...@gmail.com>:
> Hi,
>
> After analysis, found a bug in MultiByteFont::findGlyphIndex() method.
> In FOP2.0, MultiByteFont::findGlyphIndex() method, for loop is continous
> even after a glyph character is found. Updated the findGlyphIndex() method
> to terminate the loop once the glyph character is found and performance got
> improved much. Refer below existing and updated method.
>
> Existing:
>
> public int findGlyphIndex(int c) {
> int idx = c;
> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
> // for most users the most likely glyphs are in the first cmap
> segments (meaning the one with
> // the lowest unicode start values)
> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
> return mostLikelyGlyphs[idx];
> }
> for (CMapSegment i : cmap) {
> if (retIdx == 0
> && i.getUnicodeStart() <= idx
> && i.getUnicodeEnd() >= idx) {
> retIdx = i.getGlyphStartIndex()
> + idx
> - i.getUnicodeStart();
> if (idx < NUM_MOST_LIKELY_GLYPHS) {
> mostLikelyGlyphs[idx] = retIdx;
> }
> }
> }
> return retIdx;
> }
>
> Updated:
>
> public int findGlyphIndex(int c) {
> int idx = c;
> int retIdx = SingleByteEncoding.NOT_FOUND_CODE_POINT;
>
> // for most users the most likely glyphs are in the first cmap segments
> (meaning the one with
> // the lowest unicode start values)
> if (idx < NUM_MOST_LIKELY_GLYPHS && mostLikelyGlyphs[idx] != 0) {
> return mostLikelyGlyphs[idx];
> }
>
> for (int i = 0; (i < cmap.size()) && retIdx == 0; i++) {
> if (cmap.get(i).getUnicodeStart() <= idx
> && cmap.get(i).getUnicodeEnd() >= idx) {
>
> retIdx = cmap.get(i).getGlyphStartIndex()
> + idx
> - cmap.get(i).getUnicodeStart();
> if (idx < NUM_MOST_LIKELY_GLYPHS) {
> mostLikelyGlyphs[idx] = retIdx;
>
> }
> }
> }
> return retIdx;
> }
>
> Regards,
> Vinesh Kumar. D
>
>
>
>
> --
> View this message in context: http://apache-fop.1065347.n5.nabble.com/FOP2-0-taking-more-time-to-format-complex-script-documents-tp42461p42749.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
--
pascal
---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org