You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by GitBox <gi...@apache.org> on 2021/06/18 21:05:20 UTC

[GitHub] [pdfbox] gunnar-ifp opened a new pull request #121: optimize applyMask in PdImageXObject

gunnar-ifp opened a new pull request #121:
URL: https://github.com/apache/pdfbox/pull/121


   There was a severe performance issue with really big masks if the image needs to be scaled to it (i.e. 10000*10000 pixels). Scaling bicubic can take 6-10 seconds. This patch tries to switch to bilinear resizing for these cases, although the threshold might be fine tuned, still.
   
   There was also a double allocation for the final masked image when we can simply use the image since applyMask() is always fed with a newly created one. Reference hogging and needless allocation have been removed.
   
   Additionally the alpha blending routines were very slow, working on pixels. There is now a staggered approach by:
   - direct byte masking which is very fast even for big images (right now does now work with padded buffers),
   - exploiting data buffer's sample system to merge the alpha component into the ARGB image, letting the sample model do the bit masking,
   - slow pixel expansion to reverse premultiply matte values (but using fixed point integer arithmetics).
   
   Additionally also using the interpolation flag of the mask to decide if the mask should be interpolated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
THausherr commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-864392468


   Do you have a PDF where this large scaling happened? I found only one in my entire collection where the "largeScale switch is arbitrarily chosen" segment is hit, which is the second page of PDFBOX-2103, but this was a small image.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-868438983


   I did some further testing, because I felt like the large scale test that the scaling must be bigger than factor 9 x 9 is pointless and only the target resolution is important, i.e. no matter how small the source image is, if the target is 10000x10000, it's going to be slow. And I was right. Here some tests, scaling up from 10, 50, 100, 500, 1000, 2000, 5000 and 10000 square to 10000:
   afto = AffineTransformOp.filter(), draw = Graphics2D.drawImage(). Times in milliseconds.
   ```
   GRAY -> GRAY
   ~~~~~~~~~~~~
         | Bilinear                 | Bicubic                  | Nearest Neighbor
   00010 | afto =  686, draw = 2679 | afto = 2526, draw = 8882 | afto = 227, draw = 111
   00050 | afto =  494, draw = 2679 | afto = 1658, draw = 8873 | afto = 180, draw = 110
   00100 | afto =  477, draw = 2678 | afto = 1537, draw = 8870 | afto = 181, draw = 111
   00500 | afto =  461, draw = 2678 | afto = 1451, draw = 8877 | afto = 179, draw = 112
   01000 | afto =  459, draw = 2680 | afto = 1434, draw = 8877 | afto = 181, draw = 110
   02000 | afto =  458, draw = 2677 | afto = 1424, draw = 8876 | afto = 182, draw = 111
   05000 | afto =  461, draw = 2678 | afto = 1430, draw = 8904 | afto = 182, draw = 115
   10000 | afto =  461, draw =   23 | afto = 1467, draw =   23 | afto = 182, draw =  25
   
   
   RGB -> ARGB
   ~~~~~~~~~~~
         | Bilinear                 | Bicubic                  | Nearest Neighbor
   00010 | afto = 2133, draw = 2290 | afto = 7576, draw = 7674 | afto = 587, draw = 147
   00050 | afto = 1994, draw = 2302 | afto = 6573, draw = 7757 | afto = 530, draw = 146
   00100 | afto = 1985, draw = 2305 | afto = 6445, draw = 7680 | afto = 530, draw = 146
   00500 | afto = 1979, draw = 2417 | afto = 6327, draw = 7673 | afto = 533, draw = 145
   01000 | afto = 1979, draw = 2338 | afto = 6330, draw = 7670 | afto = 533, draw = 147
   02000 | afto = 1987, draw = 2297 | afto = 6312, draw = 7688 | afto = 538, draw = 146
   05000 | afto = 2030, draw = 2294 | afto = 6477, draw = 7691 | afto = 577, draw = 150
   10000 | afto = 2144, draw =  136 | afto = 6502, draw =  141 | afto = 720, draw = 141
   
   
   Bilinear
   ~~~~~~~~~~~~
         | ARGB -> ARGB             | GRAY -> ARGB             | BINARY -> BINARY
   00010 | afto = 2130, draw = 2284 | afto = 2812, draw = 2758 | afto = 2741, draw = 4463
   00050 | afto = 1984, draw = 2281 | afto = 2573, draw = 2759 | afto = 2504, draw = 4238
   00100 | afto = 1972, draw = 2282 | afto = 2558, draw = 2758 | afto = 2349, draw = 4235
   00500 | afto = 1974, draw = 2282 | afto = 2436, draw = 2756 | afto = 2365, draw = 4236
   01000 | afto = 1975, draw = 2282 | afto = 2452, draw = 2756 | afto = 2383, draw = 4247
   02000 | afto = 1976, draw = 2282 | afto = 2527, draw = 2756 | afto = 2393, draw = 4274
   05000 | afto = 1986, draw = 2288 | afto = 3096, draw = 2760 | afto = 2582, draw = 4507
   10000 | afto = 1963, draw =  152 | afto = 5072, draw =  149 | afto = 3196, draw =  460
   ```
   
   What we can see is:
   - Bicubic is ~3 times slower than bilinear.
   - Using a graphics to scale up anything if the target is not not ARGB or RGB is slow. E.g. scaling up the mask to final size.
   - Graphics is always faster for nearest neighbor and if source = destination size
   With the commits of last night using AffineTransformationOp if src < dest and interpolation is enabled, scaleImage should be a lot faster for mask scaling and faster overall.
   We can remove the scale factor test and simply test on destination pixel area to establish thresholds for bicubic, bilinear and maybe nearest neighbor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-868018899


   Yes, I would like to keep the interpolation block. 
   
   The PDF 8.9.6.2 spec says that stencil masks can have an interpolate flag, and when enabled to interpolate the mask, not the colors, so low res stencil masks don't appear with jagged edges.
   
   Originally this happens when the image / mask are rendered onto the canvas, but we apply the mask before hand w/o knowing the target resolution. This is why we have to interpolate ourself and also work in maybe way too high of image size.
   
   But I noticed a few things while writing this text:
   
   We do extend the the 1 bit stencil mask to 8 bit, which might pose an interesting problem:
   Looking at the PDFBOX-4218, if instead the image was 48 pixel wide and the mask 6 pixel, and interpolate flag set for the mask, this would create alpha blending near the masked out areas. It would be interesting to create such a file and see how other pdf engines behave. But if this is wrong, one can simply clamp the alpha value to 0 and 255 with a threshold at 127 when writing the alpha value into the stencil mask image (I reckon this should not be done for soft masks). I do believe that the current behaviour is the better choice.
   
   As for interpolating the image, if we knew the target resolution it's going to be rendered in the end, we could scale everything bigger than that down to that and then proceed. But we don't and the PDF spec says that mask and image can have different resolutions, even aspect ratios, but get squeezed into the same target rectangle. We need to emulate that and that's why we scale up to the larger one.
   So if we were to not interpolate the image when scaling it up to the mask, we can introduce a crappy, jagged image, which might be actually written 1:1 to the canvas because maybe the mask is in the resolution of the final render. Usually rendering a non masked image to the canvas will do the proper interpolation, whatever is set in the canvas graphics, but we are circumventing that in applyMask() with large masks.
   
   PDFBOX-2750 doesn't use interpolation so scaleImage doesn't interpolate. And it's just a color soup background (you can't really see it in PDFDebugger because the mask is always applied, but if one disabled the alpha composition in applyMask() it shows the soup). So differences are probably almost invisible. With disabled alpha composition and forced interpolation jagged vs not so jagged diagonal lines are clearly visible in the soup, they are simply not visible with the stencil mask:
   ![PDF2750-bgsoup_jagged](https://user-images.githubusercontent.com/23260584/123341607-8a562980-d54e-11eb-8700-f356241c3615.jpg)
   ![PDF2750-bgsoup_soft](https://user-images.githubusercontent.com/23260584/123341616-8e824700-d54e-11eb-8750-6428b9bd258b.jpg)
   Still shows that image interpolation is good when it is requested by the image.
   
   Even worse, we are violating that overlay principle sometimes:
   If there is an image with 100*100 and the mask is 200*10, we will scale the mask to 100*100 when we actually must scale everything to 200*100 as to not deliberately reduce resolution of image or mask.
   
   Seeing that interpolation is not that slow once we switch to bilinear, I could live with it. For very large scaling operations even worse than 10*10 maybe one should switch to nearest neighbor. I was looking for faster image scaling in java and did find AffineTransformationOp, which is handled by java natively and I thought that the Graphics instances use that already. Turns out it is still faster except for nearest neighbor (results for Alfa Page 1):
   Graphics.drawImage():
   Bicubic / Quality: 5.83s
   Bicubic / Speed: 5.83s
   Bilinear / Speed: 1.74s
   Nearest Neighbor: 0.11
   AffineTransformOp.filter() (no difference in speeds between Raster and BufferedImage (even if src = RGB and dst = ARGB)
   Bicubic: 4.79
   Bilinear: 1.5s
   Nearest Neighbor: 0.47
   
   I added some code for both the Op and the Max(mask, image).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp edited a comment on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp edited a comment on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-868438983


   I did some further testing, because I felt like the large scale test that the scaling must be bigger than factor 9 x 9 is pointless and only the target resolution is important, i.e. no matter how small the source image is, if the target is 10000x10000, it's going to be slow. And I was right. Here some tests, scaling up from 10, 50, 100, 500, 1000, 2000, 5000 and 10000 square to 10000:
   afto = AffineTransformOp.filter(), draw = Graphics2D.drawImage(). Times in milliseconds.
   ```
   GRAY -> GRAY
   ~~~~~~~~~~~~
         | Bilinear                 | Bicubic                  | Nearest Neighbor
   00010 | afto =  686, draw = 2679 | afto = 2526, draw = 8882 | afto = 227, draw = 111
   00050 | afto =  494, draw = 2679 | afto = 1658, draw = 8873 | afto = 180, draw = 110
   00100 | afto =  477, draw = 2678 | afto = 1537, draw = 8870 | afto = 181, draw = 111
   00500 | afto =  461, draw = 2678 | afto = 1451, draw = 8877 | afto = 179, draw = 112
   01000 | afto =  459, draw = 2680 | afto = 1434, draw = 8877 | afto = 181, draw = 110
   02000 | afto =  458, draw = 2677 | afto = 1424, draw = 8876 | afto = 182, draw = 111
   05000 | afto =  461, draw = 2678 | afto = 1430, draw = 8904 | afto = 182, draw = 115
   10000 | afto =  461, draw =   23 | afto = 1467, draw =   23 | afto = 182, draw =  25
   
   
   RGB -> ARGB
   ~~~~~~~~~~~
         | Bilinear                 | Bicubic                  | Nearest Neighbor
   00010 | afto = 2133, draw = 2290 | afto = 7576, draw = 7674 | afto = 587, draw = 147
   00050 | afto = 1994, draw = 2302 | afto = 6573, draw = 7757 | afto = 530, draw = 146
   00100 | afto = 1985, draw = 2305 | afto = 6445, draw = 7680 | afto = 530, draw = 146
   00500 | afto = 1979, draw = 2417 | afto = 6327, draw = 7673 | afto = 533, draw = 145
   01000 | afto = 1979, draw = 2338 | afto = 6330, draw = 7670 | afto = 533, draw = 147
   02000 | afto = 1987, draw = 2297 | afto = 6312, draw = 7688 | afto = 538, draw = 146
   05000 | afto = 2030, draw = 2294 | afto = 6477, draw = 7691 | afto = 577, draw = 150
   10000 | afto = 2144, draw =  136 | afto = 6502, draw =  141 | afto = 720, draw = 141
   
   
   Bilinear
   ~~~~~~~~~~~~
         | ARGB -> ARGB             | GRAY -> ARGB             | INDEXED -> ARGB
   00010 | afto = 2130, draw = 2284 | afto = 2812, draw = 2758 | afto = 2133, draw = 2625
   00050 | afto = 1984, draw = 2281 | afto = 2573, draw = 2759 | afto = 1988, draw = 2561
   00100 | afto = 1972, draw = 2282 | afto = 2558, draw = 2758 | afto = 1977, draw = 2560
   00500 | afto = 1974, draw = 2282 | afto = 2436, draw = 2756 | afto = 1977, draw = 2560
   01000 | afto = 1975, draw = 2282 | afto = 2452, draw = 2756 | afto = 1986, draw = 2563
   02000 | afto = 1976, draw = 2282 | afto = 2527, draw = 2756 | afto = 1997, draw = 2568
   05000 | afto = 1986, draw = 2288 | afto = 3096, draw = 2760 | afto = 2092, draw = 2573
   10000 | afto = 1963, draw =  152 | afto = 5072, draw =  149 | afto = 2229, draw =  112
         | BINARY -> BINARY         | INDEXED -> INDEXED
   00010 | afto = 2741, draw = 4463 | afto = 2943, draw = 3388
   00050 | afto = 2504, draw = 4238 | afto = 2634, draw = 3388
   00100 | afto = 2349, draw = 4235 | afto = 2585, draw = 3393
   00500 | afto = 2365, draw = 4236 | afto = 2574, draw = 3404
   01000 | afto = 2383, draw = 4247 | afto = 2570, draw = 3404
   02000 | afto = 2393, draw = 4274 | afto = 2795, draw = 3417
   05000 | afto = 2582, draw = 4507 | afto = 2638, draw = 3384
   10000 | afto = 3196, draw =  460 | afto = 2809, draw =   22
   ```
   
   What we can see is:
   - Bicubic is ~3 times slower than bilinear.
   - Using a graphics to scale up anything if the target is not not ARGB or RGB is slow. E.g. scaling up the mask to final size.
   - Graphics is always faster for nearest neighbor and if source = destination size
   With the commits of last night using AffineTransformationOp if src < dest and interpolation is enabled, scaleImage should be a lot faster for mask scaling and faster overall.
   We can remove the scale factor test and simply test on destination pixel area to establish thresholds for bicubic, bilinear and maybe nearest neighbor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp edited a comment on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp edited a comment on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-868438983


   I did some further testing, because I felt like the large scale test that the scaling must be bigger than factor 9 x 9 is pointless and only the target resolution is important, i.e. no matter how small the source image is, if the target is 10000x10000, it's going to be slow. And I was right. Here some tests, scaling up from 10, 50, 100, 500, 1000, 2000, 5000 and 10000 square to 10000:
   afto = AffineTransformOp.filter(), draw = Graphics2D.drawImage(). Times in milliseconds.
   ```
   GRAY -> GRAY
   ~~~~~~~~~~~~
         | Bilinear                 | Bicubic                  | Nearest Neighbor
   00010 | afto =  686, draw = 2679 | afto = 2526, draw = 8882 | afto = 227, draw = 111
   00050 | afto =  494, draw = 2679 | afto = 1658, draw = 8873 | afto = 180, draw = 110
   00100 | afto =  477, draw = 2678 | afto = 1537, draw = 8870 | afto = 181, draw = 111
   00500 | afto =  461, draw = 2678 | afto = 1451, draw = 8877 | afto = 179, draw = 112
   01000 | afto =  459, draw = 2680 | afto = 1434, draw = 8877 | afto = 181, draw = 110
   02000 | afto =  458, draw = 2677 | afto = 1424, draw = 8876 | afto = 182, draw = 111
   05000 | afto =  461, draw = 2678 | afto = 1430, draw = 8904 | afto = 182, draw = 115
   10000 | afto =  461, draw =   23 | afto = 1467, draw =   23 | afto = 182, draw =  25
   
   
   RGB -> ARGB
   ~~~~~~~~~~~
         | Bilinear                 | Bicubic                  | Nearest Neighbor
   00010 | afto = 2133, draw = 2290 | afto = 7576, draw = 7674 | afto = 587, draw = 147
   00050 | afto = 1994, draw = 2302 | afto = 6573, draw = 7757 | afto = 530, draw = 146
   00100 | afto = 1985, draw = 2305 | afto = 6445, draw = 7680 | afto = 530, draw = 146
   00500 | afto = 1979, draw = 2417 | afto = 6327, draw = 7673 | afto = 533, draw = 145
   01000 | afto = 1979, draw = 2338 | afto = 6330, draw = 7670 | afto = 533, draw = 147
   02000 | afto = 1987, draw = 2297 | afto = 6312, draw = 7688 | afto = 538, draw = 146
   05000 | afto = 2030, draw = 2294 | afto = 6477, draw = 7691 | afto = 577, draw = 150
   10000 | afto = 2144, draw =  136 | afto = 6502, draw =  141 | afto = 720, draw = 141
   
   
   Bilinear
   ~~~~~~~~~~~~
         | ARGB -> ARGB             | GRAY -> ARGB             | INDEXED -> ARGB
   00010 | afto = 2130, draw = 2284 | afto = 2812, draw = 2758 | afto = 2133, draw = 2625
   00050 | afto = 1984, draw = 2281 | afto = 2573, draw = 2759 | afto = 1988, draw = 2561
   00100 | afto = 1972, draw = 2282 | afto = 2558, draw = 2758 | afto = 1977, draw = 2560
   00500 | afto = 1974, draw = 2282 | afto = 2436, draw = 2756 | afto = 1977, draw = 2560
   01000 | afto = 1975, draw = 2282 | afto = 2452, draw = 2756 | afto = 1986, draw = 2563
   02000 | afto = 1976, draw = 2282 | afto = 2527, draw = 2756 | afto = 1997, draw = 2568
   05000 | afto = 1986, draw = 2288 | afto = 3096, draw = 2760 | afto = 2092, draw = 2573
   10000 | afto = 1963, draw =  152 | afto = 5072, draw =  149 | afto = 2229, draw =  112
         | BINARY -> BINARY         | INDEXED -> INDEXED
   00010 | afto = 2741, draw = 4463 | afto = 2943, draw = 3388
   00050 | afto = 2504, draw = 4238 | afto = 2634, draw = 3388
   00100 | afto = 2349, draw = 4235 | afto = 2585, draw = 3393
   00500 | afto = 2365, draw = 4236 | afto = 2574, draw = 3404
   01000 | afto = 2383, draw = 4247 | afto = 2570, draw = 3404
   02000 | afto = 2393, draw = 4274 | afto = 2795, draw = 3417
   05000 | afto = 2582, draw = 4507 | afto = 2638, draw = 3384
   10000 | afto = 3196, draw =  460 | afto = 2809, draw =   22
   ```
   
   What we can see is:
   - Bicubic is ~3 times slower than bilinear.
   - Graphics is always faster for nearest neighbor and if source = destination size
   - Using a graphics to scale up anything if the target is not not ARGB or RGB is slow. E.g. scaling up the mask to final size.
   - Using indexed is also very slow.
   With the commits of last night using AffineTransformationOp if src < dest and interpolation is enabled, scaleImage should be a lot faster for mask scaling and faster overall.
   We can remove the scale factor test and simply test on destination pixel area to establish thresholds for bicubic, bilinear and maybe nearest neighbor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
THausherr commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-869141779


   Thanks, this has been committed in https://issues.apache.org/jira/browse/PDFBOX-5229 , please close your PR (should have been done by asfgit but didn't for some reason)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp edited a comment on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp edited a comment on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-868018899


   Yes, I would like to keep the interpolation block. 
   
   The PDF 8.9.6.2 spec says that stencil masks can have an interpolate flag, and when enabled to interpolate the mask, not the colors, so low res stencil masks don't appear with jagged edges.
   
   Originally this happens when the image / mask are rendered onto the canvas, but we apply the mask before hand w/o knowing the target resolution. This is why we have to interpolate ourself and also work in maybe way too high of image size.
   
   But I noticed a few things while writing this text:
   
   We do extend the the 1 bit stencil mask to 8 bit, which might pose an interesting problem:
   Looking at the PDFBOX-4218, if instead the image was 48 pixel wide and the mask 6 pixel, and interpolate flag set for the mask, this would create alpha blending near the masked out areas. It would be interesting to create such a file and see how other pdf engines behave. But if this is wrong, one can simply clamp the alpha value to 0 and 255 with a threshold at 127 when writing the alpha value into the stencil mask image (I reckon this should not be done for soft masks). I do believe that the current behaviour is the better choice.
   
   As for interpolating the image, if we knew the target resolution it's going to be rendered in the end, we could scale everything bigger than that down to that and then proceed. But we don't and the PDF spec says that mask and image can have different resolutions, even aspect ratios, but get squeezed into the same target rectangle. We need to emulate that and that's why we scale up to the larger one.
   So if we were to not interpolate the image when scaling it up to the mask, we can introduce a crappy, jagged image, which might be actually written 1:1 to the canvas because maybe the mask is in the resolution of the final render. Usually rendering a non masked image to the canvas will do the proper interpolation, whatever is set in the canvas graphics, but we are circumventing that in applyMask() with large masks.
   
   PDFBOX-2750 doesn't use interpolation so scaleImage doesn't interpolate. And it's just a color soup background (you can't really see it in PDFDebugger because the mask is always applied, but if one disabled the alpha composition in applyMask() it shows the soup). So differences are probably almost invisible. With disabled alpha composition and forced interpolation jagged vs not so jagged diagonal lines are clearly visible in the soup, they are simply not visible with the stencil mask:
   ![PDF2750-bgsoup_jagged](https://user-images.githubusercontent.com/23260584/123341607-8a562980-d54e-11eb-8700-f356241c3615.jpg)
   ![PDF2750-bgsoup_soft](https://user-images.githubusercontent.com/23260584/123341616-8e824700-d54e-11eb-8750-6428b9bd258b.jpg)
   Still shows that image interpolation is good when it is requested by the image.
   
   Even worse, we are violating that overlay principle sometimes:
   If there is an image with 100 x 100 and the mask is 200 x 10, we will scale the mask to 100 x 100 when we actually must scale everything to 200 x 100 as to not deliberately reduce resolution of image or mask.
   
   Seeing that interpolation is not that slow once we switch to bilinear, I could live with it. For very large scaling operations even worse than 10*10 maybe one should switch to nearest neighbor. I was looking for faster image scaling in java and did find AffineTransformationOp, which is handled by java natively and I thought that the Graphics instances use that already. Turns out it is still faster except for nearest neighbor (results for Alfa Page 1):
   Graphics.drawImage():
   Bicubic / Quality: 5.83s
   Bicubic / Speed: 5.83s
   Bilinear / Speed: 1.74s
   Nearest Neighbor: 0.11
   AffineTransformOp.filter() (no difference in speeds between Raster and BufferedImage (even if src = RGB and dst = ARGB)
   Bicubic: 4.79
   Bilinear: 1.5s
   Nearest Neighbor: 0.47
   
   I added some code for both the Op and the Max(mask, image).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
THausherr commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-865485909


   Oops, you did mention that one in your mailing list post. OK, I'll need some time to look at all of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] asfgit closed pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
asfgit closed pull request #121:
URL: https://github.com/apache/pdfbox/pull/121


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
THausherr commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-867872283


   I tested this one with the Alpha file (page 5) and all my files. Rendering of the Alfa file is definitively much faster. I intend to remove the "interpolate" block from scaleImage because the effect is almost invisible (but I would leave a comment). I did not have any difference in the rendering tests (done at 96 dpi), I tried at 300 dpi where I did a visual test and saw only meaningless differences, but it is a bit slower (with interpolation).
   
   Is there a reason that you'd like to keep that block? (besides that the block was there before). It comes from PDFBOX-4218, where I had corrected previous code (PDFBOX-2750) that always interpolated. I retested the file PDFBOX-2750 and it always looks ok. Even when trying the "worst" interpolation I can't get a bad rendering of the file from PDFBOX-2750. I was wondering if java itself was improved, but no, when running an old version PDFBox the rendering was still bad. So there were other changes that improved the quality of the scaling.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
THausherr commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-865485909


   Oops, you did mention that one in your mailing list post. OK, I'll need some time to look at all of this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr edited a comment on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
THausherr edited a comment on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-869141779


   Thanks, this has been committed in https://issues.apache.org/jira/browse/PDFBOX-5229


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-864913411


   > Do you have a PDF where this large scaling happened? I found only one in my entire collection where the "largeScale switch is arbitrarily chosen" segment is hit, which is the second page of PDFBOX-2103, but this was a small image.
   
   Yes, you can download the PDF from here: https://archive.org/details/AlfaWaffenkatalog1911
   It's around 46 MB. It has been processed with Abby Finereader that probably did the optimization. It's a nice and clever one, getting the 400 pages into 46 MB but it's very hard for most PDF libraries to work with this file.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] gunnar-ifp commented on pull request #121: optimize applyMask in PdImageXObject

Posted by GitBox <gi...@apache.org>.
gunnar-ifp commented on pull request #121:
URL: https://github.com/apache/pdfbox/pull/121#issuecomment-864913411


   > Do you have a PDF where this large scaling happened? I found only one in my entire collection where the "largeScale switch is arbitrarily chosen" segment is hit, which is the second page of PDFBOX-2103, but this was a small image.
   
   Yes, you can download the PDF from here: https://archive.org/details/AlfaWaffenkatalog1911
   It's around 46 MB. It has been processed with Abby Finereader that probably did the optimization. It's a nice and clever one, getting the 400 pages into 46 MB but it's very hard for most PDF libraries to work with this file.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org