You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Yurii Luchkiv <yu...@gmail.com> on 2016/08/31 07:11:40 UTC

PDFBox highlight issue

Hello PDFBox team.

How are you?
I am working on highlighting text in PDFs. I found solution and seems it is
the good one. I add few more issues that our requirements needs.
It is working well when the saved PDF open in Preview tool or Safari (I am
using Mac).
But when open result PDF in Google Chrome or Mozilla it is not working?
Do you have any idea why this could happen?
I really appreciate your response. Thanks in advance.

Posts I used:
https://gist.github.com/joelkuiper/331a399961941989fec8
https://gist.github.com/joelkuiper/9eb52555e02edb653dcf

BR, Yurii

Re: PDFBox highlight issue

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 31.08.2016 um 19:47 schrieb Tilman Hausherr:
> I tried to display the saved file in PDFBox. Instead of yellow, the 
> mark is purple.

fixed in https://issues.apache.org/jira/browse/PDFBOX-3477


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDFBox highlight issue

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 31.08.2016 um 09:11 schrieb Yurii Luchkiv:
> Hello PDFBox team.
>
> How are you?
> I am working on highlighting text in PDFs. I found solution and seems it is
> the good one. I add few more issues that our requirements needs.
> It is working well when the saved PDF open in Preview tool or Safari (I am
> using Mac).
> But when open result PDF in Google Chrome or Mozilla it is not working?
> Do you have any idea why this could happen?
> I really appreciate your response. Thanks in advance.
>
> Posts I used:
> https://gist.github.com/joelkuiper/331a399961941989fec8
> https://gist.github.com/joelkuiper/9eb52555e02edb653dcf

I looked at the result file ( http://www.filedropper.com/output3 ). 
There are several possible explanations:

- the CA / ca values of 2 are likely incorrect. From my understanding 
these should be between 0 and 1. (try it)
- the annotations don't have an appearance stream. Some tools (e.g. 
Adobe) create it themselves, some don't. PDF.js and PDFBox both have 
trouble with this. What you can do is to open it with Adobe and save it. 
It is then displayed on PDF.js.

Weird thing:
I tried to display the saved file in PDFBox. Instead of yellow, the mark 
is purple. After editing the file so that they are 1, the color was OK. 
(Btw the result is really nice :-) Why isn't this in our code?! )

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: PDFBox highlight issue

Posted by Gilad Denneboom <gi...@gmail.com>.
Chrome and Firefox use internal plugins to display PDF files, and these
plugins are known to have various issues, especially related to "meta
objects" such as annotations and form fields. If you find a bug with them
you should report it to the developers of these plugins.

On Wed, Aug 31, 2016 at 9:11 AM, Yurii Luchkiv <yu...@gmail.com>
wrote:

> Hello PDFBox team.
>
> How are you?
> I am working on highlighting text in PDFs. I found solution and seems it is
> the good one. I add few more issues that our requirements needs.
> It is working well when the saved PDF open in Preview tool or Safari (I am
> using Mac).
> But when open result PDF in Google Chrome or Mozilla it is not working?
> Do you have any idea why this could happen?
> I really appreciate your response. Thanks in advance.
>
> Posts I used:
> https://gist.github.com/joelkuiper/331a399961941989fec8
> https://gist.github.com/joelkuiper/9eb52555e02edb653dcf
>
> BR, Yurii
>

Re: PDFBox highlight issue

Posted by Tilman Hausherr <TH...@t-online.de>.
Here's some code to create an appearance stream for a highlight 
annotation. Note that
- code is is Apache licensed
- works with 2.0.*
- will probably no longer needed in 2.1
- has only been tested with your file and another, i.e. it may not 
handle all highlighting cases
- be careful when removing parts that seem to be "not needed", e.g. the 
usage of java.util.Locale.US
- "curvy" highlight doesn't work on your file, because the curvy part is 
outside of the rectangle

If you need help with different files, you'd need to upload the file so 
that I can find the problem.

Good luck

Tilman


         if (annotation.getAppearance() == null)
         {
             if (annotation instanceof PDAnnotationTextMarkup && 
PDAnnotationTextMarkup.SUB_TYPE_HIGHLIGHT.equals(annotation.getSubtype()))
             {
                 PDAnnotationTextMarkup markupAnnotation = 
(PDAnnotationTextMarkup) annotation;
                 PDAppearanceDictionary appearanceDictionary = new 
PDAppearanceDictionary();
                 PDAppearanceStream appearanceStream = new 
PDAppearanceStream(renderer.document);
                 PDPageContentStream cs = new 
PDPageContentStream(renderer.document, appearanceStream);
                 PDRectangle bbox = new 
PDRectangle(annotation.getRectangle().getWidth(), 
annotation.getRectangle().getHeight());
                 appearanceStream.setBBox(bbox);
                 PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();
                 PDExtendedGraphicsState r1 = new PDExtendedGraphicsState();
                 r0.setAlphaSourceFlag(false);
r0.setStrokingAlphaConstant(markupAnnotation.getConstantOpacity());
r0.setNonStrokingAlphaConstant(markupAnnotation.getConstantOpacity());
                 r1.setAlphaSourceFlag(false);
                 r1.getCOSObject().setItem(COSName.BM, 
COSName.MULTIPLY); //TODO PDExtendedGraphicsState.setBlendMode() is missing
                 if (cs.getResources() == null)
                 {
                     if (appearanceStream.getResources() == null)
                     {
                         appearanceStream.setResources(new PDResources());
                     }
cs.setResources(appearanceStream.getResources()); // why not done by 
default?
                 }
                 cs.setGraphicsStateParameters(r0);
                 cs.setGraphicsStateParameters(r1);
                 COSStream mwfoformStrm = new COSStream();
                 OutputStream os = mwfoformStrm.createOutputStream();
                 os.write("/Form Do".getBytes(Charsets.ISO_8859_1));
                 os.close();
                 PDFormXObject mwfofrm = new PDFormXObject(mwfoformStrm);
                 cs.drawForm(mwfofrm);
                 cs.close();
                 COSStream frmStrm2 = new COSStream();
                 PDFormXObject frm2 = new PDFormXObject(frmStrm2);
                 PDResources res = new PDResources();
                 mwfofrm.setBBox(bbox);
                 mwfofrm.setResources(res);
                 COSDictionary groupDict = new COSDictionary();
                 groupDict.setItem(COSName.S, COSName.TRANSPARENCY);
                 mwfofrm.getCOSObject().setItem(COSName.GROUP, 
groupDict); //TODO PDFormXObject.setGroup() is missing
                 res.put(COSName.getPDFName("Form"), frm2);
                 frm2.setBBox(annotation.getRectangle());
frm2.setMatrix(Matrix.getTranslateInstance(-annotation.getRectangle().getLowerLeftX(), 
-annotation.getRectangle().getLowerLeftY()).createAffineTransform());
                 os = frm2.getCOSObject().createOutputStream();
                 //TODO why can't we get a "classic" content stream?
                 PDColor color = annotation.getColor();
                 switch (color.getComponents().length)
                 {
                     case 1:
                         os.write(String.format(java.util.Locale.US, 
"%.6f g\n", color.getComponents()[0]).getBytes(Charsets.ISO_8859_1));
                         break;
                     case 3:
                         os.write(String.format(java.util.Locale.US, 
"%.6f %.6f %.6f rg\n", color.getComponents()[0], 
color.getComponents()[1], 
color.getComponents()[2]).getBytes(Charsets.ISO_8859_1));
                         break;
                     case 4:
                         os.write(String.format(java.util.Locale.US, 
"%.6f %.6f %.6f %.6f k\n", 
color.getComponents()[0],color.getComponents()[1],color.getComponents()[2],color.getComponents()[3]).getBytes(Charsets.ISO_8859_1));
                         break;
                     default:
                         break;
                 }
                 float[] qp = markupAnnotation.getQuadPoints();
                 int of = 0;
                 while (of + 7 < qp.length)
                 {
                     // quadpoints spec sequence is incorrect, correct 
one is (4,5 0,1 2,3 6,7)

                     // for "curvy" highlighting, Bzier points are used 
that seem to have a distance of about 1/4 of the height.
                     // note that curves may not appear if outside of 
the rectangle
                     float delta = (qp[of+3] - qp[of+5]) / 4;

                     os.write(String.format(java.util.Locale.US, "%.4f 
%.4f m\n", qp[of+4], qp[of+5]).getBytes(Charsets.ISO_8859_1));
                     os.write(String.format(java.util.Locale.US, "%.4f 
%.4f %.4f %.4f %.4f %.4f c\n", qp[of+0] - delta, qp[of+5] + delta, 
qp[of+0] - delta, qp[of+1] - delta, qp[of+0], 
qp[of+1]).getBytes(Charsets.ISO_8859_1));
                     os.write(String.format(java.util.Locale.US, "%.4f 
%.4f l\n", qp[of+2], qp[of+3]).getBytes(Charsets.ISO_8859_1));
                     os.write(String.format(java.util.Locale.US, "%.4f 
%.4f %.4f %.4f %.4f %.4f c\n", qp[of+6]+delta, qp[of+3]-delta, 
qp[of+6]+delta, qp[of+7]+delta, qp[of+6], 
qp[of+7]).getBytes(Charsets.ISO_8859_1));
                     os.write("f\n".getBytes(Charsets.ISO_8859_1));
                     of += 8;

                     //TODO Adobe puts a "w" (line width). Why?

                     //TODO If quadpoints is not present or the 
conforming reader does not recognize it, the region specified by the 
Rect entry should be used. QuadPoints shall be ignored if any coordinate 
in the array lies outside the region specified by Rect
                 }
                 os.close();

appearanceDictionary.setNormalAppearance(appearanceStream);
                 annotation.setAppearance(appearanceDictionary);
             }
         }


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org