You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Yurii Luchkiv <yu...@gmail.com> on 2016/08/31 07:11:40 UTC
PDFBox highlight issue
Hello PDFBox team.
How are you?
I am working on highlighting text in PDFs. I found solution and seems it is
the good one. I add few more issues that our requirements needs.
It is working well when the saved PDF open in Preview tool or Safari (I am
using Mac).
But when open result PDF in Google Chrome or Mozilla it is not working?
Do you have any idea why this could happen?
I really appreciate your response. Thanks in advance.
Posts I used:
https://gist.github.com/joelkuiper/331a399961941989fec8
https://gist.github.com/joelkuiper/9eb52555e02edb653dcf
BR, Yurii
Re: PDFBox highlight issue
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 31.08.2016 um 19:47 schrieb Tilman Hausherr:
> I tried to display the saved file in PDFBox. Instead of yellow, the
> mark is purple.
fixed in https://issues.apache.org/jira/browse/PDFBOX-3477
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: PDFBox highlight issue
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 31.08.2016 um 09:11 schrieb Yurii Luchkiv:
> Hello PDFBox team.
>
> How are you?
> I am working on highlighting text in PDFs. I found solution and seems it is
> the good one. I add few more issues that our requirements needs.
> It is working well when the saved PDF open in Preview tool or Safari (I am
> using Mac).
> But when open result PDF in Google Chrome or Mozilla it is not working?
> Do you have any idea why this could happen?
> I really appreciate your response. Thanks in advance.
>
> Posts I used:
> https://gist.github.com/joelkuiper/331a399961941989fec8
> https://gist.github.com/joelkuiper/9eb52555e02edb653dcf
I looked at the result file ( http://www.filedropper.com/output3 ).
There are several possible explanations:
- the CA / ca values of 2 are likely incorrect. From my understanding
these should be between 0 and 1. (try it)
- the annotations don't have an appearance stream. Some tools (e.g.
Adobe) create it themselves, some don't. PDF.js and PDFBox both have
trouble with this. What you can do is to open it with Adobe and save it.
It is then displayed on PDF.js.
Weird thing:
I tried to display the saved file in PDFBox. Instead of yellow, the mark
is purple. After editing the file so that they are 1, the color was OK.
(Btw the result is really nice :-) Why isn't this in our code?! )
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: PDFBox highlight issue
Posted by Gilad Denneboom <gi...@gmail.com>.
Chrome and Firefox use internal plugins to display PDF files, and these
plugins are known to have various issues, especially related to "meta
objects" such as annotations and form fields. If you find a bug with them
you should report it to the developers of these plugins.
On Wed, Aug 31, 2016 at 9:11 AM, Yurii Luchkiv <yu...@gmail.com>
wrote:
> Hello PDFBox team.
>
> How are you?
> I am working on highlighting text in PDFs. I found solution and seems it is
> the good one. I add few more issues that our requirements needs.
> It is working well when the saved PDF open in Preview tool or Safari (I am
> using Mac).
> But when open result PDF in Google Chrome or Mozilla it is not working?
> Do you have any idea why this could happen?
> I really appreciate your response. Thanks in advance.
>
> Posts I used:
> https://gist.github.com/joelkuiper/331a399961941989fec8
> https://gist.github.com/joelkuiper/9eb52555e02edb653dcf
>
> BR, Yurii
>
Re: PDFBox highlight issue
Posted by Tilman Hausherr <TH...@t-online.de>.
Here's some code to create an appearance stream for a highlight
annotation. Note that
- code is is Apache licensed
- works with 2.0.*
- will probably no longer needed in 2.1
- has only been tested with your file and another, i.e. it may not
handle all highlighting cases
- be careful when removing parts that seem to be "not needed", e.g. the
usage of java.util.Locale.US
- "curvy" highlight doesn't work on your file, because the curvy part is
outside of the rectangle
If you need help with different files, you'd need to upload the file so
that I can find the problem.
Good luck
Tilman
if (annotation.getAppearance() == null)
{
if (annotation instanceof PDAnnotationTextMarkup &&
PDAnnotationTextMarkup.SUB_TYPE_HIGHLIGHT.equals(annotation.getSubtype()))
{
PDAnnotationTextMarkup markupAnnotation =
(PDAnnotationTextMarkup) annotation;
PDAppearanceDictionary appearanceDictionary = new
PDAppearanceDictionary();
PDAppearanceStream appearanceStream = new
PDAppearanceStream(renderer.document);
PDPageContentStream cs = new
PDPageContentStream(renderer.document, appearanceStream);
PDRectangle bbox = new
PDRectangle(annotation.getRectangle().getWidth(),
annotation.getRectangle().getHeight());
appearanceStream.setBBox(bbox);
PDExtendedGraphicsState r0 = new PDExtendedGraphicsState();
PDExtendedGraphicsState r1 = new PDExtendedGraphicsState();
r0.setAlphaSourceFlag(false);
r0.setStrokingAlphaConstant(markupAnnotation.getConstantOpacity());
r0.setNonStrokingAlphaConstant(markupAnnotation.getConstantOpacity());
r1.setAlphaSourceFlag(false);
r1.getCOSObject().setItem(COSName.BM,
COSName.MULTIPLY); //TODO PDExtendedGraphicsState.setBlendMode() is missing
if (cs.getResources() == null)
{
if (appearanceStream.getResources() == null)
{
appearanceStream.setResources(new PDResources());
}
cs.setResources(appearanceStream.getResources()); // why not done by
default?
}
cs.setGraphicsStateParameters(r0);
cs.setGraphicsStateParameters(r1);
COSStream mwfoformStrm = new COSStream();
OutputStream os = mwfoformStrm.createOutputStream();
os.write("/Form Do".getBytes(Charsets.ISO_8859_1));
os.close();
PDFormXObject mwfofrm = new PDFormXObject(mwfoformStrm);
cs.drawForm(mwfofrm);
cs.close();
COSStream frmStrm2 = new COSStream();
PDFormXObject frm2 = new PDFormXObject(frmStrm2);
PDResources res = new PDResources();
mwfofrm.setBBox(bbox);
mwfofrm.setResources(res);
COSDictionary groupDict = new COSDictionary();
groupDict.setItem(COSName.S, COSName.TRANSPARENCY);
mwfofrm.getCOSObject().setItem(COSName.GROUP,
groupDict); //TODO PDFormXObject.setGroup() is missing
res.put(COSName.getPDFName("Form"), frm2);
frm2.setBBox(annotation.getRectangle());
frm2.setMatrix(Matrix.getTranslateInstance(-annotation.getRectangle().getLowerLeftX(),
-annotation.getRectangle().getLowerLeftY()).createAffineTransform());
os = frm2.getCOSObject().createOutputStream();
//TODO why can't we get a "classic" content stream?
PDColor color = annotation.getColor();
switch (color.getComponents().length)
{
case 1:
os.write(String.format(java.util.Locale.US,
"%.6f g\n", color.getComponents()[0]).getBytes(Charsets.ISO_8859_1));
break;
case 3:
os.write(String.format(java.util.Locale.US,
"%.6f %.6f %.6f rg\n", color.getComponents()[0],
color.getComponents()[1],
color.getComponents()[2]).getBytes(Charsets.ISO_8859_1));
break;
case 4:
os.write(String.format(java.util.Locale.US,
"%.6f %.6f %.6f %.6f k\n",
color.getComponents()[0],color.getComponents()[1],color.getComponents()[2],color.getComponents()[3]).getBytes(Charsets.ISO_8859_1));
break;
default:
break;
}
float[] qp = markupAnnotation.getQuadPoints();
int of = 0;
while (of + 7 < qp.length)
{
// quadpoints spec sequence is incorrect, correct
one is (4,5 0,1 2,3 6,7)
// for "curvy" highlighting, Bzier points are used
that seem to have a distance of about 1/4 of the height.
// note that curves may not appear if outside of
the rectangle
float delta = (qp[of+3] - qp[of+5]) / 4;
os.write(String.format(java.util.Locale.US, "%.4f
%.4f m\n", qp[of+4], qp[of+5]).getBytes(Charsets.ISO_8859_1));
os.write(String.format(java.util.Locale.US, "%.4f
%.4f %.4f %.4f %.4f %.4f c\n", qp[of+0] - delta, qp[of+5] + delta,
qp[of+0] - delta, qp[of+1] - delta, qp[of+0],
qp[of+1]).getBytes(Charsets.ISO_8859_1));
os.write(String.format(java.util.Locale.US, "%.4f
%.4f l\n", qp[of+2], qp[of+3]).getBytes(Charsets.ISO_8859_1));
os.write(String.format(java.util.Locale.US, "%.4f
%.4f %.4f %.4f %.4f %.4f c\n", qp[of+6]+delta, qp[of+3]-delta,
qp[of+6]+delta, qp[of+7]+delta, qp[of+6],
qp[of+7]).getBytes(Charsets.ISO_8859_1));
os.write("f\n".getBytes(Charsets.ISO_8859_1));
of += 8;
//TODO Adobe puts a "w" (line width). Why?
//TODO If quadpoints is not present or the
conforming reader does not recognize it, the region specified by the
Rect entry should be used. QuadPoints shall be ignored if any coordinate
in the array lies outside the region specified by Rect
}
os.close();
appearanceDictionary.setNormalAppearance(appearanceStream);
annotation.setAppearance(appearanceDictionary);
}
}
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org