You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (JIRA)" <ji...@apache.org> on 2015/02/21 14:21:13 UTC

[jira] [Commented] (PDFBOX-2530) Improve PDFDebugger

    [ https://issues.apache.org/jira/browse/PDFBOX-2530?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14330229#comment-14330229 ] 

Tilman Hausherr commented on PDFBOX-2530:
-----------------------------------------

Quick intro on PDIndexed, PDSeparation and PDDeviceN colorspaces:

- Indexed: the color value (0... 255) is an index into a table of color values. This index can be displayed as a colored bar, see this line in the code:
{code}BufferedImage rgbImage = baseColorSpace.toRGBImage(baseRaster);{code}
that image is what I want to see in the debugger, but higher than just one pixel. (Try saving that image into a file, then display it to see what I mean)
- Separation: the color value (0...1) tells how much to use of a specific colorant (the spec mentions metallic and fluorescent colors and special textures), i.e. the result will vary between the lightest and the darkest colorant that can (approximatly) be represented by a color combination in the RGB space. So all I need is a bar showing these colors. The actual look on the RGB screen is calculated with the tintTransform function.
- DeviceN: this is like CMYK but with N arbitrary colors instead of C, M Y and K. The actual look is calculated with the tintTransform function. All needed to display would be these colors, the maximum and minimum values individually, see the first file in PDFBOX-1870 and trace through PDDeviceN.

The PD* colorspace classes are not to be changed. You need to create your output with the methods that already exist, e.g. by calling toRGB()

Btw to get a better understanding of color spaces, read the difference between RGB and CMYK on wikipedia and understand what "additive" and "substractive" means. 

> Improve PDFDebugger
> -------------------
>
>                 Key: PDFBOX-2530
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2530
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: Utilities
>    Affects Versions: 1.8.8, 2.0.0
>            Reporter: Tilman Hausherr
>              Labels: gsoc2015
>
> (This is an idea for [GSOC2015|https://www.google-melange.com/]. But if you want to submit some improvements to the code outside of GSOC2015 now, this is fine. We can always come up with other project ideas for GSOC2015)
> Our command line utility PDFDebugger (part of the command line pdfbox-app get it [here|https://pdfbox.apache.org/downloads.html], read description [here|https://pdfbox.apache.org/commandline/], see the source code [here|https://svn.apache.org/viewvc/pdfbox/trunk/tools/src/main/java/org/apache/pdfbox/tools/PDFDebugger.java?view=markup&sortby=date]) needs some improvements:
>    - hex view
>    - view of non printable characters
>    - saving streams
>    - binary copy & paste
>    - ability to search in streams (very useful for content streams and meta data)
>    - show images that are streams
>    - show PDIndexed color lookup table, show the index value, the base and RGB color value sets when the mouse moves
>    - show PDSeparation color
>    - show PDDeviceN colors
>    - show font encodings and characters
>    - edit attributes
>    - edit streams, while keeping or changing the compression filter
>    - save altered PDF 
>    - color mark of certain PDF operators, especially Q...q and text operators (BT...ET). Ideally, it should help the user understand the "bracketing" of these operators, i.e. understand where a sequence starts and where it ends. (See "operator summary" in the PDF Spec) Other "important" operators I can think of are the matrix, font and color operators. A cool advanced thing would be to show the current color or the font in a popup when hovering above such an operator.
> To see a product with a similar purpose that is better than PDFDebugger, watch [this video|https://www.youtube.com/watch?v=g-QcU9B4qMc].
> I'm not asking to implement a clone of that product (I don't use it, all I know is that video), but we at PDFBox really need something that makes PDF debugging easier. As an example of how the current PDFDebugger prevented me from finding a bug quickly, see PDFBOX-2401 and search for "PDFDebugger".
> Prerequisites:
> - java programming, especially the GUI components
> - the ability to understand existing source code
> Using external software components is possible (must have Apache License or a compatible one), but should be decided on a case-by-case basis, we don't want to get too big.
> Development strategy: go from the easy to the difficult. The wished features are already sorted this way (mostly).
> Get introduced: [download the source code with svn|https://pdfbox.apache.org/downloads.html#scm] and build it with maven. Run PDFDebugger and view some PDFs to see the components of a PDF. Start with the file of PDFBOX-2401. Read up something about the structure of PDF on the web or from the [PDF Specification|https://www.adobe.com/devnet/pdf/pdf_reference.html].
> Mentor: Tilman Hausherr (European timezone, languages: german, english, french). To see the GSoC2014 project I mentored, go to PDFBOX-1915.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org