You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pdfbox.apache.org by ms...@apache.org on 2015/10/23 14:32:13 UTC

svn commit: r1710198 - in /pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook: pdfavalidation.mdtext textextraction.mdtext workingwithattachments.mdtext workingwithfonts.mdtext workingwithmetadata.mdtext

Author: msahyoun
Date: Fri Oct 23 12:32:13 2015
New Revision: 1710198

URL: http://svn.apache.org/viewvc?rev=1710198&view=rev
Log:
PDFBOX-3040: fix source code highlight

Modified:
    pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/pdfavalidation.mdtext
    pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/textextraction.mdtext
    pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithattachments.mdtext
    pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithfonts.mdtext
    pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithmetadata.mdtext

Modified: pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/pdfavalidation.mdtext
URL: http://svn.apache.org/viewvc/pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/pdfavalidation.mdtext?rev=1710198&r1=1710197&r2=1710198&view=diff
==============================================================================
--- pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/pdfavalidation.mdtext (original)
+++ pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/pdfavalidation.mdtext Fri Oct 23 12:32:13 2015
@@ -10,55 +10,56 @@ Check Compliance with PDF/A-1b
 
 This small sample shows how to check the compliance of a file with the PDF/A-1b specification.
 
-	:::java
-    ValidationResult result = null;
+~~~java
+ValidationResult result = null;
 
-    FileDataSource fd = new FileDataSource(args[0]);
-    PreflightParser parser = new PreflightParser(fd);
-    try
+FileDataSource fd = new FileDataSource(args[0]);
+PreflightParser parser = new PreflightParser(fd);
+try
+{
+
+    /* Parse the PDF file with PreflightParser that inherits from the NonSequentialParser.
+     * Some additional controls are present to check a set of PDF/A requirements. 
+     * (Stream length consistency, EOL after some Keyword...)
+     */
+    parser.parse();
+
+    /* Once the syntax validation is done, 
+     * the parser can provide a PreflightDocument 
+     * (that inherits from PDDocument) 
+     * This document process the end of PDF/A validation.
+     */
+    PreflightDocument document = parser.getPreflightDocument();
+    document.validate();
+
+    // Get validation result
+    result = document.getResult();
+    document.close();
+
+}
+catch (SyntaxValidationException e)
+{
+    /* the parse method can throw a SyntaxValidationException 
+     * if the PDF file can't be parsed.
+     * In this case, the exception contains an instance of ValidationResult  
+     */
+    result = e.getResult();
+}
+
+// display validation result
+if (result.isValid())
+{
+    System.out.println("The file " + args[0] + " is a valid PDF/A-1b file");
+}
+else
+{
+    System.out.println("The file" + args[0] + " is not valid, error(s) :");
+    for (ValidationError error : result.getErrorsList())
     {
-
-        /* Parse the PDF file with PreflightParser that inherits from the NonSequentialParser.
-         * Some additional controls are present to check a set of PDF/A requirements. 
-         * (Stream length consistency, EOL after some Keyword...)
-         */
-        parser.parse();
-
-        /* Once the syntax validation is done, 
-         * the parser can provide a PreflightDocument 
-         * (that inherits from PDDocument) 
-         * This document process the end of PDF/A validation.
-         */
-        PreflightDocument document = parser.getPreflightDocument();
-        document.validate();
-
-        // Get validation result
-        result = document.getResult();
-        document.close();
-
+        System.out.println(error.getErrorCode() + " : " + error.getDetails());
     }
-    catch (SyntaxValidationException e)
-    {
-        /* the parse method can throw a SyntaxValidationException 
-         * if the PDF file can't be parsed.
-         * In this case, the exception contains an instance of ValidationResult  
-         */
-        result = e.getResult();
-	}
-
-	// display validation result
-    if (result.isValid())
-    {
-        System.out.println("The file " + args[0] + " is a valid PDF/A-1b file");
-	}
-    else
-    {
-        System.out.println("The file" + args[0] + " is not valid, error(s) :");
-        for (ValidationError error : result.getErrorsList())
-        {
-            System.out.println(error.getErrorCode() + " : " + error.getDetails());
-        }
-	}
+}
+~~~
       	
 ## Categories of Validation Error
 
@@ -71,7 +72,7 @@ In order to help in the failure understa
 
 Category ('Y') and cause ('Z') may be missing according to the difficulty to identify the error detail.
 
-Here after, you can find all Categories (for detailed cause, see constants in the PreglihtConstant interface) :
+Here after, you can find all Categories (for detailed cause, see constants in the ``PreflightConstants`` interface) :
 
 | Category | Description |
 | -------- | ----------- | 

Modified: pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/textextraction.mdtext
URL: http://svn.apache.org/viewvc/pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/textextraction.mdtext?rev=1710198&r1=1710197&r2=1710198&view=diff
==============================================================================
--- pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/textextraction.mdtext (original)
+++ pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/textextraction.mdtext Fri Oct 23 12:32:13 2015
@@ -22,8 +22,10 @@ Lucene is an open source text search lib
 Lucene to be able to index a PDF document it must first be converted to text. PDFBox provides 
 a simple approach for adding PDF documents into a Lucene index.
 
-	Document luceneDocument = LucenePDFDocument.getDocument( ... );
-          
+~~~java
+Document luceneDocument = LucenePDFDocument.getDocument( ... );
+~~~
+
 Now that you hava a Lucene Document object, you can add it to the Lucene index just like 
 you would if it had been created from a text or HTML file. The LucenePDFDocument automatically 
 extracts a variety of metadata fields from the PDF to be added to the index, the javadoc 
@@ -45,10 +47,12 @@ process. The simplest is to specify the
 For example, to only extract text from the second and third pages of the PDF document 
 you could do this:
 
-    PDFTextStripper stripper = new PDFTextStripper();
-    stripper.setStartPage( 2 );
-    stripper.setEndPage( 3 );
-    stripper.writeText( ... );
+~~~java
+PDFTextStripper stripper = new PDFTextStripper();
+stripper.setStartPage( 2 );
+stripper.setEndPage( 3 );
+stripper.writeText( ... );
+~~~~
         
 NOTE: The startPage and endPage properties of PDFTextStripper are 1 based and inclusive.
 

Modified: pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithattachments.mdtext
URL: http://svn.apache.org/viewvc/pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithattachments.mdtext?rev=1710198&r1=1710197&r2=1710198&view=diff
==============================================================================
--- pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithattachments.mdtext (original)
+++ pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithattachments.mdtext Fri Oct 23 12:32:13 2015
@@ -13,15 +13,15 @@ See example:EmbeddedFiles
 A PDF can contain references to external files via the file system or a URL to a remote 
 location. It is also possible to embed a binary file into a PDF document.
 
-There are two classes that can be used when referencing a file. PDSimpleFileSpecification 
+There are two classes that can be used when referencing a file. ``PDSimpleFileSpecification``
 is a simple string reference to a file(e.g. "./movies/BigMovie.avi"). The simple file 
 specification does not allow for any parameters to be set. 
 
-The PDComplexFileSpecification is more feature rich and allows for advanced settings on 
+The ``PDComplexFileSpecification`` is more feature rich and allows for advanced settings on 
 the file reference.
 
 It is also possible to embed a file directly into a PDF. Instead of setting the file 
-attribute of the PDComplexFileSpecification, the EmbeddedFile attribute can be used instead.
+attribute of the ``PDComplexFileSpecification``, the ``EmbeddedFile`` attribute can be used instead.
 
 ## Adding a File Attachment
 
@@ -29,24 +29,26 @@ PDF documents can contain file attachmen
 menu. PDFBox allows attachments to be added to and extracted from PDF documents. 
 Attachments are part of the named tree that is attached to the document catalog.
 
-	PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
+~~~java
+PDEmbeddedFilesNameTreeNode efTree = new PDEmbeddedFilesNameTreeNode();
 
-	//first create the file specification, which holds the embedded file
-	PDComplexFileSpecification fs = new PDComplexFileSpecification();
-	fs.setFile( "Test.txt" );
-	InputStream is = ...;
-	PDEmbeddedFile ef = new PDEmbeddedFile(doc, is );
-	//set some of the attributes of the embedded file
-	ef.setSubtype( "test/plain" );
-	ef.setSize( data.length );
-	ef.setCreationDate( new GregorianCalendar() );
-	fs.setEmbeddedFile( ef );
-
-	//now add the entry to the embedded file tree and set in the document.
-	Map efMap = new HashMap();
-	efMap.put( "My first attachment", fs );
-	efTree.setNames( efMap );
-	//attachments are stored as part of the "names" dictionary in the document catalog
-	PDDocumentNameDictionary names = new PDDocumentNameDictionary( doc.getDocumentCatalog() );
-	names.setEmbeddedFiles( efTree );
-	doc.getDocumentCatalog().setNames( names );
\ No newline at end of file
+//first create the file specification, which holds the embedded file
+PDComplexFileSpecification fs = new PDComplexFileSpecification();
+fs.setFile( "Test.txt" );
+InputStream is = ...;
+PDEmbeddedFile ef = new PDEmbeddedFile(doc, is );
+//set some of the attributes of the embedded file
+ef.setSubtype( "test/plain" );
+ef.setSize( data.length );
+ef.setCreationDate( new GregorianCalendar() );
+fs.setEmbeddedFile( ef );
+
+//now add the entry to the embedded file tree and set in the document.
+Map efMap = new HashMap();
+efMap.put( "My first attachment", fs );
+efTree.setNames( efMap );
+//attachments are stored as part of the "names" dictionary in the document catalog
+PDDocumentNameDictionary names = new PDDocumentNameDictionary( doc.getDocumentCatalog() );
+names.setEmbeddedFiles( efTree );
+doc.getDocumentCatalog().setNames( names );
+~~~
\ No newline at end of file

Modified: pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithfonts.mdtext
URL: http://svn.apache.org/viewvc/pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithfonts.mdtext?rev=1710198&r1=1710197&r2=1710198&view=diff
==============================================================================
--- pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithfonts.mdtext (original)
+++ pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithfonts.mdtext Fri Oct 23 12:32:13 2015
@@ -30,59 +30,63 @@ The PDF specification states that a stan
 
 This small sample shows how to create a new document and print the text "Hello World" using one of the PDF base fonts.
 
-	// Create a document and add a page to it
-	PDDocument document = new PDDocument();
-	PDPage page = new PDPage();
-	document.addPage( page );
-	
-	// Create a new font object selecting one of the PDF base fonts
-	PDFont font = PDType1Font.HELVETICA_BOLD;
-	
-	// Start a new content stream which will "hold" the to be created content
-	PDPageContentStream contentStream = new PDPageContentStream(document, page);
-	
-	// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
-	contentStream.beginText();
-	contentStream.setFont( font, 12 );
-	contentStream.moveTextPositionByAmount( 100, 700 );
-	contentStream.drawString( "Hello World" );
-	contentStream.endText();
-	
-	// Make sure that the content stream is closed:
-	contentStream.close();
-	
-	// Save the results and ensure that the document is properly closed:
-	document.save( "Hello World.pdf");
-	document.close();
+~~~java
+// Create a document and add a page to it
+PDDocument document = new PDDocument();
+PDPage page = new PDPage();
+document.addPage( page );
+
+// Create a new font object selecting one of the PDF base fonts
+PDFont font = PDType1Font.HELVETICA_BOLD;
+
+// Start a new content stream which will "hold" the to be created content
+PDPageContentStream contentStream = new PDPageContentStream(document, page);
+
+// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
+contentStream.beginText();
+contentStream.setFont( font, 12 );
+contentStream.moveTextPositionByAmount( 100, 700 );
+contentStream.drawString( "Hello World" );
+contentStream.endText();
+
+// Make sure that the content stream is closed:
+contentStream.close();
+
+// Save the results and ensure that the document is properly closed:
+document.save( "Hello World.pdf");
+document.close();
+~~~
 
 ## Hello World using a TrueType font
 
 This small sample shows how to create a new document and print the text "Hello World" using a TrueType font.
 
-	// Create a document and add a page to it
-	PDDocument document = new PDDocument();
-	PDPage page = new PDPage();
-	document.addPage( page );
-	
-	// Create a new font object by loading a TrueType font into the document
-	PDFont font = PDTrueTypeFont.loadTTF(document, "Arial.ttf");
-	
-	// Start a new content stream which will "hold" the to be created content
-	PDPageContentStream contentStream = new PDPageContentStream(document, page);
-	
-	// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
-	contentStream.beginText();
-	contentStream.setFont( font, 12 );
-	contentStream.moveTextPositionByAmount( 100, 700 );
-	contentStream.drawString( "Hello World" );
-	contentStream.endText();
-	
-	// Make sure that the content stream is closed:
-	contentStream.close();
-	
-	// Save the results and ensure that the document is properly closed:
-	document.save( "Hello World.pdf");
-	document.close();
+~~~java
+// Create a document and add a page to it
+PDDocument document = new PDDocument();
+PDPage page = new PDPage();
+document.addPage( page );
+
+// Create a new font object by loading a TrueType font into the document
+PDFont font = PDTrueTypeFont.loadTTF(document, "Arial.ttf");
+
+// Start a new content stream which will "hold" the to be created content
+PDPageContentStream contentStream = new PDPageContentStream(document, page);
+
+// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
+contentStream.beginText();
+contentStream.setFont( font, 12 );
+contentStream.moveTextPositionByAmount( 100, 700 );
+contentStream.drawString( "Hello World" );
+contentStream.endText();
+
+// Make sure that the content stream is closed:
+contentStream.close();
+
+// Save the results and ensure that the document is properly closed:
+document.save( "Hello World.pdf");
+document.close();
+~~~
 
 While it is recommended to embed all fonts for greatest portability not all PDF producer 
 applications will do this. When displaying a PDF it is necessary to find an external font to use. 
@@ -97,27 +101,29 @@ use when no mapping exists.
 
 This small sample shows how to create a new document and print the text "Hello World" using a Postscript Type1 font.
 
-	// Create a document and add a page to it
-	PDDocument document = new PDDocument();
-	PDPage page = new PDPage();
-	document.addPage( page );
-	
-	// Create a new font object by loading a Postscript Type 1 font into the document
-	PDFont font = new PDType1AfmPfbFont(doc,"cfm.afm");
-	
-	// Start a new content stream which will "hold" the to be created content
-	PDPageContentStream contentStream = new PDPageContentStream(document, page);
-	
-	// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
-	contentStream.beginText();
-	contentStream.setFont( font, 12 );
-	contentStream.moveTextPositionByAmount( 100, 700 );
-	contentStream.drawString( "Hello World" );
-	contentStream.endText();
-	
-	// Make sure that the content stream is closed:
-	contentStream.close();
-	
-	// Save the results and ensure that the document is properly closed:
-	document.save( "Hello World.pdf");
-	document.close();
\ No newline at end of file
+~~~java
+// Create a document and add a page to it
+PDDocument document = new PDDocument();
+PDPage page = new PDPage();
+document.addPage( page );
+
+// Create a new font object by loading a Postscript Type 1 font into the document
+PDFont font = new PDType1AfmPfbFont(doc,"cfm.afm");
+
+// Start a new content stream which will "hold" the to be created content
+PDPageContentStream contentStream = new PDPageContentStream(document, page);
+
+// Define a text content stream using the selected font, moving the cursor and drawing the text "Hello World"
+contentStream.beginText();
+contentStream.setFont( font, 12 );
+contentStream.moveTextPositionByAmount( 100, 700 );
+contentStream.drawString( "Hello World" );
+contentStream.endText();
+
+// Make sure that the content stream is closed:
+contentStream.close();
+
+// Save the results and ensure that the document is properly closed:
+document.save( "Hello World.pdf");
+document.close();
+~~~
\ No newline at end of file

Modified: pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithmetadata.mdtext
URL: http://svn.apache.org/viewvc/pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithmetadata.mdtext?rev=1710198&r1=1710197&r2=1710198&view=diff
==============================================================================
--- pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithmetadata.mdtext (original)
+++ pdfbox/cmssite/branches/jekyll-migration/content/1.8/cookbook/workingwithmetadata.mdtext Fri Oct 23 12:32:13 2015
@@ -17,18 +17,19 @@ Getting basic Metadata
 To set or retrieve basic information about the document the PDDocumentInformation object 
 provides a high level API to that information:
 
-    PDDocumentInformation info = document.getDocumentInformation();
-    System.out.println( "Page Count=" + document.getNumberOfPages() );
-    System.out.println( "Title=" + info.getTitle() );
-    System.out.println( "Author=" + info.getAuthor() );
-    System.out.println( "Subject=" + info.getSubject() );
-    System.out.println( "Keywords=" + info.getKeywords() );
-    System.out.println( "Creator=" + info.getCreator() );
-    System.out.println( "Producer=" + info.getProducer() );
-    System.out.println( "Creation Date=" + info.getCreationDate() );
-    System.out.println( "Modification Date=" + info.getModificationDate());
-    System.out.println( "Trapped=" + info.getTrapped() );      
-      
+~~~java
+PDDocumentInformation info = document.getDocumentInformation();
+System.out.println( "Page Count=" + document.getNumberOfPages() );
+System.out.println( "Title=" + info.getTitle() );
+System.out.println( "Author=" + info.getAuthor() );
+System.out.println( "Subject=" + info.getSubject() );
+System.out.println( "Keywords=" + info.getKeywords() );
+System.out.println( "Creator=" + info.getCreator() );
+System.out.println( "Producer=" + info.getProducer() );
+System.out.println( "Creation Date=" + info.getCreationDate() );
+System.out.println( "Modification Date=" + info.getModificationDate());
+System.out.println( "Trapped=" + info.getTrapped() );      
+~~~
 
 ## Accessing PDF Metadata
 
@@ -50,14 +51,16 @@ recommended that you review that specifi
 managing the XML metadata, PDFBox uses standard java InputStream/OutputStream to retrieve 
 or set the XML metadata.
 
-	PDDocument doc = PDDocument.load( ... );
-    PDDocumentCatalog catalog = doc.getDocumentCatalog();
-    PDMetadata metadata = catalog.getMetadata();
-
-    //to read the XML metadata
-    InputStream xmlInputStream = metadata.createInputStream();
-
-    //or to write new XML metadata
-    InputStream newXMPData = ...;
-    PDMetadata newMetadata = new PDMetadata(doc, newXMLData, false );
-    catalog.setMetadata( newMetadata );
\ No newline at end of file
+~~~java
+PDDocument doc = PDDocument.load( ... );
+PDDocumentCatalog catalog = doc.getDocumentCatalog();
+PDMetadata metadata = catalog.getMetadata();
+
+//to read the XML metadata
+InputStream xmlInputStream = metadata.createInputStream();
+
+//or to write new XML metadata
+InputStream newXMPData = ...;
+PDMetadata newMetadata = new PDMetadata(doc, newXMLData, false );
+catalog.setMetadata( newMetadata );
+~~~
\ No newline at end of file