You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Claudius Teodorescu <cl...@gmail.com> on 2017/01/14 07:44:17 UTC

Rendering of a Devanagari text

Hi,

I am using pdfbox 2.0.4, and I am trying to output a pdf document with text
following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्.

The code is very simple:
    @Test
    public void testPdfBox() throws IOException {
        PDDocument document = new PDDocument();
        PDPage page = new PDPage();
        document.addPage(page);

        PDFont font = PDType0Font.load(document,
                new
File("/home/claudius/workspaces/repositories/backup/fonts/Sanskrit2003.ttf"));

        PDPageContentStream contentStream = new
PDPageContentStream(document, page);

        contentStream.beginText();
        contentStream.setFont(font, 12);
        contentStream.moveTextPositionByAmount(100, 700);
        contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्");
        contentStream.endText();

        // Make sure that the content stream is closed:
        contentStream.close();

        // Save the results and ensure that the document is properly closed:
        document.save("target/" + name.getMethodName() + ".pdf");
        document.close();
    }

The output pdf file (attached) is not rendering correctly the string, as it
is above. Namely, the ligatures are not displayed, as if they do not exist.
On the other hand, if I am copying the text from the pdf file, and paste it
in eclipse, it shows perfectly.

I checked the pdf output with evince, firefox, and adobe reader 9, in
ubuntu.

Any idea on how to fix this display issue?

Thanks,
Claudius

-- 
http://kuberam.ro

Re: Rendering of a Devanagari text

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 19.01.2017 um 13:17 schrieb Claudius Teodorescu:
> So, I found the private use Unicode code for a ligature, and displayed 
> it in a PDF document by using the code:
>
> pageContentStream.showText("\u0924\u094d\u0924\u094d\u0935 is correctly displayed with glyph 
> substitution as " + "\ue10d");
>
> The result is in the attached file.
>
> So, it looks that what is needed is only the string to be rendered 
> with all the glyph substitution done. With this approach, the PDFBox 
> is left untouched.

That's similar to the line

stream.showText("Ligatures: \uFB01lm \uFB02ood");

in the EmbeddedFonts.java example. But the problem ist that one can't 
know in advance (without parsing some advanced font tables) whether such 
ligatures exist, and what code is to be used.

Tilman


>
>
> Cheers from Heidelberg,
> Claudius
>
> On Tue, Jan 17, 2017 at 8:55 AM, Tilman Hausherr 
> <THausherr@t-online.de <ma...@t-online.de>> wrote:
>
>     Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu:
>
>         Well, I was just about to congratulate myself for fixing this
>         with PDFBox,
>         as FOP is returning good output, but with a character that is
>         represented
>         in half.
>
>         So, I guess I will need a text layout engine. What output of
>         such engine
>         would be fit for PDFBox?
>
>
>     In PDPageContentStream.showText there is this line:
>
>     COSWriter.writeString(font.encode(text), getOutput());
>
>     So you need to get that sequence... might be tricky as above that
>     line there's the subsetting that also needs the correct codes.
>     This is not a change that will be done within a few hours.
>
>     Tilman
>
>
>
>
>
>         Thanks,
>         Claudius
>
>         On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr
>         <THausherr@t-online.de <ma...@t-online.de>>
>         wrote:
>
>             Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
>
>                 Its is not a big deal, but works for an awt component,
>                 but it is not
>                 related to that:
>
>                           String s = "\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d";
>                           Font font2 = new Font("Sanskrit2003",
>                 Font.PLAIN, 24);
>                           FontRenderContext frc = new
>                 FontRenderContext(new
>                 AffineTransform(), true, true);
>
>                           char[] chars = s.toCharArray();
>                           GlyphVector glyphVector =
>                 font2.layoutGlyphVector(frc, chars, 0,
>                 chars.length, 0);// createGlyphVector(frc, s);
>
>                           int length = glyphVector.getNumGlyphs();
>
>                           for (int i = 0; i < length; i++) {
>                             Shape glyph = glyphVector.getGlyphOutline(i);
>                            
>                 System.out.println(glyphVector.getGlyphCode(i));
>                           }
>
>                 Any pointers about where I can hook this in PDFBox?
>
>             Problem is we don't use the awt fonts anymore.
>
>             Tilman
>
>
>
>
>                 Thanks,
>                 Claudius
>
>                 On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler
>                 <andreas@lehmi.de <ma...@lehmi.de>>
>                 wrote:
>
>                 Hi,
>
>                     Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>
>                     Hi,
>
>
>                         Thanks for the answer, Tilman.
>
>                         I managed to get the Devanagari text exactly
>                         as it should, by using
>                         java.awt.font.layoutGlyphVector().
>
>                         Are they any chances to write a GlyphVector in
>                         a PDFBox page?
>
>                         There was a discussion at [1] about using
>                         GlpyhVector, but we didn't
>
>                     make
>                     any descision nor did we implement anything.
>
>                     Do you mimd to share some of your code as a
>                     possible starting point?
>
>                     BR
>                     Andreas
>
>                     [1]
>                     https://issues.apache.org/jira/browse/PDFBOX-3550
>                     <https://issues.apache.org/jira/browse/PDFBOX-3550>
>
>
>                     Thanks,
>
>                         Claudius
>
>                         On Sat, Jan 14, 2017 at 9:45 AM, Tilman
>                         Hausherr <THausherr@t-online.de
>                         <ma...@t-online.de>
>                         wrote:
>
>                         Hi,
>
>                             This is not supported, sorry. PDFBox just
>                             outputs the glyphs for the
>                             single characters and does not replace for
>                             ligatures.
>
>                             Tilman
>
>
>                             Am 14.01.2017 um 08:44 schrieb Claudius
>                             Teodorescu:
>
>                             Hi,
>
>                                 I am using pdfbox 2.0.4, and I am
>                                 trying to output a pdf document with
>                                 text following devanagari text:
>                                 \u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d
>                                 \u092f\u0941\u0915\u094d\u0924\u092e\u094d.
>
>                                 The code is very simple:
>                                       @Test
>                                       public void testPdfBox() throws
>                                 IOException {
>                                           PDDocument document = new
>                                 PDDocument();
>                                           PDPage page = new PDPage();
>                                           document.addPage(page);
>
>                                           PDFont font =
>                                 PDType0Font.load(document,
>                                                   new
>                                 File("/home/claudius/workspace
>                                 s/repositories/backup/fonts/Sanskrit2003.ttf"));
>
>                                           PDPageContentStream
>                                 contentStream = new
>                                 PDPageContentStream(document, page);
>
>                                           contentStream.beginText();
>                                           contentStream.setFont(font, 12);
>                                          
>                                 contentStream.moveTextPositionByAmount(100,
>                                 700);
>                                 contentStream.showText("\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f
>                                 \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d");
>                                           contentStream.endText();
>
>                                           // Make sure that the
>                                 content stream is closed:
>                                           contentStream.close();
>
>                                           // Save the results and
>                                 ensure that the document is properly
>                                 closed:
>                                           document.save("target/" +
>                                 name.getMethodName() + ".pdf");
>                                           document.close();
>                                       }
>
>                                 The output pdf file (attached) is not
>                                 rendering correctly the string,
>                                 as
>                                 it is above. Namely, the ligatures are
>                                 not displayed, as if they do
>                                 not
>                                 exist. On the other hand, if I am
>                                 copying the text from the pdf file,
>                                 and
>                                 paste it in eclipse, it shows perfectly.
>
>                                 I checked the pdf output with evince,
>                                 firefox, and adobe reader 9, in
>                                 ubuntu.
>
>                                 Any idea on how to fix this display issue?
>
>                                 Thanks,
>                                 Claudius
>
>                                 --
>                                 http://kuberam.ro
>
>
>                                 ---------------------------------------------------------------------
>                                 To unsubscribe, e-mail:
>                                 users-unsubscribe@pdfbox.apache.org
>                                 <ma...@pdfbox.apache.org>
>                                 For additional commands, e-mail:
>                                 users-help@pdfbox.apache.org
>                                 <ma...@pdfbox.apache.org>
>
>
>
>
>                         ---------------------------------------------------------------------
>
>                     To unsubscribe, e-mail:
>                     users-unsubscribe@pdfbox.apache.org
>                     <ma...@pdfbox.apache.org>
>                     For additional commands, e-mail:
>                     users-help@pdfbox.apache.org
>                     <ma...@pdfbox.apache.org>
>
>
>
>             ---------------------------------------------------------------------
>             To unsubscribe, e-mail:
>             users-unsubscribe@pdfbox.apache.org
>             <ma...@pdfbox.apache.org>
>             For additional commands, e-mail:
>             users-help@pdfbox.apache.org
>             <ma...@pdfbox.apache.org>
>
>
>
>
>
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>     <ma...@pdfbox.apache.org>
>     For additional commands, e-mail: users-help@pdfbox.apache.org
>     <ma...@pdfbox.apache.org>
>
>
>
>
> -- 
> http://kuberam.ro
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Rendering of a Devanagari text

Posted by Claudius Teodorescu <cl...@gmail.com>.

Well, I hope I am not doing what you said, as I am editing in eclipse in
ubuntu, and I am compiling with maven as UTF-8.

On Fri, Jan 20, 2017 at 1:56 PM, Lachezar Dobrev <l....@gmail.com> wrote:

>   Apologies for being blunt, but seeing that you're mixing string
> literals and UNICODE escape sequences, I have to ask: are you *sure*
> you're using the same character set when editing the .java file and
> when compiling it? I've had discrepancies when editing the java file
> in one encoding (say UTF-8), but the automated build system uses ISO
> 8859-1, and literal non-Latin characters get mangled, sans those
> written as UNICODE escape sequences, since those are in the ASCII
> range.
>
> 2017-01-19 14:17 GMT+02:00 Claudius Teodorescu <
> claudius.teodorescu@gmail.com>:
> > So, I found the private use Unicode code for a ligature, and displayed
> it in
> > a PDF document by using the code:
> >
> > pageContentStream.showText("त्त्व is correctly displayed with glyph
> > substitution as " + "\ue10d");
> >
> > The result is in the attached file.
> >
> > So, it looks that what is needed is only the string to be rendered with
> all
> > the glyph substitution done. With this approach, the PDFBox is left
> > untouched.
> >
> >
> > Cheers from Heidelberg,
> > Claudius
> >
> > On Tue, Jan 17, 2017 at 8:55 AM, Tilman Hausherr <TH...@t-online.de>
> > wrote:
> >>
> >> Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu:
> >>>
> >>> Well, I was just about to congratulate myself for fixing this with
> >>> PDFBox,
> >>> as FOP is returning good output, but with a character that is
> represented
> >>> in half.
> >>>
> >>> So, I guess I will need a text layout engine. What output of such
> engine
> >>> would be fit for PDFBox?
> >>
> >>
> >> In PDPageContentStream.showText there is this line:
> >>
> >> COSWriter.writeString(font.encode(text), getOutput());
> >>
> >> So you need to get that sequence... might be tricky as above that line
> >> there's the subsetting that also needs the correct codes. This is not a
> >> change that will be done within a few hours.
> >>
> >> Tilman
> >>
> >>
> >>
> >>>
> >>>
> >>> Thanks,
> >>> Claudius
> >>>
> >>> On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <
> THausherr@t-online.de>
> >>> wrote:
> >>>
> >>>> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
> >>>>
> >>>>> Its is not a big deal, but works for an awt component, but it is not
> >>>>> related to that:
> >>>>>
> >>>>>           String s = "कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्";
> >>>>>           Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
> >>>>>           FontRenderContext frc = new FontRenderContext(new
> >>>>> AffineTransform(), true, true);
> >>>>>
> >>>>>           char[] chars = s.toCharArray();
> >>>>>           GlyphVector glyphVector = font2.layoutGlyphVector(frc,
> chars,
> >>>>> 0,
> >>>>> chars.length, 0);// createGlyphVector(frc, s);
> >>>>>
> >>>>>           int length = glyphVector.getNumGlyphs();
> >>>>>
> >>>>>           for (int i = 0; i < length; i++) {
> >>>>>             Shape glyph = glyphVector.getGlyphOutline(i);
> >>>>>             System.out.println(glyphVector.getGlyphCode(i));
> >>>>>           }
> >>>>>
> >>>>> Any pointers about where I can hook this in PDFBox?
> >>>>>
> >>>> Problem is we don't use the awt fonts anymore.
> >>>>
> >>>> Tilman
> >>>>
> >>>>
> >>>>
> >>>>
> >>>>> Thanks,
> >>>>> Claudius
> >>>>>
> >>>>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <
> andreas@lehmi.de>
> >>>>> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>>
> >>>>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
> >>>>>>
> >>>>>> Hi,
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks for the answer, Tilman.
> >>>>>>>
> >>>>>>> I managed to get the Devanagari text exactly as it should, by using
> >>>>>>> java.awt.font.layoutGlyphVector().
> >>>>>>>
> >>>>>>> Are they any chances to write a GlyphVector in a PDFBox page?
> >>>>>>>
> >>>>>>> There was a discussion at [1] about using GlpyhVector, but we
> didn't
> >>>>>>
> >>>>>> make
> >>>>>> any descision nor did we implement anything.
> >>>>>>
> >>>>>> Do you mimd to share some of your code as a possible starting point?
> >>>>>>
> >>>>>> BR
> >>>>>> Andreas
> >>>>>>
> >>>>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
> >>>>>>
> >>>>>>
> >>>>>> Thanks,
> >>>>>>>
> >>>>>>> Claudius
> >>>>>>>
> >>>>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr
> >>>>>>> <THausherr@t-online.de
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi,
> >>>>>>>
> >>>>>>>> This is not supported, sorry. PDFBox just outputs the glyphs for
> the
> >>>>>>>> single characters and does not replace for ligatures.
> >>>>>>>>
> >>>>>>>> Tilman
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document
> >>>>>>>>> with
> >>>>>>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत्
> >>>>>>>>> युक्तम्.
> >>>>>>>>>
> >>>>>>>>> The code is very simple:
> >>>>>>>>>       @Test
> >>>>>>>>>       public void testPdfBox() throws IOException {
> >>>>>>>>>           PDDocument document = new PDDocument();
> >>>>>>>>>           PDPage page = new PDPage();
> >>>>>>>>>           document.addPage(page);
> >>>>>>>>>
> >>>>>>>>>           PDFont font = PDType0Font.load(document,
> >>>>>>>>>                   new File("/home/claudius/workspace
> >>>>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
> >>>>>>>>>
> >>>>>>>>>           PDPageContentStream contentStream = new
> >>>>>>>>> PDPageContentStream(document, page);
> >>>>>>>>>
> >>>>>>>>>           contentStream.beginText();
> >>>>>>>>>           contentStream.setFont(font, 12);
> >>>>>>>>>           contentStream.moveTextPositionByAmount(100, 700);
> >>>>>>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत्
> युक्तम्");
> >>>>>>>>>           contentStream.endText();
> >>>>>>>>>
> >>>>>>>>>           // Make sure that the content stream is closed:
> >>>>>>>>>           contentStream.close();
> >>>>>>>>>
> >>>>>>>>>           // Save the results and ensure that the document is
> >>>>>>>>> properly
> >>>>>>>>> closed:
> >>>>>>>>>           document.save("target/" + name.getMethodName() +
> ".pdf");
> >>>>>>>>>           document.close();
> >>>>>>>>>       }
> >>>>>>>>>
> >>>>>>>>> The output pdf file (attached) is not rendering correctly the
> >>>>>>>>> string,
> >>>>>>>>> as
> >>>>>>>>> it is above. Namely, the ligatures are not displayed, as if they
> do
> >>>>>>>>> not
> >>>>>>>>> exist. On the other hand, if I am copying the text from the pdf
> >>>>>>>>> file,
> >>>>>>>>> and
> >>>>>>>>> paste it in eclipse, it shows perfectly.
> >>>>>>>>>
> >>>>>>>>> I checked the pdf output with evince, firefox, and adobe reader
> 9,
> >>>>>>>>> in
> >>>>>>>>> ubuntu.
> >>>>>>>>>
> >>>>>>>>> Any idea on how to fix this display issue?
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>> Claudius
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> http://kuberam.ro
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ------------------------------------------------------------
> ---------
> >>>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>> ------------------------------------------------------------
> ---------
> >>>>>>
> >>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>>>
> >>>>>>
> >>>>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >>>> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>>>
> >>>>
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: users-help@pdfbox.apache.org
> >>
> >
> >
> >
> > --
> > http://kuberam.ro
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
http://kuberam.ro

Re: Rendering of a Devanagari text

Posted by Lachezar Dobrev <l....@gmail.com>.

  Apologies for being blunt, but seeing that you're mixing string
literals and UNICODE escape sequences, I have to ask: are you *sure*
you're using the same character set when editing the .java file and
when compiling it? I've had discrepancies when editing the java file
in one encoding (say UTF-8), but the automated build system uses ISO
8859-1, and literal non-Latin characters get mangled, sans those
written as UNICODE escape sequences, since those are in the ASCII
range.

2017-01-19 14:17 GMT+02:00 Claudius Teodorescu <cl...@gmail.com>:
> So, I found the private use Unicode code for a ligature, and displayed it in
> a PDF document by using the code:
>
> pageContentStream.showText("त्त्व is correctly displayed with glyph
> substitution as " + "\ue10d");
>
> The result is in the attached file.
>
> So, it looks that what is needed is only the string to be rendered with all
> the glyph substitution done. With this approach, the PDFBox is left
> untouched.
>
>
> Cheers from Heidelberg,
> Claudius
>
> On Tue, Jan 17, 2017 at 8:55 AM, Tilman Hausherr <TH...@t-online.de>
> wrote:
>>
>> Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu:
>>>
>>> Well, I was just about to congratulate myself for fixing this with
>>> PDFBox,
>>> as FOP is returning good output, but with a character that is represented
>>> in half.
>>>
>>> So, I guess I will need a text layout engine. What output of such engine
>>> would be fit for PDFBox?
>>
>>
>> In PDPageContentStream.showText there is this line:
>>
>> COSWriter.writeString(font.encode(text), getOutput());
>>
>> So you need to get that sequence... might be tricky as above that line
>> there's the subsetting that also needs the correct codes. This is not a
>> change that will be done within a few hours.
>>
>> Tilman
>>
>>
>>
>>>
>>>
>>> Thanks,
>>> Claudius
>>>
>>> On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <TH...@t-online.de>
>>> wrote:
>>>
>>>> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
>>>>
>>>>> Its is not a big deal, but works for an awt component, but it is not
>>>>> related to that:
>>>>>
>>>>>           String s = "कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्";
>>>>>           Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
>>>>>           FontRenderContext frc = new FontRenderContext(new
>>>>> AffineTransform(), true, true);
>>>>>
>>>>>           char[] chars = s.toCharArray();
>>>>>           GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars,
>>>>> 0,
>>>>> chars.length, 0);// createGlyphVector(frc, s);
>>>>>
>>>>>           int length = glyphVector.getNumGlyphs();
>>>>>
>>>>>           for (int i = 0; i < length; i++) {
>>>>>             Shape glyph = glyphVector.getGlyphOutline(i);
>>>>>             System.out.println(glyphVector.getGlyphCode(i));
>>>>>           }
>>>>>
>>>>> Any pointers about where I can hook this in PDFBox?
>>>>>
>>>> Problem is we don't use the awt fonts anymore.
>>>>
>>>> Tilman
>>>>
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>> Claudius
>>>>>
>>>>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <an...@lehmi.de>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>>
>>>>>>> Thanks for the answer, Tilman.
>>>>>>>
>>>>>>> I managed to get the Devanagari text exactly as it should, by using
>>>>>>> java.awt.font.layoutGlyphVector().
>>>>>>>
>>>>>>> Are they any chances to write a GlyphVector in a PDFBox page?
>>>>>>>
>>>>>>> There was a discussion at [1] about using GlpyhVector, but we didn't
>>>>>>
>>>>>> make
>>>>>> any descision nor did we implement anything.
>>>>>>
>>>>>> Do you mimd to share some of your code as a possible starting point?
>>>>>>
>>>>>> BR
>>>>>> Andreas
>>>>>>
>>>>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
>>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>>>
>>>>>>> Claudius
>>>>>>>
>>>>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr
>>>>>>> <THausherr@t-online.de
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>>>>>>>> single characters and does not replace for ligatures.
>>>>>>>>
>>>>>>>> Tilman
>>>>>>>>
>>>>>>>>
>>>>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document
>>>>>>>>> with
>>>>>>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत्
>>>>>>>>> युक्तम्.
>>>>>>>>>
>>>>>>>>> The code is very simple:
>>>>>>>>>       @Test
>>>>>>>>>       public void testPdfBox() throws IOException {
>>>>>>>>>           PDDocument document = new PDDocument();
>>>>>>>>>           PDPage page = new PDPage();
>>>>>>>>>           document.addPage(page);
>>>>>>>>>
>>>>>>>>>           PDFont font = PDType0Font.load(document,
>>>>>>>>>                   new File("/home/claudius/workspace
>>>>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>>>>>>>
>>>>>>>>>           PDPageContentStream contentStream = new
>>>>>>>>> PDPageContentStream(document, page);
>>>>>>>>>
>>>>>>>>>           contentStream.beginText();
>>>>>>>>>           contentStream.setFont(font, 12);
>>>>>>>>>           contentStream.moveTextPositionByAmount(100, 700);
>>>>>>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्");
>>>>>>>>>           contentStream.endText();
>>>>>>>>>
>>>>>>>>>           // Make sure that the content stream is closed:
>>>>>>>>>           contentStream.close();
>>>>>>>>>
>>>>>>>>>           // Save the results and ensure that the document is
>>>>>>>>> properly
>>>>>>>>> closed:
>>>>>>>>>           document.save("target/" + name.getMethodName() + ".pdf");
>>>>>>>>>           document.close();
>>>>>>>>>       }
>>>>>>>>>
>>>>>>>>> The output pdf file (attached) is not rendering correctly the
>>>>>>>>> string,
>>>>>>>>> as
>>>>>>>>> it is above. Namely, the ligatures are not displayed, as if they do
>>>>>>>>> not
>>>>>>>>> exist. On the other hand, if I am copying the text from the pdf
>>>>>>>>> file,
>>>>>>>>> and
>>>>>>>>> paste it in eclipse, it shows perfectly.
>>>>>>>>>
>>>>>>>>> I checked the pdf output with evince, firefox, and adobe reader 9,
>>>>>>>>> in
>>>>>>>>> ubuntu.
>>>>>>>>>
>>>>>>>>> Any idea on how to fix this display issue?
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Claudius
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> http://kuberam.ro
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
>
> --
> http://kuberam.ro
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Rendering of a Devanagari text

Posted by Claudius Teodorescu <cl...@gmail.com>.

So, I found the private use Unicode code for a ligature, and displayed it
in a PDF document by using the code:

pageContentStream.showText("त्त्व is correctly displayed with glyph
substitution as " + "\ue10d");

The result is in the attached file.

So, it looks that what is needed is only the string to be rendered with all
the glyph substitution done. With this approach, the PDFBox is left
untouched.


Cheers from Heidelberg,
Claudius

On Tue, Jan 17, 2017 at 8:55 AM, Tilman Hausherr <TH...@t-online.de>
wrote:

> Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu:
>
>> Well, I was just about to congratulate myself for fixing this with PDFBox,
>> as FOP is returning good output, but with a character that is represented
>> in half.
>>
>> So, I guess I will need a text layout engine. What output of such engine
>> would be fit for PDFBox?
>>
>
> In PDPageContentStream.showText there is this line:
>
> COSWriter.writeString(font.encode(text), getOutput());
>
> So you need to get that sequence... might be tricky as above that line
> there's the subsetting that also needs the correct codes. This is not a
> change that will be done within a few hours.
>
> Tilman
>
>
>
>
>>
>> Thanks,
>> Claudius
>>
>> On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <TH...@t-online.de>
>> wrote:
>>
>> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
>>>
>>> Its is not a big deal, but works for an awt component, but it is not
>>>> related to that:
>>>>
>>>>           String s = "कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्";
>>>>           Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
>>>>           FontRenderContext frc = new FontRenderContext(new
>>>> AffineTransform(), true, true);
>>>>
>>>>           char[] chars = s.toCharArray();
>>>>           GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars,
>>>> 0,
>>>> chars.length, 0);// createGlyphVector(frc, s);
>>>>
>>>>           int length = glyphVector.getNumGlyphs();
>>>>
>>>>           for (int i = 0; i < length; i++) {
>>>>             Shape glyph = glyphVector.getGlyphOutline(i);
>>>>             System.out.println(glyphVector.getGlyphCode(i));
>>>>           }
>>>>
>>>> Any pointers about where I can hook this in PDFBox?
>>>>
>>>> Problem is we don't use the awt fonts anymore.
>>>
>>> Tilman
>>>
>>>
>>>
>>>
>>> Thanks,
>>>> Claudius
>>>>
>>>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <an...@lehmi.de>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>>>>>
>>>>> Hi,
>>>>>
>>>>>>
>>>>>> Thanks for the answer, Tilman.
>>>>>>
>>>>>> I managed to get the Devanagari text exactly as it should, by using
>>>>>> java.awt.font.layoutGlyphVector().
>>>>>>
>>>>>> Are they any chances to write a GlyphVector in a PDFBox page?
>>>>>>
>>>>>> There was a discussion at [1] about using GlpyhVector, but we didn't
>>>>>>
>>>>> make
>>>>> any descision nor did we implement anything.
>>>>>
>>>>> Do you mimd to share some of your code as a possible starting point?
>>>>>
>>>>> BR
>>>>> Andreas
>>>>>
>>>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
>>>>>
>>>>>
>>>>> Thanks,
>>>>>
>>>>>> Claudius
>>>>>>
>>>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <
>>>>>> THausherr@t-online.de
>>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>>>>>>> single characters and does not replace for ligatures.
>>>>>>>
>>>>>>> Tilman
>>>>>>>
>>>>>>>
>>>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document
>>>>>>>> with
>>>>>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत्
>>>>>>>> युक्तम्.
>>>>>>>>
>>>>>>>> The code is very simple:
>>>>>>>>       @Test
>>>>>>>>       public void testPdfBox() throws IOException {
>>>>>>>>           PDDocument document = new PDDocument();
>>>>>>>>           PDPage page = new PDPage();
>>>>>>>>           document.addPage(page);
>>>>>>>>
>>>>>>>>           PDFont font = PDType0Font.load(document,
>>>>>>>>                   new File("/home/claudius/workspace
>>>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>>>>>>
>>>>>>>>           PDPageContentStream contentStream = new
>>>>>>>> PDPageContentStream(document, page);
>>>>>>>>
>>>>>>>>           contentStream.beginText();
>>>>>>>>           contentStream.setFont(font, 12);
>>>>>>>>           contentStream.moveTextPositionByAmount(100, 700);
>>>>>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्");
>>>>>>>>           contentStream.endText();
>>>>>>>>
>>>>>>>>           // Make sure that the content stream is closed:
>>>>>>>>           contentStream.close();
>>>>>>>>
>>>>>>>>           // Save the results and ensure that the document is
>>>>>>>> properly
>>>>>>>> closed:
>>>>>>>>           document.save("target/" + name.getMethodName() + ".pdf");
>>>>>>>>           document.close();
>>>>>>>>       }
>>>>>>>>
>>>>>>>> The output pdf file (attached) is not rendering correctly the
>>>>>>>> string,
>>>>>>>> as
>>>>>>>> it is above. Namely, the ligatures are not displayed, as if they do
>>>>>>>> not
>>>>>>>> exist. On the other hand, if I am copying the text from the pdf
>>>>>>>> file,
>>>>>>>> and
>>>>>>>> paste it in eclipse, it shows perfectly.
>>>>>>>>
>>>>>>>> I checked the pdf output with evince, firefox, and adobe reader 9,
>>>>>>>> in
>>>>>>>> ubuntu.
>>>>>>>>
>>>>>>>> Any idea on how to fix this display issue?
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Claudius
>>>>>>>>
>>>>>>>> --
>>>>>>>> http://kuberam.ro
>>>>>>>>
>>>>>>>>
>>>>>>>> ------------------------------------------------------------
>>>>>>>> ---------
>>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>> ------------------------------------------------------------
>>>>>> ---------
>>>>>>
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
http://kuberam.ro

Re: Rendering of a Devanagari text

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 17.01.2017 um 07:32 schrieb Claudius Teodorescu:
> Well, I was just about to congratulate myself for fixing this with PDFBox,
> as FOP is returning good output, but with a character that is represented
> in half.
>
> So, I guess I will need a text layout engine. What output of such engine
> would be fit for PDFBox?

In PDPageContentStream.showText there is this line:

COSWriter.writeString(font.encode(text), getOutput());

So you need to get that sequence... might be tricky as above that line 
there's the subsetting that also needs the correct codes. This is not a 
change that will be done within a few hours.

Tilman


>
>
> Thanks,
> Claudius
>
> On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
>>
>>> Its is not a big deal, but works for an awt component, but it is not
>>> related to that:
>>>
>>>           String s = "\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d";
>>>           Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
>>>           FontRenderContext frc = new FontRenderContext(new
>>> AffineTransform(), true, true);
>>>
>>>           char[] chars = s.toCharArray();
>>>           GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars, 0,
>>> chars.length, 0);// createGlyphVector(frc, s);
>>>
>>>           int length = glyphVector.getNumGlyphs();
>>>
>>>           for (int i = 0; i < length; i++) {
>>>             Shape glyph = glyphVector.getGlyphOutline(i);
>>>             System.out.println(glyphVector.getGlyphCode(i));
>>>           }
>>>
>>> Any pointers about where I can hook this in PDFBox?
>>>
>> Problem is we don't use the awt fonts anymore.
>>
>> Tilman
>>
>>
>>
>>
>>> Thanks,
>>> Claudius
>>>
>>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <an...@lehmi.de>
>>> wrote:
>>>
>>> Hi,
>>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>>>>
>>>> Hi,
>>>>>
>>>>> Thanks for the answer, Tilman.
>>>>>
>>>>> I managed to get the Devanagari text exactly as it should, by using
>>>>> java.awt.font.layoutGlyphVector().
>>>>>
>>>>> Are they any chances to write a GlyphVector in a PDFBox page?
>>>>>
>>>>> There was a discussion at [1] about using GlpyhVector, but we didn't
>>>> make
>>>> any descision nor did we implement anything.
>>>>
>>>> Do you mimd to share some of your code as a possible starting point?
>>>>
>>>> BR
>>>> Andreas
>>>>
>>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
>>>>
>>>>
>>>> Thanks,
>>>>> Claudius
>>>>>
>>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <THausherr@t-online.de
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>>>>>> single characters and does not replace for ligatures.
>>>>>>
>>>>>> Tilman
>>>>>>
>>>>>>
>>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document with
>>>>>>> text following devanagari text: \u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d
>>>>>>> \u092f\u0941\u0915\u094d\u0924\u092e\u094d.
>>>>>>>
>>>>>>> The code is very simple:
>>>>>>>       @Test
>>>>>>>       public void testPdfBox() throws IOException {
>>>>>>>           PDDocument document = new PDDocument();
>>>>>>>           PDPage page = new PDPage();
>>>>>>>           document.addPage(page);
>>>>>>>
>>>>>>>           PDFont font = PDType0Font.load(document,
>>>>>>>                   new File("/home/claudius/workspace
>>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>>>>>
>>>>>>>           PDPageContentStream contentStream = new
>>>>>>> PDPageContentStream(document, page);
>>>>>>>
>>>>>>>           contentStream.beginText();
>>>>>>>           contentStream.setFont(font, 12);
>>>>>>>           contentStream.moveTextPositionByAmount(100, 700);
>>>>>>> contentStream.showText("\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d");
>>>>>>>           contentStream.endText();
>>>>>>>
>>>>>>>           // Make sure that the content stream is closed:
>>>>>>>           contentStream.close();
>>>>>>>
>>>>>>>           // Save the results and ensure that the document is properly
>>>>>>> closed:
>>>>>>>           document.save("target/" + name.getMethodName() + ".pdf");
>>>>>>>           document.close();
>>>>>>>       }
>>>>>>>
>>>>>>> The output pdf file (attached) is not rendering correctly the string,
>>>>>>> as
>>>>>>> it is above. Namely, the ligatures are not displayed, as if they do
>>>>>>> not
>>>>>>> exist. On the other hand, if I am copying the text from the pdf file,
>>>>>>> and
>>>>>>> paste it in eclipse, it shows perfectly.
>>>>>>>
>>>>>>> I checked the pdf output with evince, firefox, and adobe reader 9, in
>>>>>>> ubuntu.
>>>>>>>
>>>>>>> Any idea on how to fix this display issue?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Claudius
>>>>>>>
>>>>>>> --
>>>>>>> http://kuberam.ro
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>>
>>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Rendering of a Devanagari text

Posted by Claudius Teodorescu <cl...@gmail.com>.

Well, I was just about to congratulate myself for fixing this with PDFBox,
as FOP is returning good output, but with a character that is represented
in half.

So, I guess I will need a text layout engine. What output of such engine
would be fit for PDFBox?


Thanks,
Claudius

On Tue, Jan 17, 2017 at 7:18 AM, Tilman Hausherr <TH...@t-online.de>
wrote:

> Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
>
>> Its is not a big deal, but works for an awt component, but it is not
>> related to that:
>>
>>          String s = "कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्";
>>          Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
>>          FontRenderContext frc = new FontRenderContext(new
>> AffineTransform(), true, true);
>>
>>          char[] chars = s.toCharArray();
>>          GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars, 0,
>> chars.length, 0);// createGlyphVector(frc, s);
>>
>>          int length = glyphVector.getNumGlyphs();
>>
>>          for (int i = 0; i < length; i++) {
>>            Shape glyph = glyphVector.getGlyphOutline(i);
>>            System.out.println(glyphVector.getGlyphCode(i));
>>          }
>>
>> Any pointers about where I can hook this in PDFBox?
>>
>
> Problem is we don't use the awt fonts anymore.
>
> Tilman
>
>
>
>
>>
>> Thanks,
>> Claudius
>>
>> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <an...@lehmi.de>
>> wrote:
>>
>> Hi,
>>>
>>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>>>
>>> Hi,
>>>>
>>>>
>>>> Thanks for the answer, Tilman.
>>>>
>>>> I managed to get the Devanagari text exactly as it should, by using
>>>> java.awt.font.layoutGlyphVector().
>>>>
>>>> Are they any chances to write a GlyphVector in a PDFBox page?
>>>>
>>>> There was a discussion at [1] about using GlpyhVector, but we didn't
>>> make
>>> any descision nor did we implement anything.
>>>
>>> Do you mimd to share some of your code as a possible starting point?
>>>
>>> BR
>>> Andreas
>>>
>>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
>>>
>>>
>>> Thanks,
>>>> Claudius
>>>>
>>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <THausherr@t-online.de
>>>> >
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>>>>> single characters and does not replace for ligatures.
>>>>>
>>>>> Tilman
>>>>>
>>>>>
>>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>>>>
>>>>> Hi,
>>>>>
>>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document with
>>>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत्
>>>>>> युक्तम्.
>>>>>>
>>>>>> The code is very simple:
>>>>>>      @Test
>>>>>>      public void testPdfBox() throws IOException {
>>>>>>          PDDocument document = new PDDocument();
>>>>>>          PDPage page = new PDPage();
>>>>>>          document.addPage(page);
>>>>>>
>>>>>>          PDFont font = PDType0Font.load(document,
>>>>>>                  new File("/home/claudius/workspace
>>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>>>>
>>>>>>          PDPageContentStream contentStream = new
>>>>>> PDPageContentStream(document, page);
>>>>>>
>>>>>>          contentStream.beginText();
>>>>>>          contentStream.setFont(font, 12);
>>>>>>          contentStream.moveTextPositionByAmount(100, 700);
>>>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्");
>>>>>>          contentStream.endText();
>>>>>>
>>>>>>          // Make sure that the content stream is closed:
>>>>>>          contentStream.close();
>>>>>>
>>>>>>          // Save the results and ensure that the document is properly
>>>>>> closed:
>>>>>>          document.save("target/" + name.getMethodName() + ".pdf");
>>>>>>          document.close();
>>>>>>      }
>>>>>>
>>>>>> The output pdf file (attached) is not rendering correctly the string,
>>>>>> as
>>>>>> it is above. Namely, the ligatures are not displayed, as if they do
>>>>>> not
>>>>>> exist. On the other hand, if I am copying the text from the pdf file,
>>>>>> and
>>>>>> paste it in eclipse, it shows perfectly.
>>>>>>
>>>>>> I checked the pdf output with evince, firefox, and adobe reader 9, in
>>>>>> ubuntu.
>>>>>>
>>>>>> Any idea on how to fix this display issue?
>>>>>>
>>>>>> Thanks,
>>>>>> Claudius
>>>>>>
>>>>>> --
>>>>>> http://kuberam.ro
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>>
>>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
http://kuberam.ro

Re: Rendering of a Devanagari text

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 15.01.2017 um 20:04 schrieb Claudius Teodorescu:
> Its is not a big deal, but works for an awt component, but it is not
> related to that:
>
>          String s = "\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d";
>          Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
>          FontRenderContext frc = new FontRenderContext(new
> AffineTransform(), true, true);
>
>          char[] chars = s.toCharArray();
>          GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars, 0,
> chars.length, 0);// createGlyphVector(frc, s);
>
>          int length = glyphVector.getNumGlyphs();
>
>          for (int i = 0; i < length; i++) {
>            Shape glyph = glyphVector.getGlyphOutline(i);
>            System.out.println(glyphVector.getGlyphCode(i));
>          }
>
> Any pointers about where I can hook this in PDFBox?

Problem is we don't use the awt fonts anymore.

Tilman


>
>
> Thanks,
> Claudius
>
> On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <an...@lehmi.de>
> wrote:
>
>> Hi,
>>
>> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>>
>>> Hi,
>>>
>>>
>>> Thanks for the answer, Tilman.
>>>
>>> I managed to get the Devanagari text exactly as it should, by using
>>> java.awt.font.layoutGlyphVector().
>>>
>>> Are they any chances to write a GlyphVector in a PDFBox page?
>>>
>> There was a discussion at [1] about using GlpyhVector, but we didn't make
>> any descision nor did we implement anything.
>>
>> Do you mimd to share some of your code as a possible starting point?
>>
>> BR
>> Andreas
>>
>> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
>>
>>
>>> Thanks,
>>> Claudius
>>>
>>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <TH...@t-online.de>
>>> wrote:
>>>
>>> Hi,
>>>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>>>> single characters and does not replace for ligatures.
>>>>
>>>> Tilman
>>>>
>>>>
>>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>>>
>>>> Hi,
>>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document with
>>>>> text following devanagari text: \u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d.
>>>>>
>>>>> The code is very simple:
>>>>>      @Test
>>>>>      public void testPdfBox() throws IOException {
>>>>>          PDDocument document = new PDDocument();
>>>>>          PDPage page = new PDPage();
>>>>>          document.addPage(page);
>>>>>
>>>>>          PDFont font = PDType0Font.load(document,
>>>>>                  new File("/home/claudius/workspace
>>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>>>
>>>>>          PDPageContentStream contentStream = new
>>>>> PDPageContentStream(document, page);
>>>>>
>>>>>          contentStream.beginText();
>>>>>          contentStream.setFont(font, 12);
>>>>>          contentStream.moveTextPositionByAmount(100, 700);
>>>>> contentStream.showText("\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d");
>>>>>          contentStream.endText();
>>>>>
>>>>>          // Make sure that the content stream is closed:
>>>>>          contentStream.close();
>>>>>
>>>>>          // Save the results and ensure that the document is properly
>>>>> closed:
>>>>>          document.save("target/" + name.getMethodName() + ".pdf");
>>>>>          document.close();
>>>>>      }
>>>>>
>>>>> The output pdf file (attached) is not rendering correctly the string, as
>>>>> it is above. Namely, the ligatures are not displayed, as if they do not
>>>>> exist. On the other hand, if I am copying the text from the pdf file,
>>>>> and
>>>>> paste it in eclipse, it shows perfectly.
>>>>>
>>>>> I checked the pdf output with evince, firefox, and adobe reader 9, in
>>>>> ubuntu.
>>>>>
>>>>> Any idea on how to fix this display issue?
>>>>>
>>>>> Thanks,
>>>>> Claudius
>>>>>
>>>>> --
>>>>> http://kuberam.ro
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>>
>>>>
>>>>
>>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Rendering of a Devanagari text

Posted by Claudius Teodorescu <cl...@gmail.com>.

Its is not a big deal, but works for an awt component, but it is not
related to that:

        String s = "कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्";
        Font font2 = new Font("Sanskrit2003", Font.PLAIN, 24);
        FontRenderContext frc = new FontRenderContext(new
AffineTransform(), true, true);

        char[] chars = s.toCharArray();
        GlyphVector glyphVector = font2.layoutGlyphVector(frc, chars, 0,
chars.length, 0);// createGlyphVector(frc, s);

        int length = glyphVector.getNumGlyphs();

        for (int i = 0; i < length; i++) {
          Shape glyph = glyphVector.getGlyphOutline(i);
          System.out.println(glyphVector.getGlyphCode(i));
        }

Any pointers about where I can hook this in PDFBox?


Thanks,
Claudius

On Sun, Jan 15, 2017 at 4:56 PM, Andreas Lehmkuehler <an...@lehmi.de>
wrote:

> Hi,
>
> Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
>
>> Hi,
>>
>>
>> Thanks for the answer, Tilman.
>>
>> I managed to get the Devanagari text exactly as it should, by using
>> java.awt.font.layoutGlyphVector().
>>
>> Are they any chances to write a GlyphVector in a PDFBox page?
>>
> There was a discussion at [1] about using GlpyhVector, but we didn't make
> any descision nor did we implement anything.
>
> Do you mimd to share some of your code as a possible starting point?
>
> BR
> Andreas
>
> [1] https://issues.apache.org/jira/browse/PDFBOX-3550
>
>
>>
>> Thanks,
>> Claudius
>>
>> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <TH...@t-online.de>
>> wrote:
>>
>> Hi,
>>>
>>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>>> single characters and does not replace for ligatures.
>>>
>>> Tilman
>>>
>>>
>>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>>
>>> Hi,
>>>>
>>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document with
>>>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्.
>>>>
>>>> The code is very simple:
>>>>     @Test
>>>>     public void testPdfBox() throws IOException {
>>>>         PDDocument document = new PDDocument();
>>>>         PDPage page = new PDPage();
>>>>         document.addPage(page);
>>>>
>>>>         PDFont font = PDType0Font.load(document,
>>>>                 new File("/home/claudius/workspace
>>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>>
>>>>         PDPageContentStream contentStream = new
>>>> PDPageContentStream(document, page);
>>>>
>>>>         contentStream.beginText();
>>>>         contentStream.setFont(font, 12);
>>>>         contentStream.moveTextPositionByAmount(100, 700);
>>>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्");
>>>>         contentStream.endText();
>>>>
>>>>         // Make sure that the content stream is closed:
>>>>         contentStream.close();
>>>>
>>>>         // Save the results and ensure that the document is properly
>>>> closed:
>>>>         document.save("target/" + name.getMethodName() + ".pdf");
>>>>         document.close();
>>>>     }
>>>>
>>>> The output pdf file (attached) is not rendering correctly the string, as
>>>> it is above. Namely, the ligatures are not displayed, as if they do not
>>>> exist. On the other hand, if I am copying the text from the pdf file,
>>>> and
>>>> paste it in eclipse, it shows perfectly.
>>>>
>>>> I checked the pdf output with evince, firefox, and adobe reader 9, in
>>>> ubuntu.
>>>>
>>>> Any idea on how to fix this display issue?
>>>>
>>>> Thanks,
>>>> Claudius
>>>>
>>>> --
>>>> http://kuberam.ro
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>


-- 
http://kuberam.ro

Re: Rendering of a Devanagari text

Posted by Andreas Lehmkuehler <an...@lehmi.de>.

Hi,

Am 15.01.2017 um 15:51 schrieb Claudius Teodorescu:
> Hi,
>
>
> Thanks for the answer, Tilman.
>
> I managed to get the Devanagari text exactly as it should, by using
> java.awt.font.layoutGlyphVector().
>
> Are they any chances to write a GlyphVector in a PDFBox page?
There was a discussion at [1] about using GlpyhVector, but we didn't make any 
descision nor did we implement anything.

Do you mimd to share some of your code as a possible starting point?

BR
Andreas

[1] https://issues.apache.org/jira/browse/PDFBOX-3550
>
>
> Thanks,
> Claudius
>
> On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> Hi,
>>
>> This is not supported, sorry. PDFBox just outputs the glyphs for the
>> single characters and does not replace for ligatures.
>>
>> Tilman
>>
>>
>> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>>
>>> Hi,
>>>
>>> I am using pdfbox 2.0.4, and I am trying to output a pdf document with
>>> text following devanagari text: \u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d.
>>>
>>> The code is very simple:
>>>     @Test
>>>     public void testPdfBox() throws IOException {
>>>         PDDocument document = new PDDocument();
>>>         PDPage page = new PDPage();
>>>         document.addPage(page);
>>>
>>>         PDFont font = PDType0Font.load(document,
>>>                 new File("/home/claudius/workspace
>>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>>
>>>         PDPageContentStream contentStream = new
>>> PDPageContentStream(document, page);
>>>
>>>         contentStream.beginText();
>>>         contentStream.setFont(font, 12);
>>>         contentStream.moveTextPositionByAmount(100, 700);
>>> contentStream.showText("\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d");
>>>         contentStream.endText();
>>>
>>>         // Make sure that the content stream is closed:
>>>         contentStream.close();
>>>
>>>         // Save the results and ensure that the document is properly
>>> closed:
>>>         document.save("target/" + name.getMethodName() + ".pdf");
>>>         document.close();
>>>     }
>>>
>>> The output pdf file (attached) is not rendering correctly the string, as
>>> it is above. Namely, the ligatures are not displayed, as if they do not
>>> exist. On the other hand, if I am copying the text from the pdf file, and
>>> paste it in eclipse, it shows perfectly.
>>>
>>> I checked the pdf output with evince, firefox, and adobe reader 9, in
>>> ubuntu.
>>>
>>> Any idea on how to fix this display issue?
>>>
>>> Thanks,
>>> Claudius
>>>
>>> --
>>> http://kuberam.ro
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>>
>>
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Rendering of a Devanagari text

Posted by Claudius Teodorescu <cl...@gmail.com>.

Hi,


Thanks for the answer, Tilman.

I managed to get the Devanagari text exactly as it should, by using
java.awt.font.layoutGlyphVector().

Are they any chances to write a GlyphVector in a PDFBox page?


Thanks,
Claudius

On Sat, Jan 14, 2017 at 9:45 AM, Tilman Hausherr <TH...@t-online.de>
wrote:

> Hi,
>
> This is not supported, sorry. PDFBox just outputs the glyphs for the
> single characters and does not replace for ligatures.
>
> Tilman
>
>
> Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
>
>> Hi,
>>
>> I am using pdfbox 2.0.4, and I am trying to output a pdf document with
>> text following devanagari text: कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्.
>>
>> The code is very simple:
>>     @Test
>>     public void testPdfBox() throws IOException {
>>         PDDocument document = new PDDocument();
>>         PDPage page = new PDPage();
>>         document.addPage(page);
>>
>>         PDFont font = PDType0Font.load(document,
>>                 new File("/home/claudius/workspace
>> s/repositories/backup/fonts/Sanskrit2003.ttf"));
>>
>>         PDPageContentStream contentStream = new
>> PDPageContentStream(document, page);
>>
>>         contentStream.beginText();
>>         contentStream.setFont(font, 12);
>>         contentStream.moveTextPositionByAmount(100, 700);
>> contentStream.showText("कारणत्त्वङ्गवाश्वादीनमपीति चेत् युक्तम्");
>>         contentStream.endText();
>>
>>         // Make sure that the content stream is closed:
>>         contentStream.close();
>>
>>         // Save the results and ensure that the document is properly
>> closed:
>>         document.save("target/" + name.getMethodName() + ".pdf");
>>         document.close();
>>     }
>>
>> The output pdf file (attached) is not rendering correctly the string, as
>> it is above. Namely, the ligatures are not displayed, as if they do not
>> exist. On the other hand, if I am copying the text from the pdf file, and
>> paste it in eclipse, it shows perfectly.
>>
>> I checked the pdf output with evince, firefox, and adobe reader 9, in
>> ubuntu.
>>
>> Any idea on how to fix this display issue?
>>
>> Thanks,
>> Claudius
>>
>> --
>> http://kuberam.ro
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
>


-- 
http://kuberam.ro

Re: Rendering of a Devanagari text

Posted by Tilman Hausherr <TH...@t-online.de>.

Hi,

This is not supported, sorry. PDFBox just outputs the glyphs for the 
single characters and does not replace for ligatures.

Tilman

Am 14.01.2017 um 08:44 schrieb Claudius Teodorescu:
> Hi,
>
> I am using pdfbox 2.0.4, and I am trying to output a pdf document with 
> text following devanagari text: \u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d.
>
> The code is very simple:
>     @Test
>     public void testPdfBox() throws IOException {
>         PDDocument document = new PDDocument();
>         PDPage page = new PDPage();
>         document.addPage(page);
>
>         PDFont font = PDType0Font.load(document,
>                 new 
> File("/home/claudius/workspaces/repositories/backup/fonts/Sanskrit2003.ttf"));
>
>         PDPageContentStream contentStream = new 
> PDPageContentStream(document, page);
>
>         contentStream.beginText();
>         contentStream.setFont(font, 12);
>         contentStream.moveTextPositionByAmount(100, 700);
> contentStream.showText("\u0915\u093e\u0930\u0923\u0924\u094d\u0924\u094d\u0935\u0919\u094d\u0917\u0935\u093e\u0936\u094d\u0935\u093e\u0926\u0940\u0928\u092e\u092a\u0940\u0924\u093f \u091a\u0947\u0924\u094d \u092f\u0941\u0915\u094d\u0924\u092e\u094d");
>         contentStream.endText();
>
>         // Make sure that the content stream is closed:
>         contentStream.close();
>
>         // Save the results and ensure that the document is properly 
> closed:
>         document.save("target/" + name.getMethodName() + ".pdf");
>         document.close();
>     }
>
> The output pdf file (attached) is not rendering correctly the string, 
> as it is above. Namely, the ligatures are not displayed, as if they do 
> not exist. On the other hand, if I am copying the text from the pdf 
> file, and paste it in eclipse, it shows perfectly.
>
> I checked the pdf output with evince, firefox, and adobe reader 9, in 
> ubuntu.
>
> Any idea on how to fix this display issue?
>
> Thanks,
> Claudius
>
> -- 
> http://kuberam.ro
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org