You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by echoo <ve...@eurodyn.com> on 2011/01/25 09:32:15 UTC

FOP - HTML2PDF

Dear


I have a problem which I don't know how to solve:

    I have an xml file which I want to transform to:
    - a pdf file.
    - a xsl file.

   For this, I use Apache FOP (as I am working in a Java environment).
    The result of this is nice except for one thing:

    My xml has one field which is called 'introduction' and which accepts
HTML contents.
    After transformation, the plain html is shown in the pdf file.

    I want the HTML to be interpreted.


Yours Sincerely



Christof
-- 
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: FOP - HTML2PDF

Posted by mehdi houshmand <me...@gmail.com>.
Hi Christof,

Correct me if I'm wrong, but you're trying to extract the relevant
text from the HTML and convert that to FO objects in XML. If so, that
looks like a job for regex i.e. finding strings - in your case, you'd
be looking for <table>ANY STRING</table> (I presume) and insert that
text into FO elements. However, there's almost definitely a more
intuitive way to do that using XSLT, but that's not really the scope
of this forum. You want all that intelligence in the XSLT, you want
the XSLT to parse the HTML and create the necessary FO elements. XSLT
is a very powerful tool, and most likely someone else would have done
what you're trying to do or at least something similar that you
could... Uhm... *cough* plagiarize *cough*. My point is there's no
point reinventing the wheel, Google is your friend, check this out,
might be a good starting point:
http://stackoverflow.com/questions/1639625/can-i-parse-an-html-using-xslt.

I hope that helps

Mehdi

On 25 January 2011 13:33, echoo <ve...@eurodyn.com> wrote:
>
> Dear Mehdi
>
>
> Thank you for your reply.
>
> What do you mean with 'interpreted as text'?
> What I want is that if I have a <table> tag in my the html content, a table
> should be drawn in the resulting pdf file. I just don't know how :-) (yet)
> You link might be, indeed, useful.
>
>
> Yours Sincerely
>
>
>
>
> Christof
>
>
>
>
> mehdi houshmand wrote:
>>
>> Hi Christof,
>>
>> Just to be clear, you've got an XML element that contains HTML and you
>> want that HTML to be interpreted as text (i.e. you want the HTML tags
>> removed?)? If so, this isn't strictly a FOP question, FOP isn't
>> responsible for analysing/parsing/interpreting XML directly (though
>> admittedly it does accept XML as input with an XSLT to transform the
>> XML to FO). Anyway, the point is, you want that knowledge to be in the
>> XSL. The XSL/XSLT is responsible for parsing the XML and converting it
>> into FO, how you do that is a question for an XSLT forum, but one way
>> would be using regexs to return the string you want. One google search
>> yielded http://www.xml.com/pub/a/2003/06/04/tr.html which seems like a
>> fairly nice little introduction to regexes in XSLT.
>>
>> I hope that helps
>>
>> Mehdi
>>
>> On 25 January 2011 08:32, echoo <ve...@eurodyn.com> wrote:
>>>
>>> Dear
>>>
>>>
>>> I have a problem which I don't know how to solve:
>>>
>>>    I have an xml file which I want to transform to:
>>>    - a pdf file.
>>>    - a xsl file.
>>>
>>>   For this, I use Apache FOP (as I am working in a Java environment).
>>>    The result of this is nice except for one thing:
>>>
>>>    My xml has one field which is called 'introduction' and which accepts
>>> HTML contents.
>>>    After transformation, the plain html is shown in the pdf file.
>>>
>>>    I want the HTML to be interpreted.
>>>
>>>
>>> Yours Sincerely
>>>
>>>
>>>
>>> Christof
>>> --
>>> View this message in context:
>>> http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
>>> Sent from the FOP - Users mailing list archive at Nabble.com.
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>>> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>>>
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>>
>>
>>
>
> --
> View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30757986.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: FOP - HTML2PDF

Posted by echoo <ve...@eurodyn.com>.
Dear Mehdi


Thank you for your reply.

What do you mean with 'interpreted as text'?
What I want is that if I have a <table> tag in my the html content, a table
should be drawn in the resulting pdf file. I just don't know how :-) (yet)
You link might be, indeed, useful.


Yours Sincerely




Christof




mehdi houshmand wrote:
> 
> Hi Christof,
> 
> Just to be clear, you've got an XML element that contains HTML and you
> want that HTML to be interpreted as text (i.e. you want the HTML tags
> removed?)? If so, this isn't strictly a FOP question, FOP isn't
> responsible for analysing/parsing/interpreting XML directly (though
> admittedly it does accept XML as input with an XSLT to transform the
> XML to FO). Anyway, the point is, you want that knowledge to be in the
> XSL. The XSL/XSLT is responsible for parsing the XML and converting it
> into FO, how you do that is a question for an XSLT forum, but one way
> would be using regexs to return the string you want. One google search
> yielded http://www.xml.com/pub/a/2003/06/04/tr.html which seems like a
> fairly nice little introduction to regexes in XSLT.
> 
> I hope that helps
> 
> Mehdi
> 
> On 25 January 2011 08:32, echoo <ve...@eurodyn.com> wrote:
>>
>> Dear
>>
>>
>> I have a problem which I don't know how to solve:
>>
>>    I have an xml file which I want to transform to:
>>    - a pdf file.
>>    - a xsl file.
>>
>>   For this, I use Apache FOP (as I am working in a Java environment).
>>    The result of this is nice except for one thing:
>>
>>    My xml has one field which is called 'introduction' and which accepts
>> HTML contents.
>>    After transformation, the plain html is shown in the pdf file.
>>
>>    I want the HTML to be interpreted.
>>
>>
>> Yours Sincerely
>>
>>
>>
>> Christof
>> --
>> View this message in context:
>> http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
>> Sent from the FOP - Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
>> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30757986.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: FOP - HTML2PDF

Posted by mehdi houshmand <me...@gmail.com>.
Hi Christof,

Just to be clear, you've got an XML element that contains HTML and you
want that HTML to be interpreted as text (i.e. you want the HTML tags
removed?)? If so, this isn't strictly a FOP question, FOP isn't
responsible for analysing/parsing/interpreting XML directly (though
admittedly it does accept XML as input with an XSLT to transform the
XML to FO). Anyway, the point is, you want that knowledge to be in the
XSL. The XSL/XSLT is responsible for parsing the XML and converting it
into FO, how you do that is a question for an XSLT forum, but one way
would be using regexs to return the string you want. One google search
yielded http://www.xml.com/pub/a/2003/06/04/tr.html which seems like a
fairly nice little introduction to regexes in XSLT.

I hope that helps

Mehdi

On 25 January 2011 08:32, echoo <ve...@eurodyn.com> wrote:
>
> Dear
>
>
> I have a problem which I don't know how to solve:
>
>    I have an xml file which I want to transform to:
>    - a pdf file.
>    - a xsl file.
>
>   For this, I use Apache FOP (as I am working in a Java environment).
>    The result of this is nice except for one thing:
>
>    My xml has one field which is called 'introduction' and which accepts
> HTML contents.
>    After transformation, the plain html is shown in the pdf file.
>
>    I want the HTML to be interpreted.
>
>
> Yours Sincerely
>
>
>
> Christof
> --
> View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748316.html
> Sent from the FOP - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
> For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: FOP - HTML2PDF

Posted by echoo <ve...@eurodyn.com>.
Hello Wim VN


Thank you for your reply.

Yes, what you metion is exactly what I want to do.

I am not sure but I believe I can use the xsl file (xhtml2fo.xsl), provided
by antennahouse(http://www.antennahouse.com/XSLsample/XSLsample.htm), to
lookup the html translations. This is what you mean with 'have the XSL
transformation lookup that <introduction> tag'?


Yours Sincerely




Christof


Wim VN wrote:
> 
> Hello Christof,
> 
> I'm not sure but I think the solution can be found in XSLT and not in FOP.
> 
> If I understand correctly: you use an XSL transformation to go from a
> source xml file to an intermediate XSL-FO file. Afterwards you process
> this with Apache FOP to a final PDF document.
> Within the xml there is a tag that holds html content. You wish this
> content to be interpreted.
> 
> Is it not possible to have the XSL transformation lookup that
> <introduction> tag and make sure it is converting the html content to FO
> as well?
> 
> I am not an XSLT expert and if I'm not mistaken other forums might be a
> better choice to get help on this specific problem.
> 
> Good luck with your project
> Wim
> 
> 
> echoo wrote:
>> 
>> Dear
>> 
>> 
>> I have a problem which I don't know how to solve:
>> 
>>     I have an xml file which I want to transform to:
>>     - a pdf file.
>>     - a xsl file.
>> 
>>    For this, I use Apache FOP (as I am working in a Java environment).
>>     The result of this is nice except for one thing:
>> 
>>     My xml has one field which is called 'introduction' and which accepts
>> HTML contents.
>>     After transformation, the plain html is shown in the pdf file.
>> 
>>     I want the HTML to be interpreted.
>> 
>> 
>> Yours Sincerely
>> 
>> 
>> 
>> Christof
>> 
> 
> 

-- 
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30757603.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: FOP - HTML2PDF

Posted by Wim VN <wi...@gmail.com>.
Hello Christof,

I'm not sure but I think the solution can be found in XSLT and not in FOP.

If I understand correctly: you use an XSL transformation to go from a source
xml file to an intermediate XSL-FO file. Afterwards you process this with
Apache FOP to a final PDF document.
Within the xml there is a tag that holds html content. You wish this content
to be interpreted.

Is it not possible to have the XSL transformation lookup that <introduction>
tag and make sure it is converting the html content to FO as well?

I am not an XSLT expert and if I'm not mistaken other forums might be a
better choice to get help on this specific problem.

Good luck with your project
Wim


echoo wrote:
> 
> Dear
> 
> 
> I have a problem which I don't know how to solve:
> 
>     I have an xml file which I want to transform to:
>     - a pdf file.
>     - a xsl file.
> 
>    For this, I use Apache FOP (as I am working in a Java environment).
>     The result of this is nice except for one thing:
> 
>     My xml has one field which is called 'introduction' and which accepts
> HTML contents.
>     After transformation, the plain html is shown in the pdf file.
> 
>     I want the HTML to be interpreted.
> 
> 
> Yours Sincerely
> 
> 
> 
> Christof
> 

-- 
View this message in context: http://old.nabble.com/FOP---HTML2PDF-tp30748316p30748881.html
Sent from the FOP - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org