You are viewing a plain text version of this content. The canonical link for it is here.
Posted to rivet-dev@tcl.apache.org by Harald Oehlmann <ha...@elmicron.de> on 2012/03/14 16:01:44 UTC

Encode entities

Is there an utf-8 to html entity encoder in the rivet standard library ?
If not, I would favor to include:
	http://wiki.tcl.tk/26403
which is identical to tcllib html::html_entities
-Harald

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <ma...@unipr.it>.
On 22.03.2012 11:13, Harald Oehlmann wrote:

> My compassion for the workload experience, sorry for that.
>
> I just wanted to state, that technically utf8 is sufficient.
> For other common encodings (html entities is not really an encoding),
> one may use:
> % rivet::entities [encoding convertfrom cp850 "\x8eh"]
> &Auml;h
>
> to recode the Text "Äh" from DOS CP850 to html entities.
>
> IMHO this is not much less elegant than:
> rivet::encode "\x8eh" -encoding cp850
>
> Or do I miss something obvious ?
>
> -Harald
>

No I think you have the whole picture. Even though this is
no more than a wrapper of other stuff, it could turn out
to be a central utility on which build other functionalities.

  -- Massimo


---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Harald Oehlmann <ha...@elmicron.de>.
Am 22.03.2012 10:56, schrieb Massimo Manghi:
> Perhaps no one will use it, but it won't be only unused component of
> Rivet. I've been
> working on an application that gets data and templates from different
> sources
> which may well use different encodings, so it won't be completely
> useless for me

My compassion for the workload experience, sorry for that.

I just wanted to state, that technically utf8 is sufficient.
For other common encodings (html entities is not really an encoding),
one may use:
% rivet::entities [encoding convertfrom cp850 "\x8eh"]
&Auml;h

to recode the Text "Äh" from DOS CP850 to html entities.

IMHO this is not much less elegant than:
rivet::encode "\x8eh" -encoding cp850

Or do I miss something obvious ?

-Harald

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <mx...@apache.org>.
On 22.03.2012 10:31, Harald Oehlmann wrote:
>
> I am sorry, IMHO this overstates the case.
>
> IMHO, there is no need to support other encodings. One would use to
> prepare the string with "encoding convertfrom" but this is rarely
> needed. We need a tool to output html dynamic text. As utf8 is the
> default Tcl encoding (e.g.) we have everything in utf8.
>
> By the way, I have asked on clt, if it would be possible to use
> "encoding convertto" to do html entities:
> 
> http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/7d6b2aace01d7043#
>
> -Harald
>

I hate to leave half baked ideas around...:-( I'm rather obsessive at 
this and
I have too many things left around on my desk, if I can finish one I 
have
a chance to feel better.

Therefore I'm exploiting 'encoding convertto' to support more than one 
encoding,
use UTF-8 and intermediate encoding and then output html entities.

Perhaps no one will use it, but it won't be only unused component of 
Rivet. I've been
working on an application that gets data and templates from different 
sources
which may well use different encodings, so it won't be completely 
useless for me

  -- Massimo


---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Harald Oehlmann <ha...@elmicron.de>.
Am 22.03.2012 10:19, schrieb Massimo Manghi:
> Thank you Jeff for the suggestion
> 
> the PHP implementation looks more elaborate and complete, supporting more
> than a single encoding and several ways to perform the translation.
> 
> I think this justifies the adoption of rather general terms for
> their commands. We may choose a lower profile adding the basic initial
> implementation and keeping the door open for other encodings
> (adding ISO-8859-1 should be trivial). The command could be
> 
> ::rivet::encode <string> ?-encoding <encoding-name>?
> ::rivet::decode <string> ?-encoding <encoding-name>?
> 
> we have to make clear in the docs that current implementation supports
> only UTF-8. There is still something that keeps me from adopting this
> name wholeheartedly.

I am sorry, IMHO this overstates the case.

IMHO, there is no need to support other encodings. One would use to
prepare the string with "encoding convertfrom" but this is rarely
needed. We need a tool to output html dynamic text. As utf8 is the
default Tcl encoding (e.g.) we have everything in utf8.

By the way, I have asked on clt, if it would be possible to use
"encoding convertto" to do html entities:
http://groups.google.com/group/comp.lang.tcl/browse_thread/thread/7d6b2aace01d7043#

-Harald


---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <mx...@apache.org>.
Thank you Jeff for the suggestion

the PHP implementation looks more elaborate and complete, supporting 
more
than a single encoding and several ways to perform the translation.

I think this justifies the adoption of rather general terms for
their commands. We may choose a lower profile adding the basic initial
implementation and keeping the door open for other encodings
(adding ISO-8859-1 should be trivial). The command could be

::rivet::encode <string> ?-encoding <encoding-name>?
::rivet::decode <string> ?-encoding <encoding-name>?

we have to make clear in the docs that current implementation supports
only UTF-8. There is still something that keeps me from adopting this
name wholeheartedly.

  -- Massimo

On 21.03.2012 19:38, Jeff Lawson wrote:
>
> Although PHP is not necessarily the best thing to emulate, it does
> have the benefit of having a large established userbase and new Rivet
> users might have familiarity with its name choices.
>
> htmlentities -- http://php.net/htmlentities
> html_entity_decode -- http://php.net/html_entity_decode
>
>
> These two are basically the equivalents of Rivet's escape_string and
> unescape_string:
>
> urlencode -- http://php.net/urlencode
> urldecode -- http://php.net/urldecode
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
> For additional commands, e-mail: rivet-dev-help@tcl.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Jeff Lawson <je...@bovine.net>.
On Wed, Mar 21, 2012 at 12:10 PM, Massimo Manghi <mx...@apache.org> wrote:
> On Wed, 2012-03-21 at 08:15 +0100, Harald Oehlmann wrote:
>> Am 21.03.2012 00:10, schrieb Massimo Manghi:
>> > On Wed, 2012-03-14 at 17:59 +0100, Harald Oehlmann wrote:
>> >> I would try to stay as compatible to tcllib as possible:
>> >>
>> >> html::html_entities
>> >>
>> >> but this is not a really nice name.
>> >
>> > I would mark the difference between tcllib's implementation and Rivet's
>> > adopted code. What if we get the code into a 'entitities' package and
>> > let the actual command be placed within the ::rivet namespace?
>> >
>> > e.g.
>> >
>> > package require entities
>> > set xformed_text [::rivet::entities $text]
>> >
>> > or
>> >
>> > ::rivet::html_entities
>>
>> I like both, but would prefer the first.
>>
>
> Me too, but if 'entities' as subcommand doesn't make sense, we need a
> also meaningful name for the inverse subcommand.
> '::rivet::encode'/'rivet::decode' are excellent but too generic after
> all. Suggestions?
>

Although PHP is not necessarily the best thing to emulate, it does
have the benefit of having a large established userbase and new Rivet
users might have familiarity with its name choices.

htmlentities -- http://php.net/htmlentities
html_entity_decode -- http://php.net/html_entity_decode


These two are basically the equivalents of Rivet's escape_string and
unescape_string:

urlencode -- http://php.net/urlencode
urldecode -- http://php.net/urldecode

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <mx...@apache.org>.
On Wed, 2012-03-21 at 08:15 +0100, Harald Oehlmann wrote:
> Am 21.03.2012 00:10, schrieb Massimo Manghi:
> > On Wed, 2012-03-14 at 17:59 +0100, Harald Oehlmann wrote:
> >> I would try to stay as compatible to tcllib as possible:
> >>
> >> html::html_entities
> >>
> >> but this is not a really nice name.
> > 
> > I would mark the difference between tcllib's implementation and Rivet's
> > adopted code. What if we get the code into a 'entitities' package and
> > let the actual command be placed within the ::rivet namespace?
> > 
> > e.g.
> > 
> > package require entities
> > set xformed_text [::rivet::entities $text]
> > 
> > or 
> > 
> > ::rivet::html_entities
> 
> I like both, but would prefer the first.
> 

Me too, but if 'entities' as subcommand doesn't make sense, we need a
also meaningful name for the inverse subcommand.
'::rivet::encode'/'rivet::decode' are excellent but too generic after
all. Suggestions? 
 
 -- Massimo



---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Harald Oehlmann <ha...@elmicron.de>.
Am 21.03.2012 00:10, schrieb Massimo Manghi:
> On Wed, 2012-03-14 at 17:59 +0100, Harald Oehlmann wrote:
>> I would try to stay as compatible to tcllib as possible:
>>
>> html::html_entities
>>
>> but this is not a really nice name.
> 
> I would mark the difference between tcllib's implementation and Rivet's
> adopted code. What if we get the code into a 'entitities' package and
> let the actual command be placed within the ::rivet namespace?
> 
> e.g.
> 
> package require entities
> set xformed_text [::rivet::entities $text]
> 
> or 
> 
> ::rivet::html_entities

I like both, but would prefer the first.

> the name ::html::html_entities is a redundant, not very well picked.
> 
>>
>> I personally have used:
>>
>> ee -> encode entity
>>
>> mce -> use msgcat::mc and html_entities
>>
> 
> I'm not an msgcat expert. I presume this call would be a wrapper for
> msgcat and ::rivet::entitited being called in sequence.

Absolutely. My demo code was wrong. Here is the corrected code with the
upper definition:

proc ::wwwbase::mce args {
     return [::rivet::entities [uplevel 1 [concat ::msgcat::mc $args]]]
}

Thank you,
Harald

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <ma...@alice.it>.
On Wed, 2012-03-14 at 17:59 +0100, Harald Oehlmann wrote:
> I would try to stay as compatible to tcllib as possible:
> 
> html::html_entities
> 
> but this is not a really nice name.

I would mark the difference between tcllib's implementation and Rivet's
adopted code. What if we get the code into a 'entitities' package and
let the actual command be placed within the ::rivet namespace?

e.g.

package require entities
set xformed_text [::rivet::entities $text]

or 

::rivet::html_entities

the name ::html::html_entities is a redundant, not very well picked.

> 
> I personally have used:
> 
> ee -> encode entity
> 
> mce -> use msgcat::mc and html_entities
> 

I'm not an msgcat expert. I presume this call would be a wrapper for
msgcat and ::rivet::entitited being called in sequence.


> proc ::wwwbase::mce args {
>     return [wwwbaseEncodeEntity [uplevel 1 [concat ::msgcat::mc $args]]]
> }
> 
> Please sleep a night over such a decision. You might have good ideas
> tomorrow morning.
> 
> Thank you,
> -Harald
> 

 -- Massimo




---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Harald Oehlmann <ha...@elmicron.de>.
Am 14.03.2012 17:46, schrieb Massimo Manghi:
> On 14.03.2012 17:33, Harald Oehlmann wrote:
>>
>> Am 14.03.2012 17:23, Massimo Manghi wrote:
>>> Do you propose to encapsulate it into a package and put it in
>>> rivet/packages? (the second ensemble based implementation looks better)
>>
>> yes
> 
> ok, any name you like for such package?

I would try to stay as compatible to tcllib as possible:

html::html_entities

but this is not a really nice name.

I personally have used:

ee -> encode entity

mce -> use msgcat::mc and html_entities

proc ::wwwbase::mce args {
    return [wwwbaseEncodeEntity [uplevel 1 [concat ::msgcat::mc $args]]]
}

Please sleep a night over such a decision. You might have good ideas
tomorrow morning.

Thank you,
-Harald

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <ma...@unipr.it>.
On 14.03.2012 17:33, Harald Oehlmann wrote:
>
> Am 14.03.2012 17:23, Massimo Manghi wrote:
>> Do you propose to encapsulate it into a package and put it in
>> rivet/packages? (the second ensemble based implementation looks 
>> better)
>
> yes

ok, any name you like for such package?

>
>>
>> I usually consider tcllib as a needed component of every Tcl
>> installation, so I don't see it as an absolute requirements. However 
>> I
>> won't break anything.
>
> I considered to use the tcllib html package.
> I found the following inconveniences after a quick view:
> - it overlapps with the form package
> - it requires itself package ncgi which seams to be not so well 
> suited
> for Rivet
> - ncgi requires package fileutils, another package to check the
> consequences...
>

I wasn't aware the html package had such a dependency list. Good point 
for a simple encoder/decoder

> As the use of entities is an important internal function - any data
> comming from a data base etc should pass by such a function - I was
> wondering that this is not included.
>
> I would also doubt if html entities could not be implemented as 
> encoding...
>
> -Harald


thanks Harald

  -- Massimo

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Harald Oehlmann <ha...@elmicron.de>.
> On 14.03.2012 16:01, Harald Oehlmann wrote:
>> Is there an utf-8 to html entity encoder in the rivet standard library ?
>> If not, I would favor to include:
>>     http://wiki.tcl.tk/26403
>> which is identical to tcllib html::html_entities

Am 14.03.2012 17:23, Massimo Manghi wrote:
> Do you propose to encapsulate it into a package and put it in
> rivet/packages? (the second ensemble based implementation looks better)

yes

>
> I usually consider tcllib as a needed component of every Tcl
> installation, so I don't see it as an absolute requirements. However I
> won't break anything.

I considered to use the tcllib html package.
I found the following inconveniences after a quick view:
- it overlapps with the form package
- it requires itself package ncgi which seams to be not so well suited
for Rivet
- ncgi requires package fileutils, another package to check the
consequences...

As the use of entities is an important internal function - any data
comming from a data base etc should pass by such a function - I was
wondering that this is not included.

I would also doubt if html entities could not be implemented as encoding...

-Harald

---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org


Re: Encode entities

Posted by Massimo Manghi <ma...@unipr.it>.
Do you propose to encapsulate it into a package and put it in 
rivet/packages? (the second ensemble based implementation looks better)

I usually consider tcllib as a needed component of every Tcl 
installation, so I don't see it as an absolute requirements. However I 
won't break anything.

  -- Massimo

On 14.03.2012 16:01, Harald Oehlmann wrote:
> Is there an utf-8 to html entity encoder in the rivet standard 
> library ?
> If not, I would favor to include:
> 	http://wiki.tcl.tk/26403
> which is identical to tcllib html::html_entities
> -Harald
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
> For additional commands, e-mail: rivet-dev-help@tcl.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: rivet-dev-unsubscribe@tcl.apache.org
For additional commands, e-mail: rivet-dev-help@tcl.apache.org