You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jmeter-dev@jakarta.apache.org by "BAZLEY, Sebastian" <Se...@london.sema.slb.com> on 2004/02/05 12:05:37 UTC

How to handle htmlparser library (was Are we ready for a RC?)

We could check for the presence of HTMLParser at run-time, and fall back to
JTidy or Regex (or etc.) if not present. 

Some users might not like the fallback behaviour, so if the parser property
were changed to be a list of the acceptable parsers, in order of preference,
we could support as much (or as little) fallback as required.

We should log a warning message if the desired parser is not present (JMeter
already logs an info message when a parser is initialised).

S.
>-----Original Message-----
>From: peter lin [mailto:jmw00lfel@yahoo.com]
>Sent: 05 February 2004 00:23
>To: JMeter Developers List
>Subject: RE: Are we ready for a RC?
>
>
>
>right except that would mean users would have to go
>download HTMLParser, and change the setting for JMeter
>to make HTTPSampler use HTMLParser.
>
>otherwise it would still use JTidy. If you think we
>realy shouldn't have it, then I'm fine with removing
>it. Though it's already there and I feel it is a
>benefit to users and it does improve the throughput of
>JMeter as my benchmarks showed.
>
>peter lin
>
>
>--- mstover1@apache.org wrote:
>> That's why we'd use Ant to fetch them rather than
>> put them in CVS.  We can't distribute 
>> the jars, but to do a dist, JMeter should come fully
>> compiled against these optional jars 
>> (and so I need them).  Also, an Ant fetch target
>> would be really nice for users - want all 
>> the optional abilities?  Run the target to get them.
>>  Wouldn't work for the mail api's, but 
>> would be great for anything under non-compatible
>> free licenses.
>> 
>> -Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: jmeter-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jmeter-dev-help@jakarta.apache.org


Re: How to handle htmlparser library (was Are we ready for a RC?)

Posted by ms...@apache.org.
we have a license to keep the source in CVS but not a binary jar?

On 5 Feb 2004 at 5:52, peter lin wrote:

>  
> I guess I'm the only one with the bias towards better performance 
at the cost of increased maintenance. If everyone prefers to default 
to JTidy and require users download HTMLParser, I have no 
objections.
>  
> I just would rather make it easier on the user and not add another 
jar file for users to download. Plus the developers of HTMLParser 
were kind enough to donate a license to us. Overall, my bias is 
towards keeping the source in CVS or try to move it to commons. 
HTMLParser is capable of parsing XML and other markup 
languages, so it does provide a flexible set of API for developers to 
extend. 
>  
> On a unrelated note, I am working on the monitor idea again after 
several months of putting it off. In order to get the monitor to work, I 
need to use digest authentication. My plan is to use commons-
HTTPClient, since it supports digest auth. I was also planning on 
doing a simple benchmark comparing the default URLConnection 
to HTTPClient. HTTPClient also supports NTLM, so it could mean 
an easy way to support NTLM in HTTPSampler.
>  
> If there are no performance degredations using HTTPClient, I will 
probably suggest we convert to HTTPClient. Does anyone have an 
alergy to that idea? If so, speak up now and I'll just keep the results 
to myself.
>  
> I went through the bugs last night. I don't know enough of those 
samplers to be able to provide a quick patch. Are there any other 
bugs we want to address before a release candidate?
>  
>  
> peter lin
>  
> 
> 
> "BAZLEY, Sebastian" 
<Se...@london.sema.slb.com> wrote:
> We could check for the presence of HTMLParser at run-time, and 
fall back to
> JTidy or Regex (or etc.) if not present. 
> 
> Some users might not like the fallback behaviour, so if the parser 
property
> were changed to be a list of the acceptable parsers, in order of 
preference,
> we could support as much (or as little) fallback as required.
> 
> We should log a warning message if the desired parser is not 
present (JMeter
> already logs an info message when a parser is initialised).
> 
> S.
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Finance: Get your refund fast by filing online




--
Michael Stover
mstover1@apache.org
Yahoo IM: mstover_ya
ICQ: 152975688
AIM: mstover777

---------------------------------------------------------------------
To unsubscribe, e-mail: jmeter-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jmeter-dev-help@jakarta.apache.org


Re: How to handle htmlparser library (was Are we ready for a RC?)

Posted by peter lin <jm...@yahoo.com>.

Jordi Salvat i Alabart <js...@atg.com> wrote:

+0 on keeping htmlparser code in our codebase -- at least for the time 
being.

+0 on switching default parser to htmlparser. Performance-wise, I would 
vote for the regex one, but I don't really trust its accuracy (although 
it is working very well for me so far, I usually work with sites that 
have fairly clean HTML). Anyway, I'm thinking about making the parser a 
configuration component.


I just doubled checked and it looks, HTMLParser is the default.

private final static String DEFAULT_PARSER = "org.apache.jmeter.protocol.http.parser.HtmlParserHTMLParser";

we should change it back to JTidy. I prefer the idea of making it configurable, so that users can select which one they want to use in the GUI. that way it doesn't require changing the properties file.


+1 on switching to HTTPClient -- as far as performance is good, which 
I'm sure will be.

-- 
Salut,

Jordi.



I'll report my findings once I am done.

 

peter lin


---------------------------------
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online

Re: How to handle htmlparser library (was Are we ready for a RC?)

Posted by Jordi Salvat i Alabart <js...@atg.com>.
+0 on keeping htmlparser code in our codebase -- at least for the time 
being.

+0 on switching default parser to htmlparser. Performance-wise, I would 
vote for the regex one, but I don't really trust its accuracy (although 
it is working very well for me so far, I usually work with sites that 
have fairly clean HTML). Anyway, I'm thinking about making the parser a 
configuration component.

+1 on switching to HTTPClient -- as far as performance is good, which 
I'm sure will be.

-- 
Salut,

Jordi.

En/na peter lin ha escrit:
>  
> I guess I'm the only one with the bias towards better performance at the cost of increased maintenance. If everyone prefers to default to JTidy and require users download HTMLParser, I have no objections.
>  
> I just would rather make it easier on the user and not add another jar file for users to download. Plus the developers of HTMLParser were kind enough to donate a license to us. Overall, my bias is towards keeping the source in CVS or try to move it to commons. HTMLParser is capable of parsing XML and other markup languages, so it does provide a flexible set of API for developers to extend. 
>  
> On a unrelated note, I am working on the monitor idea again after several months of putting it off. In order to get the monitor to work, I need to use digest authentication. My plan is to use commons-HTTPClient, since it supports digest auth. I was also planning on doing a simple benchmark comparing the default URLConnection to HTTPClient. HTTPClient also supports NTLM, so it could mean an easy way to support NTLM in HTTPSampler.
>  
> If there are no performance degredations using HTTPClient, I will probably suggest we convert to HTTPClient. Does anyone have an alergy to that idea? If so, speak up now and I'll just keep the results to myself.
>  
> I went through the bugs last night. I don't know enough of those samplers to be able to provide a quick patch. Are there any other bugs we want to address before a release candidate?
>  
>  
> peter lin
>  
> 
> 
> "BAZLEY, Sebastian" <Se...@london.sema.slb.com> wrote:
> We could check for the presence of HTMLParser at run-time, and fall back to
> JTidy or Regex (or etc.) if not present. 
> 
> Some users might not like the fallback behaviour, so if the parser property
> were changed to be a list of the acceptable parsers, in order of preference,
> we could support as much (or as little) fallback as required.
> 
> We should log a warning message if the desired parser is not present (JMeter
> already logs an info message when a parser is initialised).
> 
> S.
> 
> 
> ---------------------------------
> Do you Yahoo!?
> Yahoo! Finance: Get your refund fast by filing online


---------------------------------------------------------------------
To unsubscribe, e-mail: jmeter-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: jmeter-dev-help@jakarta.apache.org


Re: How to handle htmlparser library (was Are we ready for a RC?)

Posted by peter lin <jm...@yahoo.com>.
 
I guess I'm the only one with the bias towards better performance at the cost of increased maintenance. If everyone prefers to default to JTidy and require users download HTMLParser, I have no objections.
 
I just would rather make it easier on the user and not add another jar file for users to download. Plus the developers of HTMLParser were kind enough to donate a license to us. Overall, my bias is towards keeping the source in CVS or try to move it to commons. HTMLParser is capable of parsing XML and other markup languages, so it does provide a flexible set of API for developers to extend. 
 
On a unrelated note, I am working on the monitor idea again after several months of putting it off. In order to get the monitor to work, I need to use digest authentication. My plan is to use commons-HTTPClient, since it supports digest auth. I was also planning on doing a simple benchmark comparing the default URLConnection to HTTPClient. HTTPClient also supports NTLM, so it could mean an easy way to support NTLM in HTTPSampler.
 
If there are no performance degredations using HTTPClient, I will probably suggest we convert to HTTPClient. Does anyone have an alergy to that idea? If so, speak up now and I'll just keep the results to myself.
 
I went through the bugs last night. I don't know enough of those samplers to be able to provide a quick patch. Are there any other bugs we want to address before a release candidate?
 
 
peter lin
 


"BAZLEY, Sebastian" <Se...@london.sema.slb.com> wrote:
We could check for the presence of HTMLParser at run-time, and fall back to
JTidy or Regex (or etc.) if not present. 

Some users might not like the fallback behaviour, so if the parser property
were changed to be a list of the acceptable parsers, in order of preference,
we could support as much (or as little) fallback as required.

We should log a warning message if the desired parser is not present (JMeter
already logs an info message when a parser is initialised).

S.


---------------------------------
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online