You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by Nicolas Bouillon <ni...@bouillon.net> on 2013/04/05 22:08:58 UTC

[T5] Avoid logs in error for wrong urls and missing components ?

Dear all,

I'm working on a e-commerce website in Tapestry 5 since a couple of years
and we are making our website evolving constantly, adding some features and
changing stuff here and there.

I'm closely monitoring application logs but i'm very annoyed by robots who
reminds some URLs that are not valid anymore.

For example, they remind (or follow old links) to webpages that used to be
coded in PHP and where the URL contained special chars, not allowed in
Tapestry URLs. Or those spiders try to access an URL of a grid pager event,
but the Grid component is not there anymore (or has a different name). Each
of those hit generate a log.error message, and that hide the important
errors messages inside many noise. (I know, "Cool URI don't change", but
for page events, it could be quite had to keep old URLs...)

The error log is something like that :
org.apache.tapestry5.ioc.util.UnknownValueException: Component
product/domain/PriceList does not contain embedded component 'v3grid'.

It's seems to be too wide to ignore totally "UnknownValueException" log
appender, because it might hide real mistakes in the web application.

Is there a way to avoid this kind of behavior ? How do you treat error logs
from your applications ?

Thanks.

Nicolas.

Re: [T5] Avoid logs in error for wrong urls and missing components ?

Posted by Ivan Khalopik <ik...@gmail.com>.
What about configuring robots.txt?


On Sat, Apr 6, 2013 at 1:50 PM, Nicolas Bouillon <ni...@bouillon.net>wrote:

> Thanks for your thoughts, and that leads me to the following suggestion to
> resolve my problem:
>
> I've already contributed a custom RequestExceptionHandler using
> binder.bind(RequestExceptionHandler.class,
> CustomRequestExceptionHandler.class)
>               .withId("CustomRequestExceptionHandler");
>
> That is this service which is responsible of my log error message
>
> (org.apache.tapestry5.internal.services.DefaultRequestExceptionHandler#handleRequestException).
>
> In my Custom RequestExceptionHandler, I could check the user agent to match
> against bots, and then instead of rendering the exception page, give them a
> 404 response code so they could stop crawling the page (and me skipping
> logger.error at the top of the handleRequestException method).
>
> Best regards.
>
>
>
> 2013/4/6 Muhammad Gelbana <m....@gmail.com>
>
> > When you create a tapestry project using maven's archtype, it creates the
> > below request filter (slightly modified) to log how much time each
> request
> > consumed.
> >
> > // Service building
> > public RequestFilter buildTimingFilter(final Logger log) {
> >     return new RequestFilter() {
> >         @Override
> >         public boolean service(Request request, Response response,
> > RequestHandler handler) throws IOException {
> >             long startTime = System.currentTimeMillis();
> >             try {
> >                 // The responsibility of a filter is to invoke the
> > corresponding method
> >                 // in the handler. When you chain multiple filters
> > together, each filter
> >                 // received a handler that is a bridge to the next
> filter.
> >                 return handler.service(request, response);
> >             } finally {
> >                 long elapsed = System.currentTimeMillis() - startTime;
> >                 if (TimeUnit.MILLISECONDS.toSeconds(elapsed) >= 10) {
> >                     log.warn(String.format("Request time: %d ms",
> > elapsed));
> >                  }
> >             }
> >         }
> >     };
> > }
> >
> > // Service contribution
> > public void contributeRequestHandler(OrderedConfiguration<RequestFilter>
> > configuration, @Local RequestFilter filter) {
> >     // Each contribution to an ordered configuration has a name, When
> > necessary, you may
> >     // set constraints to precisely control the invocation order of the
> > contributed filter
> >     // within the pipeline.
> >     configuration.add("Timing", filter);
> > }
> >
> > From there, if you find a pattern for requests coming from bots, you can
> > drop these requests if that suits you. This way these bots will also
> learn
> > that the links they are requesting doesn't exists anymore and will
> > eventually stop bothering you, if they are smart enough !
> >
> > Regards
> >
> >
> > On Fri, Apr 5, 2013 at 10:08 PM, Nicolas Bouillon <nicolas@bouillon.net
> > >wrote:
> >
> > > Dear all,
> > >
> > > I'm working on a e-commerce website in Tapestry 5 since a couple of
> years
> > > and we are making our website evolving constantly, adding some features
> > and
> > > changing stuff here and there.
> > >
> > > I'm closely monitoring application logs but i'm very annoyed by robots
> > who
> > > reminds some URLs that are not valid anymore.
> > >
> > > For example, they remind (or follow old links) to webpages that used to
> > be
> > > coded in PHP and where the URL contained special chars, not allowed in
> > > Tapestry URLs. Or those spiders try to access an URL of a grid pager
> > event,
> > > but the Grid component is not there anymore (or has a different name).
> > Each
> > > of those hit generate a log.error message, and that hide the important
> > > errors messages inside many noise. (I know, "Cool URI don't change",
> but
> > > for page events, it could be quite had to keep old URLs...)
> > >
> > > The error log is something like that :
> > > org.apache.tapestry5.ioc.util.UnknownValueException: Component
> > > product/domain/PriceList does not contain embedded component 'v3grid'.
> > >
> > > It's seems to be too wide to ignore totally "UnknownValueException" log
> > > appender, because it might hide real mistakes in the web application.
> > >
> > > Is there a way to avoid this kind of behavior ? How do you treat error
> > logs
> > > from your applications ?
> > >
> > > Thanks.
> > >
> > > Nicolas.
> > >
> >
>



-- 
BR
Ivan

Re: [T5] Avoid logs in error for wrong urls and missing components ?

Posted by Nicolas Bouillon <ni...@bouillon.net>.
Thanks for your thoughts, and that leads me to the following suggestion to
resolve my problem:

I've already contributed a custom RequestExceptionHandler using
binder.bind(RequestExceptionHandler.class,
CustomRequestExceptionHandler.class)
              .withId("CustomRequestExceptionHandler");

That is this service which is responsible of my log error message
(org.apache.tapestry5.internal.services.DefaultRequestExceptionHandler#handleRequestException).

In my Custom RequestExceptionHandler, I could check the user agent to match
against bots, and then instead of rendering the exception page, give them a
404 response code so they could stop crawling the page (and me skipping
logger.error at the top of the handleRequestException method).

Best regards.



2013/4/6 Muhammad Gelbana <m....@gmail.com>

> When you create a tapestry project using maven's archtype, it creates the
> below request filter (slightly modified) to log how much time each request
> consumed.
>
> // Service building
> public RequestFilter buildTimingFilter(final Logger log) {
>     return new RequestFilter() {
>         @Override
>         public boolean service(Request request, Response response,
> RequestHandler handler) throws IOException {
>             long startTime = System.currentTimeMillis();
>             try {
>                 // The responsibility of a filter is to invoke the
> corresponding method
>                 // in the handler. When you chain multiple filters
> together, each filter
>                 // received a handler that is a bridge to the next filter.
>                 return handler.service(request, response);
>             } finally {
>                 long elapsed = System.currentTimeMillis() - startTime;
>                 if (TimeUnit.MILLISECONDS.toSeconds(elapsed) >= 10) {
>                     log.warn(String.format("Request time: %d ms",
> elapsed));
>                  }
>             }
>         }
>     };
> }
>
> // Service contribution
> public void contributeRequestHandler(OrderedConfiguration<RequestFilter>
> configuration, @Local RequestFilter filter) {
>     // Each contribution to an ordered configuration has a name, When
> necessary, you may
>     // set constraints to precisely control the invocation order of the
> contributed filter
>     // within the pipeline.
>     configuration.add("Timing", filter);
> }
>
> From there, if you find a pattern for requests coming from bots, you can
> drop these requests if that suits you. This way these bots will also learn
> that the links they are requesting doesn't exists anymore and will
> eventually stop bothering you, if they are smart enough !
>
> Regards
>
>
> On Fri, Apr 5, 2013 at 10:08 PM, Nicolas Bouillon <nicolas@bouillon.net
> >wrote:
>
> > Dear all,
> >
> > I'm working on a e-commerce website in Tapestry 5 since a couple of years
> > and we are making our website evolving constantly, adding some features
> and
> > changing stuff here and there.
> >
> > I'm closely monitoring application logs but i'm very annoyed by robots
> who
> > reminds some URLs that are not valid anymore.
> >
> > For example, they remind (or follow old links) to webpages that used to
> be
> > coded in PHP and where the URL contained special chars, not allowed in
> > Tapestry URLs. Or those spiders try to access an URL of a grid pager
> event,
> > but the Grid component is not there anymore (or has a different name).
> Each
> > of those hit generate a log.error message, and that hide the important
> > errors messages inside many noise. (I know, "Cool URI don't change", but
> > for page events, it could be quite had to keep old URLs...)
> >
> > The error log is something like that :
> > org.apache.tapestry5.ioc.util.UnknownValueException: Component
> > product/domain/PriceList does not contain embedded component 'v3grid'.
> >
> > It's seems to be too wide to ignore totally "UnknownValueException" log
> > appender, because it might hide real mistakes in the web application.
> >
> > Is there a way to avoid this kind of behavior ? How do you treat error
> logs
> > from your applications ?
> >
> > Thanks.
> >
> > Nicolas.
> >
>

Re: [T5] Avoid logs in error for wrong urls and missing components ?

Posted by Muhammad Gelbana <m....@gmail.com>.
When you create a tapestry project using maven's archtype, it creates the
below request filter (slightly modified) to log how much time each request
consumed.

// Service building
public RequestFilter buildTimingFilter(final Logger log) {
    return new RequestFilter() {
        @Override
        public boolean service(Request request, Response response,
RequestHandler handler) throws IOException {
            long startTime = System.currentTimeMillis();
            try {
                // The responsibility of a filter is to invoke the
corresponding method
                // in the handler. When you chain multiple filters
together, each filter
                // received a handler that is a bridge to the next filter.
                return handler.service(request, response);
            } finally {
                long elapsed = System.currentTimeMillis() - startTime;
                if (TimeUnit.MILLISECONDS.toSeconds(elapsed) >= 10) {
                    log.warn(String.format("Request time: %d ms", elapsed));
                 }
            }
        }
    };
}

// Service contribution
public void contributeRequestHandler(OrderedConfiguration<RequestFilter>
configuration, @Local RequestFilter filter) {
    // Each contribution to an ordered configuration has a name, When
necessary, you may
    // set constraints to precisely control the invocation order of the
contributed filter
    // within the pipeline.
    configuration.add("Timing", filter);
}

>From there, if you find a pattern for requests coming from bots, you can
drop these requests if that suits you. This way these bots will also learn
that the links they are requesting doesn't exists anymore and will
eventually stop bothering you, if they are smart enough !

Regards


On Fri, Apr 5, 2013 at 10:08 PM, Nicolas Bouillon <ni...@bouillon.net>wrote:

> Dear all,
>
> I'm working on a e-commerce website in Tapestry 5 since a couple of years
> and we are making our website evolving constantly, adding some features and
> changing stuff here and there.
>
> I'm closely monitoring application logs but i'm very annoyed by robots who
> reminds some URLs that are not valid anymore.
>
> For example, they remind (or follow old links) to webpages that used to be
> coded in PHP and where the URL contained special chars, not allowed in
> Tapestry URLs. Or those spiders try to access an URL of a grid pager event,
> but the Grid component is not there anymore (or has a different name). Each
> of those hit generate a log.error message, and that hide the important
> errors messages inside many noise. (I know, "Cool URI don't change", but
> for page events, it could be quite had to keep old URLs...)
>
> The error log is something like that :
> org.apache.tapestry5.ioc.util.UnknownValueException: Component
> product/domain/PriceList does not contain embedded component 'v3grid'.
>
> It's seems to be too wide to ignore totally "UnknownValueException" log
> appender, because it might hide real mistakes in the web application.
>
> Is there a way to avoid this kind of behavior ? How do you treat error logs
> from your applications ?
>
> Thanks.
>
> Nicolas.
>