You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@jena.apache.org by "Chaudhuri, Rajiv" <ra...@pearson.com> on 2013/12/11 20:47:20 UTC
How to set query timeout and threds kill implicity
Hi,
Our application which is running at Tomcat server is making query to Fuseki
TDB store using HTTP protocol.
We have observed that thread count for the Fuseki process is increasing
with time.
I know there might be code issue and resources should be closed properly.
*But we would like to achieve the same using configuration at Fuseki
server- i.e. If the query takes too much time (which means that resource is
not closed by application server) then there should be time out and Fuseki
server should implicitly kill the thread.*
The impact of this thread count getting increased is Fuseki server stop
responding and process gets kill automatically after sometimes and we have
to restart the Fuseki server again.
--
Regards,
*Rajiv Chaudhuri*
*Cell(O)# 2013642598*
*Cell(P)# 6093563706*
Re: How to set query timeout and threds kill implicity
Posted by Andy Seaborne <an...@apache.org>.
Rajiv,
I have failed to reproduce this. I ran a Fuseki server and sent 1000's
of ASK queries to it. No threads were created.
I used quite simple ASK queries - maybe there is complexity in queryStr?
The results of an ASK query are about 400 bytes on the wire, including
HTTP header - that fits in an IP packet and so in TCP buffering. The
server will respond to a request, finish the HTTP response and send it
without any way for the client to lead the connection hanging around.
But your server is under a different load so I would speculate that the
change you made perturbed the situation but is not the root cause. If
so, problems may arise later.
Could you please point a profiler at the server and report what the
additional threads are doing? jvisualvm will do this to a running
server (thread dump under the treads section) as will YourKit, which is
easier to read afterwards.
Andy
On 12/12/13 18:57, Andy Seaborne wrote:
> On 12/12/13 18:41, Chaudhuri, Rajiv wrote:
>> QueryExecution qexec =
>> QueryExecutionFactory.sparqlService(queryServiceIRI, queryStr);
>> try {
>> boolean result = qexec.execAsk();
>> if (! result) {
>> }
>> } finally {
>> qexec.close();
>> }
>>
>>
>
> Thanks.
>
> Recorded as JENA-610
> https://issues.apache.org/jira/browse/JENA-610
>
> Andy
>
>>
>> On Thu, Dec 12, 2013 at 1:36 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>>> On 12/12/13 18:15, Chaudhuri, Rajiv wrote:
>>>
>>>> The issue got resolved. execAsk does not release the thread even if we
>>>> close the query execution; as an alternative we have used execSelect
>>>> and
>>>> check whether data exist or not.
>>>>
>>>
>>> Then that's a bug we need to fix.
>>>
>>> Which format are you using in the request to the server?
>>>
>>> Andy
Re: How to set query timeout and threds kill implicity
Posted by Andy Seaborne <an...@apache.org>.
On 12/12/13 18:41, Chaudhuri, Rajiv wrote:
> QueryExecution qexec =
> QueryExecutionFactory.sparqlService(queryServiceIRI, queryStr);
> try {
> boolean result = qexec.execAsk();
> if (! result) {
> }
> } finally {
> qexec.close();
> }
>
>
Thanks.
Recorded as JENA-610
https://issues.apache.org/jira/browse/JENA-610
Andy
>
> On Thu, Dec 12, 2013 at 1:36 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> On 12/12/13 18:15, Chaudhuri, Rajiv wrote:
>>
>>> The issue got resolved. execAsk does not release the thread even if we
>>> close the query execution; as an alternative we have used execSelect and
>>> check whether data exist or not.
>>>
>>
>> Then that's a bug we need to fix.
>>
>> Which format are you using in the request to the server?
>>
>> Andy
>>
>>
>>
>>>
>>> On Thu, Dec 12, 2013 at 12:24 PM, Andy Seaborne <an...@apache.org> wrote:
>>>
>>> Setting the timeout is a good idea but there are a couple of things in
>>>> Rajiv's message that suggest that is not all that is going on.
>>>>
>>>>
>>>> We have observed that thread count for the Fuseki process is increasing
>>>>
>>>>> with time.
>>>>>
>>>>>
>>>> I know there might be code issue and resources should be
>>>>> closed properly
>>>>>
>>>>
>>>> Are we talking about SELECT queries?
>>>> How many concurrent users of the application side are there?
>>>> And are the results quite large (larger than TCP buffering)?
>>>>
>>>> The only way I can think of, at the moment, is if the client is keeping
>>>> the TCP connection open and the TCP connection filling up, causing backup
>>>> at the server. Given, you can calling it from Tomcat, is the Tomcat
>>>> application code long lived?
>>>>
>>>> If you can attach a java debugger to the server, could you tell me what
>>>> the threads are doing (or not doing)?
>>>>
>>>> If the client is consuming the results very slowly, or jumping out of the
>>>> ResultSet loop prematurely, then the server has no way of knowing whether
>>>> the client application code has really finished or is going to come back
>>>> later.
>>>>
>>>> The pattern:
>>>> QueryExecution exec = QueryExecutionFactory.create(q,ds) ;
>>>> try {
>>>> ResultSet rs = exec.execSelect() ;
>>>> while(rs.hasNext()) {
>>>> return ...
>>>> }
>>>> } finally {
>>>> exec.close();
>>>> }
>>>>
>>>> catches that.
>>>>
>>>> Another approach is
>>>>
>>>> ResultSet rs = exec.execSelect() ;
>>>> rs = ResultSetFactory.copyResults(rs) ;
>>>>
>>>> which forces the ARQ code to pull in all the results, which than releases
>>>> the connection back tot he local connection pool.
>>>>
>>>> It does not matter whether we're talking streaming or if Fuseki gets all
>>>> the query results, then sent the results (which it doesn't do - it
>>>> streams
>>>> when possible) but it does make this vector more likely.
>>>>
>>>> To defend the server, Brian's suggestion of a reverse proxy gives you a
>>>> control point. Rather than put every mechanism needed into Fuseki,
>>>> keeping
>>>> it focues on beign a SPARQL engine and using well-known, well-engineered
>>>> other components.
>>>>
>>>> The default configuration of Fuseki is to use Jetty's
>>>> BlockingChannelConnector but it is possible to complete take control of
>>>> the
>>>> jetty configuration with --jetty-config.
>>>>
>>>> (SelectChannelConnector might have been better but it has been reported
>>>> to
>>>> be unstable on OS/X for Fuseki's usage patterns)
>>>>
>>>>
>>>> It looks like an (unintentional) "Slow HTTP" DOS attack but on the
>>>> response consumption side, not the request sending.
>>>>
>>>> https://community.qualys.com/blogs/securitylabs/2011/11/02/
>>>> how-to-protect-against-slow-http-attacks
>>>>
>>>> The client OS will still be ack'ing the TCP connection and may do so
>>>> forever from a server-style app like from Tomcat.
>>>>
>>>> Setting the query time may reduce the general effect but it will not free
>>>> up threads. A query timeout is a graceful abort of the query - the query
>>>> gets a chance to clean up and that needs the query execution in the
>>>> server
>>>> getting called.
>>>>
>>>> Actually killing threads in Java without the threads involvement is bad.
>>>>
>>>> http://docs.oracle.com/javase/1.5.0/docs/guide/misc/
>>>> threadPrimitiveDeprecation.html
>>>>
>>>> Andy
>>>>
>>>>
>>>>
>>>> On 12/12/13 09:33, Brian McBride wrote:
>>>>
>>>> Hi Rajiv,
>>>>>
>>>>> You can find how to set query timeouts in the Fuseki documentation at
>>>>> [1]
>>>>>
>>>>> To set server wide timeout:
>>>>>
>>>>> [] rdf:type fuseki:Server ;
>>>>> #Server-wide context parameters can be given here.
>>>>> #For example, to set query timeouts: on a server-wide
>>>>> basis:
>>>>> #Format 1: "1000"-- 1second timeout
>>>>> #Format 2: "10000,60000"-- 10s timeout to first result,
>>>>> then 60s timeout to for rest of query.
>>>>> #See java doc for ARQ.queryTimeout
>>>>> #ja:context [ ja:cxtName "arq:queryTimeout"; ja:cxtValue
>>>>> "10000"] ;
>>>>>
>>>>> Uncomment the last line and set the value you require per the comments
>>>>> above.
>>>>>
>>>>> Do you have a reverse proxy configured to limit the rate at which
>>>>> requests are going to Fuseki? Can you confirm that all your requests
>>>>> are going through that proxy?
>>>>>
>>>>> Brian
>>>>>
>>>>> [1]
>>>>> http://jena.apache.org/documentation/serving_data/#
>>>>> running-a-fuseki-server
>>>>>
>>>>>
>>>>> On 11/12/2013 19:47, Chaudhuri, Rajiv wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> Our application which is running at Tomcat server is making query to
>>>>>> Fuseki
>>>>>> TDB store using HTTP protocol.
>>>>>>
>>>>>> We have observed that thread count for the Fuseki process is increasing
>>>>>> with time.
>>>>>>
>>>>>> I know there might be code issue and resources should be closed
>>>>>> properly.
>>>>>>
>>>>>> *But we would like to achieve the same using configuration at Fuseki
>>>>>> server- i.e. If the query takes too much time (which means that
>>>>>> resource is
>>>>>> not closed by application server) then there should be time out and
>>>>>> Fuseki
>>>>>> server should implicitly kill the thread.*
>>>>>>
>>>>>>
>>>>>> The impact of this thread count getting increased is Fuseki server stop
>>>>>> responding and process gets kill automatically after sometimes and we
>>>>>> have
>>>>>> to restart the Fuseki server again.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>
>
Re: How to set query timeout and threds kill implicity
Posted by "Chaudhuri, Rajiv" <ra...@pearson.com>.
QueryExecution qexec =
QueryExecutionFactory.sparqlService(queryServiceIRI, queryStr);
try {
boolean result = qexec.execAsk();
if (! result) {
}
} finally {
qexec.close();
}
On Thu, Dec 12, 2013 at 1:36 PM, Andy Seaborne <an...@apache.org> wrote:
> On 12/12/13 18:15, Chaudhuri, Rajiv wrote:
>
>> The issue got resolved. execAsk does not release the thread even if we
>> close the query execution; as an alternative we have used execSelect and
>> check whether data exist or not.
>>
>
> Then that's a bug we need to fix.
>
> Which format are you using in the request to the server?
>
> Andy
>
>
>
>>
>> On Thu, Dec 12, 2013 at 12:24 PM, Andy Seaborne <an...@apache.org> wrote:
>>
>> Setting the timeout is a good idea but there are a couple of things in
>>> Rajiv's message that suggest that is not all that is going on.
>>>
>>>
>>> We have observed that thread count for the Fuseki process is increasing
>>>
>>>> with time.
>>>>
>>>>
>>> I know there might be code issue and resources should be
>>>> closed properly
>>>>
>>>
>>> Are we talking about SELECT queries?
>>> How many concurrent users of the application side are there?
>>> And are the results quite large (larger than TCP buffering)?
>>>
>>> The only way I can think of, at the moment, is if the client is keeping
>>> the TCP connection open and the TCP connection filling up, causing backup
>>> at the server. Given, you can calling it from Tomcat, is the Tomcat
>>> application code long lived?
>>>
>>> If you can attach a java debugger to the server, could you tell me what
>>> the threads are doing (or not doing)?
>>>
>>> If the client is consuming the results very slowly, or jumping out of the
>>> ResultSet loop prematurely, then the server has no way of knowing whether
>>> the client application code has really finished or is going to come back
>>> later.
>>>
>>> The pattern:
>>> QueryExecution exec = QueryExecutionFactory.create(q,ds) ;
>>> try {
>>> ResultSet rs = exec.execSelect() ;
>>> while(rs.hasNext()) {
>>> return ...
>>> }
>>> } finally {
>>> exec.close();
>>> }
>>>
>>> catches that.
>>>
>>> Another approach is
>>>
>>> ResultSet rs = exec.execSelect() ;
>>> rs = ResultSetFactory.copyResults(rs) ;
>>>
>>> which forces the ARQ code to pull in all the results, which than releases
>>> the connection back tot he local connection pool.
>>>
>>> It does not matter whether we're talking streaming or if Fuseki gets all
>>> the query results, then sent the results (which it doesn't do - it
>>> streams
>>> when possible) but it does make this vector more likely.
>>>
>>> To defend the server, Brian's suggestion of a reverse proxy gives you a
>>> control point. Rather than put every mechanism needed into Fuseki,
>>> keeping
>>> it focues on beign a SPARQL engine and using well-known, well-engineered
>>> other components.
>>>
>>> The default configuration of Fuseki is to use Jetty's
>>> BlockingChannelConnector but it is possible to complete take control of
>>> the
>>> jetty configuration with --jetty-config.
>>>
>>> (SelectChannelConnector might have been better but it has been reported
>>> to
>>> be unstable on OS/X for Fuseki's usage patterns)
>>>
>>>
>>> It looks like an (unintentional) "Slow HTTP" DOS attack but on the
>>> response consumption side, not the request sending.
>>>
>>> https://community.qualys.com/blogs/securitylabs/2011/11/02/
>>> how-to-protect-against-slow-http-attacks
>>>
>>> The client OS will still be ack'ing the TCP connection and may do so
>>> forever from a server-style app like from Tomcat.
>>>
>>> Setting the query time may reduce the general effect but it will not free
>>> up threads. A query timeout is a graceful abort of the query - the query
>>> gets a chance to clean up and that needs the query execution in the
>>> server
>>> getting called.
>>>
>>> Actually killing threads in Java without the threads involvement is bad.
>>>
>>> http://docs.oracle.com/javase/1.5.0/docs/guide/misc/
>>> threadPrimitiveDeprecation.html
>>>
>>> Andy
>>>
>>>
>>>
>>> On 12/12/13 09:33, Brian McBride wrote:
>>>
>>> Hi Rajiv,
>>>>
>>>> You can find how to set query timeouts in the Fuseki documentation at
>>>> [1]
>>>>
>>>> To set server wide timeout:
>>>>
>>>> [] rdf:type fuseki:Server ;
>>>> #Server-wide context parameters can be given here.
>>>> #For example, to set query timeouts: on a server-wide
>>>> basis:
>>>> #Format 1: "1000"-- 1second timeout
>>>> #Format 2: "10000,60000"-- 10s timeout to first result,
>>>> then 60s timeout to for rest of query.
>>>> #See java doc for ARQ.queryTimeout
>>>> #ja:context [ ja:cxtName "arq:queryTimeout"; ja:cxtValue
>>>> "10000"] ;
>>>>
>>>> Uncomment the last line and set the value you require per the comments
>>>> above.
>>>>
>>>> Do you have a reverse proxy configured to limit the rate at which
>>>> requests are going to Fuseki? Can you confirm that all your requests
>>>> are going through that proxy?
>>>>
>>>> Brian
>>>>
>>>> [1]
>>>> http://jena.apache.org/documentation/serving_data/#
>>>> running-a-fuseki-server
>>>>
>>>>
>>>> On 11/12/2013 19:47, Chaudhuri, Rajiv wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> Our application which is running at Tomcat server is making query to
>>>>> Fuseki
>>>>> TDB store using HTTP protocol.
>>>>>
>>>>> We have observed that thread count for the Fuseki process is increasing
>>>>> with time.
>>>>>
>>>>> I know there might be code issue and resources should be closed
>>>>> properly.
>>>>>
>>>>> *But we would like to achieve the same using configuration at Fuseki
>>>>> server- i.e. If the query takes too much time (which means that
>>>>> resource is
>>>>> not closed by application server) then there should be time out and
>>>>> Fuseki
>>>>> server should implicitly kill the thread.*
>>>>>
>>>>>
>>>>> The impact of this thread count getting increased is Fuseki server stop
>>>>> responding and process gets kill automatically after sometimes and we
>>>>> have
>>>>> to restart the Fuseki server again.
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>>
>
--
Regards,
*Rajiv Chaudhuri*
*Cell(O)# 2013642598*
*Cell(P)# 6093563706*
Re: How to set query timeout and threds kill implicity
Posted by Andy Seaborne <an...@apache.org>.
On 12/12/13 18:15, Chaudhuri, Rajiv wrote:
> The issue got resolved. execAsk does not release the thread even if we
> close the query execution; as an alternative we have used execSelect and
> check whether data exist or not.
Then that's a bug we need to fix.
Which format are you using in the request to the server?
Andy
>
>
> On Thu, Dec 12, 2013 at 12:24 PM, Andy Seaborne <an...@apache.org> wrote:
>
>> Setting the timeout is a good idea but there are a couple of things in
>> Rajiv's message that suggest that is not all that is going on.
>>
>>
>> We have observed that thread count for the Fuseki process is increasing
>>> with time.
>>>
>>
>>> I know there might be code issue and resources should be
>>> closed properly
>>
>> Are we talking about SELECT queries?
>> How many concurrent users of the application side are there?
>> And are the results quite large (larger than TCP buffering)?
>>
>> The only way I can think of, at the moment, is if the client is keeping
>> the TCP connection open and the TCP connection filling up, causing backup
>> at the server. Given, you can calling it from Tomcat, is the Tomcat
>> application code long lived?
>>
>> If you can attach a java debugger to the server, could you tell me what
>> the threads are doing (or not doing)?
>>
>> If the client is consuming the results very slowly, or jumping out of the
>> ResultSet loop prematurely, then the server has no way of knowing whether
>> the client application code has really finished or is going to come back
>> later.
>>
>> The pattern:
>> QueryExecution exec = QueryExecutionFactory.create(q,ds) ;
>> try {
>> ResultSet rs = exec.execSelect() ;
>> while(rs.hasNext()) {
>> return ...
>> }
>> } finally {
>> exec.close();
>> }
>>
>> catches that.
>>
>> Another approach is
>>
>> ResultSet rs = exec.execSelect() ;
>> rs = ResultSetFactory.copyResults(rs) ;
>>
>> which forces the ARQ code to pull in all the results, which than releases
>> the connection back tot he local connection pool.
>>
>> It does not matter whether we're talking streaming or if Fuseki gets all
>> the query results, then sent the results (which it doesn't do - it streams
>> when possible) but it does make this vector more likely.
>>
>> To defend the server, Brian's suggestion of a reverse proxy gives you a
>> control point. Rather than put every mechanism needed into Fuseki, keeping
>> it focues on beign a SPARQL engine and using well-known, well-engineered
>> other components.
>>
>> The default configuration of Fuseki is to use Jetty's
>> BlockingChannelConnector but it is possible to complete take control of the
>> jetty configuration with --jetty-config.
>>
>> (SelectChannelConnector might have been better but it has been reported to
>> be unstable on OS/X for Fuseki's usage patterns)
>>
>>
>> It looks like an (unintentional) "Slow HTTP" DOS attack but on the
>> response consumption side, not the request sending.
>>
>> https://community.qualys.com/blogs/securitylabs/2011/11/02/
>> how-to-protect-against-slow-http-attacks
>>
>> The client OS will still be ack'ing the TCP connection and may do so
>> forever from a server-style app like from Tomcat.
>>
>> Setting the query time may reduce the general effect but it will not free
>> up threads. A query timeout is a graceful abort of the query - the query
>> gets a chance to clean up and that needs the query execution in the server
>> getting called.
>>
>> Actually killing threads in Java without the threads involvement is bad.
>>
>> http://docs.oracle.com/javase/1.5.0/docs/guide/misc/
>> threadPrimitiveDeprecation.html
>>
>> Andy
>>
>>
>>
>> On 12/12/13 09:33, Brian McBride wrote:
>>
>>> Hi Rajiv,
>>>
>>> You can find how to set query timeouts in the Fuseki documentation at [1]
>>>
>>> To set server wide timeout:
>>>
>>> [] rdf:type fuseki:Server ;
>>> #Server-wide context parameters can be given here.
>>> #For example, to set query timeouts: on a server-wide basis:
>>> #Format 1: "1000"-- 1second timeout
>>> #Format 2: "10000,60000"-- 10s timeout to first result,
>>> then 60s timeout to for rest of query.
>>> #See java doc for ARQ.queryTimeout
>>> #ja:context [ ja:cxtName "arq:queryTimeout"; ja:cxtValue
>>> "10000"] ;
>>>
>>> Uncomment the last line and set the value you require per the comments
>>> above.
>>>
>>> Do you have a reverse proxy configured to limit the rate at which
>>> requests are going to Fuseki? Can you confirm that all your requests
>>> are going through that proxy?
>>>
>>> Brian
>>>
>>> [1]
>>> http://jena.apache.org/documentation/serving_data/#
>>> running-a-fuseki-server
>>>
>>>
>>> On 11/12/2013 19:47, Chaudhuri, Rajiv wrote:
>>>
>>>> Hi,
>>>>
>>>> Our application which is running at Tomcat server is making query to
>>>> Fuseki
>>>> TDB store using HTTP protocol.
>>>>
>>>> We have observed that thread count for the Fuseki process is increasing
>>>> with time.
>>>>
>>>> I know there might be code issue and resources should be closed properly.
>>>>
>>>> *But we would like to achieve the same using configuration at Fuseki
>>>> server- i.e. If the query takes too much time (which means that
>>>> resource is
>>>> not closed by application server) then there should be time out and
>>>> Fuseki
>>>> server should implicitly kill the thread.*
>>>>
>>>>
>>>> The impact of this thread count getting increased is Fuseki server stop
>>>> responding and process gets kill automatically after sometimes and we
>>>> have
>>>> to restart the Fuseki server again.
>>>>
>>>>
>>>
>>>
>>
>
>
Re: How to set query timeout and threds kill implicity
Posted by "Chaudhuri, Rajiv" <ra...@pearson.com>.
The issue got resolved. execAsk does not release the thread even if we
close the query execution; as an alternative we have used execSelect and
check whether data exist or not.
On Thu, Dec 12, 2013 at 12:24 PM, Andy Seaborne <an...@apache.org> wrote:
> Setting the timeout is a good idea but there are a couple of things in
> Rajiv's message that suggest that is not all that is going on.
>
>
> We have observed that thread count for the Fuseki process is increasing
>> with time.
>>
>
> > I know there might be code issue and resources should be
> > closed properly
>
> Are we talking about SELECT queries?
> How many concurrent users of the application side are there?
> And are the results quite large (larger than TCP buffering)?
>
> The only way I can think of, at the moment, is if the client is keeping
> the TCP connection open and the TCP connection filling up, causing backup
> at the server. Given, you can calling it from Tomcat, is the Tomcat
> application code long lived?
>
> If you can attach a java debugger to the server, could you tell me what
> the threads are doing (or not doing)?
>
> If the client is consuming the results very slowly, or jumping out of the
> ResultSet loop prematurely, then the server has no way of knowing whether
> the client application code has really finished or is going to come back
> later.
>
> The pattern:
> QueryExecution exec = QueryExecutionFactory.create(q,ds) ;
> try {
> ResultSet rs = exec.execSelect() ;
> while(rs.hasNext()) {
> return ...
> }
> } finally {
> exec.close();
> }
>
> catches that.
>
> Another approach is
>
> ResultSet rs = exec.execSelect() ;
> rs = ResultSetFactory.copyResults(rs) ;
>
> which forces the ARQ code to pull in all the results, which than releases
> the connection back tot he local connection pool.
>
> It does not matter whether we're talking streaming or if Fuseki gets all
> the query results, then sent the results (which it doesn't do - it streams
> when possible) but it does make this vector more likely.
>
> To defend the server, Brian's suggestion of a reverse proxy gives you a
> control point. Rather than put every mechanism needed into Fuseki, keeping
> it focues on beign a SPARQL engine and using well-known, well-engineered
> other components.
>
> The default configuration of Fuseki is to use Jetty's
> BlockingChannelConnector but it is possible to complete take control of the
> jetty configuration with --jetty-config.
>
> (SelectChannelConnector might have been better but it has been reported to
> be unstable on OS/X for Fuseki's usage patterns)
>
>
> It looks like an (unintentional) "Slow HTTP" DOS attack but on the
> response consumption side, not the request sending.
>
> https://community.qualys.com/blogs/securitylabs/2011/11/02/
> how-to-protect-against-slow-http-attacks
>
> The client OS will still be ack'ing the TCP connection and may do so
> forever from a server-style app like from Tomcat.
>
> Setting the query time may reduce the general effect but it will not free
> up threads. A query timeout is a graceful abort of the query - the query
> gets a chance to clean up and that needs the query execution in the server
> getting called.
>
> Actually killing threads in Java without the threads involvement is bad.
>
> http://docs.oracle.com/javase/1.5.0/docs/guide/misc/
> threadPrimitiveDeprecation.html
>
> Andy
>
>
>
> On 12/12/13 09:33, Brian McBride wrote:
>
>> Hi Rajiv,
>>
>> You can find how to set query timeouts in the Fuseki documentation at [1]
>>
>> To set server wide timeout:
>>
>> [] rdf:type fuseki:Server ;
>> #Server-wide context parameters can be given here.
>> #For example, to set query timeouts: on a server-wide basis:
>> #Format 1: "1000"-- 1second timeout
>> #Format 2: "10000,60000"-- 10s timeout to first result,
>> then 60s timeout to for rest of query.
>> #See java doc for ARQ.queryTimeout
>> #ja:context [ ja:cxtName "arq:queryTimeout"; ja:cxtValue
>> "10000"] ;
>>
>> Uncomment the last line and set the value you require per the comments
>> above.
>>
>> Do you have a reverse proxy configured to limit the rate at which
>> requests are going to Fuseki? Can you confirm that all your requests
>> are going through that proxy?
>>
>> Brian
>>
>> [1]
>> http://jena.apache.org/documentation/serving_data/#
>> running-a-fuseki-server
>>
>>
>> On 11/12/2013 19:47, Chaudhuri, Rajiv wrote:
>>
>>> Hi,
>>>
>>> Our application which is running at Tomcat server is making query to
>>> Fuseki
>>> TDB store using HTTP protocol.
>>>
>>> We have observed that thread count for the Fuseki process is increasing
>>> with time.
>>>
>>> I know there might be code issue and resources should be closed properly.
>>>
>>> *But we would like to achieve the same using configuration at Fuseki
>>> server- i.e. If the query takes too much time (which means that
>>> resource is
>>> not closed by application server) then there should be time out and
>>> Fuseki
>>> server should implicitly kill the thread.*
>>>
>>>
>>> The impact of this thread count getting increased is Fuseki server stop
>>> responding and process gets kill automatically after sometimes and we
>>> have
>>> to restart the Fuseki server again.
>>>
>>>
>>
>>
>
--
Regards,
*Rajiv Chaudhuri*
*Cell(O)# 2013642598*
*Cell(P)# 6093563706*
Re: How to set query timeout and threds kill implicity
Posted by Andy Seaborne <an...@apache.org>.
Setting the timeout is a good idea but there are a couple of things in
Rajiv's message that suggest that is not all that is going on.
> We have observed that thread count for the Fuseki process is increasing
> with time.
> I know there might be code issue and resources should be
> closed properly
Are we talking about SELECT queries?
How many concurrent users of the application side are there?
And are the results quite large (larger than TCP buffering)?
The only way I can think of, at the moment, is if the client is keeping
the TCP connection open and the TCP connection filling up, causing
backup at the server. Given, you can calling it from Tomcat, is the
Tomcat application code long lived?
If you can attach a java debugger to the server, could you tell me what
the threads are doing (or not doing)?
If the client is consuming the results very slowly, or jumping out of
the ResultSet loop prematurely, then the server has no way of knowing
whether the client application code has really finished or is going to
come back later.
The pattern:
QueryExecution exec = QueryExecutionFactory.create(q,ds) ;
try {
ResultSet rs = exec.execSelect() ;
while(rs.hasNext()) {
return ...
}
} finally {
exec.close();
}
catches that.
Another approach is
ResultSet rs = exec.execSelect() ;
rs = ResultSetFactory.copyResults(rs) ;
which forces the ARQ code to pull in all the results, which than
releases the connection back tot he local connection pool.
It does not matter whether we're talking streaming or if Fuseki gets all
the query results, then sent the results (which it doesn't do - it
streams when possible) but it does make this vector more likely.
To defend the server, Brian's suggestion of a reverse proxy gives you a
control point. Rather than put every mechanism needed into Fuseki,
keeping it focues on beign a SPARQL engine and using well-known,
well-engineered other components.
The default configuration of Fuseki is to use Jetty's
BlockingChannelConnector but it is possible to complete take control of
the jetty configuration with --jetty-config.
(SelectChannelConnector might have been better but it has been reported
to be unstable on OS/X for Fuseki's usage patterns)
It looks like an (unintentional) "Slow HTTP" DOS attack but on the
response consumption side, not the request sending.
https://community.qualys.com/blogs/securitylabs/2011/11/02/how-to-protect-against-slow-http-attacks
The client OS will still be ack'ing the TCP connection and may do so
forever from a server-style app like from Tomcat.
Setting the query time may reduce the general effect but it will not
free up threads. A query timeout is a graceful abort of the query - the
query gets a chance to clean up and that needs the query execution in
the server getting called.
Actually killing threads in Java without the threads involvement is bad.
http://docs.oracle.com/javase/1.5.0/docs/guide/misc/threadPrimitiveDeprecation.html
Andy
On 12/12/13 09:33, Brian McBride wrote:
> Hi Rajiv,
>
> You can find how to set query timeouts in the Fuseki documentation at [1]
>
> To set server wide timeout:
>
> [] rdf:type fuseki:Server ;
> #Server-wide context parameters can be given here.
> #For example, to set query timeouts: on a server-wide basis:
> #Format 1: "1000"-- 1second timeout
> #Format 2: "10000,60000"-- 10s timeout to first result,
> then 60s timeout to for rest of query.
> #See java doc for ARQ.queryTimeout
> #ja:context [ ja:cxtName "arq:queryTimeout"; ja:cxtValue
> "10000"] ;
>
> Uncomment the last line and set the value you require per the comments
> above.
>
> Do you have a reverse proxy configured to limit the rate at which
> requests are going to Fuseki? Can you confirm that all your requests
> are going through that proxy?
>
> Brian
>
> [1]
> http://jena.apache.org/documentation/serving_data/#running-a-fuseki-server
>
>
> On 11/12/2013 19:47, Chaudhuri, Rajiv wrote:
>> Hi,
>>
>> Our application which is running at Tomcat server is making query to
>> Fuseki
>> TDB store using HTTP protocol.
>>
>> We have observed that thread count for the Fuseki process is increasing
>> with time.
>>
>> I know there might be code issue and resources should be closed properly.
>>
>> *But we would like to achieve the same using configuration at Fuseki
>> server- i.e. If the query takes too much time (which means that
>> resource is
>> not closed by application server) then there should be time out and
>> Fuseki
>> server should implicitly kill the thread.*
>>
>>
>> The impact of this thread count getting increased is Fuseki server stop
>> responding and process gets kill automatically after sometimes and we
>> have
>> to restart the Fuseki server again.
>>
>
>
Re: How to set query timeout and threds kill implicity
Posted by Brian McBride <br...@epimorphics.com>.
Hi Rajiv,
You can find how to set query timeouts in the Fuseki documentation at [1]
To set server wide timeout:
[] rdf:type fuseki:Server ;
#Server-wide context parameters can be given here.
#For example, to set query timeouts: on a server-wide basis:
#Format 1: "1000"-- 1second timeout
#Format 2: "10000,60000"-- 10s timeout to first result, then 60s timeout to for rest of query.
#See java doc for ARQ.queryTimeout
#ja:context [ ja:cxtName "arq:queryTimeout"; ja:cxtValue "10000"] ;
Uncomment the last line and set the value you require per the comments
above.
Do you have a reverse proxy configured to limit the rate at which
requests are going to Fuseki? Can you confirm that all your requests
are going through that proxy?
Brian
[1]
http://jena.apache.org/documentation/serving_data/#running-a-fuseki-server
On 11/12/2013 19:47, Chaudhuri, Rajiv wrote:
> Hi,
>
> Our application which is running at Tomcat server is making query to Fuseki
> TDB store using HTTP protocol.
>
> We have observed that thread count for the Fuseki process is increasing
> with time.
>
> I know there might be code issue and resources should be closed properly.
>
> *But we would like to achieve the same using configuration at Fuseki
> server- i.e. If the query takes too much time (which means that resource is
> not closed by application server) then there should be time out and Fuseki
> server should implicitly kill the thread.*
>
>
> The impact of this thread count getting increased is Fuseki server stop
> responding and process gets kill automatically after sometimes and we have
> to restart the Fuseki server again.
>
--
Epimorphics Ltd (http://www.epimorphics.com) Epimorphics Ltd. is a limited company registered in England (number 7016688)
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20 6PT, UK