You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@commons.apache.org by oh...@apache.org on 2010/09/22 22:08:02 UTC
svn commit: r1000166 - /commons/proper/lang/trunk/src/site/xdoc/userguide.xml
Author: oheger
Date: Wed Sep 22 20:08:02 2010
New Revision: 1000166
URL: http://svn.apache.org/viewvc?rev=1000166&view=rev
Log:
[lang-644] Add documentation for the new concurrent package.
Modified:
commons/proper/lang/trunk/src/site/xdoc/userguide.xml
Modified: commons/proper/lang/trunk/src/site/xdoc/userguide.xml
URL: http://svn.apache.org/viewvc/commons/proper/lang/trunk/src/site/xdoc/userguide.xml?rev=1000166&r1=1000165&r2=1000166&view=diff
==============================================================================
--- commons/proper/lang/trunk/src/site/xdoc/userguide.xml (original)
+++ commons/proper/lang/trunk/src/site/xdoc/userguide.xml Wed Sep 22 20:08:02 2010
@@ -40,6 +40,7 @@ limitations under the License.
<a href="#lang.mutable.">[lang.mutable.*]</a>
<a href="#lang.text.">[lang.text.*]</a>
<a href="#lang.time.">[lang.time.*]</a>
+ <a href="#lang.concurrent.">[lang.concurrent.*]</a>
<br /><br />
</div>
</section>
@@ -228,5 +229,496 @@ public final class ColorEnum extends Enu
<p>New in Lang 2.1 is the DurationFormatUtils class, which provides various methods for formatting durations. </p>
</section>
+ <section name="lang.concurrent.*">
+ <p>
+ In Lang 3.0 a new <em>concurrent</em> package was introduced containing
+ interfaces and classes to support programming with multiple threads. Its
+ aim is to serve as an extension of the <em>java.util.concurrent</em>
+ package of the JDK.
+ </p>
+
+ <subsection name="Concurrent initializers">
+ <p>
+ A group of classes deals with the correct creation and initialization of
+ objects that are accessed by multiple threads. All these classes implement
+ the <code>ConcurrentInitializer</code> interface which provides just a
+ single method:
+ </p>
+ <source><![CDATA[
+public interface ConcurrentInitializer<T> {
+ T get() throws ConcurrentException;
+}
+ ]]></source>
+ <p>
+ A <code>ConcurrentInitializer</code> produces an object. By calling the
+ <code>get()</code> method the object managed by the initializer can be
+ obtained. There are different implementations of the interface available
+ addressing various use cases:
+ </p>
+ <p>
+ The <code>LazyInitializer</code> class can be used to defer the creation of
+ an object until it is actually used. This makes sense, for instance, if the
+ creation of the object is expensive and would slow down application startup
+ or if the object is needed only for special executions. <code>LazyInitializer</code>
+ implements the <em>double-check idiom for an instance field</em> as
+ discussed in Joshua Bloch's "Effective Java", 2nd edition, item 71. It
+ uses <strong>volatile</strong> fields to reduce the amount of
+ synchronization. Note that this idiom is appropriate for instance fields
+ only. For <strong>static</strong> fields there are superior alternatives.
+ </p>
+ <p>
+ We provide an example use case to demonstrate the usage of this class: A
+ server application uses multiple worker threads to process client requests.
+ If such a request causes a fatal error, an administrator is to be notified
+ using a special messaging service. We assume that the creation of the
+ messaging service is an expensive operation. So it should only be performed
+ if an error actually occurs. Here is where <code>LazyInitializer</code>
+ comes into play. We create a specialized subclass for creating and
+ initializing an instance of our messaging service. <code>LazyInitializer</code>
+ declares an abstract <code>initialize()</code> method which we have to
+ implement to create the messaging service object:
+ </p>
+ <source><![CDATA[
+public class MessagingServiceInitializer
+ extends LazyInitializer<MessagingService> {
+ protected MessagingService initialize() throws ConcurrentException {
+ // Do all necessary steps to create and initialize the service object
+ MessagingService service = ...
+
+ return service;
+ }
+}
+ ]]></source>
+ <p>
+ Now each server thread is passed a reference to a shared instance of our
+ new <code>MessagingServiceInitializer</code> class. The threads run in a
+ loop processing client requests. If an error is detected, the messaging
+ service is obtained from the initializer, and the administrator is
+ notified:
+ </p>
+ <source><![CDATA[
+public class ServerThread implements Runnable {
+ /** The initializer for obtaining the messaging service. */
+ private final ConcurrentInitializer<MessagingService> initializer;
+
+ public ServerThread(ConcurrentInitializer<MessagingService> init) {
+ initializer = init;
+ }
+
+ public void run() {
+ while (true) {
+ try {
+ // wait for request
+ // process request
+ } catch (FatalServerException ex) {
+ // get messaging service
+ try {
+ MessagingService svc = initializer.get();
+ svc.notifyAdministrator(ex);
+ } catch (ConcurrentException cex) {
+ cex.printStackTrace();
+ }
+ }
+ }
+ }
+}
+ ]]></source>
+ <p>
+ The <code>AtomicInitializer</code> class is very similar to
+ <code>LazyInitializer</code>. It serves the same purpose: to defer the
+ creation of an object until it is needed. The internal structure is also
+ very similar. Again there is an abstract <code>initialize()</code> method
+ which has to be implemented by concrete subclasses in order to create and
+ initialize the managed object. Actually, in our example above we can turn
+ the <code>MessagingServiceInitializer</code> into an atomic initializer by
+ simply changing the <strong>extends</strong> declaration to refer to
+ <code>AtomicInitializer<MessagingService></code> as super class.
+ </p>
+ <p>
+ The difference between <code>AtomicInitializer</code> and
+ <code>LazyInitializer</code> is that the former uses classes from the
+ <code>java.util.concurrent.atomic</code> package for its implementation
+ (hence the name). This has the advantage that no synchronization is needed,
+ thus the implementation is usually more efficient than the one of the
+ <code>LazyInitializer</code> class. However, there is one drawback: Under
+ high load, if multiple threads access the initializer concurrently, it is
+ possible that the <code>initialize()</code> method is invoked multiple
+ times. The class guarantees that <code>get()</code> always returns the
+ same object though; so objects created accidently are immideately discarded.
+ As a rule of thumb, <code>AtomicInitializer</code> is preferrable if the
+ probability of a simultaneous access to the initializer is low, e.g. if
+ there are not too many concurrent threads. <code>LazyInitializer</code> is
+ the safer variant, but it has some overhead due to synchronization.
+ </p>
+ <p>
+ Another implementation of the <code>ConcurrentInitializer</code> interface
+ is <code>BackgroundInitializer</code>. It is again an abstract base class
+ with an <code>initialize()</code> method that has to be defined by concrete
+ subclasses. The idea of <code>BackgroundInitializer</code> is that it calls
+ the <code>initialize()</code> method in a separate worker thread. An
+ application creates a background initializer and starts it. Then it can
+ continue with its work while the initializer runs in parallel. When the
+ application needs the results of the initializer it calls its
+ <code>get()</code> method. <code>get()</code> blocks until the initialization
+ is complete. This is useful for instance at application startup. Here
+ initialization steps (e.g. reading configuration files, opening a database
+ connection, etc.) can be run in background threads while the application
+ shows a splash screen and constructs its UI.
+ </p>
+ <p>
+ As a concrete example consider an application that has to read the content
+ of a URL - maybe a page with news - which is to be displayed to the user after
+ login. Because loading the data over the network can take some time a
+ specialized implementation of <code>BackgroundInitializer</code> can be
+ created for this purpose:
+ </p>
+ <source><![CDATA[
+public class URLLoader extends BackgroundInitializer<String> {
+ /** The URL to be loaded. */
+ private final URL url;
+
+ public URLLoader(URL u) {
+ url = u;
+ }
+
+ protected String initialize() throws ConcurrentException {
+ try {
+ InputStream in = url.openStream();
+ // read content into string
+ ...
+ return content;
+ } catch (IOException ioex) {
+ throw new ConcurrentException(ioex);
+ }
+ }
+}
+ ]]></source>
+ <p>
+ An application creates an instance of <code>URLLoader</code> and starts it.
+ Then it can do other things. When it needs the content of the URL it calls
+ the initializer's <code>get()</code> method:
+ </p>
+ <source><![CDATA[
+URL url = new URL("http://www.application-home-page.com/");
+URLLoader loader = new URLLoader(url);
+loader.start(); // this starts the background initialization
+
+// do other stuff
+...
+// now obtain the content of the URL
+String content;
+try {
+ content = loader.get(); // this may block
+} catch (ConcurrentException cex) {
+ content = "Error when loading URL " + url;
+}
+// display content
+ ]]></source>
+ <p>
+ Related to <code>BackgroundInitializer</code> is the
+ <code>MultiBackgroundInitializer</code> class. As the name implies, this
+ class can handle multiplie initializations in parallel. The basic usage
+ scenario is that a <code>MultiBackgroundInitializer</code> instance is
+ created. Then an arbitrary number of <code>BackgroundInitializer</code>
+ objects is added using the <code>addInitializer()</code> method. When adding
+ an initializer a string has to be provided which is later used to obtain
+ the result for this initializer. When all initializers have been added the
+ <code>start()</code> method is called. This starts processing of all
+ initializers. Later the <code>get()</code> method can be called. It waits
+ until all initializers have finished their initialization. <code>get()</code>
+ returns an object of type <code>MultiBackgroundInitializer.MultiBackgroundInitializerResults</code>.
+ This object provides information about all initializations that have been
+ performed. It can be checked whether a specific initializer was successful
+ or threw an exception. Of course, all initialization results can be queried.
+ </p>
+ <p>
+ With <code>MultiBackgroundInitializer</code> we can extend our example to
+ perform multiple initialization steps. Suppose that in addition to loading
+ a web site we also want to create a JPA entity manager factory and read a
+ configuration file. We assume that corresponding <code>BackgroundInitializer</code>
+ implementations exist. The following example fragment shows the usage of
+ <code>MultiBackgroundInitializer</code> for this purpose:
+ </p>
+ <source><![CDATA[
+MultiBackgroundInitializer initializer = new MultiBackgroundInitializer();
+initializer.addInitializer("url", new URLLoader(url));
+initializer.addInitializer("jpa", new JPAEMFInitializer());
+initializer.addInitializer("config", new ConfigurationInitializer());
+initializer.start(); // start background processing
+
+// do other interesting things in parallel
+...
+// evaluate the results of background initialization
+MultiBackgroundInitializer.MultiBackgroundInitializerResults results =
+ initializer.get();
+String urlContent = (String) results.getResultObject("url");
+EntityManagerFactory emf =
+ (EntityManagerFactory) results.getResultObject("jpa");
+...
+ ]]></source>
+ <p>
+ The child initializers are added to the multi initializer and are assigned
+ a unique name. The object returned by the <code>get()</code> method is then
+ queried for the single results using these unique names.
+ </p>
+ <p>
+ If background initializers - including <code>MultiBackgroundInitializer</code>
+ - are created using the standard constructor, they create their own
+ <code>ExecutorService</code> which is used behind the scenes to execute the
+ worker tasks. It is also possible to pass in an <code>ExecutorService</code>
+ when the initializer is constructed. That way client code can configure
+ the <code>ExecutorService</code> according to its specific needs; for
+ instance, the number of threads available could be limited.
+ </p>
+ </subsection>
+
+ <subsection name="Utility classes">
+ <p>
+ Another group of classes in the new <code>concurrent</code> package offers
+ some generic functionality related to concurrency. There is the
+ <code>ConcurrentUtils</code> class with a bunch of static utility methods.
+ One focus of this class is dealing with exceptions thrown by JDK classes.
+ Many JDK classes of the executor framework throw exceptions of type
+ <code>ExecutionException</code> if something goes wrong. The root cause of
+ these exceptions can also be a runtime exception or even an error. In
+ typical Java programming you often do not want to deal with runtime
+ exceptions directly; rather you let them fall through the hierarchy of
+ method invocations until they reach a central exception handler. Checked
+ exceptions in contrast are usually handled close to their occurrence. With
+ <code>ExecutionException</code> this principle is violated. Because it is a
+ checked exception, an application is forced to handle it even if the cause
+ is a runtime exception. So you typically have to inspect the cause of the
+ <code>ExecutionException</code> and test whether it is a checked exception
+ which has to be handled. If this is not the case, the causing exception can
+ be rethrown.
+ </p>
+ <p>
+ The <code>extractCause()</code> method of <code>ConcurrentUtils</code> does
+ this work for you. It is passed an <code>ExecutionException</code> and tests
+ its root cause. If this is an error or a runtime exception, it is directly
+ rethrown. Otherwise, an instance of <code>ConcurrentException</code> is
+ created and initialized with the root cause. (<code>ConcurrentException</code>
+ is a new exception class in the <code>o.a.c.l.concurrent</code> package.)
+ So if you get such a <code>ConcurrentException</code>, you can be sure that
+ the original cause for the <code>ExecutionException</code> was a checked
+ exception. For users who prefer runtime exceptions in general there is also
+ an <code>extractCauseUnchecked()</code> method which behaves like
+ <code>extractCause()</code>, but returns the unchecked exception
+ <code>ConcurrentRuntimeException</code> instead.
+ </p>
+ <p>
+ In addition to the <code>extractCause()</code> methods there are
+ corresponding <code>handleCause()</code> methods. These methods extract the
+ cause of the passed in <code>ExecutionException</code> and throw the
+ resulting <code>ConcurrentException</code> or <code>ConcurrentRuntimeException</code>.
+ This makes it easy to transform an <code>ExecutionException</code> into a
+ <code>ConcurrentException</code> ignoring unchecked exceptions:
+ </p>
+ <source><![CDATA[
+Future<Object> future = ...;
+try {
+ Object result = future.get();
+ ...
+} catch (ExecutionException eex) {
+ ConcurrentUtils.handleCause(eex);
+}
+ ]]></source>
+ <p>
+ There is also some support for the concurrent initializers introduced in
+ the last sub section. The <code>initialize()</code> method is passed a
+ <code>ConcurrentInitializer</code> object and returns the object created by
+ this initializer. It is null-safe. The <code>initializeUnchecked()</code>
+ method works analogously, but a <code>ConcurrentException</code> throws by the
+ initializer is rethrown as a <code>ConcurrentRuntimeException</code>. This
+ is especially useful if the specific <code>ConcurrentInitializer</code>
+ does not throw checked exceptions. Using this method the code for requesting
+ the object of an initializer becomes less verbose. The direct invocation
+ looks as follows:
+ </p>
+ <source><![CDATA[
+ConcurrentInitializer<MyClass> initializer = ...;
+try {
+ MyClass obj = initializer.get();
+ // do something with obj
+} catch (ConcurrentException cex) {
+ // exception handling
+}
+ ]]></source>
+ <p>
+ Using the <code>initializeUnchecked()</code> method, this becomes:
+ </p>
+ <source><![CDATA[
+ConcurrentInitializer<MyClass> initializer = ...;
+MyClass obj = ConcurrentUtils.initializeUnchecked(initializer);
+// do something with obj
+ ]]></source>
+ <p>
+ Another utility class deals with the creation of threads. When using the
+ <em>Executor</em> framework new in JDK 1.5 the developer usually does not
+ have to care about creating threads; the executors create the threads they
+ need on demand. However, sometimes it is desired to set some properties of
+ the newly created worker threads. This is possible through the
+ <code>ThreadFactory</code> interface; an implementation of this interface
+ has to be created and pased to an executor on creation time. Currently, the
+ JDK does not provide an implementation of <code>ThreadFactory</code>, so
+ one has to start from scratch.
+ </p>
+ <p>
+ With <code>BasicThreadFactory</code> Commons Lang has an implementation of
+ <code>ThreadFactory</code> that works out of the box for many common use
+ cases. For instance, it is possible to set a naming pattern for the new
+ threads, set the daemon flag and a priority, or install a handler for
+ uncaught exceptions. Instances of <code>BasicThreadFactory</code> are
+ created and configured using the nested <code>Builder</code> class. The
+ following example shows a typical usage scenario:
+ </p>
+ <source><![CDATA[
+BasicThreadFactory factory = new BasicThreadFactory.Builder()
+ .namingPattern("worker-thread-%d")
+ .daemon(true)
+ .uncaughtExceptionHandler(myHandler)
+ .build();
+ExecutorService exec = Executors.newSingleThreadExecutor(factory);
+ ]]></source>
+ <p>
+ The nested <code>Builder</code> class defines some methods for configuring
+ the new <code>BasicThreadFactory</code> instance. Objects of this class are
+ immutable, so these attributes cannot be changed later. The naming pattern
+ is a string which can be passed to <code>String.format()</code>. The
+ placeholder <em>%d</em> is replaced by an increasing counter value. An
+ instance can wrap another <code>ThreadFactory</code> implementation; this
+ is achieved by calling the builder's <code>wrappedFactory()</code> method.
+ This factory is then used for creating new threads; after that the specific
+ attributes are applied to the new thread. If no wrapped factory is set, the
+ default factory provided by the JDK is used.
+ </p>
+ </subsection>
+
+ <subsection name="Synchronization objects">
+ <p>
+ The <code>concurrent</code> package also provides some support for specific
+ synchronization problems with threads.
+ </p>
+ <p>
+ <code>TimedSemaphore</code> allows restricted access to a resource in a
+ given time frame. Similar to a semaphore, a number of permits can be
+ acquired. What is new is the fact that the permits available are related to
+ a given time unit. For instance, the timed semaphore can be configured to
+ allow 10 permits in a second. Now multiple threads access the semaphore
+ and call its <code>acquire()</code> method. The semaphore keeps track about
+ the number of granted permits in the current time frame. Only 10 calls are
+ allowd; if there are further callers, they are blocked until the time
+ frame (one second in this example) is over. Then all blocking threads are
+ released, and the counter of available permits is reset to 0. So the game
+ can start anew.
+ </p>
+ <p>
+ What are use cases for <code>TimedSemaphore</code>? One example is to
+ artificially limit the load produced by multiple threads. Consider a batch
+ application accessing a database to extract statistical data. The
+ application runs multiple threads which issue database queries in parallel
+ and perform some calculation on the results. If the database to be processed
+ is huge and is also used by a production system, multiple factors have to be
+ balanced: On one hand, the time required for the statistical evaluation
+ should not take too long. Therefore you will probably use a larger number
+ of threads because most of its life time a thread will just wait for the
+ database to return query results. On the other hand, the load on the
+ database generated by all these threads should be limited so that the
+ responsiveness of the production system is not affected. With a
+ <code>TimedSemaphore</code> object this can be achieved. The semaphore can
+ be configured to allow e.g. 100 queries per second. After these queries
+ have been sent to the database the threads have to wait until the second is
+ over - then they can query again. By fine-tuning the limit enforced by the
+ semaphore a good balance between performance and database load can be
+ established. It is even possible to change the number of available permits
+ at runtime. So this number can be reduced during the typical working hours
+ and increased at night.
+ </p>
+ <p>
+ The following code examples demonstrate parts of the implementation of such
+ a scenario. First the batch application has to create an instance of
+ <code>TimedSemaphore</code> and to initialize its properties with default
+ values:
+ </p>
+ <source><![CDATA[
+TimedSemaphore semaphore = new TimedSemaphore(1, TimeUnit.SECONDS, 100);
+ ]]></source>
+ <p>
+ Here we specify that the semaphore should allow 100 permits in one second.
+ This is effectively the limit of database queries per second in our
+ example use case. Next the server threads issuing database queries and
+ performing statistical operations can be initialized. They are passed a
+ reference to the semaphore at creation time. Before they execute a query
+ they have to acquire a permit.
+ </p>
+ <source><![CDATA[
+public class StatisticsTask implements Runnable {
+ /** The semaphore for limiting database load. */
+ private final TimedSemaphore semaphore;
+
+ public StatisticsTask(TimedSemaphore sem, Connection con) {
+ semaphore = sem;
+ ...
+ }
+
+ /**
+ * The main processing method. Executes queries and evaluates their results.
+ */
+ public void run() {
+ try {
+ while (!isDone()) {
+ semaphore.acquire(); // enforce the load limit
+
+ executeAndEvaluateQuery();
+ }
+ } catch (InterruptedException iex) {
+ // fall through
+ }
+ }
+}
+ ]]></source>
+ <p>
+ The important line here is the call to <code>semaphore.acquire()</code>.
+ If the number of permits in the current time frame has not yet been reached,
+ the call returns immediately. Otherwise, it blocks until the end of the
+ time frame. The last piece missing is a scheduler service which adapts the
+ number of permits allowed by the semaphore according to the time of day. We
+ assume that this service is pretty simple and knows only two different time
+ slots: working shift and night shift. The service is triggered periodically.
+ It then determines the current time slot and configures the timed semaphore
+ accordingly.
+ </p>
+ <source><![CDATA[
+public class SchedulerService {
+ /** The semaphore for limiting database load. */
+ private final TimedSemaphore semaphore;
+ ...
+
+ /**
+ * Configures the timed semaphore based on the current time of day. This
+ * method is called periodically.
+ */
+ public void configureTimedSemaphore() {
+ int limit;
+ if (isWorkshift()) {
+ limit = 50; // low database load
+ } else {
+ limit = 250; // high database load
+ }
+
+ semaphore.setLimit(limit);
+ }
+}
+ ]]></source>
+ <p>
+ With the <code>setLimit()</code> method the number of permits allowed for
+ a time frame can be changed. There are some other methods for querying the
+ internal state of a timed semaphore. Also some statistical data is available,
+ e.g. the average number of <code>acquire()</code> calls per time frame. When
+ a timed semaphore is no more needed, its <code>shutdown()</code> method has
+ to be called.
+ </p>
+ </subsection>
+ </section>
</body>
</document>