You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@freemarker.apache.org by Daniel Dekany <dd...@freemail.hu> on 2017/02/05 23:40:15 UTC

[FM3] Redesigned TemplateLoader interface

In FM2 we had some problems with the TemplateLoader interface:

- The major problem with is that to load a template you need to do
  multiple round trips to the storage mechanism: check if the template
  "file" exists, then get its last modification date, then read its
  content. This is particularly problematic for databases, but often
  even HTTP could pack these into a single round trip.

- When the <#ftl encoding=...> header disagrees with the actual
  charset used for reading the template, the whole template has to
  be re-read (I/O!). That's because we get a Reader, so we can't
  rewind the InputStream behind it and start reading it again with
  the new charset.

- To detect changes one can only use the last modification date (not a
  revision number or hash). It can be especially problematic if
  templates can change so fast, that the clock may doesn't tick
  between them.

- TemplateLoader can't return meta-info like the output format (MIME
  type basically) of the template. Some storages could do that.

- Some storages (like databases) support some kind of atomicity and
  transaction isolation, but the TemplateLoader mechanism doesn't allow
  you to utilize these.

- The "template source" is a tricky to understand for implementators, as
  it servers both as a handle that can be closed, and as a long lived cache
  key component.

I propose a totally new TemplateLoader design to counter all these
problems in FM3. I tell the idea mostly in code bellow. Please share
your thoughts! (I was also thinking about backporting this to FM2, but
it was not feasible.)


/**
 * This is the one that replaces the TemplateLoader of FM2.
 */
public interface TemplateLoader {

    /**
     * Creates a new session, or returns {@code null} if the template loader implementation doesn't support sessions.
     * See {@link TemplateLoaderSession} for more information about sessions.
     */
    TemplateLoaderSession createSession();
    
    /**
     * Loads the template content together with meta-data such as the version (usually the last modification time),
     * optionally conditionally. Note how all these operations (existence check, up-to-date check, opening for reading)
     * were put into one atomic unit. This allows you utilize the capabilities of the storage mechanism to spare round
     * trips, or even to add atomicity guarantees.
     *
     * @param name
     *            The name (template root directory relative path) of the template; same as in FM2.
     * @param ifSourceDiffersFrom
     *            If we only want to load the template if its source differs from this. {@code null} if you want the
     *            template to be loaded unconditionally. If this is {@code null} then the
     *            {@code ifVersionDiffersFrom} parameter must be {@code null} too. See
     *            {@link TemplateLoadingResult#getSource()} for more about versions.
     * @param ifVersionDiffersFrom
     *            If we only want to load the template if its version (which is usually the last modification time)
     *            differs from this. {@code null} if {@code ifSourceDiffersFrom} is {@code null}, or if the backing
     *            storage from which the {@code ifSourceDiffersFrom} template source comes from doesn't store a version.
     *            See {@link TemplateLoadingResult#getVersion()} for more about versions.
     * 
     * @return Not {@code null}.
     */
    TemplateLoadingResult load(String name, TemplateLoadingSource ifSourceDiffersFrom, Serializable ifVersionDiffersFrom,
            TemplateLoaderSession session) throws IOException;
    
    /**
     * Invoked by {@link Configuration#clearTemplateCache()} to instruct this template loader to throw away its current
     * state (some kind of cache usually) and start afresh. For most {@link TemplateLoader} implementations this does
     * nothing.
     */
    void resetState();

}

/**
 * Return value of {@link TemplateLoader#load(String, TemplateLoadingSource, Serializable, TemplateLoaderSession)}
 */
public final class TemplateLoadingResult {
 
    public static final TemplateLoadingResult NOT_FOUND = new TemplateLoadingResult(
            TemplateLoadingResultStatus.NOT_FOUND);
    public static final TemplateLoadingResult NOT_MODIFIED = new TemplateLoadingResult(
            TemplateLoadingResultStatus.NOT_MODIFIED);

    /**
     * Creates an instance with status {@link TemplateLoadingResultStatus#OPENED}, for a storage mechanism that
     * naturally returns the template content as sequence of {@code char}-s as opposed to a sequence of {@code byte}-s.
     * This is the case for example when you store the template in a database in a varchar or CLOB. Do <em>not</em> use
     * this constructor for stores that naturally return binary data instead (like files, class loader resources,
     * BLOB-s, etc.), because using this constructor will disable FreeMarker's charset selection mechanism.
     */
    public TemplateLoadingResult(TemplateLoadingSource source, Serializable version, Reader reader,
            TemplateConfiguration templateConfiguration) { ... }

    /**
     * Creates an instance with status {@link TemplateLoadingResultStatus#OPENED}, for a storage mechanism that
     * naturally returns the template content as sequence of {@code byte}-s as opposed to a sequence of {@code char}-s.
     * This is the case for example when you store the template in a file, classpath resource, or BLOB. Do <em>not</em>
     * use this constructor for stores that naturally return text instead (like database varchar and CLOB columns).
     */
    public TemplateLoadingResult(TemplateLoadingSource source, Serializable version, InputStream inputStream,
            TemplateConfiguration templateConfiguration) { ... }

    /**
     * Returns non-{@code null} exactly if {@link #getStatus()} is {@link TemplateLoadingResultStatus#OPENED} and the
     * backing store mechanism returns content as {@code byte}-s, as opposed to as {@code chars}-s. The return value is
     * always the same instance, no mater when and how many times this method is called.
     */
    public InputStream getInputStream() {
        return inputStream;
    }

    /**
     * Tells what kind of result this is; see the documentation of {@link TemplateLoadingResultStatus}.
     */
    public TemplateLoadingResultStatus getStatus() {
        return status;
    }

    /**
     * Same as "template source" FM2, but it's simpler, as it only focuses the usage as part of the cache key,
     * and can't be closed. If you aren't familiar with FM2 template sources, see {@link TemplateLoadingSource}
     * below.
     */
    public TemplateLoadingSource getSource() {
        return source;
    }

    /**
     * This replaced the lastModifed of FM2, and is more flexible as it can be a revision number, a cryptographic hash,
     * etc. Only set if the result status is {@link TemplateLoadingResultStatus#OPENED} and the backing storage stores
     * such information. Version objects are compared with each other with their {@link Object#equals(Object)} method.
     */
    public Serializable getVersion() {
        return version;
    }

    /**
     * Similar to {@link #getInputStream()}, but used when the backing storage mechanism returns content as
     * {@code char}-s, as opposed to as {@code byte}-s.
     */
    public Reader getReader() {
        return reader;
    }

    /**
     * If {@link #getStatus()} is {@link TemplateLoadingResultStatus#OPENED}, and the template loader stores such
     * information (which is rare) then it returns the {@link TemplateConfiguration} applicable to the template,
     * otherwise it returns {@code null}. If there are {@link TemplateConfiguration}-s coming from other
     * sources, such as from {@link Configuration#getTemplateConfigurations()}, this won't replace them, but will be
     * merged with them, with properties coming from the returned {@link TemplateConfiguration} having the highest
     * priority.
     */
    public TemplateConfiguration getTemplateConfiguration() {
        return templateConfiguration;
    }
    
}

/**
 * Used for the value of {@link TemplateLoadingResult#getStatus()}.
 */
public enum TemplateLoadingResultStatus {

    /**
     * The template with the requested name doesn't exist (not to be confused with "wasn't accessible due to error").
     */
    NOT_FOUND,

    /**
     * If the template was found, but its source and version is the same as that which was provided to
     * {@link TemplateLoader#load(String, TemplateLoadingSource, Serializable, TemplateLoaderSession)} (from a cache
     * presumably), so its content wasn't opened for reading.
     */
    NOT_MODIFIED,

    /**
     * If the template was found and its content is ready for reading.
     */
    OPENED
    
}

/**
 * The point of {@link TemplateLoadingSource} is that with their {@link #equals(Object)} method we can tell if two cache
 * entries were generated from the same physical resource or not. Comparing the template names isn't enough, because a
 * {@link TemplateLoader} may uses some kind of fallback mechanism, such as delegating to other {@link TemplateLoader}-s
 * until the template is found. Like if we have two {@link FileTemplateLoader}-s with different physical root
 * directories, both can contain {@code "foo/bar.ftl"}, but obviously the two files aren't the same.
 */
public interface TemplateLoadingSource extends Serializable {
    // Empty
}

/**
 * Stores shared state between {@link TemplateLoader} operations that are executed close to each other in the same
 * thread. For example, a {@link TemplateLoader} that reads from a database might wants to store the database
 * connection in it for reuse. The goal of sessions is mostly to increase performance. However, because a
 * {@link TemplateCache#getTemplate(String, java.util.Locale, Object, String, boolean)} call is executed inside a single
 * session, sessions can be also be utilized to ensure that the template lookup (see {@link TemplateLookupStrategy})
 * happens on a consistent view (a snapshot) of the backing storage, if the backing storage mechanism supports such
 * thing.
 * 
 * <p>
 * The {@link TemplateLoaderSession} implementation is (usually) specific to the {@link TemplateLoader}
 * implementation. If your {@link TemplateLoader} implementation can't take advantage of sessions, you don't have to
 * implement this interface, just return {@code null} for {@link TemplateLoader#createSession()}.
 * 
 * <p>
 * {@link TemplateLoaderSession}-s should be lazy, that is, creating an instance should be very fast and should not
 * cause I/O. Only when (and if ever) the shared resource stored in the session is needed for the first time should the
 * shared resource be initialized.
 *
 * <p>
 * {@link TemplateLoaderSession}-s need not be thread safe.
 */
public interface TemplateLoaderSession {

    /**
     * Closes this session, freeing any resources it holds. Further operations involving this session should fail, with
     * the exception of {@link #close()} itself, which should be silently ignored.
     */
    public void close() throws IOException;
    
    public boolean isClosed();

}


So, the template loading sequence for template not yet in the cache is something like this:

  session = templateLoader.createSession();

  // TemplateLookupStrategy kicks in... of course its some loop in reality:
  res = templateLoader.load("foo_en_US.ftl", null, null, session);
  if (res.status == NOT_FOUND) res = templateLoader.load("foo_en.ftl", null, null, session);
  if (res.status == NOT_FOUND) res = templateLoader.load("foo.ftl", null, null, session);
  if (res.status == NOT_FOUND) throw new TemplateNotFoundException();

  Reader reader;
  if (res.getReader() != null) {
      reader = res.getReader(); // Charset is not relevant
  } else {
      reader = new InputStreamReader(res.getInputStream, figureOutCharsetFromCfgAndSuch());
  }

  template = new Template(reader, ...);
  reader.close();

  session.close();

For cached templates after the update delay you will fill those two
null parameters with the source and version from the cache entry, an
then you skip most of the rest if res.status == NOT_MODIFIED.

I would also note that I have simplified the case of the
InputStreamReader above. In reality you will have markSupported()
true, have to add a mark at position 0, then drop the mark during
parsing when you know that no <#ftl encoding=...> can come anymore, and
so you are safe. If <#ftl encoding=...> interferes, we will have to
reset() the stream, recreate the InputStreamReader, and parse again.
(We don't re-read from the TemplateLoader, we just re-parse.)

-- 
Thanks,
 Daniel Dekany


Re: [FM3] Redesigned TemplateLoader interface

Posted by Daniel Dekany <dd...@freemail.hu>.
I have replaced the old TemplateLoader interface with the new one in
the freemarker-3 branch, and "migrated" all the included
TemplateLoader-s and tests.

More eyes see more, so check it out if you can, and tell if you see
anything to improve.

As we don't have a DatabaseTemplateLoader yet, one of the important
points of this new TemplateLoader wasn't demonstrated now. A good way
of spotting the rough edges would be if someone implements that for
example.

Thanks!


Monday, February 6, 2017, 12:40:15 AM, Daniel Dekany wrote:

> In FM2 we had some problems with the TemplateLoader interface:
>
> - The major problem with is that to load a template you need to do
>   multiple round trips to the storage mechanism: check if the template
>   "file" exists, then get its last modification date, then read its
>   content. This is particularly problematic for databases, but often
>   even HTTP could pack these into a single round trip.
>
> - When the <#ftl encoding=...> header disagrees with the actual
>   charset used for reading the template, the whole template has to
>   be re-read (I/O!). That's because we get a Reader, so we can't
>   rewind the InputStream behind it and start reading it again with
>   the new charset.
>
> - To detect changes one can only use the last modification date (not a
>   revision number or hash). It can be especially problematic if
>   templates can change so fast, that the clock may doesn't tick
>   between them.
>
> - TemplateLoader can't return meta-info like the output format (MIME
>   type basically) of the template. Some storages could do that.
>
> - Some storages (like databases) support some kind of atomicity and
>   transaction isolation, but the TemplateLoader mechanism doesn't allow
>   you to utilize these.
>
> - The "template source" is a tricky to understand for implementators, as
>   it servers both as a handle that can be closed, and as a long lived cache
>   key component.
>
> I propose a totally new TemplateLoader design to counter all these
> problems in FM3. I tell the idea mostly in code bellow. Please share
> your thoughts! (I was also thinking about backporting this to FM2, but
> it was not feasible.)
>
>
> /**
>  * This is the one that replaces the TemplateLoader of FM2.
>  */
> public interface TemplateLoader {
>
>     /**
>      * Creates a new session, or returns {@code null} if the
> template loader implementation doesn't support sessions.
>      * See {@link TemplateLoaderSession} for more information about sessions.
>      */
>     TemplateLoaderSession createSession();
>     
>     /**
>      * Loads the template content together with meta-data such as
> the version (usually the last modification time),
>      * optionally conditionally. Note how all these operations
> (existence check, up-to-date check, opening for reading)
>      * were put into one atomic unit. This allows you utilize the
> capabilities of the storage mechanism to spare round
>      * trips, or even to add atomicity guarantees.
>      *
>      * @param name
>      *            The name (template root directory relative path) of the template; same as in FM2.
>      * @param ifSourceDiffersFrom
>      *            If we only want to load the template if its
> source differs from this. {@code null} if you want the
>      *            template to be loaded unconditionally. If this is {@code null} then the
>      *            {@code ifVersionDiffersFrom} parameter must be {@code null} too. See
>      *            {@link TemplateLoadingResult#getSource()} for more about versions.
>      * @param ifVersionDiffersFrom
>      *            If we only want to load the template if its
> version (which is usually the last modification time)
>      *            differs from this. {@code null} if {@code
> ifSourceDiffersFrom} is {@code null}, or if the backing
>      *            storage from which the {@code
> ifSourceDiffersFrom} template source comes from doesn't store a version.
>      *            See {@link TemplateLoadingResult#getVersion()} for more about versions.
>      * 
>      * @return Not {@code null}.
>      */
>     TemplateLoadingResult load(String name, TemplateLoadingSource
> ifSourceDiffersFrom, Serializable ifVersionDiffersFrom,
>             TemplateLoaderSession session) throws IOException;
>     
>     /**
>      * Invoked by {@link Configuration#clearTemplateCache()} to
> instruct this template loader to throw away its current
>      * state (some kind of cache usually) and start afresh. For
> most {@link TemplateLoader} implementations this does
>      * nothing.
>      */
>     void resetState();
>
> }
>
> /**
>  * Return value of {@link TemplateLoader#load(String,
> TemplateLoadingSource, Serializable, TemplateLoaderSession)}
>  */
> public final class TemplateLoadingResult {
>  
>     public static final TemplateLoadingResult NOT_FOUND = new TemplateLoadingResult(
>             TemplateLoadingResultStatus.NOT_FOUND);
>     public static final TemplateLoadingResult NOT_MODIFIED = new TemplateLoadingResult(
>             TemplateLoadingResultStatus.NOT_MODIFIED);
>
>     /**
>      * Creates an instance with status {@link
> TemplateLoadingResultStatus#OPENED}, for a storage mechanism that
>      * naturally returns the template content as sequence of {@code
> char}-s as opposed to a sequence of {@code byte}-s.
>      * This is the case for example when you store the template in
> a database in a varchar or CLOB. Do <em>not</em> use
>      * this constructor for stores that naturally return binary
> data instead (like files, class loader resources,
>      * BLOB-s, etc.), because using this constructor will disable
> FreeMarker's charset selection mechanism.
>      */
>     public TemplateLoadingResult(TemplateLoadingSource source,
> Serializable version, Reader reader,
>             TemplateConfiguration templateConfiguration) { ... }
>
>     /**
>      * Creates an instance with status {@link
> TemplateLoadingResultStatus#OPENED}, for a storage mechanism that
>      * naturally returns the template content as sequence of {@code
> byte}-s as opposed to a sequence of {@code char}-s.
>      * This is the case for example when you store the template in
> a file, classpath resource, or BLOB. Do <em>not</em>
>      * use this constructor for stores that naturally return text
> instead (like database varchar and CLOB columns).
>      */
>     public TemplateLoadingResult(TemplateLoadingSource source,
> Serializable version, InputStream inputStream,
>             TemplateConfiguration templateConfiguration) { ... }
>
>     /**
>      * Returns non-{@code null} exactly if {@link #getStatus()} is
> {@link TemplateLoadingResultStatus#OPENED} and the
>      * backing store mechanism returns content as {@code byte}-s,
> as opposed to as {@code chars}-s. The return value is
>      * always the same instance, no mater when and how many times this method is called.
>      */
>     public InputStream getInputStream() {
>         return inputStream;
>     }
>
>     /**
>      * Tells what kind of result this is; see the documentation of
> {@link TemplateLoadingResultStatus}.
>      */
>     public TemplateLoadingResultStatus getStatus() {
>         return status;
>     }
>
>     /**
>      * Same as "template source" FM2, but it's simpler, as it only
> focuses the usage as part of the cache key,
>      * and can't be closed. If you aren't familiar with FM2
> template sources, see {@link TemplateLoadingSource}
>      * below.
>      */
>     public TemplateLoadingSource getSource() {
>         return source;
>     }
>
>     /**
>      * This replaced the lastModifed of FM2, and is more flexible
> as it can be a revision number, a cryptographic hash,
>      * etc. Only set if the result status is {@link
> TemplateLoadingResultStatus#OPENED} and the backing storage stores
>      * such information. Version objects are compared with each
> other with their {@link Object#equals(Object)} method.
>      */
>     public Serializable getVersion() {
>         return version;
>     }
>
>     /**
>      * Similar to {@link #getInputStream()}, but used when the
> backing storage mechanism returns content as
>      * {@code char}-s, as opposed to as {@code byte}-s.
>      */
>     public Reader getReader() {
>         return reader;
>     }
>
>     /**
>      * If {@link #getStatus()} is {@link
> TemplateLoadingResultStatus#OPENED}, and the template loader stores such
>      * information (which is rare) then it returns the {@link
> TemplateConfiguration} applicable to the template,
>      * otherwise it returns {@code null}. If there are {@link
> TemplateConfiguration}-s coming from other
>      * sources, such as from {@link
> Configuration#getTemplateConfigurations()}, this won't replace them, but will be
>      * merged with them, with properties coming from the returned
> {@link TemplateConfiguration} having the highest
>      * priority.
>      */
>     public TemplateConfiguration getTemplateConfiguration() {
>         return templateConfiguration;
>     }
>     
> }
>
> /**
>  * Used for the value of {@link TemplateLoadingResult#getStatus()}.
>  */
> public enum TemplateLoadingResultStatus {
>
>     /**
>      * The template with the requested name doesn't exist (not to
> be confused with "wasn't accessible due to error").
>      */
>     NOT_FOUND,
>
>     /**
>      * If the template was found, but its source and version is the
> same as that which was provided to
>      * {@link TemplateLoader#load(String, TemplateLoadingSource,
> Serializable, TemplateLoaderSession)} (from a cache
>      * presumably), so its content wasn't opened for reading.
>      */
>     NOT_MODIFIED,
>
>     /**
>      * If the template was found and its content is ready for reading.
>      */
>     OPENED
>     
> }
>
> /**
>  * The point of {@link TemplateLoadingSource} is that with their
> {@link #equals(Object)} method we can tell if two cache
>  * entries were generated from the same physical resource or not.
> Comparing the template names isn't enough, because a
>  * {@link TemplateLoader} may uses some kind of fallback mechanism,
> such as delegating to other {@link TemplateLoader}-s
>  * until the template is found. Like if we have two {@link
> FileTemplateLoader}-s with different physical root
>  * directories, both can contain {@code "foo/bar.ftl"}, but
> obviously the two files aren't the same.
>  */
> public interface TemplateLoadingSource extends Serializable {
>     // Empty
> }
>
> /**
>  * Stores shared state between {@link TemplateLoader} operations
> that are executed close to each other in the same
>  * thread. For example, a {@link TemplateLoader} that reads from a
> database might wants to store the database
>  * connection in it for reuse. The goal of sessions is mostly to
> increase performance. However, because a
>  * {@link TemplateCache#getTemplate(String, java.util.Locale,
> Object, String, boolean)} call is executed inside a single
>  * session, sessions can be also be utilized to ensure that the
> template lookup (see {@link TemplateLookupStrategy})
>  * happens on a consistent view (a snapshot) of the backing
> storage, if the backing storage mechanism supports such
>  * thing.
>  * 
>  * <p>
>  * The {@link TemplateLoaderSession} implementation is (usually)
> specific to the {@link TemplateLoader}
>  * implementation. If your {@link TemplateLoader} implementation
> can't take advantage of sessions, you don't have to
>  * implement this interface, just return {@code null} for {@link TemplateLoader#createSession()}.
>  * 
>  * <p>
>  * {@link TemplateLoaderSession}-s should be lazy, that is,
> creating an instance should be very fast and should not
>  * cause I/O. Only when (and if ever) the shared resource stored in
> the session is needed for the first time should the
>  * shared resource be initialized.
>  *
>  * <p>
>  * {@link TemplateLoaderSession}-s need not be thread safe.
>  */
> public interface TemplateLoaderSession {
>
>     /**
>      * Closes this session, freeing any resources it holds. Further
> operations involving this session should fail, with
>      * the exception of {@link #close()} itself, which should be silently ignored.
>      */
>     public void close() throws IOException;
>     
>     public boolean isClosed();
>
> }
>
>
> So, the template loading sequence for template not yet in the cache is something like this:
>
>   session = templateLoader.createSession();
>
>   // TemplateLookupStrategy kicks in... of course its some loop in reality:
>   res = templateLoader.load("foo_en_US.ftl", null, null, session);
>   if (res.status == NOT_FOUND) res =
> templateLoader.load("foo_en.ftl", null, null, session);
>   if (res.status == NOT_FOUND) res = templateLoader.load("foo.ftl", null, null, session);
>   if (res.status == NOT_FOUND) throw new TemplateNotFoundException();
>
>   Reader reader;
>   if (res.getReader() != null) {
>       reader = res.getReader(); // Charset is not relevant
>   } else {
>       reader = new InputStreamReader(res.getInputStream,
> figureOutCharsetFromCfgAndSuch());
>   }
>
>   template = new Template(reader, ...);
>   reader.close();
>
>   session.close();
>
> For cached templates after the update delay you will fill those two
> null parameters with the source and version from the cache entry, an
> then you skip most of the rest if res.status == NOT_MODIFIED.
>
> I would also note that I have simplified the case of the
> InputStreamReader above. In reality you will have markSupported()
> true, have to add a mark at position 0, then drop the mark during
> parsing when you know that no <#ftl encoding=...> can come anymore, and
> so you are safe. If <#ftl encoding=...> interferes, we will have to
> reset() the stream, recreate the InputStreamReader, and parse again.
> (We don't re-read from the TemplateLoader, we just re-parse.)
>

-- 
Thanks,
 Daniel Dekany