You are viewing a plain text version of this content. The canonical link for it is here.
Posted to log4j-dev@logging.apache.org by "Remko Popma (JIRA)" <ji...@apache.org> on 2014/09/18 19:35:34 UTC
[jira] [Updated] (LOG4J2-506) Binary logging interface
[ https://issues.apache.org/jira/browse/LOG4J2-506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Remko Popma updated LOG4J2-506:
-------------------------------
Labels: binary (was: )
> Binary logging interface
> ------------------------
>
> Key: LOG4J2-506
> URL: https://issues.apache.org/jira/browse/LOG4J2-506
> Project: Log4j 2
> Issue Type: New Feature
> Components: Core, Layouts
> Reporter: Gregg Donovan
> Assignee: Remko Popma
> Priority: Minor
> Labels: binary
>
> From [discussion|https://groups.google.com/forum/#!topic/mechanical-sympathy/7HdRCacJCDk] with Remko Pompa on the mechanical-sympathy list:
> {quote}We -- Etsy.com -- have a request logging system that we've begun to outgrow due to increased log volume and increased size of the messages we need to log. The messages are Thrift-encoded search requests from 1-100K in size. We use the logs to feed our homegrown traffic replay tools for offline performance tests and to reproduce issues found in production (GC, perf, etc.).
> For our next gen request logging system we would like the following:
> Low contention -- Minimal lock contention despite log writes from many (10s to 100s) application threads.
> Async writes -- File I/O should be outside of the actual request path.
> Binary logging -- Our current system base64 encodes the payload as a String and logs it via log4j 1.2. ;-) This creates more garbage than we would like and makes the data written twice as large. It was good enough for a while, but now we need something more efficient...
> Compression -- Either per-message or per block of messages would be great. Incurring large periodic CPU penalties from logrotate + gzip, as we do now, is less than ideal.
> Rotation -- Ideally based on file size, but time-based would also do.
> {quote}
> Remko:
> {quote}
> You can do this with log4j-2.0. The current version (log4j-2.0-beta-9) out of the box gives you 3 out of 5: Async Loggers use the Disruptor under the hood -> low contention & async writes, and RollingRandomAccessFileAppender (the fastest appender) can do rotation. This appender can also optionally do compression in a separate thread, but this happens on roll-over only, not continuously. (Still, I would guess that this is more CPU-efficient than using logrotate + gzip in a separate process...)
> The binary logging can be achieved with a bit of custom code.
> First, your objects would need to provide some way to turn themselves into bytes. I doubt that you want to use java standard serialization :-), so how about this as an interface to log4j that your objects would implement:
> {quote}
> {code:java}
> public interface BinaryMessage extends org.apache.logging.log4j.message.Message {
> int getLength(); // return byte count
> void writeInto(ByteBuffer buffer); // write length bytes from the current buffer position
> }
> {code}
> {quote}
> Log4j2 Loggers already have methods that take a Message argument (instead of a String), so in your application code you could just write
> {quote}
> {code:java}
> Logger logger = LogManager.getLogger(MyClass.class); // usually cached in static field
> logger.info(myObject); // where myObject implements BinaryMessage
> {code}
> {quote}
> Second, in the log4j2 framework, the Layout is responsible for converting a LogEvent into bytes, so you need a BinaryLayout that recognizes BinaryMessages. The layout could optionally output the timestamp (long) and log level (byte), followed by the message size (int) and the message bytes (byte[]).
> Using the log4j2 plugin model, an implementation could look something like below.
> {quote}
> {code:java}
> @Plugin(name = "BinaryLayout", category = "Core", elementType = "layout", printObject = true)
> public final class BinaryLayout extends AbstractLayout<LogEvent> {
> private static enum TimestampStrategy {
> WRITE {
> public void write(LogEvent event, ByteBuffer buffer) {
> buffer.putLong(event.getMillis());
> }
> },
> DONT {
> public void write(LogEvent event, ByteBuffer buffer) {
> }
> };
> abstract void write(LogEvent event, ByteBuffer buffer);
> }
> private static enum LevelStrategy {
> WRITE {
> public void write(LogEvent event, ByteBuffer buffer) {
> buffer.put((byte) event.getLevel().intLevel());
> }
> },
> DONT {
> public void write(LogEvent event, ByteBuffer buffer) {
> }
> };
> abstract void write(LogEvent event, ByteBuffer buffer);
> }
> private final TimestampStrategy timestamp;
> private final LevelStrategy level;
> private final int prefixLength;
> public BinaryLayout(boolean timestamp, boolean level) {
> this.timestamp = timestamp ? TimestampStrategy.WRITE : TimestampStrategy.DONT;
> this.level = level ? LevelStrategy.WRITE : LevelStrategy.DONT;
> final int stampBytes = timestamp ? 8 : 0;
> final int levelBytes = level ? 1 : 0;
> prefixLength = 4 + stampBytes + levelBytes;
> }
> @Override
> public byte[] toByteArray(LogEvent event) {
> final BinaryMessage msg = (BinaryMessage) event.getMessage();
> final byte[] result = new byte[msg.getLength() + prefixLength];
> final ByteBuffer buffer = ByteBuffer.wrap(result);
> timestamp.write(event, buffer);
> level.write(event, buffer);
> buffer.putInt(msg.getLength());
> msg.writeInto(buffer);
> return result;
> }
> @Override
> public LogEvent toSerializable(LogEvent event) {
> return event;
> }
> @Override
> public String getContentType() {
> return "application/octet-stream";
> }
> @Override
> public Map<String, String> getContentFormat() {
> return new HashMap<String, String>();
> }
> @PluginFactory
> public static BinaryLayout createLayout(
> @PluginAttribute("writeTimestamp") final String writeTimestamp,
> @PluginAttribute("writeLevel") final String writeLevel) {
> final boolean timestamp = Booleans.parseBoolean(writeTimestamp, false);
> final boolean level = Booleans.parseBoolean(writeLevel, false);
> return new BinaryLayout(timestamp, level);
> }
> }
> {code}
> {quote}
> Unfortunately the log4j API does not allow us to be completely garbage-free, you still need to allocate the byte[] array and ByteBuffer for each event...
> (The LogEvents are reused Disruptor ring buffer slots if you make all loggers AsyncLoggers.)
> Still, this should be a lot better (garbage-wise) than base64-encoding a String, and then turning that String into bytes again...
> Here is the log4j2.xml configuration that would put it all together (modify the "packages" attribute at the top to match your package):
> (Don't forget to set system property -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector to make all loggers AsyncLoggers.)
> {quote}
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <Configuration status="WARN" packages="package.of.binary.layout.class">
> <Appenders>
> <RollingRandomAccessFile name="rraf" fileName="binary.log"
> filePattern="logs/$${date:yyyy-MM}/app-%d{MM-dd-yyyy}-%i.log.gz"
> append="false" immediateFlush="false">
> <BinaryLayout writeTimestamp="true" writeLevel="true" />
> <Policies>
> <TimeBasedTriggeringPolicy />
> <SizeBasedTriggeringPolicy size="250 MB"/>
> </Policies>
> </RollingRandomAccessFile>
> </Appenders>
>
> <Loggers>
> <Root level="trace" includeLocation="false">
> <AppenderRef ref="rraf"/>
> </Root>
> </Loggers>
> </Configuration>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscribe@logging.apache.org
For additional commands, e-mail: log4j-dev-help@logging.apache.org