博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
log4j的最佳实践(转)
阅读量:7211 次
发布时间:2019-06-29

本文共 19203 字,大约阅读时间需要 64 分钟。

Log4j Best Practices

Julius Davies, June 9th, 2008

Before You Do Anything Else

Take a look at this  by Anton Chuvakin.

Introduction

Logs must be thought out. An application's log design should probably be signed-off by a software architect. Enhancements to the log design, as the system evolves, should be encouraged.

Checklist:


Log Design

There are only two things about a log that can be controlled by the application: the individual lines, and the separation into different log files.


Greppable Logs

What is the end result that we want?

We want a greppable log. This means log design is focused entirely on a single line of log. A SINGLE LINE. Remember that. A log entry SHOULD NEVER go over multiple-lines. The only exception is java stacktraces.

We want a greppable log. In other words, what will we be asking of the log? Probably questions like:

  • When did user 555444, branch 10, last log in?
  • Is our incoming connection from <soap.partner.com> receiving any messages?
  • How many times did our web-services partner go dead yesterday?
  • What's the longest time required to make a transaction in September?
  • Did any transaction fail to complete?

Make sure your log can answer questions like these.

Also, if your application EVER reports a problem, make sure your application logs it. It's not enough to just show a webpage or send an email. You need the log (and hopefully the stacktrace/register-dump/diagnostics).

Four problems to avoid

  1. Log is missing necessary information.

We cannot answer, "When did user 555444, branch 10, last log in?" if the log file doesn't ever contain the information.

  1. Log unsuitable for grep because of redundant information.

We cannot answer, "Is <soap.partner.com> recieving any messages?" if the log file mentions <soap.partner.com> for every line of log, even lines unrelated to <soap.partner.com>.

  1. Information split across more than one line (bad for grep).

Answering, "What's the longest time required to make a billpayment in September?" or "Did any billpayments fail to complete" is impossible with grep if "billPayment start" and "billPayment done" are separate log statements. Answering might even be impossible with any tool if other log statements are present inbetween the "start" and "finish."

Instead, just log a single statement of "billPayment took 285ms" or "billPayment timed out".

  1. Error reported to user, but not logged.

Don't write code like this! ;-)

try {

   // set some variable

} catch ( Exception e ) {

   // do nothing

}

// if variable is null, tell user there was a problem


Separate files for different areas of interest

Log design is primarily focused on the individual line of log. However, partitioning the lines into various files can make useful information even more accessible.

So far my approach is to try and segregate quieter events (that occur seldom) from noisier events. For a 100 txn/second application I wrote I created three main logs: a message log, a connection log (disconnects/reconnects), and a stacktrace log.

I don't worry about redundancy. I might log the same line in several files.


Timestamps

Logs must contain timestamps, with timezone, to the millisecond or nanosecond.

2004-10-04/09:03:00.141141000/PDT*bad event

2004-10-04/09:03:20.217588221/PDT good event

2004-10-04/09:03:26.006912999/PDT another good event

Why timezone?

Answer: maybe half our servers are in Toronto. If our sysadmins decide every computer should use GMT (or PST), great, but let's give the sysadmins the option! Timezone also helps around daylight-savings changes.

Why nanosecond? Why not to the second?

Answer: we want to minimize the number of lines occuring with identical timestamps. This is very important if you want to split logs into separate files covering different areas of interest (e.g. connection-history.log and messages.log). If your timestamps are to the nanosecond, you can re-merge these files together and usually not lose the order of events ("sort -m").

Personally, for Java I prefer to use "yyyy-MM-dd/HH:mm:ss.SSS/zzz". (I would use  if Java ).

High Volume Logging: It is probably impossible for any logging system to guarantee logs get written chronologically (in same order as the original events). In an extremely highly loaded system you will sometimes see out-of-order events in your log, usually by a few milliseconds. If you only log to the second, you cannot reestablish the chronology. Log analysis tools also need to be aware of this and not blowup when fed data that is not strictly chronological.


Always Log To Local Disk

Applications should not try to log directly to the network. Don't be tempted by the syslog dream, and be careful not to accidentally (or overtly) log to a shared directory. So many times people say, "The network is not reliable." This means it's unsuitable for direct logging. Local disk is your loyal and trustworthy friend.

We do want our logs to be ultimately centralized and archived, but that task is best accomplished with cron or some other batch-oriented background process using a reliable transport (e.g. ftp, scp, nfs, smbfs). Also, be sure to compress your logs before they hit the network.

Keep a week or two of the archived files in both places (local-disk and archive server), both to make trouble shooting easier for sysadmins, and to give you some breathing room should your archival process break down temporarily. Having a couple weeks of logs on the local-disk can be quite convenient, especially when faced with a question like, "Did this problem start in the last few days?" The sysadmin will be happier if she doesn't have to always ssh into the central archive just to look at yesterday's logs.


Use Log4j 1.2.x (or a worthy competitor - e.g. )

There are several logging frameworks available for java.

  •  (Note: the 1.3.x and 2.x versions are not production quality yet - avoid for now)
  • logKit
  • For those that can't decide: , or 

I have worked with several applications that use home-brew logging, java.util.logging and log4j. In my opinion log4j is the easiest to program with, the easiest to configure, and the most flexible. I recommend log4j, and strongly discourage all others. (Note: this was written before  was available).

Commons-logging and slf4j are interesting. The are both generalizations of logging, capable of supporting any underlying implementation. If you cannot decide on an implementation, or if you think you might want to change your implementation in the future, then commons-logging or slf4j are appropriate. Commons-logging is also good to know about, because many 3rd party tools (JBoss, HttpClient, Tomcat, etc...) use it.

I recommend skipping the generalization if possible. You will have more control over log4j if you use it directly. You cannot execute log4j's PropertyConfigurator.configure( "/etc/log4j.properties" ) at your leisure if using "commons-logging".


Programming with log4j

Every class must contain this line:

private static Logger log = Logger.getLogger( MyClass.class )

EVERY CLASS! I've configured IntelliJ to automatically generate that code whenever I create a new class. Some people recommend against making that line static, but Logger.getLogger() is synchronized, so non-static can degrade performance. It's easier just to use "static" and not worry about it. Normally I don't try to avoid synchronized calls, but this really is one of those rare places in Java these days, even with Java6, where synchronized can hurt, since every object's constructor will now contend for the global log4j lock.

DO NOT: create a single class "com.foo.app.Log" and run all logging calls through that! You are severely impairing log4j's flexibility if you do that.

In log4j, all loggers are named. Dots in the name "." act as hierarchy dividers. Through configuration, you can turn off "com.foo", turn on INFO for "com.foo.abc" and turn on DEBUG for "com.foo.abc.Message". But you can only do this if every logger is named after the class calling it.

4 Types of Logging

  • log.error()
  • log.warn()
  • log.info()
  • log.debug()

There is also a "log.fatal()" but this is more intended for command-line tools that should stay quiet except for dire problems.

Logging Exceptions

If the Exception is significant, at the very minimum, always do this:

try {

 

  // Code that might throw an exception...

 

} catch (Exception e) {

  log.error(e, e);

}

If you're sending the stacktraces to a separate log file (more about this below), you can do this:

try {

 

  // Code that might throw an exception...

 

} catch (Exception e) {

  log.error(e);

  stacktraceLog.error(e,e);

}

 

This way the error is in both files, but the actual stacktrace itself is only in the "stacktraces" logger.

Note to log4j developers: Please add the timestamp and level to each line of the stacktrace to keep it grep-friendly. I don't like doing "grep -C 15" (e.g. show 15 lines before and after)!

Performance Tweak

Before assembling your  log statement you may want to check log.isInfoEnabled() or log.isDebugEnabled(). This way you save on some cpu cycles:

// Uses many cpu cycles:

String fancyLogString = buildFancyLogString();

   

// Does nothing because "debug" is currently disabled in log4j.properties:

log.debug( fancyLogString );

Better:

if ( log.isDebugEnabled() )

{

  // Uses many cpu cycles:

  String fancyLogString = buildFancyLogString();

 

  // Does nothing because "debug" is currently disabled in log4j.properties:

  log.debug( fancyLogString );

}

Separate files for different areas of interest

For each file, you'll need a separate Logger.

private static Logger log = Logger.getLogger( MyClass.class )

private static Logger connectionsLog = Logger.getLogger( "connections." + MyClass.class.getName() )

private static Logger stacktracesLog = Logger.getLogger( "stacktraces." + MyClass.class.getName() )

private static Logger httpLog = Logger.getLogger( "http." + MyClass.class.getName() )


Configuration

Log4j's main disadvantage is that the documentation is confusing and hard to navigate. If you are brave, the javadoc for  is handy. So is the manual.

I find log4j configuration much easier to deal with once you have an example:

log.dir=/var/log/foo

rrd.dir=${log.dir}/rrd

datestamp=yyyy-MM-dd/HH:mm:ss.SSS/zzz

roll.pattern.hourly=.yyyy-MM-dd.HH

roll.pattern.daily=.yyyy-MM-dd

 

# catchAll.log -- Default catch-all.

log4j.rootLogger=DEBUG, defaultLog

log4j.appender.defaultLog=org.apache.log4j.DailyRollingFileAppender

log4j.appender.defaultLog.DatePattern=${roll.pattern.daily}

log4j.appender.defaultLog.File=${log.dir}/catchAll.log

log4j.appender.defaultLog.layout=org.apache.log4j.PatternLayout

log4j.appender.defaultLog.layout.ConversionPattern=%d{${datestamp}} [%t] %-5p %m%n

 

# foo.log

log4j.logger.com.foo.shared=INFO,fooLog

log4j.logger.com.foo.abc=INFO,fooLog

log4j.additivity.com.foo.shared=false

log4j.additivity.com.foo.abc=false

log4j.appender.fooLog=org.apache.log4j.DailyRollingFileAppender

log4j.appender.fooLog.File=${log.dir}/foo.log

log4j.appender.fooLog.DatePattern=${roll.pattern.hourly}

log4j.appender.fooLog.layout=org.apache.log4j.PatternLayout

log4j.appender.fooLog.layout.ConversionPattern=%d{${datestamp}}%p%m%n

 

# fooConnections.log

log4j.logger.connections.com.foo=INFO,fooConnections

log4j.additivity.connections=false

log4j.appender.fooConnections=org.apache.log4j.DailyRollingFileAppender

log4j.appender.fooConnections.File=${log.dir}/fooConnections.log

log4j.appender.fooConnections.DatePattern=${roll.pattern.daily}

log4j.appender.fooConnections.layout=org.apache.log4j.PatternLayout

log4j.appender.fooConnections.layout.ConversionPattern=%d{${datestamp}}%p%m%n

 

# fooStacktraces.log

log4j.logger.stacktraces.com.foo=INFO,stacktraces

log4j.additivity.stacktraces=false

log4j.appender.stacktraces=org.apache.log4j.DailyRollingFileAppender

log4j.appender.stacktraces.File=${log.dir}/fooStacktraces.log

log4j.appender.stacktraces.DatePattern=${roll.pattern.daily}

log4j.appender.stacktraces.layout=org.apache.log4j.PatternLayout

log4j.appender.stacktraces.layout.ConversionPattern=%d{${datestamp}}%p%m%n

Notice the use of property substitution! Very handy! Also, since I use single-character level strings (* instead of ERROR), I don't have to put spaces before and after %p.

Loading Configuration

Log4j will automatically load the configuration if it is stored in a file called "log4j.properties" and is present on the classpath under "" (e.g. WEB-INF/classes/log4j.properties).

I don't like that approach and prefer to load the configuration explicitly by calling:

PropertyConfigurator.configure( Config.ETC + "/log4j.properties" );

This way I can reload the configuration at any time as long as my application is still running. I like to add a button to an administrative jsp, "Reload Log4J".

Dynamic Log File Location

Many people complain that Log4j forces you to hard-code the location where your logs will be kept. Actually, it is possible to dynamically choose the log-file location, especially if you use the ${log.dir} property substitution technique above. Here's how:

String dynamicLog = // log directory somehow chosen...

Properties p = new Properties( Config.ETC + "/log4j.properties" );

p.put( "log.dir", dynamicLog ); // overwrite "log.dir"

PropertyConfigurator.configure( p );


Log rotation and archival

WARNING: Log4J has some catastrophic bugs (logs accidentally deleted) if more than one running JVM tries to rotate the same target log file. This is especially likely to happen if two instances of the same application are running in parallel in separate processes (e.g. two Tomcat instances), but both use identical log4j.properties.

Logs should be rotated every hour, or every day. They should not be rotated by size (e.g. every 100MB). A log file containing only events between "13:00:00.000-13:59:59.999" is inherently organized in a way useful for humans, whereas a log file containing exactly 100MB of data is not.

Log4j handles hourly/daily log rotation very nicely:

log4j.appender.defaultLog=org.apache.log4j.DailyRollingFileAppender

log4j.appender.defaultLog.DatePattern=.yyyy-MM-dd.HH

This particular config rotates the log every hour (yyyy-MM-dd.HH).

Logs should be periodically gzipped and moved to a server devoted to log-archival. This should happen every night, or every hour. Archival is also a good stage to run all logs through the unix "sort" command to reestablish chronologically in case some events were logged out-of-order.

You should be arhiving logs on QA as well as Production, so that you can make sure you got it right.

Personally I prefer gzip to bzip2. bzip2 makes smaller files, but it is a lot slower! When viewing compressed logs, use zgrep (or bzgrep). In other words, don't decompress the logs to disk. Just decompress them to standard-out and run grep against that. You may also want to look into these related commands: zcat, zdiff, zcmp, zless.

A Log Rotation Best Practice That Is Currently Not Possible

When a log file rotates (either because of size, or daily/hourly), the first line of the new log file should contain important version information about the running application:

2008-06-06/09:00:00.000 APP=[FooApp-1.2.3] UPTIME=[382 Days, 10:51:11.231] VM=[Java HotSpot(TM) Client VM 1.4.2_16-b05] OS=[Windows XP 5.1 x86] CONTAINER=[Apache Tomcat/5.0.30] HEAP=[69MB/281MB 24.6%]

Additional log-related investigations become possible with this at the top of every rotated log file. For example, "Did the application encounter more of bug X when we upgraded the Linux kernel?" or "How much did response time improve after we upgraded Java?" Analyses like these are impossible without a "version-line" at the top of each log file.

That version line should also print in the middle of the current log whenever the application restarts. The "version line" should be sent to all active log files. So if log4j.properties is managing 100 files, then all 100 should get this line. Personally, I would even recommend that during really quiet times logging frameworks should still create the file, and log the single version line, even if no other log events occurred during the day or hour in question. This way the absence of the file in the log archive would almost always mean "the application was shut down," rather than "the application was either shut down, or was not doing anything," as it currently means.

Unfortunately, this "version line" feature is not possible in any existing Java logging framework. Hopefully one day someone will implement it. (Why is everyone looking at me?)


Cosmetic considerations

Consider this log snippet.

2004-09-28/18:37:35.138/PDT Loading FooApp configuration from /home/julius/dev/foo/webapps/foo/WEB-INF/etc/foo.xml

2004-09-28/18:37:35.938/PDT 200-HEAD 325ms 0.0kB KEY=GOOGLE DST=http://www.google.ca/ [health]

2004-09-28/18:37:35.939/PDT!400-HEAD 238ms 0.0kB KEY=DLL DST=http://192.1.2.3/cgi-bin/cgi.dll/ext/ [health]

2004-09-28/18:37:37.745/PDT*ECHOSERVER DST=socket(localhost:5555) [java.net.ConnectException: Connection refused]

2004-09-28/18:37:41.886/PDT 200-HEAD 6306ms 0.0kB KEY=SOAP DST=https-easy://198.5.6.7/ [health]

2004-09-28/18:37:43.083/PDT*Timeout  7501ms KEY=TXN DST=http://10.1.2.3:8080/ [health]

2004-09-28/18:37:47.750/PDT ECHOSERVER DST=socket(localhost:5555) [Connected!]

2004-09-28/18:39:47.429/PDT*Timeout  7502ms KEY=TXN SRC=localhost.localdomain DST=http://10.1.2.3:8080/blah [bar:xyz foo:5050]

2004-09-28/18:40:44.581/PDT!com.foo.net.TimeoutException: Timeout  7501ms KEY=TXN DST=http://10.1.2.3:8080/ [client.executeMethod() failed]

2004-09-28/18:40:51.778/PDT!404-GET  86ms 1.22kB KEY=GOOGLE SRC=localhost.localdomain DST=http://www.google.ca/blah [bar:hamburger foo:5050]

2004-09-28/18:41:03.480/PDT SOCKET   558ms 0.26kB KEY=ECHOSERVER SRC=localhost.localdomain DST=DST=socket(localhost:5555) [bar:hamburger foo:5050]

2004-09-28/18:41:06.016/PDT SOCKET   13ms 0.26kB KEY=ECHOSERVER SRC=localhost.localdomain DST=DST=socket(localhost:5555) [bar:hamburger foo:5050]

2004-09-28/18:41:08.562/PDT SOCKET   7ms 0.26kB KEY=ECHOSERVER SRC=localhost.localdomain DST=DST=socket(localhost:5555) [bar:hamburger foo:5050]

2004-09-28/18:43:53.553/PDT!com.foo.net.TimeoutException: Timeout  7502ms KEY=TXN DST=http://10.1.2.3:8080/blah [client.executeMethod() failed]

2004-09-28/22:05:12.950/PDT*ECHOSERVER DST=socket(localhost:5555) [Closed]

I'm trying to keep earlier parts of the log statement aligned. Notice the "millisecond" response-time reads are aligned.

I find ! * + and [space] more convenient to work with than the verbose "INFO, DEBUG, ERROR, WARN". They are a little trickier to grep for, though, since the shell will try to expand them. I use single-quotes to stop the shell from interfering:

grep '*'    foo.log    Find all errors in log.

grep '!'    foo.log    Find all warnings in log.

grep '*\|!' foo.log    Find all warnings AND errors in log!  The pipe is

                       the OR operator in grep, but it needs backslash protection.

Here's the code to change Log4J's level strings:

package org.apache.log4j;

public class LevelChanger {

  public static void changeLevels() {

    Level.DEBUG.levelStr = "_";

    Level.INFO.levelStr = " ";

    Level.WARN.levelStr = "!";

    Level.ERROR.levelStr = "*";

    Level.FATAL.levelStr = "***";

  }

}

Even though LevelChanger is code I write and package in a separate jar file, as long as I declare it part of the "org.apache.log4j", it can change the level strings.

 

===============

转载地址:http://alrum.baihongyu.com/

你可能感兴趣的文章
5.6. Spring boot with Logging
查看>>
MySQL 视图技术
查看>>
第 138 章 Spark
查看>>
flask 使用 SQLAlchemy 的两种方式
查看>>
Nginx入门笔记之————配置文件结构
查看>>
SQL Server-聚焦深入理解死锁以及避免死锁建议(三十三)
查看>>
Android(Linux)实时监控串口数据
查看>>
Open Sans字体兼容问题解决办法[font-face]
查看>>
现在的我为什么不泡技术论坛了
查看>>
AES加密
查看>>
MPLS LDP随堂笔记1
查看>>
HTTPS 也不安全?被发现新漏洞会暴露你的数据
查看>>
MySQL · 最佳实践 · 什么时候该升级内存规格
查看>>
花卉世界大观园和杂技之游
查看>>
安全问题会对网络购物季产生负面影响吗
查看>>
2016年日本信息安全市场规模分析
查看>>
微软:超过96%的企业用户正在测试Win10
查看>>
Google VC投资SDN初创公司Plexxi
查看>>
英特尔在移动市场另辟蹊径
查看>>
英特尔收购Movidius背后:为什么我们需要一款专门的CV处理芯片?
查看>>