Reliable software: Log level per request

Some time ago I was reading Technology Radar of ThoughtWorks. And there was a technique called "Log level per request". Here in Confirmit, we use logging widely. So I was wondering how to implement a similar solution. And now I'm ready to show you one possible implementation.

Problem description

So what are we talking about? Let's say you have a Web service. At one moment it starts to fail in a production environment. But it fails only on some requests. For example, it fails only for one user. Or only for a specific endpoint... Certainly, we have to find the reason. In this case, logging should help.

We can insert into the code enough log instructions to pinpoint different reasons of failures. Log instruction usually associates a message with some log level (Debug, Info, Warning, ...). Also, the logger has its own log level. So all messages with levels above the logger level will be written to a log sink (file, database, ...). If the level of the message is below the logger level, the message will be discarded.

When the application works well, we want to have as fewer log messages as possible, to keep the size of log sinks small. At the same time, if the application fails, we want to have as many log messages as possible to be able to find the reason for the problem. The difficulty here is that usually, we set one level for all loggers in an application. If everything is OK, we keep this level high (e.g. Warning). If we need to investigate failures we set this level low (e.g. Debug).

Log level per application

When we set the application log level low, suddenly there will be a lot of messages in the log sinks. These messages will come from many requests and will be shuffled, as many requests can be processed simultaneously. It leads to several potential difficulties:

How to separate messages from requests with problems from messages from requests without problems?
Requests without problems still spend their time to write messages to the log sinks, although these messages will not be used.
Messages from the requests without problems still occupy space in the log sinks, although these messages will not be used.

Well, all these difficulties are not very serious. To separate messages of "good" requests from messages of "bad" requests, we can use correlation id. All modern log-keeping systems support filtering of messages by their fields.

Performance is also not a big problem usually. Logging systems support asynchronous write, so the impact of heavy logging should not be severe.

And storage space is relatively cheap now, so this is not a big problem either. Especially if we can delete old records.

Nevertheless, can we do better? Can we set a separate log level for each request depending on some condition? I'd like to investigate this problem here.

Log level per request

Allow me to formulate requirements for the solution I'll implement. There should be a way to set log level independently for each request. There must be a flexible way to define conditions of choosing a log level for a request. And it should be possible to change these conditions in runtime without restart of the application.

The stage is set. Let's begin.

I'll write a simple Web API application using .NET Core. It will have the only controller:

[Route("api/[controller]")]
[ApiController]
public class ValuesController : ControllerBase
{
    ...

    // GET api/values
    [HttpGet]
    public ActionResult<IEnumerable<string>> Get()
    {
        Logger.Info("Executing Get all");

        return new[] { "value1", "value2" };
    }

    // GET api/values/5
    [HttpGet("{id}")]
    public ActionResult<string> Get(int id)
    {
        Logger.Info($"Executing Get {id}");

        return "value";
    }
}

We'll discuss the implementation of Logger property later. For this implementation, I'll use the log4net library for logging. This library has an interesting feature. I'm talking about level inheritance. Very briefly, if in the configuration of the logging I say that logger with name X should have log level Info, it means, that all loggers with names starting with X. (like X.Y, X.Z, X.A.B) will inherit the same log level.

So here comes the initial idea. For every request, I'll somehow calculate the required log level. Then I'll create a newly named logger in log4net configuration. This logger will have desired log level. After that, all logger objects created during this request must have names prefixed with the name of that logger. The only thing here is that log4net never removes loggers. Once they were created they will live while an application is running. This is why I'll pre-create loggers with specific names for each log level:

<?xml version="1.0" encoding="utf-8" ?>
<log4net>
  <appender name="Console" type="log4net.Appender.ConsoleAppender">
    <layout type="log4net.Layout.PatternLayout">
      <!-- Pattern to output the caller's file name and line number -->
      <conversionPattern value="%5level [%thread] (%file:%line) - %message%newline" />
    </layout>
  </appender>

  <appender name="RollingFile" type="log4net.Appender.RollingFileAppender">
    <file value="RequestLoggingLog.log" />
    <appendToFile value="true" />
    <maximumFileSize value="100KB" />
    <maxSizeRollBackups value="2" />

    <layout type="log4net.Layout.PatternLayout">
      <conversionPattern value="%level %thread %logger - %message%newline" />
    </layout>
  </appender>

  <root>
    <level value="WARN" />
    <appender-ref ref="Console" />
    <appender-ref ref="RollingFile" />
  </root>

  <logger name="EdlinSoftware.Log.Error">
    <level value="ERROR" />
  </logger>

  <logger name="EdlinSoftware.Log.Warning">
    <level value="WARN" />
  </logger>

  <logger name="EdlinSoftware.Log.Info">
    <level value="INFO" />
  </logger>

  <logger name="EdlinSoftware.Log.Debug">
    <level value="DEBUG" />
  </logger>
</log4net>

Now I have several predefined loggers with names EdlinSoftware.Log.XXXX. These names will be prefixes for names of actual levels. To avoid collisions between requests, I'll save prefix for the current request in an AsyncLocal object. Value for this object I'll set inside new OWIN middleware:

app.Use(async (context, next) =>
{
    try
    {
        LogSupport.LogNamePrefix.Value = await LogSupport.GetLogNamePrefix(context);

        await next();
    }
    finally
    {
        LogSupport.LogNamePrefix.Value = null;
    }
});

When this value is set, it is easy to create loggers with the desired prefix of name:

public static class LogSupport
{
    public static readonly AsyncLocal<string> LogNamePrefix = new AsyncLocal<string>();

    public static ILog GetLogger(string name)
    {
        return GetLoggerWithPrefixedName(name);
    }

    public static ILog GetLogger(Type type)
    {
        return GetLoggerWithPrefixedName(type.FullName);
    }

    private static ILog GetLoggerWithPrefixedName(string name)
    {
        if (!string.IsNullOrWhiteSpace(LogNamePrefix.Value))
        { name = $"{LogNamePrefix.Value}.{name}"; }

        return LogManager.GetLogger(typeof(LogSupport).Assembly, name);
    }

    ....

}

It is clear now how to get an instance of logger inside our controller:

[Route("api/[controller]")]
[ApiController]
public class ValuesController : ControllerBase
{
    private ILog _logger;

    private ILog Logger
    {
        get => _logger ?? (_logger = LogSupport.GetLogger(typeof(ValuesController)));
    }

    ....
}

The only thing left to do is how to set rules that define which log level should be assigned for the request. This mechanism should be flexible enough. The main idea here is to use C# scripting. I'll create a file LogLevelRules.json where I'll define a set of pairs rule-log level:

[
  {
    "logLevel": "Debug",
    "ruleCode": "context.Request.Path.Value == \"/api/values/1\""
  },
  {
    "logLevel": "Debug",
    "ruleCode": "context.Request.Path.Value == \"/api/values/3\""
  }
]

Here logLevel is desired log level and ruleCode - C# code that returns a boolean value for a given request. The application will run code of this rules one by one. The first rule that returns true, will set corresponding log level. If all rules returned false, default log level will be used.

To create delegates from the string representations of rules, CSharpScript class can be used:

public class Globals
{
    public HttpContext context;
}

internal class LogLevelRulesCompiler
{
    public IReadOnlyList<LogLevelRule> Compile(IReadOnlyList<LogLevelRuleDescription> levelRuleDescriptions)
    {
        var result = new List<LogLevelRule>();

        foreach (var levelRuleDescription in levelRuleDescriptions ?? new LogLevelRuleDescription[0])
        {
            var script = CSharpScript.Create<bool>(levelRuleDescription.RuleCode, globalsType: typeof(Globals));
            ScriptRunner<bool> runner = script.CreateDelegate();

            result.Add(new LogLevelRule(levelRuleDescription.LogLevel, runner));
        }

        return result;
    }
}

internal sealed class LogLevelRule
{
    public string LogLevel { get; }

    public ScriptRunner<bool> Rule { get; }

    public LogLevelRule(string logLevel, ScriptRunner<bool> rule)
    {
        LogLevel = logLevel ?? throw new ArgumentNullException(nameof(logLevel));
        Rule = rule ?? throw new ArgumentNullException(nameof(rule));
    }
}

Here Compile method gets a list of objects read from the LogLevelRules.json file. It creates runner delegate for each rule and stores it for later usage.

This list of delegates can be stored:

LogSupport.LogLevelSetters = new LogLevelRulesCompiler().Compile(
    new LogLevelRulesFileReader().ReadFile("LogLevelRules.json")
);

and used later:

public static class LogSupport
{
    internal static IReadOnlyList<LogLevelRule> LogLevelSetters = new LogLevelRule[0];

    ...

    public static async Task<string> GetLogNamePrefix(HttpContext context)
    {
        var globals = new Globals
        {
            context = context
        };

        string result = null;

        foreach (var logLevelSetter in LogLevelSetters)
        {
            if (await logLevelSetter.Rule(globals))
            {
                result = $"EdlinSoftware.Log.{logLevelSetter.LogLevel}";
                break;
            }
        }

        return result;
    }
}

So on the start of the application, we read LogLevelRules.json file, convert its content into a list of delegates using CSharpScript class and store it into the LogSupport.LogLevelSetters field. On each request, we run delegates from this list to get the log level for the request.

The only thing left to do is to watch modifications of the LogLevelRules.json file. When we want to set the log level for some sort of requests, we add another checker in this file. To make the application to apply these changes without restart we have to watch the file:

var watcher = new FileSystemWatcher
{
    Path = Directory.GetCurrentDirectory(),
    Filter = "*.json",
    NotifyFilter = NotifyFilters.LastWrite
};
watcher.Changed += (sender, eventArgs) =>
{
    // Wait for the application that modifies the file to release it..
    Thread.Sleep(1000);

    LogSupport.LogLevelSetters = new LogLevelRulesCompiler().Compile(
        new LogLevelRulesFileReader().ReadFile("LogLevelRules.json")
    );
};
watcher.EnableRaisingEvents = true;

For the sake of brevity, I have not used thread synchronization code while working with LogSupport.LogLevelSetters field. But in a real application, you really should apply it.

The whole code of the application you can find at GitHub.

Disadvantages

This code solves the problem of defining log level per request. But it has some disadvantages too. Let's discuss them.

1. This approach changes names of loggers. So in the log file instead of "MyClassLogger" one will see something like "Edlinsoft.Log.Debug.MyClassLogger". One can live with it, but it is not very convenient. Probably this problem can be overcome by playing with the log layout.

2. Now it is not possible to make logger instances static, as they should be created separately for each request. The most serious problem here for me is that all team members should always remember to do it. One can accidentally define a static field with logger and get strange results. To overcome this situation we could create a wrapper class for logger and use it instead of direct usage of log4net classes. Such a wrapper class could always create new instances of log4net loggers for each operation. In this case, it will be no problem to use a static instance of the wrapper class.

3. The described approach creates many instances of loggers. It pollutes memory and takes up CPU cycles. Depending on the application it may be or not be a problem.

4. Then we modify the JSON file with rules, the code of rules can contain errors. It is easy to set try-catch blocks to be sure that these errors will not ruin our main program. But we still need to be aware that something went wrong. There could be two types of errors:

Compile-time errors while compiling rules code into delegates.
Run-time errors during execution of these delegates.

Somehow we have to be aware of these errors, otherwise, we just won't get log messages and won't even know about it.

5. Code of rules in the JSON file can contain any instructions. Potentially this is a security issue. We should somehow limit capabilities of the code. On the other hand, if an adversary is able to modify your files, it means you already have serious security problem.

Conclusion

In general, I should say that I don't fill this is a great solution that should replace the modern approach to the logging. A good tool that can filter log records can help here even with using log level per application. In any case, I hope the analysis of the problem will give you something to think about.

Reliable software

Thursday, January 10, 2019

Log level per request

Problem description

Log level per application

Log level per request

Disadvantages

Conclusion

No comments:

Post a Comment