Modern requests to web services are very complex. The service you are calling can call other services, they are other services, etc. All these requests can be executed in parallel. Of course, the logging system stores information from all participants in the request. But the clocks on different services can be slightly out of sync, so it is not easy to recreate the correct picture. And if we add message queues here (Azure EventHub, RabbitMQ, ...), the task becomes even more difficult.
Here I'll try to create a system that will allow us to quickly plot the sequence diagram of events during my request.
Ok, let's start.
System to analyze
Let's build a system whose requests we want to analyze. You can take it's full code from GitHub.
My system will contain several services (Service1, Service2, Service3, ExternalService):
The ServiceN services are components of my system. They are under my control. They do some work, write something in logs, make requests to other services. It doesn't matter to us here what they actually do. Here is a typical example of such a service:
[HttpGet]
public async Task<IEnumerable<WeatherForecast>> Get()
{
_logger.LogInformation("Get weather forecast");
Enumerable.Range(1, 4).ToList()
.ForEach(_ => _logger.LogInformation("Some random message"));
await Task.WhenAll(Enumerable.Range(1, 3).Select(_ => _service2Client.Get()));
await _service3Client.Get();
return Enumerable.Range(1, 5).Select(index => new WeatherForecast
{
Date = DateTime.Now.AddDays(index),
TemperatureC = Random.Shared.Next(-20, 55),
Summary = Summaries[Random.Shared.Next(Summaries.Length)]
})
.ToArray();
}
But in addition to my services, there are also external services. Simply put, these are services that do not write anything to our log system. It can be anything: mail, database, authorization, client webhooks, ... This type of service is represented here by ExternalService.
Now our system is ready. Let's configure it.
System configuration
First of all, I want to collect all my logs in one place. I will use Seq just because it is so easy to use with Docker. Here is the corresponding Docker Compose file:
version: "3"
services:
seq:
image: datalust/seq
container_name: seq
environment:
- ACCEPT_EULA=Y
ports:
- "5341:5341"
- "9090:80"
Now at the address http://localhost:9090 I have access to the Seq UI. And I can write logs in Seq using Serilog:
Log.Logger = new LoggerConfiguration()
.MinimumLevel.Override("Microsoft", LogEventLevel.Error)
.MinimumLevel.Override("System", LogEventLevel.Error)
.Enrich.FromLogContext()
.WriteTo.Console(new CompactJsonFormatter())
.WriteTo.Seq("http://localhost:5341")
.CreateLogger();
But there is one more requirement for my logging system. It must give me access to the log entries via some API. In the case of Seq, there is a NuGet package Seq.Api. That's quite enough for me.
Now I want to add some information to logs, which I'll use later to build request sequence diagram. To do this, I'll create ASP.NET Core middleware and add it into the request processing pipeline. Here is the main code:
public async Task Invoke(HttpContext context)
{
GetCorrelationId(context);
GetInitialsService(context);
GetPreviousService(context);
GetPreviousClock(context);
using (LogContext.PushProperty(Names.CurrentServiceName, ServiceNameProvider.ServiceName))
using (LogContext.PushProperty(Names.CorrelationIdHeaderName, _correlationIdProvider.GetCorrelationId()))
using (LogContext.PushProperty(Names.InitialServiceHeaderName, _initialServiceProvider.GetInitialService()))
using (LogContext.PushProperty(Names.PreviousServiceHeaderName, _previousServiceProvider.GetPreviousService()))
using (LogContext.PushProperty(Names.RequestClockHeaderName, _requestClockProvider.GetPreviousServiceClock()))
{
await _next(context);
}
}
What can we see here? We add the following information to all log entries:
- The name of the current service. There is no magic here. It can be just the assembly name or whatever you want.
- Correlation id. I hope it does not require a long introduction. It connects all the log entries associated with a single external request.
- The name of service to which the external request is sent. This is the point where the request enters our system. This information is for convenience only and will not be used in this article.
- The name of the previous service in the chain of requests. It is useful to know where the request came from.
- Some timestamp that does not depend on the physical clocks of different services. We'll talk about this in more detail later.
At the beginning of request processing, we need to get all these values from the request object. This is what all these GetNNN methods do at the start of Invoke. Let's take a look at the GetCorrelationId method. Other methods are generally identical.
private void GetCorrelationId(HttpContext context)
{
if (context.Request.Headers.ContainsKey(Names.CorrelationIdHeaderName)
&& context.Request.Headers[Names.CorrelationIdHeaderName].Any())
{
_correlationIdProvider.SetCorrelationId(context.Request.Headers[Names.CorrelationIdHeaderName][0]);
}
else
{
_correlationIdProvider.SetCorrelationId(Guid.NewGuid().ToString("N"));
}
}
The providers of these values also generally identical. They store values during a request in a field of type AsyncLocal<T>:
public class CorrelationIdProvider
{
private static readonly AsyncLocal<string> Value = new();
public string GetCorrelationId()
{
var value = Value.Value;
if (string.IsNullOrWhiteSpace(value))
{
value = Guid.NewGuid().ToString("N");
SetCorrelationId(value);
}
return value;
}
public void SetCorrelationId(string value)
{
if (string.IsNullOrWhiteSpace(value))
throw new ArgumentException("Value cannot be null or whitespace.", nameof(value));
Value.Value = value;
}
}
But there is an exception to this simplicity. I'm talking about the monotonous clock. Now it is time to discuss it.
Monotonous sequence of requests
Technically, each log entry has it's own timestamp. What prevents me from sorting by this timestamp and considering such a sequence of records? There are several obstacles.
First of all, as I have already said, the clocks of different services can be out of sync. Even a small shift of a couple of tens of milliseconds can lead to shuffling of log entries.
But even if the clocks are perfectly synchronized, this does not solve our problem completely. Imagine that my service calls the same endpoint several times, each time with slightly different parameters. And I'm doing it in parallel to improve performance. Log entries from these calls will inevitably be shuffled.
What can we do about it? We need some kind of magic monotonous clock that is identical for all services, or a monotonically increasing sequence of numbers. We can use it as follows. When my system receives a request, I set this clock to 0. When I need to call another service, I increase the clock value by one and send this new value to another service. It gets the value and makes it's own calls, increasing this value each time.Eventually it returns the last value to me. With this value, I continue my work.
This approach has a number of disadvantages. First of all, external (not my) systems will not update the value of my clock. But this is not that important. The worse part is that I can not make parallel calls. I have to wait for the end of the next call to get the updated clock value.
To avoid this problem, I will use a different method. I use the clock value from the previous service as a prefix for the clock value in my service. The suffix will be a number that I monotonically increase with each request to another service. Here is how it is implemented:
public class RequestClockProvider
{
private class ClockHolder
{
public string PreviousServiceClock { get; init; }
public int CurrentClock { get; set; }
}
private static readonly AsyncLocal<ClockHolder> Clock = new();
public void SetPreviousServiceClock(string? value)
{
Clock.Value = new ClockHolder
{
PreviousServiceClock = value ?? string.Empty
};
}
public string GetPreviousServiceClock() => Clock.Value?.PreviousServiceClock ?? string.Empty;
public string GetNextCurrentServiceClock()
{
lock (this)
{
var clock = Clock.Value!;
return $"{clock.PreviousServiceClock}.{clock.CurrentClock++}";
}
}
}
The SetPreviousServiceClock method is used by my middleware to initialize the clock for this request. The GetNextCurrentServiceClock method is used every time I send a request to another service.
So, if my service receives a request in which the clock is set to 2, it will generate requests to other services with clock values 2.0, 2.1, 2.2, ... And if the service receives a request with the clock value 2.1, it will generate requests with the values 2.1.0, 2.1.1, 2.1.2, ...
If I have such a value for each log entry, I can easily group and order them. Entries within the same group can be safely ordered by timestamp, since they are created when processing a single request by a single service. It means that their timestamps were created by one physical clock.
One more remark. One could say that I'm implementing the Lamport timestamp here. It may be so, but I won't venture to assert it. I'm sure that there are more efficient algorithms that solve this problem. In practice, you should use them. But here my implementation is enough.
Request sending
Now we have the information from the request to our service. We need to send it further with each of our own request. How can we do it? For simplicity, I'll use instances of HttpClient. Here is my client for one of the services:
public interface IService2Client
{
Task Get();
}
public class Service2Client : IService2Client
{
private readonly HttpClient _client;
public Service2Client(HttpClient client)
{
_client = client ?? throw new ArgumentNullException(nameof(client));
}
public async Task Get()
{
_ = await _client.GetAsync("http://localhost:5106/weatherforecast");
}
}
I'll register it in the dependency container as follows:
builder.Services.AddHttpClientWithHeaders<IService2Client, Service2Client>();
Here the AddHttpClientWithHeaders method looks like this:
public static IHttpClientBuilder AddHttpClientWithHeaders<TInterface, TClass>(this IServiceCollection services)
where TInterface : class
where TClass : class, TInterface
{
return services.AddHttpClient<TInterface, TClass>()
.AddHttpMessageHandler<RequestHandler>();
}
As you can see, I'm just adding my own request handler. Here is it's code:
protected override Task<HttpResponseMessage> SendAsync(
HttpRequestMessage request,
CancellationToken cancellationToken)
{
var requestClockValue = _requestClockProvider.GetNextCurrentServiceClock();
request.Headers.Add(Names.CorrelationIdHeaderName, _correlationIdProvider.GetCorrelationId());
request.Headers.Add(Names.InitialServiceHeaderName, _initialServiceProvider.GetInitialService());
request.Headers.Add(Names.PreviousServiceHeaderName, ServiceNameProvider.ServiceName);
request.Headers.Add(Names.RequestClockHeaderName, requestClockValue);
using (LogContext.PushProperty(Names.RequestBoundaryForName, requestClockValue))
using (LogContext.PushProperty(Names.RequestURLName, $"{request.Method} {request.RequestUri}"))
{
_logger.LogInformation("Sending request...");
return base.SendAsync(request, cancellationToken);
}
}
First of all, I add several headers to the request, the values of which you already know. Here I pass these values to the next service.
Then I create an additional log entry with two special fields. One of them is the URL of the request. I keep it for information purposes only. In the sequence diagram, I'll show this address. The second field (RequestBoundaryFor) will be used by me to understand where I must place the log entries from the target service. We'll discuss this topic later when we talk about creating the request sequence diagram.
System launch
It is time to make some request. Firstly, I'll start Seq using Docker Compose:
> docker compose -f "docker-compose.yml" up -d
Then I'll start all my services. Here is my startup configuration in Visual Studio:
And now we can make a request to one of our services (e. g. to http://localhost:5222/weatherforecast).
After that, we'll have some entries in Seq :
I only need the correlation id from them.
Let's see how we can build a request sequence diagram based on these log entries.
Building of request sequence diagram
There is a free www.websequencediagrams.com service on the Internet. It has its own language for describing sequence diagrams. We'll use this language to describe our request. For this reason I created EventsReader application.
But first we have to get the log entries from Seq. We'll use Seq.Api NuGet package:
using EventsReader;
using Seq.Api;
var connection = new SeqConnection("http://localhost:9090");
var result = connection.Events.EnumerateAsync(
filter: "CorrelationId = '4395cd986c9e4b548404a2aa2aca6016'",
render: true,
count: int.MaxValue);
var logs = new ServicesRequestLogs();
await foreach (var evt in result)
{
logs.Add(evt);
}
logs.PrintSequenceDiagram();
In ServicesRequestLogs class, we group all log entries by the value of our monotonous clock:
public void Add(EventEntity evt)
{
var clock = evt.GetPropertyValue(Names.RequestClockHeaderName);
if(clock == null) return;
var singleServiceLogs = GetSingleServiceLogs(clock, evt);
singleServiceLogs.Add(evt);
}
private SingleServiceRequestLogs GetSingleServiceLogs(string clock, EventEntity evt)
{
if (_logRecords.ContainsKey(clock))
{
return _logRecords[clock];
}
var serviceName = evt.GetPropertyValue(Names.CurrentServiceName)!;
var serviceAlias = GetServiceAlias(serviceName);
var logs = new SingleServiceRequestLogs
{
ServiceName = serviceName,
ServiceAlias = serviceAlias,
Clock = clock
};
_logRecords.Add(clock, logs);
return logs;
}
private string GetServiceAlias(string serviceName)
{
if(_serviceAliases.ContainsKey(serviceName))
return _serviceAliases[serviceName];
var serviceAlias = $"s{_serviceAliases.Count}";
_serviceAliases[serviceName] = serviceAlias;
return serviceAlias;
}
All entries with the same value correspond to the processing of one request by one service. They are stored in a simple class:
public class SingleServiceRequestLogs
{
public string ServiceName { get; set; }
public string ServiceAlias { get; set; }
public string Clock { get; set; }
public List<EventEntity> LogEntities { get; } = new List<EventEntity>();
public void Add(EventEntity evt)
{
LogEntities.Add(evt);
}
}
Now we'll construct our description of the sequence diagram:
public void PrintSequenceDiagram()
{
Console.WriteLine();
PrintParticipants();
PrintServiceLogs("");
}
The PrintParticipants method describes all participants in the communication. The names of services may contain characters that are unacceptable for websequencediagrams, so we use aliases:
private void PrintParticipants()
{
Console.WriteLine("participant \"User\" as User");
foreach (var record in _serviceAliases)
{
Console.WriteLine($"participant \"{record.Key}\" as {record.Value}");
}
}
The PrintServiceLogs method prints the sequence of request processing in one service. This method gets the value of monotonous clock as a parameter:
private void PrintServiceLogs(string clock)
{
var logs = _logRecords[clock];
if (clock == string.Empty)
{
Console.WriteLine($"User->{logs.ServiceAlias}: ");
Console.WriteLine($"activate {logs.ServiceAlias}");
}
foreach (var entity in logs.LogEntities.OrderBy(e => DateTime.Parse(e.Timestamp, null, System.Globalization.DateTimeStyles.RoundtripKind)))
{
var boundaryClock = entity.GetPropertyValue(Names.RequestBoundaryForName);
if (boundaryClock == null)
{
Console.WriteLine($"note right of {logs.ServiceAlias}: {entity.RenderedMessage}");
}
else
{
if (_logRecords.TryGetValue(boundaryClock, out var anotherLogs))
{
Console.WriteLine($"{logs.ServiceAlias}->{anotherLogs.ServiceAlias}: {entity.GetPropertyValue(Names.RequestURLName)}");
Console.WriteLine($"activate {anotherLogs.ServiceAlias}");
PrintServiceLogs(boundaryClock);
Console.WriteLine($"{anotherLogs.ServiceAlias}->{logs.ServiceAlias}: ");
Console.WriteLine($"deactivate {anotherLogs.ServiceAlias}");
}
else
{
// Call to external system
Console.WriteLine($"{logs.ServiceAlias}->External: {entity.GetPropertyValue(Names.RequestURLName)}");
Console.WriteLine($"activate External");
Console.WriteLine($"External->{logs.ServiceAlias}: ");
Console.WriteLine($"deactivate External");
}
}
}
if (clock == string.Empty)
{
Console.WriteLine($"{logs.ServiceAlias}->User: ");
Console.WriteLine($"deactivate {logs.ServiceAlias}");
}
}
Here we get all the log entries for this particular clock value (logs variable). Then an the beginning and at the end of the method there is some code that should make our diagram more beautiful. There is nothing important here:
if (clock == string.Empty) ...
All the main work is done inside the foreach loop. As you can see, we sort the log entries by timestamp. We can safely do it here, because all these entries are obtained as a result of processing one request by one service. It means that all these timestamps came from one physical clock:
foreach (var entity in logs.LogEntities.OrderBy(e => DateTime.Parse(e.Timestamp, null, System.Globalization.DateTimeStyles.RoundtripKind))) ...
Then we check if the current entry is a service entry representing the beginning of some request to another service. As I have already said, this entry must contain the RequestBoundaryFor field. If there is no such field, then this is a normal entry. In this case, we just print it's message as a note:
Console.WriteLine($"note right of {logs.ServiceAlias}: {entity.RenderedMessage}");
If the entry is a service entry, two variants are possible. First of all, it can be start of a request to another of our services. In this case, logs will contain information from the target service. We extract this information and add it to the diagram:
Console.WriteLine($"{logs.ServiceAlias}->{anotherLogs.ServiceAlias}: {entity.GetPropertyValue(Names.RequestURLName)}");
Console.WriteLine($"activate {anotherLogs.ServiceAlias}");
PrintServiceLogs(boundaryClock);
Console.WriteLine($"{anotherLogs.ServiceAlias}->{logs.ServiceAlias}: ");
Console.WriteLine($"deactivate {anotherLogs.ServiceAlias}");
Secondly, it can be a request to an external service. In this case, we have no log entries about it's work:
Console.WriteLine($"{logs.ServiceAlias}->External: {entity.GetPropertyValue(Names.RequestURLName)}");
Console.WriteLine($"activate External");
Console.WriteLine($"External->{logs.ServiceAlias}: ");
Console.WriteLine($"deactivate External");
That's all. We can insert our correlation id into the appropriate place of Program.cs and run the program. It will give us the description of the sequence diagram that can be inserted into www.websequencediagrams.com:
Improvements
Our system is ready. At the end, I'd like to say a few words about possible improvements.
Firstly, I have not talked about messages to message queues (RabbitMQ, Azure EventHub, ...) here. They usually allow you to send some metadata along with messages, so you can transfer our special data (correlation id, monotonous clock value, ...). Support of message queues is a natural extension of our mechanism.
Secondly, the capabilities of www.websequencediagrams.com (at least in free version) are not very large. For example, I'd like to visually separate log entries of different types (Info, Warning, Error, ...). Perhaps we can use another, more powerful tool to create sequence diagrams.
Thirdly, some requests are send as "fire and forget". It means that no one is waiting for their completion. We need to somehow represent them in a different way on the diagram.
Conclusion
That's all I wanted to say. I hope that the article will be useful to you and help you understand what's happening in a complex system. Good luck!
No comments:
Post a Comment