Logging

发布时间 2023-09-26 22:27:03作者: ZhangZhihuiAAA

In the past, applications would need to store and manage the application logs manually by themselves. The application logs would be stored in files, and there is a need for the application to have code that would do the process of “log rotation,” which is a situation where older logs are removed in order to ensure that the logs do not consume crazy amounts of storage space. 

Fortunately, things have changed quite a bit nowadays. The approaches are somewhat standardized, depending on where the application is deployed to. In the case where the application is deployed to a virtual machine, we can rely on Journald, which is a subset of Systemd functionality. (Systemd was introduced in earlier chapters, where it was used to demonstrate the approach for managing applications on a virtual machine.) Once logs are on Journald, we can then have the logging system of our choice interact with Journald to retrieve the logs, which can be pushed/pulled accordingly to the various log storage mechanisms. For our applications, there is a particular need to have our application to write the logs to a particular file/folder. The logs can simply be printed to stdout, and that output is somehow captured, stored, and managed by journald.

In the case where the application is deployed via a container through docker-compose or a container orchestration engine such as Kubernetes, the same principles would still apply. As an application developer, one would only need to just log the required
information into stdout. The information logged into stdout will be collected by docker (or relevant containerization software) and stored in a specific folder on a per-container basis. The relevant logging system that is selected and chosen for our case would simply need to read the logs from the said folder and then push/pull it into the various log storage mechanisms. This would actually make things way simple for developers and maintainers for applications. Let us say if the application is deployed on Kubernetes, and the application pod is spread out across multiple nodes in the cluster. With the right set of data (done via labels and so on), we can aggregate all the logs for the same application and have it easily accessible via a single interface, making it easy to understand how the application behaves in when it is deployed in the cluster. There is no need for the developer to hop into every node that hosts the application and then read the specific files to get the logs. The logs can be set to be aggregated into a single datasource for easier analysis work.

Naturally, with logs, one can simply leave it as plain simple text and then have the logs indexed, which would then allow it to be searchable. This is the most likely scenario of how a developer would interact with such a system. However, there is a trend of
ensuring the logs from a container is in some sort of structured format, for example, JSON representation. This would make it easier to then retrieve particular fields from said logs and have such information further aggregated to form dashboards that can help developers understand the performance of their applications better.