Best practices to monitor Embedded Linux IoT devices in 2020
It's 2020 and almost every manufacturer of Embedded Linux devices are connecting their products to the internet. In other words, turning them into Internet of Things (IoT) devices.
When connecting Linux Embedded devices to the internet, a lot of opportunities arise, one of them is the ability to continuously monitor them. Although it sounds like a nice-to-have feature, monitoring edge Embedded Linux IoT products is critical when thinking about management, maintenance, and support in the long-term.
1. Receive instant alerts on issues - when your product is running in the field, the most important thing for you would be making sure it is running smoothly and as planned. If for some reason there is an issue with the product (it may happen at any time), receiving an alert as soon as possible will let you fix that issue quickly, in some cases even before your customer will notice there was an issue at all. Clearly, monitoring your Embedded Linux product will eventually make your customers more satisfied with your product.
2. Save time and costs - monitoring your IoT product will save you a huge amount of time and cost when speaking of support. The most popular case would be a recall situation. If your Linux crashes for some reason (it may be your application fault, but also 3rd party packages that you use), and you are not monitoring your Embedded Linux, you will have to recall your product to find the issue and fix it. A recall event costs tons of money and takes much time.
So, what should I monitor?
While there is an infinite number of things to monitor, there are a few matters that are a must.
1. RAM usage - There is a huge number of processes that are running inside a Linux based operating system (OS). Some of them are connected to your application in some way, but many of them aren't. That been said, there is no way to predict which of them have memory leak issues. Therefore the most significant thing to monitor is your RAM usage over time. If you notice a continuous increment of your RAM usage, there is definitely an issue that must be addressed as soon as possible.
2. CPU usage - Same as with the RAM case, some processes in your Embedded Linux distribution may eat your CPU. Usually, it will be some kind of DB application or a process that doing tough calculations, but not limited. Monitoring your CPU usage and viewing at its history, may save you huge time debugging and fixing the issue.
3. Disk usage - The disk usage depends on your application design in most of the cases. Meaning if you are creating files, such as logs, or anything else, you will probably end with an increment of your disk usage. When your Embedded device is placed on the other side of the world, this is a major issue.
4. Application Errors - Obviously, sending errors from your application is essential for understanding the issues your product has and also for future software updates. When monitoring your application and sending errors, it is important to think which errors you want to receive as they will help you and which are useless.
Implementing such a monitoring mechanism would require you to develop a dedicated service that will constantly monitor those parameters of your Linux. Then in case it recognizes unwanted behavior, sends it to your cloud. At your cloud, you should receive those issues and implement a service to alert.
You can also use a ready-to-use platform, such as Upswift.io, which continuously monitors Embedded Linux devices and let you view the issues in real-time in a web dashboard and also trigger alerts (i.e send emails) on configurable cases.