Follow up on “Searching for Root Cause”

November 8th, 2005 | No Comments | Posted in LMI and SIEM

Anton Chuvakin has posted some comments regarding my “Searching for Root Cause” article.

Anton Chuvakin is a great guy. Very smart and definitely knows a lot about log analysis. I have the highest respect for him.

However, I think he misunderstood the article. In his comments, he said that “the article claims that you have to search logs in order to discover the real issue.”

This is definitely somewhat of an overstatement. My article does not claim that the only way to troubleshoot issues and determine root cause is through searching. Searching, however, is and will always be one of the ways admins use to troubleshoot issues. No amount of intelligence or reporting or whatever will replace drilling down into the details of the logs to determine root causes.

Many of tools today will help float the issues and problems to the top so admins will notice the problem faster. Then the admins will need to tools to drill down and find out what exactly are the cause of the problems. Search is one of those tools. Others may include further drill down on the reports.

Full-text indexed search is a much faster way to search. You can obviously insert all the logs into MySQL or some database and utilize the database to do the indexing. However, that can only carry you so far as the database insertion will be slowed down dramatically and can only handle a small number of messages per second.

The only real method to do it is utilize existing full-text indexing technologies to index log data. A great book on this topic is Managing Gigabytes.

Anton is correct in that the search technology can also be extended to determine and highlight the root cause. This is definitely true and possible to implement. I believe we will see tools, open source or commercial, with this type of features in the near future.

Love to hear more thoughts from everyone on this topic.

Comments are closed.