Log File based Auto-Remediation

Started by kalpana, Nov 07, 2023, 03:08 AM

Previous topic - Next topic

kalpana

Although open source log management projects exist, and its always been possible to script out incident response auto-remediation, what open source projects focus on this at scale? It seems like many sysadmins have to re-invent the wheel every time something goes wrong. We all rely on forums like this by searching on errors found in logs. No doubt devs are working to fix software so that errors do not occur, but there's so many OS logs beyond syslog that it's pretty time consuming and sometimes frustrating, especially when app logs are included. In fixing day to day issues, I've come across many logs that I didn't even know existed.

So what prevents the development of a simple AI system where errors in logs kick off an automated forum search or general web search and then use a tactic similar to hardware probes that attempt to fix an issue by working through a series of troubleshooting test scripts? As an abstraction, I am reminded of Prolog type fwd/bwd chaining based on Logic or Condition-Action Rules. Seems like it would be no more complex than following a logic tree type flow chart, although the logic tree would end up being massive in the end. No doubt seemingly endless iterations would be required to perfect such an effort. Has anyone heard of such a thing? Seems like automation could be helpful here.