Hi,
Sorry if I got a little long winded. Your input is appreciated.
After another discussion (
http://www.techexams.net/forums/off-topic/106145-what-some-logging-solutions-cisco-devices.html) with some comments the_grinch made in it and this thread (
http://www.techexams.net/forums/off-topic/106591-logs-oh-so-important.html), I decided to take some time and spin up a VM to get the first two pieces of the ELK stack working. At the moment, Elasticsearch and Logstash are operational, collecting logs for three devices in my small home lab. I did run into a few gotchas during the setup but was able to work through them using Google. Overall, the process wasn't that bad but it does bring up some questions regarding the design of a logging stack.
During my searches I came across several variants for logging stacks and want to explore some of the pros and cons. Ultimately, simplicity is the main goal because at some point in the future, this system will be handed off to someone else and besides that, I don't want to spend 25 hours a week maintaining it either. Some variants I have seen use Redis in the middle, another I have seen uses rsyslog with logstash picking up from a file, then another is piping directly to logstash prior to being dumped into elasticsearch, and yet another option is using graylog2 to sit in-between the endpoint and elasticsearch.
When I first started digging into this, I hadn't thought about an overall design. I was merely interested in getting something to work. Now that I have a very simple install working, I started thinking, how would this be implemented in my enterprise network? Plus, if you add in the ideas suggested to implement OSSEC on the servers and potentially wanting to use TLS for shipping logs when possible, a seemingly simple endeavour gets a little complicated.
Just as a reminder, I would be looking at logging for around 350+ servers, which include; Windows, Linux, AIX, UCS, ESXi hosts, netscalers, and all ILO and DRAC messages. Then on top of that, some Apache logs, some IIS logs, email logs from Zimbra, all WAN routers, ASA's, a firewall, and another 100+ internal networking devices. Down the road, there may be a need/interest for printers, power infrastructure, building automation systems, and some medical devices too.
The question at this point is, what is the best way to get logs from an endpoint (EP) to the Central Log Repository (CLR) then to the Elasticsearch Database Server (ESS) so they can be viewed by a Frontend UI Server (UIS). I see the overall setup as EP => CLR => ESS <=> UIS. I don't know if the ELK stack or what I am going to term as the EGL stack will be best. I think detailing some options as I see them may help determine the best fit or at least put it all out there for consideration from those of you much more versed in the subject than I.
# Endpoint
Using rsyslog on NIX seems like a no brainier. It is there by default on RHEL/CENTOS and really doesn't need much to get it shipping logs, plus it supports encrypting the connection. For Windows the options are less obvious but it seems that using NXLOG to ship eventlogs is the best choice. It also supports TLS encryption. Most everything else will support sending to rsyslog out of the box. However, the ability of everything else to encrypt logs will have to be determined later.
I have to admit, I am not really familiar with OSSESC's logging options. How would using OSSEC fit into the mix? Would OSSEC **** logs to the local filesystem, which are then shipped via normal methods or would OSSEC take over the shipping duties, negating the need for NXLOG on Windows? Also, how easy would it be to integrate OSSEC after setting up something else to ship logs?
Another question related to the endpoint; what should be logged? Are you logging everything or just certain things to cut down on network chatter?
# Log Repository
One big question I have is, is it better to ship to a plain old rsyslog server then use logstash/graylog2 to pick up the logs from there or use logstash/graylog2 directly for parsing and dumping to the backend? Also, how are people dealing with HA?
# Elasticsearch backend
I think this is the one thing I don't have any questions about other than resource allocation, which I can figure out using trial and error. however, your input is still welcome.
#User Interface
Since I have not gotten to installing Kibana yet, I can only go off the articles I have read. it looks pretty amazing and seems very versatile except for one thing. I cannot find anything regarding setting up multiple logins and locking access down. GL seems to have that covered. I am going to setup both for review but are there any other obvious pros and cons I should know about?
Again, thanks for any input. I appreciate you taking the time to read this.
Regards,