Log aggregation and analysis We are looking for a ...
# general
s
Log aggregation and analysis We are looking for a cost-effective, scalable, and reliable log analysis solution. We currently use a mix of cloud watch, data dog, and sumo. Is there anything you can share about how you resolved a similar issue? What was your architecture? Any open Source tooling? What paid tools did you use? Thanks
n
This is guided by: • log volume and rate • use cases for queries • end users (security devs, ops, etc..) • structure of events • retention time
probably you have some mix of folks needing fast reliable eventing that create metrics and metadata on the stream volume, highly detailed and relatable events (liek traces), and bulk queries to answer security and product type questions, which probably means you may need multiple tools/datastores
s
What tools would you recommend?
Thanks Nathan
e
if you need something that seems promising in as a newer alternative to opensearch checkout quickwit
n
Grafana Loki and Graylog are two tools not in your list that I've used before and work pretty well
oh, hahah, and logz.io and elastic.co of course 🙂
and if you wanted to "jsut do it yourself in aws" this is a good high level overview
s
Thanks @Nathan Hruby
r
Seen a growing use of Athena (as described in that Aha! post) for the log analysis use case at a cost way, way lower than CW Logs.
For some element of real time analysis, I’d be doing something like: kinesis -> stream filters (to trigger a lambda) -> firehose -> OpenSearch collection and then OpenSearch serverless for analysis.
n
opensearch just gets expensive at massive data quantity, this is where a firehose that splits the data into s3 for longer duration (months, years, etc..) and opensearch for short term (days, weeks) gets very cool
d
Coming late to this thread since I just found it via the newsletter, but ChaosSearch (SaaS) seems really interesting for this. I've watched a demo but haven't implemented it yet.
e
@Daniel Serodio Isnt that like Quickwit does then ? they to do scale out on s3 elastic ish
d
Never heard of Quickwit before, but looking at its homepage, is looks like ChaosSearch is 100% SaaS/"Serverless" (no need to worry about servers) while Quickwit is self-hosted/self-managed
e
depends on qhat you need / want :)
a
We, kloudfuse.com, have a unified observability platform that addresses logging, metrics, traces, kubernetes events, etc. all in a single database that is custom designed from grounds-up for storing timeseries and event records. We have a free to download version that you can use forever. I will be happy to demo the product (you can also a live demo from our website). Because of the storage efficiency and ability to run in your VPC, the solution is extremely cost effective - providing 70-90% cost savings compared to SaaS vendors. Hope you will try it out.
j
Really interesting thread! We were running a self-hosted instance of Graylog running in K8S. It seems pretty solid and having the self-hosted option makes cost management easier. We ended up migrating away from it. Personally I found the querying language to be rather unintuitive and it came with quite a maintenance overhead.