I’m looking at log analysis tools. We currently h...
# platform-toolbox
t
I’m looking at log analysis tools. We currently have APM and error tracking via Sentry. It’s working great for us, but we’re not heavily invested yet. If I was doing this from scratch, I’d find a single solution to do all of it (e.g. New Relic, Datadog) I see 2 options: 1. implement a single solution and slowly migrate off Sentry (simpler, everything in one place) 2. implement a log analysis specific tools (e.g. Splunk) to use in addition to Sentry (no migrating, more specialized tools) Opinions and experiences?
m
Do you have any regulatory/compliance requirements?
r
we use both Sentry and New Relic. personally when looking at a very specific error, i like Sentry because you can add a bunch of execution context to the trace... I use New Relic for Dashboards, Alerts, and generally when investigating things across multiple services...
t
@Mark Cheshier nothing major
m
how about your budget? In my experience NR/DD have been cheap to start but they get really expensive as you build up
n
DD is fantastic if you can afford it. But as others have said it gets expensive very quickly, depending on your type of infrastructure
m
without a win for multiple departments (security, compliance, SRE, etc.) I'd be hesitant to move off Sentry if it's getting the job done. do you have any big customers with lots of SLA requirements? do you want this more for monitoring internal services or just externally-facing ones? your devs will almost certainly start pasting that NR key into everything (metaphorically, hopefully not the actual key!) if you let them and it can get crazy
t
We might be able to be selective about what logs get loaded, and able to be aggressive about archiving old ones. I've heard warnings about price, we'll try to be proactive about strategies to stay efficient, but that's hard to do in practice. The use case right now is analytics on our robot logs, but we'd use it for everything, I imagine. No SLAs right now though. Great questions. What are you getting at with the external/internal question? Really appreciate the perspectives. Sounds like we should definitely stay on Sentry for now, even if we have another APM solution. We can cut it if we like the other one better, or it's too expensive.
h
WE use DD and sentry, sentry gets more verbose for errors and the error ui is nicer, DD gets expensive
m
mostly that these tools have a ton of value where you have agreements with customers to hit certain SLAs, etc. They make it much much easier for SREs to figure out what's going on in the entire stack. There is certainly a great use case for internal as well, but it's harder to make the 'hey if we don't have a quick MTTR on this we lose x dollars so we should pay y for this great tool' argument to spend the $. Or to put it another way - can you quantify the benefit of moving from Sentry to one of these tools? Like if you expect big growth in traffic or complexity you could argue that you won't need another FTE, or something like that.
t
Great way to break it down. Thanks Mark!
n
Depends what you're using log analysis for. If it's for reducing MTTR then we find that honeycomb tracing has orders of magnitude reduction on that. But that requires your engineers changing their mindset from logging to tracing and it's not trivial
a
Hey @Tim Becker Depends on what you really need it for, but we built all our Observability product with DataDog and finally managed to move away from Sentry. Too noisy and too verbose and the engineers ended up not using it at all. DataDog is a pretty great product and whilst I agree with the fact that it can get expensive quickly, you can definitely keep the cost down by managing your logging which is truly the most expensive part of it, in my experience. DataDog was pretty great on all things security and compliance (i work in a heavily regulated industry).
c
Anybody here using self-hosted Sentry? The number of databases on the backend is intimidating, but wondering if something like https://github.com/Rungutan/sentry-fargate-cf-stack could help make those easier to manage by paying AWS to do it. On-prem would help with compliance requirements, and I’m wondering how the cost would compare to SaaS.