I've written a few posts on Hunting and Machine Learning, most of which position themselves as the 'bleeding edge' of threat intelligence foo-magic. Often times however, these somewhat advanced approaches can be both overwhelming or narrow in scope for beginners. Sometimes (most times?) it's as simple as "6th grade math" to find the bad guys.
Technical Debt
At a glance this utility almost seems a bit too simple and most engineers might even "yea but, what about things like networkx??". That was my first impression- but then I stopped myself. If I've learned anything over the years, it's this: write simple tools that solve very simple problems. Do it in a way that, the simple solution can be reused over and over again. The less dependencies that tool has, the less friction there will be in adoption.
What I find most intriguing with Matt's solution, he took a somewhat daunting problem and found a very simple solution. Most engineers (myself included!) would have probably tried to tangle this up with something like networkx or some machine learning framework- if only to show how "incredibly sophisticated we THINK we are". It would have found similar answers to each question asked of it, but implementers would have a harder time incorporating it because the solution was not easily understood. Things that are not easily understood carry technical debt- and technical debt, is bad.
In the security arena there's so much "here's some magic! it'll solve EVERYTHING FOR YOU!" these days, it's hard to really rely on code you can't read and/or don't understand. Over the years, most of us have been over promised, out-right lied to or tried leveraging a large framework that solved more problems than we immediately had.
Payment Due
When the S$!T finally hit the fan, the creators of those solutions weren't held accountable for their complexity- YOU WERE. In some cases we could blame the company and open up a support ticket (nothing like paying someone to get them to fix their code that we already paid for), but in a lot of cases, we had simply implemented something we did not understand. While it might have bailed us out of a problem for the moment, over time that technical debt adds up. Eventually that debt has to be reconciled.
Every new version of a project I write starts out with a clean slate- a chance to clear out all the deadwood. A chance to simplify and justify WHY code should be kept, not why it should be purged. What I love about Matt's approach is in its simplicity. It takes a very hard statistics problem and makes it both actionable AND approachable. While at the same time, Making the Internet a Better Place.
Kudo's Matt. Great tool, keep up the excellent work!