Prototyping CIFv4 - Streaming YOUR Threat Intel out of the Box

Earlier in this series, I talked about the real-time streaming aspects of CIFv4. While ZeroMQ is by far one of the most powerful frameworks when it comes to messaging, it's adoption can be a bit overwhelming for some. It's not a tool like Kafka or RabbitMQ, it's more of a language like C. I've spent the last 5 or 6 years working with it, and while I know my way around, there are plenty of days I spend learning something new.

In CIFv2 and v3 the ZeroMQ interfaces are partially exposed giving you the ability to stream threat intel from the router in a quasi PUB/SUB manner. What I've learned over the years, sometimes elegant and advanced technology isn't always the right answer. I absolutely love lower level messaging frameworks such as ZeroMQ. I love what i'm able to construct with them, how decentralized they are and how their simple nature keeps a lot of unnecessary complexity out of the development ecosystem.

Reducing Complexity

That said- i've also learned that, the world revolves around HTTP, at-least for now. Most people grok the idea of pulling a feed every day, hour, 15-min, they're still not quite sure what to do with streaming data. Not because they're dumb, but because there's still a lot of complexity around getting it to work, so there's not an easy, inexpensive way to gain exposure to it. You can easily stand up a Kafka, GRPC or ZeroMQ set of queue's, but then you're stuck with a bunch of complexity you have to re-tool into the rest of your frameworks. Worse- you have to then TEACH OTHER PEOPLE WTF YOU JUST DID.

Then you start researching how to stream HTTP using something like WebSockets. An hour into that rabbit hole you realize they have their own complicity issues too. At-least with WebSockets, you're teaching on-top of a technology that's already pretty well understood. Unlike the other frameworks, there's also TONs of pre-written documentation and examples in just about every language on the planet. As i've said before- if you're gonna prototype something, start with stuff that's more commonly known, easier to teach with and then go down the rabbit hole when you need scale.

Balance

In CIFv4, you get all the power of ZeroMQ, between the components as well as via the interfaces themselves. Additionally i've added the capability to stream indicators out via WebSockets. This requires a bit of re-tweaking of the original cif-httpd service by introducing things like Gunicorn which brings its own complexity (timeouts, more memory, etc). Placing things in front of it such as Elastic Load Balancers, Nginx, etc can also complicate things a bit more since WebSockets is more of a real-time, long-lived connection. Not impossible, more complexity creates more areas where things can go wrong.

With some very simple library code, in just about any language you can easily start streaming your threat intel every which way. I've included a very simple python example in the CIFv4 SDK, but when you start looking at other languages like javascript the possibilities are endless. Maybe you write a tool that looks for correlations across the data, or something that spots discrepancies between highly confident data, and data that has a low machine learning prediction score. Maybe you use it to stream highly confident data directly into your IDS instead of waiting 15-min for a classic feed pull.

You could also write a set of tools that generated alerts when something matched your IP space, brand or sector. Maybe even one of those super sexy "pew pew" maps. The cool thing about really simple, real-time data? The sky's the limit.

Did you learn something new?