Building a Threat Exchange as a Platform

There's definitely an art in designing a platform. You might think of it as a piece of code with a well defined API, a library with a man-page or a 'software as a service' function where I can use a credit card to access an API. I tend to think of it as "what abstraction are you offering me that I can build on". Meaning, are you abstracting enough of a specific problem away from me that provides an abundance of value to me? Do I need to think about things like credit card validation, or will you just do magic for me?

You can do a lot of research on platforms in the digital world, but there is so much overlap sometimes it just makes sense to take a look at the physical world first. For me; the "ah-ha!" moment was really cars. I had no idea cars leveraged common platforms for their different models. I understood that a cars, trucks etc came in different versions, but the concept of a 'car platform' just hadn't occurred to me. This is especially helpful when trying to convey the concept of "platform" to people who don't think about them every day. Sometimes the best platforms are the ones you forget are even there.

To me; CSIRTG started out as an easy way to solve the threat hosting and exchange problem. I honestly couldn't care less about a fancy user experience, nor am I trying to compete with those that do. I'm pretty terrible at UI but I have a fascination with lowering the barrier when it comes to making data actionable. After years of building CIF and trying to tackle the problem of parsing various types of feeds and formats, a few simple things stood out to me:

  • No agreement on a common taxonomy (no one really wants IDMEF, IODEF, STIX or XML)

  • Generally most people don't WANT json, but it's the lesser of the two evils

  • Generally everyone really likes CSV

  • Most operators understand what an API and Platform are, but have very few cycles to use them effectively

  • Very few groups understand how to scale indicator delivery (individually or as a feed)

  • Even less use technology such as AWS or Autoscaling Groups

  • Compression is an afterthought

  • Machine Learning is Black Magic (hint: it's not, it's just math)

  • We still use email for a lot of things; because it's a well understood platform

  • It's still really hard to sell you my feed (billing, APIs, accounts, query limits, etc)

The funny thing is, I didn't really understand most of this until I started actively solving some of these problems. I knew it had to be an API for something, whether it was indicators, feeds, other other (Threat Models)? It wasn't really clear to me until I started asking myself the very simple question: What are you a platform for? There are already many different threat-butt platforms out there that do a lot of this stuff, but to me it's the sum of the parts that makes something a platform.

The Power of Abstraction

Are you an abstraction that makes me forget about all the complexities of a problem? Amazon provides me with such an API that sometimes, I actually forget there are servers in a rack somewhere. I forget that the load balancers automatically spin things up and down with the load. I forget about disk drives, CPU's and only worry about GPU's when I have to build out a machine learning model. I forget about my credit card being charged every month, and only thing about provisioning once a year when I have to make a bulk purchase for a new service.

To me, great platforms separate themselves from the rest of the pack by both articulating their cumulative value as well as their frictionless barriers to entry. This comes in the form of not just documentation, but videos, examples, freemium usage and interactive APIs. Things that help guide me to the sum of the value, not just the value of "a single ec2 instance". Want to build a machine learning model? Use this server-less tool, and we'll abstract the rest so you can focus on the task.

Sometimes I even forget that if I were to solve that abstraction myself, it'd easily cost me 3-5 full time employees to even do a fraction of what the professionals at AWS do for me. There's almost always a case to be made for "bare-metal is faster" and while I can respect that, with the AWS Platform, I can make that back with scale. The power of platform abstraction [to me] is really only rivaled by the phenomenon of compounding interest. The effects of which aren't felt immediately, but scale exponentially over time.

I started building CSIRTG as a simple way to host and consume threat intelligence at scale. It was for me, a way to abstract away all the servers, load-balancers, auto-scaling, compression, formats, bandwidth, consumption tools. Anyone who's hosted a feed will tell you the hardest part of hosting threat intel are these things, and monetization. All these things cost money, and if you're successful your bandwidth bill alone may sink you. Wrap these things up into an API, now you're abstracting out developer costs, documentation costs, usage costs, billing costs.

True Value is in the Sum of its Parts

No single feature in CSIRTG is really a game changer. In fact, a lot of the people I talk to still think about threat intel in terms of "hosting a text file via HTTP". They offload a lot of the parsing, thinking, development work to the consumer. That's not to suggest the model is bad, but as technology evolves, newer customers are expecting that kind of work to be abstracted for them. Conversely, newer competitors are value adding by providing an API which makes it just that much easier to cross-integrate with other services.

Does your threat intel feed support webhooks into slack? Does it make sense for you to pay a developer to build that? Or use a service that has that built in. Do you want to pay to develop your feed as a realtime websockets stream? Do you want to create a billing system from scratch, deal with usage tracking, user management, api keys, etc? Or would you rather have a percentage of the subscription revenue dumped into your Paypal account every month? Do you want to spend time trying to understand machine learning? or use a platform that already incorporates it as part of their API?

The value of CSIRTG isn't any of the specific features, it's the value of abstracting away all the 'boring' stuff. It probably takes 3-5 FTEs to build host and operate this kind of platform. It's probably another 3-5 FTEs for billing, 'customer success' and platform development. Of course you could do some of this yourself, or you can hire CSIRTG to abstract it all away. Are you in the content delivery and integration business, or the security business?