The Most Important Feature of CIFv4

Should I be using CIFv3 or CIFv4? When will CIFv4 be ready? Ready. In the world of open-source, that word always makes me chuckle a bit. Is anything ever really, done? Every time you solve a problem, don't you always find five more waiting for you at the finish line? Where do you draw lines between versions of a platform?

There are never easy answers, but here are some recent rules I've tried to adhere to over the years:

  1. Define the immediate goals for a project, make sure they are clearly defined problems and it's well understood how to solve them. Make a list of these and stick to it. If the solutions are not well understood, push them to the bottom of the list. Solve the known ones first.

  2. Create a well defined prototype based on your goals, no more.. no less. Give yourself a time-limit for doing so, weeks .. not months.

  3. Don't solve problems that aren't immediately obvious. Stay away from things like "oh it'd be neat if… " and "wouldn't that feature be cool!". 84% of the time, they're usually problems not worth solving.. not yet anyway.

  4. Ensure your prototype is deployable from day one. Enable your audience to test your assumptions from day one. This will help you tease out which problems are important and which ones might not be.

  5. Create tests from day one that address each of your problems, create a CI pipeline so each release is tested with each pull request.

  6. Make sure your API's are "complete" and documented from day one. Your underlying functions will almost always need improvement, but the APIs should address your design goals and not change [much] from day one. Keep them simple and give yourself the ability to aggregate many of them into more complex interfaces as new patterns emerge. If you create too complex of an API, it'll almost always need to be changed. The more it changes, the longer it takes to stabilize and consider the project "done".

  7. After a few months and your codebase feels stable, create a google doc with "features I think I missed, but i'm not quite sure if I should spend time on". Likely this will be the design list for your next version. When that list feels complete (or overwhelming), freeze it and create your next prototype. At this point, you can probably slap a "beta" tag on the original version and freeze any new API changes. The more stable something is- the more people will test it, the more they test it, the more stable it becomes. If new features are asked for, consider the problems, but default towards your new code base where mistakes can more easily be made.

This list is not complete, but it gives you a general idea as to when it might be safe[er] to test the water. Risk tolerance is not universal, some folks don't mind testing an "alpha" product if it solves an immediate problem for them (and it's 'cheap'). Others may not feel comfortable unless there's an beta, RC or stable tag on the repo. I don't upgrade my iOS devices until version N.2, but I have no problem running CIF in just about any state in production locally. I can fix issues with CIF, I can't with iOS.

Historically, CIFv0, v1 and v2 did not follow this pattern. A lot of cycles were spent on CIFv2 solving problems that I thought existed, but really didn't (yet). With v3, I tried to squash some of this by simply migrating from Perl to Python, cutting newer features that users weren't really asking for and pushing them to the v4 code base. If you look at the history of v3, you'll notice the prototype was built in 2014, but the bulk of the refinements came in mid 2015, with things stabilizing in mind 2017.

CIFv4 was a completely different animal. It's the culmination of lessons learned from all previous versions of CIF. It comes with a caveat though, it only tries to solve the problems I've heard from other CIF users and minimally tries to introduce new concepts in ways that extend existing concepts. It introduces things like machine learning (via TensorFlow), but in a way that complements the existing confidence models. Instead of having to learn, understand and trust the new "probability" models, it places those statistics side by side with confidence to help you understand [and ultimately trust] what those numbers mean.

It also mixes in things like Networkx which gives CIFv4 an extremely simple graph based backend, but alongside SQLite. It also embeds streaming logic to the router, but in way that's unobtrusive and as simple as websockets rather than going straight to something like P2P. While I believe these features will be important in the future, I know they're not going to be that important to the majority of users TODAY. I used this opportunity to both bleed them into the code base [and write about them] as a means to seed some newer ideas, but not hinder the ultimate goal, which is RELEASING v4.

Done is the Engine of More

Does that mean v4 is done? I don't know. Maybe? I've definitely accomplished the majority of goals I set out to. As far as I'm concerned the prototype is done, I've cut ~13 releases as of this post and i'm spending time running the latest release in production as a stand alone box. This usually means the APIs are stable and I will be reluctant to change them.

I have learned that while the "all in one, SQLite" version might be considered stable, as we test new more distributed configurations there are always some performance and configuration unknowns that are bound to crop up. These usually don't change the core APIs, but in some cases they do. With that, until we've been running the core code "in production" for a few months, I've been reluctant to slap a beta tag on it.

To me- beta means more or less production stable, but now we're testing those assumptions over longer periods of time. Usually this is months, not weeks. To get an RC tag I'd like to have had a few months of real data put to the test. Are the feeds clogging up the routers? Are we leaking too much memory somewhere? Does the system fall over after six months because we're not aggregating properly? Threat data is often times seasonal, as is it's consumption.

Are users still asking questions because the documentation isn't complete yet? Remember, a lot of less technical users are waiting for that beta tag, if you tag too early with not enough tested doc, you'll get a lot of questions. These questions are a good thing, it means users are eager to test your platform. The problem with open-source projects, there are limited cycles to answer all of these. You need to balance the buildout of doc with the release tags, waiting until you have some tested doc in place before placing a beta tag on things can be valuable. Give yourself some time to test and write.

In the end, here are some hard numbers:

  1. It usually takes 150 hours to prototype a build (if you follow some of these rules).

  2. It takes you 3-6 months to deploy and regression test your platform, find the initial deployment bugs, write doc.

  3. It takes 6-12 months for users to trickle in and test your platform.

  4. It takes 12-18 months (assuming you immediately merge pull requests, make weekly or bi-weekly releases) for "new issue activity" to settle. This means, new users testing new releases, finding issues, logging them, helping you fix them and getting your FAQ built out.

Once your issues settle down (eg: haven't had a logged, non-FAQ problem logged for a few weeks or months), you can start the RC process. At this stage, RC's are basically stable releases with the ability to add in odd deployment fixes. I've found that usually by RC5 or 6, not only is the "all in one" deployment considered stable, you're almost ready to slap a beta tag on your next version.

As you can see, "done" is a really relative question and is more like a curve. You really have to understand where your tolerance is and what you're building for. If you're a researcher trying to push the needle, the alpha state of v4 is where you want to be. It's where you'll have the most influence over new features and design. It's where when you're ready to publish your work, v4 will be in it's final release candidate stages, ready to go with you.

If however, you're an operator, just dipping your toes in the water, v3 is where you want to be. It's more or less been banged on a bit, running in various production setups for years now and won't change all that much in future release. Once we get to the RC stage, we only merge obvious, non-architecture related bug fixes. This means there will be very few surprises. It also means, if you need something augmented, you either have to do it locally, or start investing in v4. Only you can figure out that balance.

The most important feature of v4? It's a highly mature, open-source prototype that follows the C4 process. This means, you can read it, learn from it, engage with it, teach using it and more importantly INFLUENCE it with something as simple as a pull request.

You can make it, yours.

Did you learn something new?