F your formats, just show me the data.

...without ANY machine learning or NLTK magic, you have a very basic and generalized pattern (or "algo" in hipster speak) that can parse and normalize, most types of feeds.
...without ANY machine learning or NLTK magic, you have a very basic and generalized pattern (or "algo" in hipster speak) that can parse and normalize, most types of feeds.
If we are to succeed at making YOUR Internet a better place, we need that information to federate out among our peers. We need each of our models to be predictably influenced by our friends to help protect ourselves against threats we do not yet know about. Those models need to be transparent in order for us to gain confidence in them...
Randomly start talking to people at a conference. Head out to a bar, have a few beers, decide to build a mailing list and take down a botnet together. Create professional life partnerships. One of the more successful patterns, because you believe you can do anything when you get a few beers in you. All other (successful) patterns usually have origins in this pattern, or something like this- could be a bar, could be a game night at a coffee house, beer helps, but isn't always required.
It's one thing to think of "statistics" in the general sense. For instance,
"100 unique IPs scanned my darknet today".
This doesn't really tell me anything useful, other than (assuming DHCP churn is nil in a given 24 hour period) there's a bit of noise on the line. 100 by itself isn't a really useful number, it's probably not even statistically relevant, is it a holiday? was part of the Internet down today? was it the same device behind a series of NATs?....