F your formats, just show me the data- part2

F your formats, just show me the data- part2

The real problem we're trying to solve here is context. We're lifting a bunch of "tokens", that usually have more than 3 characters, surrounding them with context and applying a probability value to them. All this with the express purpose of taking the high value indicators and applying them to our defenses in real-time. Not trivial, but not hard either. I'm not an SKLearn or NLTK expert- but I do know what it feels like to block accidentally netflix.com at the border….