Many considerations go into building out your own, personal honeynet:
What data do you collect?
How do you publish that data?
What collectors do you use?
How do you manage it all?
While it's pretty trivial to get started with a single honeypot, you’ll run into problems the instant your second sensor goes live. The data is usually locked up in the honeypot itself, the sensor is running on a host that's not well maintained and for some reason can't remember how you got everything working in the first place. Months or years later, you forget the sensors even exist and all that valuable insight, sadly goes unobserved.
I’m not suggesting there's a silver bullet to the "how to successfully manage a honeynet" problem. There are TONs of posts and tools out there that make managing your own honeynet, at-scale extremely feasible. I'm writing to posit a few simple ideas for self-starters that learn like I do. Take the problem apart a few simple steps at a time, keep things manageable and most importantly, learning something along the way.
First and foremost- start by writing your own. There are great tools like Cowrie out there and everyone should run one of these, if only to see where you SHOULD be thinking. However, by ONLY running pre-made honeypots, YOU WON'T LEARN ANYTHING ABOUT HONEYPOTS. Also, your attackers all know what the most successful honeypots are. They know how to fingerprint them, avoid them or worse- exploit them. In some cases this is a good thing- in your case; it can lead to a very bad thing.
Secondly, use Python or Go. By using the multitude of examples in the wild, you’re more likely to produce a minimum viable product ('MVP') and prevent making a massive [exploitable] mistake along the way. The neat thing about rolling your own- the first revision simply needs to respond back to a network request, it doesn't even need to do this correctly! In fact, if it responds incorrectly- maybe you'll even get back more interesting data. It's one of those few projects where you get to learn by building and may even get rewarded for implementing the protocol somewhat incorrectly.
Third, start simple and leverage things like Docker to get started. Using Docker will enable you to roll, test, deploy and maintain your honeynet configurations at scale. Docker images can run just about anywhere and if you do get to a place where you need to change up the TCP fingerprints, you can swap distro's and redeploy without too much overhead. That said- there can be security implications when running a honeynet inside of a virtualized environment. You should take these considerations seriously by isolating your various VM clusters. However, at this stage you're prototyping, learning what you need long term is more important than doing things perfect from day 1.
Lastly- find somewhere simple to POST the data from day 1. I've setup CSIRTG for this very reason, giving you a brain dead simple way to coverage and access your threat intel, but really any kind of centralized repository will do. As you're building out your network- if you're not IMMEDIATELY able to access that value (eg: the data) you'll have a hard time understanding what you've built, why it's important and where you should be going with it in the future. This will also give you hints as to HOW to collect and store your data along the way.
The wonderful thing about honeypots- the more unique they are, the more edge they're likely to give you.