See here for Horse64 Root's features.
Ideally, use both. Horse64 is easier to learn and faster to write, Horse64 Root allows more control and runs faster.
See here.
See here.
Horse64 Root needs at least a 32bit CPU with support for 64bit integer math support in hardware, and some modules depend on pre-emptive hardware threading provided by some operating system layer. Therefore, it may not be the most suited for embedded use right now.