Perhaps, as you say, not worth it. But whenever there is a precision timing requirement like this, I will calculate it to quite a bit more bits than is usable, increment it based on that calculation, and then use the rounded version of that to set the timer. Basically fixed point decimal calculations.I can't generate 400 RPM on my stepper as that is 187.5us, between micro-steps, and the timer only does 1us steps. (Unless I change the clock base.). So it is running at 188us. I suppose I could ping pong between 188 and 187 every timeout? Nah, not worth it.
For example, in your case, calculate your timeout as a multiple of 125 ns (1/8 of a us). Call that timeout_125ns. Then you can increment timeout_125ns by whatever amount your RPM calls for in 125ns units and preserve 125ns accuracy. And set your actual microsecond timer to fire at (timeout_125ns>>3). Adjust accordingly if you want to use more bits of accuracy. This gives you a distribution that will average much closer to your desired period (within 125ns in this case?).
Biggest pain is keeping track of units of each variable, which is where having the _125ns appended to the name is almost mandatory. And of course this doesn't work if you are trying to use a hardware based repeating timer. And you often need to go to the next bigger size variable (such as int64 instead of int32) which can slow things down.