* reduce complexity of throttling logic to use 1 queue and an atomic int * use atomic add instead of CAS, add throttling test