What is the decay rate of the weightage given to past rewards in the computation of the Q function in the stationary and non-stationary updates in the multi- Armed bandit problem?
o hyperbolic, linear
o linear, hyperbolic
o hyperbolic, exponential
o exponential, linear