The control of a two-wheeled self-balancing vehicle is a complex nonlinear issue in the classical control theory. Applications of reinforcement learning in practical control problems have been proved feasible. In this paper, we present a method that derives from common Actor-Critic algorithm, which is made up of adaptive search network (ASN) and adaptive critic network (ACN). Each network is realized by a BP artificial neural network, and ASN implements the estimation of value function while ACN makes the decision to act. Besides, the TD-error is used in the learning process. In this way, we can handle the whole control task in a continuous domain. The algorithm is finally tested on an appropriate simulation model and a desirable result is achieved.
展开▼