Probabilistic fuzzy logic framework in reinforcement learning for decision making
Hinojosa, W 2010, Probabilistic fuzzy logic framework in reinforcement learning for decision making , PhD thesis, Salford : University of Salford.
Restricted to Repository staff only until 01 March 2015.
Download (6MB) | Request a copy
This dissertation focuses on the problem of uncertainty handling during learning by agents dealing in stochastic environments by means of reinforcement learning. Most previous investigations in reinforcement learning have proposed algorithms to deal with the learning performance issues but neglecting the uncertainty present in stochastic environments. Reinforcement learning is a valuable learning method when a system requires a selection of actions whose consequences emerge over long periods for which input-output data are not available. In most combinations of fuzzy systems with reinforcement learning, the environment is considered deterministic. However, for many cases, the consequence of an action may be uncertain or stochastic in nature. This work proposes a novel reinforcement learning approach combined with the universal function approximation capability of fuzzy systems within a probabilistic fuzzy logic theory framework, where the information from the environment is not interpreted in a deterministic way as in classic approaches but rather, in a statistical way that considers a probability distribution of long term consequences. The generalized probabilistic fuzzy reinforcement learning (GPFRL) method, presented in this dissertation, is a modified version of the actor-critic learning architecture where the learning is enhanced by the introduction of a probability measure into the learning structure where an incremental gradient descent weight- updating algorithm provides convergence. XXIABSTRACT Experiments were performed on simulated and real environments based on a travel planning spoken dialogue system. Experimental results provided evidence to support the following claims: first, the GPFRL have shown a robust performance when used in control optimization tasks. Second, its learning speed outperforms most of other similar methods. Third, GPFRL agents are feasible and promising for the design of adaptive behaviour robotics systems.
|Item Type:||Thesis (PhD)|
|Schools:||Colleges and Schools > College of Science & Technology|
Colleges and Schools > College of Science & Technology > School of Computing, Science and Engineering
|Depositing User:||Institutional Repository|
|Date Deposited:||03 Oct 2012 14:34|
|Last Modified:||03 Jan 2015 23:25|
Actions (login required)
|Edit record (repository staff only)|