THESIS
2019
xv, 148 pages : illustrations ; 30 cm
Abstract
We consider remote state estimation problems. A group of sensors is deployed
to measure physical processes and transmit data to a remote state estimator
through wireless communication channels. We develop a systematic way to
schedule the sensor transmissions, which trades off the remote state estimation
quality and the communication constraints. A vast number of works have been
devoted to solving this problem; however these works assumed full knowledge
of the underlying models and relevant parameters. In this thesis work, we take
a learning-based approach, which asymptotically obtains an optimal solution to
the remote state estimation without precise knowledge of model parameters.
We first consider the case of one sensor. We formulate two types of optimization
problems: a cons...[
Read more ]
We consider remote state estimation problems. A group of sensors is deployed
to measure physical processes and transmit data to a remote state estimator
through wireless communication channels. We develop a systematic way to
schedule the sensor transmissions, which trades off the remote state estimation
quality and the communication constraints. A vast number of works have been
devoted to solving this problem; however these works assumed full knowledge
of the underlying models and relevant parameters. In this thesis work, we take
a learning-based approach, which asymptotically obtains an optimal solution to
the remote state estimation without precise knowledge of model parameters.
We first consider the case of one sensor. We formulate two types of optimization
problems: a constrained communication problem and a costly communication
problem. We prove that the optimal policy for this case has a threshold
structure. By leveraging this property, we develop revised Q-learning algorithms
to learn an optimal policy based on the transmission successes and failures of the sensor. We prove convergence of these revised algorithms. Empirical numerical
study shows that the revised algorithms accelerate the convergence of
the learning process. We then extend the learning framework to multiple sensors.
We show that scheduling multiple sensors can be captured by a restless
multi-armed bandit problem, and asymptotic optimality (as the number of sensors
goes to infinity) can be achieved with index-based heuristics. By using
Q-learning based algorithms, we develop a learning algorithm, which learns the
indices from the transmission successes and failures of the sensors. Lastly, we
consider a bandwidth allocation problem with a max-min fairness criterion. In
this problem, we aim to optimize the worst remote state estimation quality of
the sensors. We use a cost-based learning algorithm to achieve a max-min fair
allocation.
Post a Comment