Abstract

We consider the problem of optimally utilizing N resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in either a reward or a penalty per unit time. The objective is a sequential strategy governing the decision of sensing and exploitation at each time to maximize the expected utility (i.e., total reward minus total penalty and sensing cost) over a finite horizon L. We formulate the problem as a partially observable Markov decision process and show that the optimal strategy is based on two time-varying thresholds for each resource and an optimal selection rule to sense a particular resource. Since a full characterization of the optimal strategy is generally intractable, we develop a low-complexity policy that is shown by simulations to offer a near optimal performance. This problem finds applications in opportunistic spectrum access, marketing strategies, and other sequential resource allocation problems.

Original languageEnglish (US)
Article number7895211
Pages (from-to)3430-3445
Number of pages16
JournalIEEE Transactions on Signal Processing
Volume65
Issue number13
DOIs
StatePublished - Jul 1 2017

Keywords

  • cognitive radio
  • multi-channel sensing
  • opportunistic spectrum access
  • Optimum sequential testing

ASJC Scopus subject areas

  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Utility Maximizing Sequential Sensing over a Finite Horizon'. Together they form a unique fingerprint.

  • Cite this