Utility Maximizing Sequential Sensing over a Finite Horizon

Lorenzo Ferrari; Qing Zhao; Anna Scaglione

doi:10.1109/TSP.2017.2692725

Utility Maximizing Sequential Sensing over a Finite Horizon

Lorenzo Ferrari, Qing Zhao, Anna Scaglione

Research output: Contribution to journal › Article › peer-review

8 Scopus citations

Abstract

We consider the problem of optimally utilizing N resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in either a reward or a penalty per unit time. The objective is a sequential strategy governing the decision of sensing and exploitation at each time to maximize the expected utility (i.e., total reward minus total penalty and sensing cost) over a finite horizon L. We formulate the problem as a partially observable Markov decision process and show that the optimal strategy is based on two time-varying thresholds for each resource and an optimal selection rule to sense a particular resource. Since a full characterization of the optimal strategy is generally intractable, we develop a low-complexity policy that is shown by simulations to offer a near optimal performance. This problem finds applications in opportunistic spectrum access, marketing strategies, and other sequential resource allocation problems.

Original language	English (US)
Article number	7895211
Pages (from-to)	3430-3445
Number of pages	16
Journal	IEEE Transactions on Signal Processing
Volume	65
Issue number	13
DOIs	https://doi.org/10.1109/TSP.2017.2692725
State	Published - Jul 1 2017

Keywords

Optimum sequential testing
cognitive radio
multi-channel sensing
opportunistic spectrum access

ASJC Scopus subject areas

Signal Processing
Electrical and Electronic Engineering

Access to Document

10.1109/TSP.2017.2692725

Cite this

@article{6db5658b73604d629f55ac85a9905cac,

title = "Utility Maximizing Sequential Sensing over a Finite Horizon",

abstract = "We consider the problem of optimally utilizing N resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in either a reward or a penalty per unit time. The objective is a sequential strategy governing the decision of sensing and exploitation at each time to maximize the expected utility (i.e., total reward minus total penalty and sensing cost) over a finite horizon L. We formulate the problem as a partially observable Markov decision process and show that the optimal strategy is based on two time-varying thresholds for each resource and an optimal selection rule to sense a particular resource. Since a full characterization of the optimal strategy is generally intractable, we develop a low-complexity policy that is shown by simulations to offer a near optimal performance. This problem finds applications in opportunistic spectrum access, marketing strategies, and other sequential resource allocation problems.",

keywords = "Optimum sequential testing, cognitive radio, multi-channel sensing, opportunistic spectrum access",

author = "Lorenzo Ferrari and Qing Zhao and Anna Scaglione",

note = "Publisher Copyright: {\textcopyright} 1991-2012 IEEE.",

year = "2017",

month = jul,

day = "1",

doi = "10.1109/TSP.2017.2692725",

language = "English (US)",

volume = "65",

pages = "3430--3445",

journal = "IEEE Transactions on Signal Processing",

issn = "1053-587X",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

number = "13",

}

TY - JOUR

T1 - Utility Maximizing Sequential Sensing over a Finite Horizon

AU - Ferrari, Lorenzo

AU - Zhao, Qing

AU - Scaglione, Anna

PY - 2017/7/1

Y1 - 2017/7/1

N2 - We consider the problem of optimally utilizing N resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in either a reward or a penalty per unit time. The objective is a sequential strategy governing the decision of sensing and exploitation at each time to maximize the expected utility (i.e., total reward minus total penalty and sensing cost) over a finite horizon L. We formulate the problem as a partially observable Markov decision process and show that the optimal strategy is based on two time-varying thresholds for each resource and an optimal selection rule to sense a particular resource. Since a full characterization of the optimal strategy is generally intractable, we develop a low-complexity policy that is shown by simulations to offer a near optimal performance. This problem finds applications in opportunistic spectrum access, marketing strategies, and other sequential resource allocation problems.

AB - We consider the problem of optimally utilizing N resources, each in an unknown binary state. The state of each resource can be inferred from state-dependent noisy measurements. Depending on its state, utilizing a resource results in either a reward or a penalty per unit time. The objective is a sequential strategy governing the decision of sensing and exploitation at each time to maximize the expected utility (i.e., total reward minus total penalty and sensing cost) over a finite horizon L. We formulate the problem as a partially observable Markov decision process and show that the optimal strategy is based on two time-varying thresholds for each resource and an optimal selection rule to sense a particular resource. Since a full characterization of the optimal strategy is generally intractable, we develop a low-complexity policy that is shown by simulations to offer a near optimal performance. This problem finds applications in opportunistic spectrum access, marketing strategies, and other sequential resource allocation problems.

KW - Optimum sequential testing

KW - cognitive radio

KW - multi-channel sensing

KW - opportunistic spectrum access

UR - http://www.scopus.com/inward/record.url?scp=85018915475&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85018915475&partnerID=8YFLogxK

U2 - 10.1109/TSP.2017.2692725

DO - 10.1109/TSP.2017.2692725

M3 - Article

AN - SCOPUS:85018915475

SN - 1053-587X

VL - 65

SP - 3430

EP - 3445

JO - IEEE Transactions on Signal Processing

JF - IEEE Transactions on Signal Processing

IS - 13

M1 - 7895211

ER -

Utility Maximizing Sequential Sensing over a Finite Horizon

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this