This paper studies the trajectory optimization problem in a scenario where a single rotary-wing UAV acts as a relay of data payloads for downlink transmission requests generated randomly by two ground nodes (GNs) in a wireless network. The goal is to optimize the UAV trajectory in order to minimize the expected average communication delay to serve these random requests. It is shown that the problem can be cast as a semi-Markov decision process (SMDP), and the resulting minimization problem is solved via multi- chain policy iteration. The optimality of a two-scale optimization approach is proved: the optimal trajectory in the communication phase greedily minimizes the communication delay of the current request while moving between the current start position and a target end position (inner optimization); the end positions are selected to minimize the expected average long-term delay in the SMDP (outer optimization). Numerical simulations show that the expected average delay is minimized when the UAV moves towards the geometric center of the GNs during phases in which it is not actively servicing transmission requests, and demonstrate significant improvements over sensible heuristics. Finally, it is revealed that the optimal end positions of communication phases become increasingly independent of the data payload, for large data payload values.