The Communications and High-Precision Positioning (CHP2) system enables secure communications and positioning services between networks of cooperative RF users. CHP2 features flexible physical and data link layers that may be tuned to prioritize communications or positioning precision to support traffic and resource management for different network conditions. We apply reinforcement learning techniques to optimize positioning performance and network throughput for a simple CHP2 network configuration. In this example, a central arbiter manages a grid of ground users tracking aerial users moving around the network. This arbiter dynamically optimizes each ground user to maximize network throughput while maintaining a minimum positioning precision on each aerial user. We demonstrate that this arbiter successfully manages the ground user resources to track the aerial targets with a minimal number of interactions. This reduces the required resources to maintain the network, which reduces the spectral congestion and power consumption for sparse networks or supports more users in congested networks.