We consider the problem of tracking a target by integrating observations from multiple disparate sources in a multimodal sensing system. Based on the sensing modalities, these observations are associated with different measurement models. They are also statistically dependent if acquired synchronously while capturing the same scene. Although dependency among measurements is largely overlooked, improved performance can be achieved if this additional information is modeled and incorporated in the tracking formulation. This paper employs a hierarchical Dirichlet process mixture to model the data dependency and extract the time-varying cardinality of the measurements of each sensor. The hierarchical Dirichlet process framework provides a joint measurement density model that is integrated with Bayesian tracking methods to estimate the target state information.