In this paper, we address the problem of data transmission over a block fading frequency selective multi-input multi-output (MIMO) channel. The encoded data symbols are passed through an affine precoder and sent over multiple transmit antennas. Affine precoding is a unifying model of the existing training schemes, namely, preamble based training, pilot symbol assisted modulation (PSAM), and superimposed training. Within this general model, we show that the channel Cramer-Rao bound (CRB) is minimized if the precoded data and training symbols satisfy a special form of orthogonality. Under the power constraint on the training and the precoded symbols, we provide optimal and suboptimal training design guidelines.