For many multi-dimensional data applications, tensor operations as well as relational operations need to be supported throughout the data lifecycle. Although tensor decomposition is shown to be effective for multi-dimensional data analysis, the cost of tensor decomposition is often very high. We propose a novel decomposition-by-normalization scheme that first normalizes the given relation into smaller tensors based on the functional dependencies of the relation and then performs the decomposition using these smaller tensors. The decomposition and recombination steps of the decomposition-by- normalization scheme fit naturally in settings with multiple cores. This leads to a highly efficient, effective, and parallelized decomposition-by-normalization algorithm for both dense and sparse tensors. Experiments confirm the efficiency and effectiveness of the proposed decomposition-by-normalization scheme compared to the conventional nonnegative CP decomposition approach.