Alzheimer's Disease (AD) is the most common neurodegenerative disorder associated with aging. Understanding how the disease progresses and identifying related pathological biomarkers for the progression is of primary importance in Alzheimer's disease research. In this paper, we develop novel multi-task learning techniques to predict the disease progression measured by cognitive scores and select biomarkers predictive of the progression. In multi-task learning, the prediction of cognitive scores at each time point is considered as a task, and multiple prediction tasks at different time points are performed simultaneously to capture the temporal smoothness of the prediction models across different time points. Specifically, we propose a novel convex fused sparse group Lasso (cFSGL) formulation that allows the simultaneous selection of a common set of biomarkers for multiple time points and specific sets of biomarkers for different time points using the sparse group Lasso penalty and in the meantime incorporates the temporal smoothness using the fused Lasso penalty. The proposed formulation is challenging to solve due to the use of several non-smooth penalties. We show that the proximal operator associated with the proposed formulation exhibits a certain decomposition property and can be computed efficiently; thus cFSGL can be solved efficiently using the accelerated gradient method. To further improve the model, we propose two non-convex formulations to reduce the shrinkage bias inherent in the convex formulation. We employ the difference of convex programming technique to solve the non-convex formulations. Our extensive experiments using data from the Alzheimer's Disease Neuroimaging Initiative demonstrate the effectiveness of the proposed progression models in comparison with existing methods for disease progression. We also perform longitudinal stability selection to identify and analyze the temporal patterns of biomarkers in disease progression.