Copious sequential event data has consistently increased in various high-impact domains such as social media and sharing economy. When events start to take place in a sequential fashion, an important question arises: “what type of event will happen at what time in the near future?” To answer the question, a class of mathematical models called the marked temporal point process is often exploited as it can model the timing and properties of events seamlessly in a joint framework. Recently, various recurrent neural network (RNN) models are proposed to enhance the predictive power of mark temporal point process. However, existing marked temporal point models are fundamentally based on the Maximum Likelihood Estimation (MLE) framework for the training, and inevitably suffer from the problem resulted from the intractable likelihood function. Surprisingly, little attention has been paid to address this issue. In this work, we propose INITIATOR - a novel training framework based on noise-contrastive estimation to resolve this problem. Theoretically, we show the exists a strong connection between the proposed INITIATOR and the exact MLE. Experimentally, the efficacy of INITIATOR is demonstrated over the state-of-the-art approaches on several real-world datasets from various areas.