Edge computing is envisioned to be the de-facto paradigm of hosting emerging low latency Internet-of-Things (IoT) data streaming services.For IoT data streaming in edge computing, cost management is of strategic significance, due to the low cost-efficiency of edge servers. While existing literature adopts a reactive approach to dynamically provisioning edge servers to reduce cost, the delay of server activation and instantiation has been mostly ignored. In this paper, we target a proactive approach to dynamic edge server provisioning for real-time IoT data streaming across edge nodes, which adjusts server provisioning ahead of time, based on prediction of the upcoming workload. To effectively predict upcoming workload, a learning-based method online gradient descent is applied. We further combine the online learning method with an online optimization algorithm for server provisioning in a joint online optimization framework, through (1) minimizing of the regret incurred by inaccurate workload prediction, and (2) minimizing the cost incurred by near-optimal online decisions. The resulting predictive online algorithm can well leverage the power of prediction and achieve a good performance guarantee, as verified by both rigorous theoretical analysis and extensive trace-driven evaluations.