DNN pruning approaches usually trim model parameters without exploiting the intrinsic graph properties and hardware preferences. As a result, an FPGA accelerator may not directly benefit from such random pruning, with additional cost on indexing and control modules. Inspired by the observation that the brain and real-world networks follow a Small-World model, we propose a graph-based progressive structural pruning technique that integrates local clusters and global sparsity in the Small-World graph and the data locality in the FPGA dataflow. The proposed technique hierarchically trims the DNN into a sparse graph before training, which follows both the Small-World property and FPGA dataflow preferences, such as grouped non-zero and zero parameters to skip data load and corresponding computation. The pruned model is then trained for a given dataset and fine-Tuned to achieve the best accuracy. We evaluate the proposed technique for multiple DNNs with different datasets. It achieves state-of-The-Art sparsity ratio of up to 76% for CIFAR-10, 84% for CIFAR-100, and 76% for the SVHN datasets. Moreover, the generated sparse DNN achieves up to 4× improvement in throughput for an output stationary FPGA architecture across different DNNs with a marginal hardware overhead.