Resistance spot welding (RSW) is a widely adopted joining technique in automotive industry. Recent advancement in sensing technology makes it possible to collect thermal videos of the weld nugget during RSW using an infrared camera. The effective and timely analysis of such thermal videos has the potential of enabling in-situ nondestructive evaluation (NDE) of the weld nugget by predicting nugget thickness and diameter. Deep learning (DL) has demonstrated to be effective in analyzing imaging data in many applications. However, the thermal videos in RSW present unique data-level challenges that compromise the effectiveness of most pre-trained DL models. We propose a novel image segmentation method for handling the RSW thermal videos to improve the prediction performance of DL models in RSW. The proposed method transforms raw thermal videos into spatial-temporal instances in four steps: video-wise normalization, removal of uninformative images, watershed segmentation, and spatial-temporal instance construction. The extracted spatial-temporal instances serve as the input data for training a DL-based NDE model. The proposed method is able to extract high-quality data with spatial-temporal correlations in the thermal videos, while being robust to the impact of unknown surface emissivity. Our case studies demonstrate that the proposed method achieves better prediction of nugget thickness and diameter than predicting without the transformation.