As cache-based memory hierarchy is becoming a primary factor which limits the scalability and power efficiency of multi-core systems, scratchpad memory (SPM) has been studied as an alternative to cache. When SPM is used as an instruction memory, code management techniques are required to load code blocks on SPM using DMAs. In these techniques, code blocks are generally loaded on-demand to avoid loading incorrect block - unlike cache (e.g. tag arrays), SPM does not have mechanism to detect and recover from faults. While on-demand loading guarantees no fault, it leads to considerable performance overhead since it serializes the execution of DMA and CPU. This paper presents a technique to insert prefetching instructions for function-level code management to enable overlapping execution between DMA engine and CPU. Our technique inserts DMA instructions statically at compile time and does not rely on any profiling or run-time resources. Our evaluation shows that static prefetching can reduce CPU idle time due to DMAs by 58.5% and achieves 14.7% of average performance improvement on the benchmarks showing high overhead due to DMAs.