TCP offload has attracted a great deal of industrial interest. However, little research has been done to examine the benefits of offload for today's popular application layer protocols, in particular, Session Initiation Protocol (SIP). In this paper, we profile the processing of SIP stacks and find that for typical SIP scenarios and despite of different SIP stack implementations, SIP parsing would take a significant percentage (20%-40%) of CPU time. Based on this fact, a SIP offload scheme termed SIP offload engine (SOE) is proposed to offload SIP parser from SIP stack. A prototype of SOE is implemented and the benchmarking results indicate that throughput gain varies greatly depending on server's pipelining architectures. We also observe that SIP retransmission will incur "receive livelock" in overloaded high performance SIP servers.