Advances in virtualization technologies and edge computing have inspired a new paradigm for Internet-of-Things (IoT) application development. By breaking a monolithic application into loosely coupled microservices, great gain can be achieved in performance, flexibility and robustness. In this paper, we study the important problem of load balancing across IoT microservice instances. A key difficulty in this problem is the interdependencies among microservices: the load on a successor microservice instance directly depends on the load distributed from its predecessor microservice instances. We propose a graph-based model for describing the load dependencies among microservices. Based on the model, we first propose a basic formulation for load balancing, which can be solved optimally in polynomial time. The basic model neglects the quality-of-service (QoS) of the IoT application. We then propose a QoS-aware load balancing model, based on a novel abstraction that captures a realization of the application's internal logic. The QoS-aware load balancing problem is NP-hard. We propose a fully polynomial-time approximation scheme for the QoS-aware problem. We show through simulation experiments that our proposed algorithm achieves enhanced QoS compared to heuristic solutions.