Today, multimedia data are produced in massive quantities, thanks to a diverse spectrum of applications including entertainment, surveillance, e-commerce, web, and social media. In particular, social media data have three challenging characteristics: data sizes are enormous, data are often multi-faceted, and data are dynamic. Tensors (multi-dimensional arrays) are widely used for representing such high-order dimensional data. Consequently, a system dealing with social media data needs to scale with the tensor volume and the number and diversity of the data facets. This necessitates highly parallelizable, and in many cases cloud-based, frameworks for scalable processing and efficient analysis of large media and social media collections. Most multimedia applications share a few core operations, including integration/fusion, classification, clustering, graph analysis, near-neighbor search, and similarity search. When performed naively, however, these core operations are often very costly, because the number of objects and object features that need to be considered can be prohibitive. Avoiding this cost requires that redundant work is avoided. Thus, for the next generation cloud-based massive media processing and analysis systems to have transformative impact, the fundamental principles that govern their design must include an awareness of the utilities of data and features to a particular analysis task. Recently, the observation that - while not all - a significant class of data processing applications can be expressed in terms of a small set of primitives that are, in many cases, easy to parallelize, has led to frameworks, such as MapReduce, which have been successfully applied in data processing, mining, and information retrieval domains. Yet, in many other domains (including many aggregation and join tasks that are hard to parallelize) they significantly lag behind traditional solutions. In particular, many multimedia and social media analysis tasks are in the category of applications that pose significant challenges. In this talk, I will present an overview of recent developments in the area of scalable multimedia and social media retrieval and analysis in the cloud and our own efforts [1, 2, 3, 4, 5, 6] to build a scalable data processing middleware, called RanKloud, specifically sensitive to the needs and requirements of multimedia and social media analysis applications. RanKloud avoids waste by intelligently partitioning the data and allocating it on available resources to minimize the data replication and indexing overheads and to prune superfluous low-utility processing. It also includes a tensor-based relational data model to support the complete lifecycle (from collection to analysis) of the data, involving various integration and other manipulation steps. RanKloud also addresses the computational cost of various multi-dimensional data analysis operations, including decomposition or structural change detection, by (a) leveraging a priori background knowledge (or metadata) about one or more domain dimensions and (b) by extending compressed sensing (CS) to tensor data to encode the observed tensor streams in the form of compact descriptors. RanKloud will extend the scope of cloud-based systems to the delivery of efficient and large scale analysis over data with variable utility and, thus, will enable new and efficient applications, tools, and systems for multimedia and social media retrieval and analysis.