TY - CONF
T1 - PrOSe
T2 - 30th British Machine Vision Conference, BMVC 2019
AU - Shukla, Ankita
AU - Bhagat, Sarthak
AU - Uppal, Shagun
AU - Anand, Saket
AU - Turaga, Pavan
N1 - Funding Information:
This work was supported partially by ARO grant number W911NF-17-1-0293, at Arizona State University, USA and Infosys Center for Artificial Intelligence at IIIT-Delhi, India.
Funding Information:
This work was supported partially by ARO grant number W911NF-17-1-0293, at Arizona State University, USA and Infosys Center for Artificial Intelligence at IIIT- Delhi, India.
Publisher Copyright:
© 2019. The copyright of this document resides with its authors.
PY - 2020
Y1 - 2020
N2 - Learning representations that can disentangle explanatory attributes underlying the data improves interpretabilty as well as provides control on data generation. Various learning frameworks such as VAEs, GANs and auto-encoders have been used in the literature to learn such representations. Most often, the latent space is constrained to a partitioned representation or structured by a prior to impose disentangling. In this work, we advance the use of a latent representation based on a product space of Orthogonal Spheres PrOSe. The PrOSe model is motivated by the reasoning that latent-variables related to the physics of image-formation can under certain relaxed assumptions lead to spherical-spaces. Orthogonality between the spheres is motivated via physical independence models. Imposing the orthogonal-sphere constraint is much simpler than other complicated physical models, is fairly general and flexible, and extensible beyond the factors used to motivate its development. Under further relaxed assumptions of equal-sized latent blocks per factor, the constraint can be written down in closed form as an ortho-normality term in the loss function. We show that our approach improves the quality of disentanglement significantly. We find consistent improvement in disentanglement compared to several state-of-the-art approaches, across several benchmarks and metrics.
AB - Learning representations that can disentangle explanatory attributes underlying the data improves interpretabilty as well as provides control on data generation. Various learning frameworks such as VAEs, GANs and auto-encoders have been used in the literature to learn such representations. Most often, the latent space is constrained to a partitioned representation or structured by a prior to impose disentangling. In this work, we advance the use of a latent representation based on a product space of Orthogonal Spheres PrOSe. The PrOSe model is motivated by the reasoning that latent-variables related to the physics of image-formation can under certain relaxed assumptions lead to spherical-spaces. Orthogonality between the spheres is motivated via physical independence models. Imposing the orthogonal-sphere constraint is much simpler than other complicated physical models, is fairly general and flexible, and extensible beyond the factors used to motivate its development. Under further relaxed assumptions of equal-sized latent blocks per factor, the constraint can be written down in closed form as an ortho-normality term in the loss function. We show that our approach improves the quality of disentanglement significantly. We find consistent improvement in disentanglement compared to several state-of-the-art approaches, across several benchmarks and metrics.
UR - http://www.scopus.com/inward/record.url?scp=85087333187&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85087333187&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85087333187
Y2 - 9 September 2019 through 12 September 2019
ER -