Determining in Vivo Protein Complex Stoichiometry from Superresolution Microscopy

Project: Research project

Project Details


We have just completed and submitted both of our reviews. The Adv. Chem. Phys. is accepted (it was invited and not refereed) while the Chem. Rev. has been refereed.

We are currently addressing minor revisions. In particular, our lengthy Chem. Rev. details every step in the analysis of superresolution data from the acquisition
of data, to tackling the problem of localization of fluorophores, counting fluorophores, tracking particles, linking superresolved particle trajectories and interpreting the data.

Our manuscript on counting by photobleaching has been accepted in Molecular Biology of the Cell. Counting by photobleaching is a natural extension of counting by PALM. That is, it provides simplicity of analysis by sacrificing the spatial resolution afforded by PALM. We have made our photobleaching analysis code available by request (by email). The code itself will be available on the PI's new website once this website has been transferred over to the PI's new institution (ASU).

As a future extension of our work, we have come to realize that the analysis of photobleaching traces is very similar to the analysis of fluorescence time traces as
fluorophores come in and out of an illuminated confocal volume (i.e. this would be the raw trace before the correlation of signal in time in fluorescence correlation spectroscopy [FCS]). We are thus adapting our algorithm for photobleaching to analyze raw (i.e. correlation-less) FCS data that thereby avoids artifacts introduced by correlating the data.

What is more, our work on PALM that exploits properties of aggregated Markov Models (AMMs) has motivated a new direction on the topic of Hidden Markov Models (HMMs) (of which AMMs are a special case). Briefly, HMMs have been a workhorse of time series analysis in Biophysics. Yet they have a key limitation: the number of states must be assumed a priori. This limitation is especially problematic when: 1) due to noise, the number of states visited in a time trace is unclear; or 2) due to the limited length of a time series (e.g. arising from photobleaching in FRET), not all states are explored in each trace. For this reason, many new methods have been combined with HMMs to address this key limitation as a post-processing step.

Indeed it was only in 2012 that the infinite HMM (iHMM) clearly and elegantly resolved this limitation by cleverly exploiting fundamentally new Mathematics (Bayesian nonparametrics only discovered in the mid 70's). Now iHMMs, and Bayesian nonparametrics more broadly, have revolutionized data science and we believe they will inevitably alter the course of data analysis in Biophysics altogether within the next decade.

Despite this revolution in data science, to our knowledge, only a single article covering the broad developments of Bayesian nonparametrics over 40 years has ever been published for a physical audience. This was the paper by K. Hines in Biophys. J. covering iHMMs in a few lines. Yet articles continue to be published to circumvent the difficulties of the HMM already addressed by the iHMM. Part of the challenge our community faces is understanding the iHMM literature which is currently intended for a rarified group of Computer Scientists and Statisticians.

Our recent work takes iHMMs one step further. We are attempting to generalize iHMMs to deal with drift common in biophysical time traces. Understandably, drift is not a problem in computer science applications and has not been dealt with thus far. Simply put, drift can be misinterpreted by iHMMs as additional artifact states. For this reason, we have begun modeling drift with a continuous process that we must learn alongside all other quantities determined by the iHMM simultaneously. We have begun applying our method to real and synthetic data.

If the impact of Bayesian nonparametric methods in machine learning and data science is any indication, these deeply innovative methods are here to stay. The best my group can hope to accomplish on this front is to adapt these tools for Biophysics in order to speed up their inevitable adoption.

The PI has volunteered to provide more lectures to the qBio Summer School in Fort Collins for 2017. The topics discussed will cover broad areas of data analysis including model selection, Markov models and some aspects of time series analysis.

Beyond 3 analysis codes we have already made available thanks to this award (PALM analysis, FCS analysis, photobleaching data analysis), we have also begun
developing code to analyze time traces collected in biophysics using infinite Hidden Markov Models (iHMMs).

I have incorporated material from my Award into the Biophysics course taught in the Fall of 2016. I have hosted high school students for every summer while on this Award (which ends June 2017 -- in the absence of a no-cost extension).

At ASU, I have begun coordinating with instructors teaching large undergraduate classes in order to recruit talented undergraduate students into my lab in the near future.
Effective start/end date1/1/176/30/18


  • NSF: Directorate for Biological Sciences (BIO): $151,382.00


Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.