Regarding nitrogen vs ethane: because the ethane molecule (C2H6) has more rotational and vibrational degrees of freedom than the nitrogen molecule (N2), its heat capacity is higher. This makes it better at cooling things quickly. Relatedly, ethane's enthalpy of vaporization is also higher (~15 kJ/mol vs ~6 kJ/mol)
The really critical thing with ethane is that it is far below its vaporization point when you plunge into it, so you vastly reduce the leidenfrost effect. The heat capacity and enthalpy of vaporization also play roles, but leidenfrost is what really kills freezing.
Well done, but it's worth noting that ML has been applied to problems outside of just conformational heterogeneity in cryo-EM for some time. Worth shouting out a few of these below:
ModelAngelo, which has revolutionized model building into maps, particularly when amino acid sequences are unknown (allowing for structure-based discovery): https://www.nature.com/articles/s41586-024-07215-4
Well done! Really enjoyed this article, as it covered much of my PhD work. Very interesting to learn about the updates in ML for cryo-EM (cryoDRGN & Hydra). It's worth noting that cryoEM is actually a broader term with several sub categories. The one described in this article seems to be focused on single particle analysis, but there is also electron tomography, electron crystallography, and MicroED!
Thanks for the overview! Love your writing in general. I'm on the ML side and I'm a bit confused why the model couldn't transfer from one protein to another. It seems to me that the task of 2D particle image <-> 3D reconstruction should be independent of what you are imaging and that if you trained on enough samples / had a big enough model you could have a pretty general latent encoding for biological macromolecules. Is it because there is some information loss when you go from 3D to 2D or something?
but it also seems like, currently, these models are mostly good as priors rather than something you can trust out of the box. which also seems to mirror most biology foundation models (outside of maybe protein models)
> This was weird! Cryo-EM wasn’t something I ever saw much. While, admittedly, I was entirely ignorant of the field until 2022, it still felt like it wasn’t a very popular topic. Most people seem to work in small molecule behavior prediction or antibody modeling or something you’d see dozens of papers about at a NeurIPS workshop.
Very interesting remark. I agree, you don't see much about cryo-EM in ML, even though it is an extremely useful technique and a domain where ML seems potentially able to offer a lot. To me, this speaks to an over-emphasis in the field on achieving big, flashy results ("we SOLVED the protein-folding problem!") and a reluctance to do the kind of less sexy grinding that is important to move science forward. To some extent, that is born out of a healthy ambition (if we can solve the big problems, why settle for less?), but IMO there is also an element of the kind of arrogance that people used to associate primarily with physicists.
Great article! I don't think experimentalists won't change plunging techniques because it isn't easy enough, but rather, it's expensive. Vitrobots--the instruments that help with more reproducible freezing--are incompatible with nitrogen freezing because heavy airflow is needed to remove the layer of nitrogen vapor you mention. These instruments cost upwards of 100k+, and so departments that have invested in one probably won't buy another expensive instrument, and then tinker with the experimental parameters required for reproducibility between samples frozen on a different instrument (each vitrobot itself even needs to be calibrated for force and blot time, which isn't consistent between instruments). The amount of optimizations required to get reproducible EM is staggering, and then preferential orientation comes in :)
This is a really great primer, even from a structural biology perspective. I particularly enjoyed your explanation of Fourier Transform in the context of 3D refinement. X-ray crystallography actually involves Fourier transform too, in this case to obtain the diffraction pattern from the electron density and because of experimental constraints in X-ray crystallography, the phase problem arises. Just wanted to add this, as I thought it's pretty cool how FT just seems to be everywhere. Again, a really great primer!
Good write up! One additional constraint in TEM of biological specimens is that the electron beam delivers so much energy that samples are rapidly destroyed. It’s one of the reasons things are kept cold, damage is limited. Detector upgrades that allow faster imaging to capture images before samples are destroyed was another breakthrough. My sense is that a lot of the barriers to high resolution cryo EM revolved around being unable to get a high-information image before your sample was destroyed by the electron beam. Not an expert though, but it’s a fascinating technology where I’m always surprised we can extract 2 Angstrom detail out of those noisy 500px x 500px particle images.
Thanks for putting together all these interesting papers in one post! Perhaps in "why is there a carbon grid on top of the carbon grid?", the last "carbon" should be "copper"?
Regarding nitrogen vs ethane: because the ethane molecule (C2H6) has more rotational and vibrational degrees of freedom than the nitrogen molecule (N2), its heat capacity is higher. This makes it better at cooling things quickly. Relatedly, ethane's enthalpy of vaporization is also higher (~15 kJ/mol vs ~6 kJ/mol)
Source: NIST thermodynamic data
The really critical thing with ethane is that it is far below its vaporization point when you plunge into it, so you vastly reduce the leidenfrost effect. The heat capacity and enthalpy of vaporization also play roles, but leidenfrost is what really kills freezing.
Well done, but it's worth noting that ML has been applied to problems outside of just conformational heterogeneity in cryo-EM for some time. Worth shouting out a few of these below:
DeepEMhancer, which helps sharpen maps to make them more interpretable: https://www.nature.com/articles/s42003-021-02399-1
ModelAngelo, which has revolutionized model building into maps, particularly when amino acid sequences are unknown (allowing for structure-based discovery): https://www.nature.com/articles/s41586-024-07215-4
CryoSPARC's 3Dflex: https://guide.cryosparc.com/processing-data/all-job-types-in-cryosparc/variability/job-3d-flexible-refinement-3dflex-beta | https://www.nature.com/articles/s41592-023-01853-8
Blush: https://www.nature.com/articles/s41592-024-02304-8
TOPAZ particle picking: https://www.nature.com/articles/s41592-019-0575-8
crYOLO particle picking: https://pmc.ncbi.nlm.nih.gov/articles/PMC6584505/
Well done! Really enjoyed this article, as it covered much of my PhD work. Very interesting to learn about the updates in ML for cryo-EM (cryoDRGN & Hydra). It's worth noting that cryoEM is actually a broader term with several sub categories. The one described in this article seems to be focused on single particle analysis, but there is also electron tomography, electron crystallography, and MicroED!
Glad you enjoyed it! And my bad 😅 I assumed those were distinct fields and not underneath the cryo-EM name, thank you for the comment
Thanks for the overview! Love your writing in general. I'm on the ML side and I'm a bit confused why the model couldn't transfer from one protein to another. It seems to me that the task of 2D particle image <-> 3D reconstruction should be independent of what you are imaging and that if you trained on enough samples / had a big enough model you could have a pretty general latent encoding for biological macromolecules. Is it because there is some information loss when you go from 3D to 2D or something?
seems like there does exist some foundation-y models for cryo-em: https://openreview.net/forum?id=T4sMzjy7fO
but it also seems like, currently, these models are mostly good as priors rather than something you can trust out of the box. which also seems to mirror most biology foundation models (outside of maybe protein models)
> This was weird! Cryo-EM wasn’t something I ever saw much. While, admittedly, I was entirely ignorant of the field until 2022, it still felt like it wasn’t a very popular topic. Most people seem to work in small molecule behavior prediction or antibody modeling or something you’d see dozens of papers about at a NeurIPS workshop.
Very interesting remark. I agree, you don't see much about cryo-EM in ML, even though it is an extremely useful technique and a domain where ML seems potentially able to offer a lot. To me, this speaks to an over-emphasis in the field on achieving big, flashy results ("we SOLVED the protein-folding problem!") and a reluctance to do the kind of less sexy grinding that is important to move science forward. To some extent, that is born out of a healthy ambition (if we can solve the big problems, why settle for less?), but IMO there is also an element of the kind of arrogance that people used to associate primarily with physicists.
Great article! I don't think experimentalists won't change plunging techniques because it isn't easy enough, but rather, it's expensive. Vitrobots--the instruments that help with more reproducible freezing--are incompatible with nitrogen freezing because heavy airflow is needed to remove the layer of nitrogen vapor you mention. These instruments cost upwards of 100k+, and so departments that have invested in one probably won't buy another expensive instrument, and then tinker with the experimental parameters required for reproducibility between samples frozen on a different instrument (each vitrobot itself even needs to be calibrated for force and blot time, which isn't consistent between instruments). The amount of optimizations required to get reproducible EM is staggering, and then preferential orientation comes in :)
This is a really great primer, even from a structural biology perspective. I particularly enjoyed your explanation of Fourier Transform in the context of 3D refinement. X-ray crystallography actually involves Fourier transform too, in this case to obtain the diffraction pattern from the electron density and because of experimental constraints in X-ray crystallography, the phase problem arises. Just wanted to add this, as I thought it's pretty cool how FT just seems to be everywhere. Again, a really great primer!
Good write up! One additional constraint in TEM of biological specimens is that the electron beam delivers so much energy that samples are rapidly destroyed. It’s one of the reasons things are kept cold, damage is limited. Detector upgrades that allow faster imaging to capture images before samples are destroyed was another breakthrough. My sense is that a lot of the barriers to high resolution cryo EM revolved around being unable to get a high-information image before your sample was destroyed by the electron beam. Not an expert though, but it’s a fascinating technology where I’m always surprised we can extract 2 Angstrom detail out of those noisy 500px x 500px particle images.
There’s some calculations that suggest a sample under the electron beam is receiving an equivalent radiation dose to a 10 megaton fusion bomb ~30 m away. Ouch. A nice set of notes here: https://bakerlab.ucsd.edu/courses/protected/chem165-13/Sec_IIB.pdf
Thanks for putting together all these interesting papers in one post! Perhaps in "why is there a carbon grid on top of the carbon grid?", the last "carbon" should be "copper"?