Das lab creates language model for protein interactions
Das lab creates language model for protein interactions

A team of researchers in Assistant Professor Jishnu Das’s lab has developed a new language model that could revolutionize how scientists understand protein interactions. This model, Sliding Window Interaction Grammar (SWING), captures the nuances of interactions between proteins that current protein language models do not. 

SWING is a biologically inspired interaction language model, which uses a “sliding window” to pair residues across proteins and generate a language-like representation of the paired sequences. This approach creates a representation that is specific to the protein-protein interaction. 

PhD students Alisa Omelchenko, Jane Siwek and Prabal Chhibbar contributed to this research. “The three of us are all first co-authors,” Siwek said. “We all contributed equally, and it was definitely a team effort.” 

This was also a key collaboration with the Joglekar lab. Assistant Professor Alok Joglekar was a co-corresponding author on the paper.  

The Das lab applied SWING to peptide-protein interactions and their human counterparts. SWING can predict alterations of protein interactions by missense variants and determine the effect of genetic variation on protein-peptide interactions. 

“SWING is very different, because not only did we create a language model, but we also ended up creating the language as well,” Chhibbar said. 

SWING takes a different approach than other interaction language models because it focuses on the biologically formed language of protein interactions. By learning grammar and vocabulary, SWING can infer protein-protein interactions across biological contexts. 

“This model is really flexible, and we have proved that you can somehow codify the interaction of two biological molecules to then understand certain interactions or binding,” Omelchenko said. 

The Das lab envisions that SWING will be a useful tool for vaccine design, for understanding immune attack and tolerance, and for studying genetic risk of disease. It will also benefit researchers doing cross-species studies because it can be translated into a variety of species including mice and humans. In conjunction with the Joglekar lab, the Das lab is currently exploring a wide variety of uses for SWING. 

“This is really the tip of the iceberg,” Das said. “Protein language models are extremely powerful, but they are not sufficient at capturing complex biological interactions. We hope that SWING will open avenues to doing this.”