Karishma: “Would you be open to a remote internship using bioinformatics approaches to open a completely new research area in my group?” was my offer to three students from different institutes in the final year of their Masters’ program. This was in response to their emails in June 2020, in the middle of a pandemic-related lockdown. Gandhar Tendulkar (MSc Bioinformatics, Sir Sitaram and Lady Shantabai Patkar College of Arts & Science and V. P. Varde College of Commerce & Economics (Autonomous), Mumbai), Shreeya Mhade (MSc Bioinformatics, Guru Nanak Khalsa College (Autonomous), Mumbai) and Stutee Panse (currently pursuing MS Biotechnology, The Pennsylvania State University, USA) were seeking research internships with me (Karishma S. Kaushik, Assistant Professor, Savitribai Phule Pune University, Pune) in what was undoubtedly a tough time to find one.
Starting the project
Karishma: As with most research groups, my lab at the University, which focused entirely on ‘bench’ experiments, was closed. This unprecedented lack of access to wet-lab facilities had prompted me to think of how we could expand our research program, which focused on human-relevant infection biology, using computer-based experimental tools. I responded to the three students with the best offer I could at the time. The offer was daunting, but the young researchers were up to the challenge.
Building a team
Shreeya: In our first meeting, we shared our skill sets and knowledge domains. While Gandhar and I were familiar with select bioinformatics tools, Stutee had a strong background in microbiology. These diverse yet mutually synergistic skill sets enabled us to build the project. While Masters’ students are typically expected to seek out research projects on their own, this project sets an example in changing that model to a possible ‘team model’ where students can seek out opportunities in small groups. This provides students with a more realistic research experience (collaborators, team members from diverse backgrounds, and cross-talk between fields) and leads to a complete project (in the form of a publication).
Arriving at the research idea
Karishma: Prior to lockdown, my research group was looking to start working with Corynebacterium striatum, an emerging and highly-resistant, biofilm-forming wound pathogen. I posited the idea of using in silico approaches to identify potential small molecules or natural inhibitors as anti-biofilm agents in C. striatum that could serve as a filtered list for subsequent in vitro evaluation. This required a two-pronged approach: identifying a potential anti-biofilm target in C. striatum, as well as searching for possible candidate agents.
To start with, Stutee, Shreeya, and Gandhar worked together, looking at repositories of small molecules or natural inhibitors. Snehal, a former researcher in my group, helped keep the project on track in these early days. There were many unknowns, including the science and team dynamics.
Becoming a community resource
Shreeya: Since the initial idea was to identify small molecules or natural inhibitors as anti-biofilm approaches, we started looking at antimicrobial peptides (AMPs) as potential candidates. Antimicrobial compounds act differently on free-floating bacterial cells compared with aggregates of bacteria, as seen in biofilms. Further, biofilm testing in laboratories is time and resource-intensive. Given this, preliminary in silico studies such as molecular docking can help narrow down candidate anti-biofilm agents. For this, we identified an exhaustive list of AMPs and started developing structural AMP models using a range of molecular modeling tools.
Stutee: While Shreeya and Gandhar were troubleshooting the modeling tools, I started looking for candidate proteins or enzymes essential for biofilm formation in C. striatum. Based on previous literature, sortase C is important for biofilm formation in Gram-positive pathogens; however, the crystal structure of the C. striatum protein was not available. I worked with Shreeya and Gandhar to develop a homology model of the C. striatum sortase C protein. At this point, we had a potential anti-biofilm target and an array of predicted AMP structural models. The next phase involved extensive protein-peptide molecular docking, which was used to put forth a preference score of candidate AMPs for in vitro evaluation.
Karishma: We realized that, in addition to developing a pipeline for identifying candidate AMPs for anti-biofilm testing, we had built a vast library of 3D AMP structural models. Further, we had script-based filtered lists of models of AMPs with known anti-Gram positive and anti-Gram negative activity. Given the paucity of AMP resources for biofilm studies and the lack of structural AMP models, we decided to build the project into a community resource that could be leveraged by researchers across the fields of basic, clinical and applied microbiology, including biofilms and antibiotic resistance, and bioinformatics. What had started as a project to identify anti-biofilm candidates against a single pathogen, was now turning into a large-scale repository of AMPs for biofilm studies.
Expanding the team to collaborators
Karishma: To build the functional features of the database, we collaborated with Ragothaman M. Yennamalli (Assistant professor) and Yatindrapravanan Narsimhan (undergraduate student) from SASTRA Deemed University, Thanjavur. Using script-based search tools, they annotated AMPs to existing biofilm literature sources. This meant that the database could also provide information on scientific articles where a particular AMP was evaluated or discussed in the context of biofilms. Overall, this underscores the fact that research projects are very dynamic, and may require bringing in colleagues with the relevant skill sets to take it further at different stages.
Remote execution and troubleshooting
Karishma: Through the one-and-a-half years of the project, we met weekly for an hour, with regular email contact. This was important to keep the project on track and discuss the data as a group. Further, this turned out to be critical in fostering camaraderie amongst the team, given that we were all in different locations. I also know that Gandhar, Stutee, and Shreeya had regular meetings amongst themselves (on rare occasions, even at midnight!) to work in tandem. An example of this was seen when Shreeya and Gandhar were developing the homology model of the sortase C protein. Through her literature review, Stutee identified characteristic residues in the model that would block access to the catalytic site and likely functioned as a flexible ‘hinge’ under cellular conditions. Based on this, Shreeya and Gandhar re-developed the homology model of sortase C to mimic a more physiological conformation of the protein.
Shreeya: Unlike ligands or small molecules, peptides are structurally more flexible and can adopt numerous conformations. This, combined with a paucity of protein-peptide docking software, made the virtual screening of AMPs on a personal computer system a challenge. In our late-night sessions, we brainstormed ideas and technical solutions. We had several ‘Eureka’ moments, from figuring out how to incorporate GPU for Autodock to understanding CUDA frameworks to accelerate computer-intensive applications.
Stutee: Taken together, the project was a steep learning curve, but we looked at every challenge as an opportunity to find solutions. At the start, modeling over 5000 AMPs seemed like a huge task, but we divided the work, and worked on multiple systems to get more efficient output. This allowed us to build structural models of over 5000 AMPs using multiple software in a relatively short time frame. On one occasion, we were having significant issues with the molecular docking software. To overcome this, we searched for and compared over 80 existing docking software, and listed out the pros, cons, and usage possibilities. Through this we learnt a great deal about existing programs, and could make an informed decision on the most suitable one for our use.
Taking the project further
Karishma: Biofilm-AMP is a now a published structural and functional repository of AMPs for biofilm studies, with a vast library of diverse AMP models (in terms of source, size, structure, and activity), as well as filtered lists of AMPs and protein-peptide interaction models. The functional features of the repository hosts annotations to >10,000 relevant biofilm literature sources. The database is freely available to the community and has a user-friendly interface, with downloadable files for a range of in silico applications. Our work was published on 16 December, 2021 in Frontiers in Cellular and Infection Microbiology.
In what was a tough professional year, Stutee, Shreeya and Gandhar exemplify the potential of student-led research, with remarkable drive, persistence and ownership throughout the project. We met for the first time in-person one year after working together in August 2021, and needless to say it was special. Beyond the science, this research experience will always represent how we turned around a tough situation to a collective gain for the group and a team of young researchers. The B‑AMP team will continue their association with the project for updating the repository with new structural models. We will also look to expand the features of the database in collaboration with new colleagues. In the future, we are aiming for B‑AMP to serve as a one-stop resource for AMPs for biofilm studies. Nothing remote about that!