Computational Identification and Mapping of Protein-DNA Interactions

Authors

  • Evan Ru Department of Chemistry and Biochemistry, George Mason University, Fairfax, VA
  • Kenneth Foreman Department of Chemistry and Biochemistry, George Mason University, Fairfax, VA

Abstract

Protein-DNA interactions play a crucial role in key biological processes such as gene regulation, replication, and transcription. Understanding these interactions strongly influences the development of gene editing technologies like CRISPR. Machine learning models such as AlphaFold and RoseTTAFold can generally predict protein-DNA structures, but models that can explain their critical binding interactions based on protein or DNA sequences alone remain underdeveloped. Enhancing our understanding of these key interactions will improve the accuracy of such models to the point that we can predict ideal protein binding partners for a given DNA sequence. We developed a Python script to explicitly identify key interactions between protein and DNA in 89 protein-DNA interfaces from the Protein Data Bank. Our script facilitates visualization and analysis of interaction patterns along the DNA sequence, offering a robust framework for understanding the key intermolecular interactions underlying known protein-DNA interfaces, and validating the presence of key interactions on DNA bases. From here, we started developing a second algorithm that determines the best-known DNA-binding protein for an arbitrary stretch of DNA. These algorithms could be adapted to generate modified CRISPR machinery to directly recognize specific DNA sequences, thus potentially increasing CRISPR specificity while decreasing dependence on gRNA.

Published

2024-10-13

Issue

Section

College of Science: Department of Chemistry and Biochemistry