Slide #1.

Guiding Belief Propagation using Domain Knowledge for Protein-Structure Determination Ameet Soni* and Jude Shavlik Dept. of Computer Sciences Dept. of Biostatistics and Medical Informatics Craig Bingman Dept. of Biochemistry Center for Eukaryotic Structural Genomics Presented at the ACM International Conference on Bioinformatics and
More slides like this


Slide #2.

2 Protein Structure Determination  Proteins essential to most cellular function    Structural support Catalysis/enzymatic activity Cell signaling  Protein structures determine function  X-ray crystallography is main
More slides like this


Slide #3.

3 X-ray Crystallography: Background Protein Crystal Collect Diffraction pattern X-ray Beam FFT Protein Structure Interpret Electron-Density Map (3D Image)
More slides like this


Slide #4.

Task Overview 4  Given:    A protein sequence Electron-density map (EDM) of protein Do:  Automatically produce a protein structure (or trace) that is  All atom  Physically feasible SAVRVGLAIM...
More slides like this


Slide #5.

Challenges & Related Work 5 1Å 2Å 3Å 4Å ARP/wARP TEXTAL & RESOLVE Our Method: ACMI
More slides like this


Slide #6.

- Background 6 ACMI Overview - Inference in ACMI-BP - Guiding Belief Propagation - Experiments & Results
More slides like this


Slide #7.

Our Technique: ACMI 7 Perform Local Match Apply Global Constraints Sample Structure ACMI-SH ACMI-BP ACMI-PF bk … *1…M bk+1 *1 pk+1(b k+1 ) *2 pk+1(bk+1 ) bk-1 a priori probability of each AA’s location *M ) pk+1(b k+1 marginal probability all-atom protein of each AA’s location structures
More slides like this


Slide #8.

Previous Work [DiMaio et al, 2007] 8
More slides like this


Slide #9.

ACMI Framework 9 Perform Local Match Apply Global Constraints Sample Structure ACMI-SH ACMI-BP ACMI-PF bk … *1…M bk+1 *1 pk+1(b k+1 ) *2 pk+1(bk+1 ) bk-1 a priori probability of each AA’s location *M ) pk+1(b k+1 marginal probability all-atom protein structures of each AA’s location
More slides like this


Slide #10.

- Background - ACMI Overview 10 1 0 Inference in ACMI-BP - Guiding Belief Propagation - Experiments & Results
More slides like this


Slide #11.

ACMI-BP 11 1 1  ACMI models the probability of all possible traces using a pairwise Markov Random Field (MRF) ALA GLY LYS LEU SER
More slides like this


Slide #12.

12 1 2 ACMI-BP: Pairwise Markov Field ALA  GLY LYS LEU SER Model ties adjacency constraints, occupancy constraints, and Phase 1 priors
More slides like this


Slide #13.

Approximate Inference 13 1 3  P(U|M) intractable to calculate, maximize exactly  ACMI-BP uses Loopy Belief Propagation (BP)    Local, message-passing scheme Distributes evidence between nodes Approximates marginal probabilities if graph has cycles
More slides like this


Slide #14.

14 1 4 ACMI-BP: Loopy Belief Propagation LYS31 LEU32 mLYS31→LEU32 pLYS31 pLEU32
More slides like this


Slide #15.

15 1 5 ACMI-BP: Loopy Belief Propagation LYS31 LEU32 mLEU32→LEU31 pLYS31 pLEU32
More slides like this


Slide #16.

- Background - ACMI Overview - Inference in ACMI-BP 16 1 6 Guiding Belief Propagation - Experiments & Results
More slides like this


Slide #17.

Message Scheduling 17  Key design choice: message-passing schedule  When BP is approximate, ordering affects solution [Elidan et al, 2006]   Best case: wasted resources  ACMI-BP uses a naïve, round-robin Worst case: poor information given more influence schedule ALA LYS SER
More slides like this


Slide #18.

Using Domain Knowledge 18 1 8  Idea: use expert to assign importance of messages  Biochemist insight: well-structured regions of protein correlate with strong features in density map   eg, helices/strands have stable conformations Protein disorder - regions of a structure that are unstable/hard to define 
More slides like this


Slide #19.

Guided ACMI-BP 19 1 9
More slides like this


Slide #20.

Related Work 20 2 0  Assumption: messages with largest change in value are more useful  Residual Belief Propagation [Elidan et al, UAI 2006]  Calculates residual factor for each node  Each iteration, highest residual node passes messages General BP technique 
More slides like this


Slide #21.

21 2 1 Background ACMI Overview Inference in ACMI-BP Guiding Belief Propagation Experiments & Results
More slides like this


Slide #22.

Message Schedulers Tested 22 2 2  Our previous technique: naive, round robin (BP)  Our proposed technique: Guidance using disorder prediction (DOBP)  Disorder prediction using DisEMBL [Linding et al, 2003]   Prioritize residues with high stability (ie, low disorder)
More slides like this


Slide #23.

Experimental Methodology 23 2 3  Run whole ACMI pipeline     Phase 1: Local amino-acid finder (prior probabilities) Phase 2: Either BP, DOBP, or RBP Phase 3: Sample all-atom structures from Phase 2 results Test set of 10 poor-resolution electrondensity maps  From UW Center for Eukaryotic Structural Genomics
More slides like this


Slide #24.

ACMI-BP Marginal Accuracy 24 2 4
More slides like this


Slide #25.

ACMI-BP Marginal Accuracy 25 2 5
More slides like this


Slide #26.

Protein Structure Results 27 2 7  Do these better marginals produce more accurate protein structures? RBP fails to produce structures in ACMI-PF   Marginals are high in entropy (28.48 vs 5.31) Insufficient sampling of correct locations
More slides like this


Slide #27.

Conclusions 28 2 8  Our contribution: framework for utilizing domain knowledge in BP message scheduling    Our technique improves inference in ACMI    General technique for belief propagation Alternative to information-based techniques Disorder prediction used in our framework Residual-based technique fails
More slides like this


Slide #28.

Acknowledgements 29 2 9 Phillips Laboratory at UW - Madison  UW Center for Eukaryotic Structural Genomics (CESG)  NLM R01-LM008796  NLM Training Grant T15-LM007359  NIH Protein Structure Initiative Grant GM074901 
More slides like this