As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers.
Alternatively you can here view or download the uninterpreted source code file.

1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2 % % 3 % Copyright (C) 2014 Edward d'Auvergne % 4 % % 5 % This file is part of the program relax (http://www.nmr-relax.com). % 6 % % 7 % This program is free software: you can redistribute it and/or modify % 8 % it under the terms of the GNU General Public License as published by % 9 % the Free Software Foundation, either version 3 of the License, or % 10 % (at your option) any later version. % 11 % % 12 % This program is distributed in the hope that it will be useful, % 13 % but WITHOUT ANY WARRANTY; without even the implied warranty of % 14 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the % 15 % GNU General Public License for more details. % 16 % % 17 % You should have received a copy of the GNU General Public License % 18 % along with this program. If not, see <http://www.gnu.org/licenses/>. % 19 % % 20 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 21 22 23 % N-state model chapter. 24 %%%%%%%%%%%%%%%%%%%%%%%% 25 26 \chapter{The N-state model or ensemble analysis} \label{ch: N-state model} 27 \index{N-state model|textbf} 28 \index{Ensemble analysis|textbf} 29 30 31 \begin{figure*}[h] 32 \includegraphics[width=5cm, bb=0 0 1701 1701]{graphics/misc/n_state_model/phthalic_acid_ens_600x600} 33 \end{figure*} 34 35 36 % Introduction. 37 %%%%%%%%%%%%%%% 38 39 \section{Introduction to the N-state model} 40 41 The modelling of motion in molecules using experimental data consists of either continuous or discrete distributions. 42 These can be visualised respectively as either an infinite number of states or a limited set of N states. 43 The N-state model analysis in relax models the molecular dynamics using an ensemble of static structures. 44 45 This analysis supports a number of data types including: 46 \begin{itemize} 47 \item Residual dipolar couplings (RDCs)\index{Residual dipolar coupling} 48 \item Pseudo-contact shifts (PCSs)\index{Pseudo-contact shifts} 49 \item NOEs\index{NOE} 50 \end{itemize} 51 52 The main idea is to evaluate the quality of a fixed ensemble of structures. 53 relax will not perform structural optimisations. 54 The evaluation includes: 55 \begin{itemize} 56 \item Alignment tensor optimisation for the RDCs and PCSs. 57 \item Optional optimisation of the position of the paramagnetic centre for the PCSs. 58 \item Calculation of NOE constraint violations. 59 \item Q factor calculation for the RDC, PCS, and NOE. 60 \end{itemize} 61 62 Note that this analysis will also handle single structures. 63 Hence you can use the N-state model in relax with N set to 1 to find, for example, a single alignment tensor for a single structure using RDCs, PCSs, or both together. 64 This is useful for comparing a ensemble to a single structure to determine if any statistically significant motions are present. 65 66 The primary references for the N-state model analysis in relax are: 67 \begin{itemize} 68 \item \bibentry{Sun11} 69 \item \bibentry{Erdelyi11} 70 \end{itemize} 71 72 73 74 % Data types. 75 %%%%%%%%%%%%% 76 77 \section{Experimental data support for the N-state model} 78 79 % RDCs. 80 \subsection{RDCs in the N-state model} 81 82 Residual dipolar couplings (RDCs)\index{Residual dipolar coupling|textbf} can be used to evaluate ensembles. 83 The ensemble interconversion is assumed to be fast relative to timescale of the alignment process, hence a single tensor for all members of the ensemble will be used. 84 As such, precise superimposition of structures using a logical frame of reference is very important. 85 This can be performed in relax using the \uf{structure\ufsep{}superimpose} user function. 86 The RDCs can either be from external or internal alignment. 87 88 89 % PCSs. 90 \subsection{PCSs in the N-state model} 91 92 Pseudo-contact shifts (PCSs)\index{Pseudo-contact shifts|textbf} can also be used to evaluate ensembles. 93 The same averaging process as described above for the RDC is assumed. 94 Hence correct structural superimposition is essential and one alignment tensor will be optimised for the entire ensemble. 95 96 One powerful feature of relax is that the paramagnetic centre can either be fixed or be allowed to move during optimisation. 97 This allows an unknown paramagnetic centre position to be found. 98 Or a known position to be refined to higher accuracy than that possible with most other techniques. 99 100 101 % NOEs. 102 \subsection{NOEs in the N-state model} 103 104 Another data type which can be used to evaluate dynamics ensembles is the NOE\index{NOE|textbf}. 105 This is not used in optimisation but rather is used to calculate NOE constraint violations. 106 These violations are then compared to evaluate the ensemble. 107 In the stereochemistry auto-analysis, these violations will also be converted to Q factors to allow direct comparison with RDC Q factors. 108 109 110 111 % Stereochemistry. 112 %%%%%%%%%%%%%%%%%% 113 114 \section{Determining stereochemistry in dynamic molecules} 115 116 A published application of the N-state model in relax is: 117 \begin{itemize} 118 \item \bibentry{Sun11} 119 \end{itemize} 120 121 This analysis of the stereochemistry of a small molecule consists of two steps. 122 The first part is to determine the relative configuration. 123 The idea is to use NMR data (consisting of RDCs and NOEs) to find the relative configuration. 124 Ensembles of 10 members are created from molecular dynamics simulations (MD)\index{molecular dynamics simulation} or simulated annealing (SA)\index{simulated annealing}. 125 These are then ranked by the RDC Q factor and NOE violation. 126 By converting the NOE violation into a Q factor: 127 \begin{equation} 128 Q_{\textrm{NOE}}^2 = \frac{U}{\sum_i \textrm{NOE}^2}, 129 \end{equation} 130 131 where U is the quadratic flat bottom well potential, i.e.\ the NOE violation in \AA$^2$, and the denominator is the sum of all squared NOEs. 132 A combined Q factor is calculated as: 133 \begin{equation} 134 Q_{\textrm{total}}^2 = Q_{\textrm{NOE}}^2 + Q_{\textrm{RDC}}^2. 135 \end{equation} 136 137 The second step is to distinguish enantiomers. 138 As NMR data is symmetric, it cannot distinguish enantiomers. 139 Therefore an optical technique such as \href{http://en.wikipedia.org/wiki/Optical\_rotatory\_dispersion}{optical rotatory dispersion} can be used. 140 For molecules experiencing large amounts of motion, sampling all possible conformations, calculating the expected dispersion properties, and calculating an averaged dispersion curve is not feasible. 141 The idea is therefore to combine NMR and ORD by taking the best NMR ensembles from step one to use for ORD spectral prediction. 142 143 144 % Auto-analysis. 145 %~~~~~~~~~~~~~~~ 146 147 \subsection{Stereochemistry -- the auto-analysis} 148 149 150 Step one of the N-state model is implemented as an auto-analysis. 151 This is located in the module \module{auto\_analysis\pysep{}stereochem\_analysis} (see \url{http://www.nmr-relax.com/api/3.1/auto_analyses.stereochem_analysis-module.html}). 152 The auto-analysis is accessed via the \module{Stereochem\_\linebreak[0]analysis} class, the details of which can be seen at \url{http://www.nmr-relax.com/api/3.1/auto_analyses.stereochem_analysis.Stereochem_analysis-class.html}. 153 154 155 % The sample script. 156 %~~~~~~~~~~~~~~~~~~~ 157 158 \subsection{Stereochemistry -- the sample script} 159 160 The following script was used for the analysis in \citet{Sun11}. 161 It is used to complete the first step of the analysis, the determination of relative configuration, and for the generation of ensembles for the second step of the analysis. 162 The file is located at \file{sample\osus{}scripts\ossep{}n\osus{}state\osus{}model\ossep{}stereochem\osus{}analysis.py}. 163 The contents of the script are: 164 165 166 \begin{lstlisting} 167 """Script for the determination of relative stereochemistry. 168 169 The analysis is preformed by using multiple ensembles of structures, randomly sampled from a given set of structures. The discrimination is performed by comparing the sets of ensembles using NOE violations and RDC Q factors. 170 171 This script is split into multiple stages: 172 173 1. The random sampling of the snapshots to generate the N ensembles (NUM_ENS, usually 10,000 to 100,000) of M members (NUM_MODELS, usually ~10). The original snapshot files are expected to be named the SNAPSHOT_DIR + CONFIG + a number from SNAPSHOT_MIN to SNAPSHOT_MAX + ".pdb", e.g. "snapshots/R647.pdb". The ensembles will be placed into the "ensembles" directory. 174 175 2. The NOE violation analysis. 176 177 3. The superimposition of ensembles. For each ensemble, Molmol is used for superimposition using the fit to first algorithm. The superimposed ensembles will be placed into the "ensembles_superimposed" directory. This stage is not necessary for the NOE analysis. 178 179 4. The RDC Q factor analysis. 180 181 5. Generation of Grace graphs. 182 183 6. Final ordering of ensembles using the combined RDC and NOE Q factors, whereby the NOE Q factor is defined as:: 184 185 Q^2 = U / sum(NOE_i^2), 186 187 where U is the quadratic flat bottom well potential - the NOE violation in Angstrom^2. The denominator is the sum of all squared NOEs - this must be given as the value of NOE_NORM. The combined Q is given by:: 188 189 Q_total^2 = Q_NOE^2 + Q_RDC^2. 190 """ 191 192 # relax module imports. 193 from auto_analyses.stereochem_analysis import Stereochem_analysis 194 195 196 # Stage of analysis (see the docstring above for the options). 197 STAGE = 1 198 199 # Number of ensembles. 200 NUM_ENS = 100000 201 202 # Ensemble size. 203 NUM_MODELS = 10 204 205 # Configurations. 206 CONFIGS = ["R", "S"] 207 208 # Snapshot directories (corresponding to CONFIGS). 209 SNAPSHOT_DIR = ["snapshots", "snapshots"] 210 211 # Min and max number of the snapshots (corresponding to CONFIGS). 212 SNAPSHOT_MIN = [0, 0] 213 SNAPSHOT_MAX = [76, 71] 214 215 # Pseudo-atoms. 216 PSEUDO = [ 217 ["Q7", ["@H16", "@H17", "@H18"]], 218 ["Q9", ["@H20", "@H21", "@H22"]], 219 ["Q10", ["@H23", "@H24", "@H25"]] 220 ] 221 222 # NOE info. 223 NOE_FILE = "noes" 224 NOE_NORM = 50 * 4**2 # The NOE normalisation factor (sum of all NOEs squared). 225 226 # RDC file info. 227 RDC_NAME = "PAN" 228 RDC_FILE = "pan_rdcs" 229 RDC_SPIN_ID1_COL = 1 230 RDC_SPIN_ID2_COL = 2 231 RDC_DATA_COL = 2 232 RDC_ERROR_COL = None 233 234 # Bond length. 235 BOND_LENGTH = 1.117 * 1e-10 236 237 # Log file output (only for certain stages). 238 LOG = True 239 240 # Number of buckets for the distribution plots. 241 BUCKET_NUM = 200 242 243 # Distribution plot limits. 244 LOWER_LIM_NOE = 0.0 245 UPPER_LIM_NOE = 600.0 246 LOWER_LIM_RDC = 0.0 247 UPPER_LIM_RDC = 1.0 248 249 250 # Set up and code execution. 251 analysis = Stereochem_analysis( 252 stage=STAGE, 253 num_ens=NUM_ENS, 254 num_models=NUM_MODELS, 255 configs=CONFIGS, 256 snapshot_dir=SNAPSHOT_DIR, 257 snapshot_min=SNAPSHOT_MIN, 258 snapshot_max=SNAPSHOT_MAX, 259 pseudo=PSEUDO, 260 noe_file=NOE_FILE, 261 noe_norm=NOE_NORM, 262 rdc_name=RDC_NAME, 263 rdc_file=RDC_FILE, 264 rdc_spin_id1_col=RDC_SPIN_ID1_COL, 265 rdc_spin_id2_col=RDC_SPIN_ID2_COL, 266 rdc_data_col=RDC_DATA_COL, 267 rdc_error_col=RDC_ERROR_COL, 268 bond_length=BOND_LENGTH, 269 log=LOG, 270 bucket_num=BUCKET_NUM, 271 lower_lim_noe=LOWER_LIM_NOE, 272 upper_lim_noe=UPPER_LIM_NOE, 273 lower_lim_rdc=LOWER_LIM_RDC, 274 upper_lim_rdc=UPPER_LIM_RDC 275 ) 276 analysis.run() 277 \end{lstlisting} 278 279 In contrast to all of the other auto-analyses, here you do not set up your own data pipe containing all of the relevant data that is then passed into the auto-analysis. 280 This may change in the future to allow for more flexibility in how you load structures, load the RDC and NOE base data, set up pseudo-atoms and bond lengths for the RDC, etc. 281 282 Note that you need to execute this script 6 times, changing the \pycode{STAGE} variable to match. 283 These stages are fully documented at the start of the script. 284 285 Due to the original analysis being performed prior to the addition of the \uf{structure\ufsep{}superimpose} user function to relax, you will see that the auto-analysis performs superimposition of each ensemble using the external software \software{Molmol}. 286 If you wish to perform this analysis without using \software{Molmol}, please contact the relax users mailing list ``\relaxUsersML''\index{mailing list!relax-users} (see Section~\ref{sect: relax-users mailing list} on page~\pageref{sect: relax-users mailing list}). 287 It would be rather straightforward for the relax developers to replace the complicated \software{Molmol} superimposition code with a single call to the \uf{structure\ufsep{}superimpose} user function.