"Fossies" - the Fresh Open Source Software Archive

Member "relax-5.0.0/docs/latex/n_state.tex" (18 Apr 2019, 12640 Bytes) of package /linux/privat/relax-5.0.0.src.tar.bz2:


As a special service "Fossies" has tried to format the requested source page into HTML format using (guessed) TeX and LaTeX source code syntax highlighting (style: standard) with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file.

    1 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
    2 %                                                                             %
    3 % Copyright (C) 2014 Edward d'Auvergne                                        %
    4 %                                                                             %
    5 % This file is part of the program relax (http://www.nmr-relax.com).          %
    6 %                                                                             %
    7 % This program is free software: you can redistribute it and/or modify        %
    8 % it under the terms of the GNU General Public License as published by        %
    9 % the Free Software Foundation, either version 3 of the License, or           %
   10 % (at your option) any later version.                                         %
   11 %                                                                             %
   12 % This program is distributed in the hope that it will be useful,             %
   13 % but WITHOUT ANY WARRANTY; without even the implied warranty of              %
   14 % MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the               %
   15 % GNU General Public License for more details.                                %
   16 %                                                                             %
   17 % You should have received a copy of the GNU General Public License           %
   18 % along with this program.  If not, see <http://www.gnu.org/licenses/>.       %
   19 %                                                                             %
   20 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
   21 
   22 
   23 % N-state model chapter.
   24 %%%%%%%%%%%%%%%%%%%%%%%%
   25 
   26 \chapter{The N-state model or ensemble analysis} \label{ch: N-state model}
   27 \index{N-state model|textbf}
   28 \index{Ensemble analysis|textbf}
   29 
   30 
   31 \begin{figure*}[h]
   32   \includegraphics[width=5cm, bb=0 0 1701 1701]{graphics/misc/n_state_model/phthalic_acid_ens_600x600}
   33 \end{figure*}
   34 
   35 
   36 % Introduction.
   37 %%%%%%%%%%%%%%%
   38 
   39 \section{Introduction to the N-state model}
   40 
   41 The modelling of motion in molecules using experimental data consists of either continuous or discrete distributions.
   42 These can be visualised respectively as either an infinite number of states or a limited set of N states.
   43 The N-state model analysis in relax models the molecular dynamics using an ensemble of static structures.
   44 
   45 This analysis supports a number of data types including:
   46 \begin{itemize}
   47 \item Residual dipolar couplings (RDCs)\index{Residual dipolar coupling}
   48 \item Pseudo-contact shifts (PCSs)\index{Pseudo-contact shifts}
   49 \item NOEs\index{NOE}
   50 \end{itemize}
   51 
   52 The main idea is to evaluate the quality of a fixed ensemble of structures.
   53 relax will not perform structural optimisations.
   54 The evaluation includes:
   55 \begin{itemize}
   56 \item Alignment tensor optimisation for the RDCs and PCSs.
   57 \item Optional optimisation of the position of the paramagnetic centre for the PCSs.
   58 \item Calculation of NOE constraint violations.
   59 \item Q factor calculation for the RDC, PCS, and NOE.
   60 \end{itemize}
   61 
   62 Note that this analysis will also handle single structures.
   63 Hence you can use the N-state model in relax with N set to 1 to find, for example, a single alignment tensor for a single structure using RDCs, PCSs, or both together.
   64 This is useful for comparing a ensemble to a single structure to determine if any statistically significant motions are present.
   65 
   66 The primary references for the N-state model analysis in relax are:
   67 \begin{itemize}
   68   \item \bibentry{Sun11}
   69   \item \bibentry{Erdelyi11}
   70 \end{itemize}
   71 
   72 
   73 
   74 % Data types.
   75 %%%%%%%%%%%%%
   76  
   77 \section{Experimental data support for the N-state model}
   78 
   79 % RDCs.
   80 \subsection{RDCs in the N-state model}
   81 
   82 Residual dipolar couplings (RDCs)\index{Residual dipolar coupling|textbf} can be used to evaluate ensembles.  
   83 The ensemble interconversion is assumed to be fast relative to timescale of the alignment process, hence a single tensor for all members of the ensemble will be used.
   84 As such, precise superimposition of structures using a logical frame of reference is very important.
   85 This can be performed in relax using the \uf{structure\ufsep{}superimpose} user function.
   86 The RDCs can either be from external or internal alignment.
   87 
   88 
   89 % PCSs.
   90 \subsection{PCSs in the N-state model}
   91 
   92 Pseudo-contact shifts (PCSs)\index{Pseudo-contact shifts|textbf} can also be used to evaluate ensembles.  
   93 The same averaging process as described above for the RDC is assumed.
   94 Hence correct structural superimposition is essential and one alignment tensor will be optimised for the entire ensemble.
   95 
   96 One powerful feature of relax is that the paramagnetic centre can either be fixed or be allowed to move during optimisation.
   97 This allows an unknown paramagnetic centre position to be found.
   98 Or a known position to be refined to higher accuracy than that possible with most other techniques.
   99 
  100 
  101 % NOEs.
  102 \subsection{NOEs in the N-state model}
  103 
  104 Another data type which can be used to evaluate dynamics ensembles is the NOE\index{NOE|textbf}.
  105 This is not used in optimisation but rather is used to calculate NOE constraint violations.
  106 These violations are then compared to evaluate the ensemble.
  107 In the stereochemistry auto-analysis, these violations will also be converted to Q factors to allow direct comparison with RDC Q factors.
  108 
  109 
  110 
  111 % Stereochemistry.
  112 %%%%%%%%%%%%%%%%%%
  113 
  114 \section{Determining stereochemistry in dynamic molecules}
  115 
  116 A published application of the N-state model in relax is:
  117 \begin{itemize}
  118   \item \bibentry{Sun11}
  119 \end{itemize}
  120 
  121 This analysis of the stereochemistry of a small molecule consists of two steps.
  122 The first part is to determine the relative configuration.
  123 The idea is to use NMR data (consisting of RDCs and NOEs) to find the relative configuration.
  124 Ensembles of 10 members are created from molecular dynamics simulations (MD)\index{molecular dynamics simulation} or simulated annealing (SA)\index{simulated annealing}.
  125 These are then ranked by the RDC Q factor and NOE violation.
  126 By converting the NOE violation into a Q factor:
  127 \begin{equation}
  128     Q_{\textrm{NOE}}^2 = \frac{U}{\sum_i \textrm{NOE}^2},
  129 \end{equation}
  130 
  131 where U is the quadratic flat bottom well potential, i.e.\ the NOE violation in \AA$^2$, and the denominator is the sum of all squared NOEs.
  132 A combined Q factor is calculated as:
  133 \begin{equation}
  134     Q_{\textrm{total}}^2 = Q_{\textrm{NOE}}^2 + Q_{\textrm{RDC}}^2.
  135 \end{equation}
  136 
  137 The second step is to distinguish enantiomers.
  138 As NMR data is symmetric, it cannot distinguish enantiomers.
  139 Therefore an optical technique such as \href{http://en.wikipedia.org/wiki/Optical\_rotatory\_dispersion}{optical rotatory dispersion} can be used.
  140 For molecules experiencing large amounts of motion, sampling all possible conformations, calculating the expected dispersion properties, and calculating an averaged dispersion curve is not feasible.
  141 The idea is therefore to combine NMR and ORD by taking the best NMR ensembles from step one to use for ORD spectral prediction.
  142 
  143 
  144 % Auto-analysis.
  145 %~~~~~~~~~~~~~~~
  146 
  147 \subsection{Stereochemistry -- the auto-analysis}
  148 
  149 
  150 Step one of the N-state model is implemented as an auto-analysis.
  151 This is located in the module \module{auto\_analysis\pysep{}stereochem\_analysis} (see \url{http://www.nmr-relax.com/api/3.1/auto_analyses.stereochem_analysis-module.html}).
  152 The auto-analysis is accessed via the \module{Stereochem\_\linebreak[0]analysis} class, the details of which can be seen at \url{http://www.nmr-relax.com/api/3.1/auto_analyses.stereochem_analysis.Stereochem_analysis-class.html}.
  153 
  154 
  155 % The sample script.
  156 %~~~~~~~~~~~~~~~~~~~
  157 
  158 \subsection{Stereochemistry -- the sample script}
  159 
  160 The following script was used for the analysis in \citet{Sun11}.
  161 It is used to complete the first step of the analysis, the determination of relative configuration, and for the generation of ensembles for the second step of the analysis.
  162 The file is located at \file{sample\osus{}scripts\ossep{}n\osus{}state\osus{}model\ossep{}stereochem\osus{}analysis.py}.
  163 The contents of the script are:
  164 
  165 
  166 \begin{lstlisting}
  167 """Script for the determination of relative stereochemistry.
  168 
  169 The analysis is preformed by using multiple ensembles of structures, randomly sampled from a given set of structures.  The discrimination is performed by comparing the sets of ensembles using NOE violations and RDC Q factors.
  170 
  171 This script is split into multiple stages:
  172 
  173     1.  The random sampling of the snapshots to generate the N ensembles (NUM_ENS, usually 10,000 to 100,000) of M members (NUM_MODELS, usually ~10).  The original snapshot files are expected to be named the SNAPSHOT_DIR + CONFIG + a number from SNAPSHOT_MIN to SNAPSHOT_MAX + ".pdb", e.g. "snapshots/R647.pdb".  The ensembles will be placed into the "ensembles" directory.
  174 
  175     2.  The NOE violation analysis.
  176 
  177     3.  The superimposition of ensembles.  For each ensemble, Molmol is used for superimposition using the fit to first algorithm.  The superimposed ensembles will be placed into the "ensembles_superimposed" directory.  This stage is not necessary for the NOE analysis.
  178 
  179     4.  The RDC Q factor analysis.
  180 
  181     5.  Generation of Grace graphs.
  182 
  183     6.  Final ordering of ensembles using the combined RDC and NOE Q factors, whereby the NOE Q factor is defined as::
  184 
  185         Q^2 = U / sum(NOE_i^2),
  186 
  187     where U is the quadratic flat bottom well potential - the NOE violation in Angstrom^2. The denominator is the sum of all squared NOEs - this must be given as the value of NOE_NORM.  The combined Q is given by::
  188 
  189         Q_total^2 = Q_NOE^2 + Q_RDC^2.
  190 """
  191 
  192 # relax module imports.
  193 from auto_analyses.stereochem_analysis import Stereochem_analysis
  194 
  195 
  196 # Stage of analysis (see the docstring above for the options).
  197 STAGE = 1
  198 
  199 # Number of ensembles.
  200 NUM_ENS = 100000
  201 
  202 # Ensemble size.
  203 NUM_MODELS = 10
  204 
  205 # Configurations.
  206 CONFIGS = ["R", "S"]
  207 
  208 # Snapshot directories (corresponding to CONFIGS).
  209 SNAPSHOT_DIR = ["snapshots", "snapshots"]
  210 
  211 # Min and max number of the snapshots (corresponding to CONFIGS).
  212 SNAPSHOT_MIN = [0, 0]
  213 SNAPSHOT_MAX = [76, 71]
  214 
  215 # Pseudo-atoms.
  216 PSEUDO = [
  217     ["Q7",  ["@H16", "@H17", "@H18"]],
  218     ["Q9",  ["@H20", "@H21", "@H22"]],
  219     ["Q10", ["@H23", "@H24", "@H25"]]
  220 ]
  221 
  222 # NOE info.
  223 NOE_FILE = "noes"
  224 NOE_NORM = 50 * 4**2    # The NOE normalisation factor (sum of all NOEs squared).
  225 
  226 # RDC file info.
  227 RDC_NAME = "PAN"
  228 RDC_FILE = "pan_rdcs"
  229 RDC_SPIN_ID1_COL = 1
  230 RDC_SPIN_ID2_COL = 2
  231 RDC_DATA_COL = 2
  232 RDC_ERROR_COL = None
  233 
  234 # Bond length.
  235 BOND_LENGTH = 1.117 * 1e-10
  236 
  237 # Log file output (only for certain stages).
  238 LOG = True
  239 
  240 # Number of buckets for the distribution plots.
  241 BUCKET_NUM = 200
  242 
  243 # Distribution plot limits.
  244 LOWER_LIM_NOE = 0.0
  245 UPPER_LIM_NOE = 600.0
  246 LOWER_LIM_RDC = 0.0
  247 UPPER_LIM_RDC = 1.0
  248 
  249 
  250 # Set up and code execution.
  251 analysis = Stereochem_analysis(
  252     stage=STAGE,
  253     num_ens=NUM_ENS,
  254     num_models=NUM_MODELS,
  255     configs=CONFIGS,
  256     snapshot_dir=SNAPSHOT_DIR,
  257     snapshot_min=SNAPSHOT_MIN,
  258     snapshot_max=SNAPSHOT_MAX,
  259     pseudo=PSEUDO,
  260     noe_file=NOE_FILE,
  261     noe_norm=NOE_NORM,
  262     rdc_name=RDC_NAME,
  263     rdc_file=RDC_FILE,
  264     rdc_spin_id1_col=RDC_SPIN_ID1_COL,
  265     rdc_spin_id2_col=RDC_SPIN_ID2_COL,
  266     rdc_data_col=RDC_DATA_COL,
  267     rdc_error_col=RDC_ERROR_COL,
  268     bond_length=BOND_LENGTH,
  269     log=LOG,
  270     bucket_num=BUCKET_NUM,
  271     lower_lim_noe=LOWER_LIM_NOE,
  272     upper_lim_noe=UPPER_LIM_NOE,
  273     lower_lim_rdc=LOWER_LIM_RDC,
  274     upper_lim_rdc=UPPER_LIM_RDC
  275 )
  276 analysis.run()
  277 \end{lstlisting}
  278 
  279 In contrast to all of the other auto-analyses, here you do not set up your own data pipe containing all of the relevant data that is then passed into the auto-analysis.
  280 This may change in the future to allow for more flexibility in how you load structures, load the RDC and NOE base data, set up pseudo-atoms and bond lengths for the RDC, etc.
  281 
  282 Note that you need to execute this script 6 times, changing the \pycode{STAGE} variable to match.
  283 These stages are fully documented at the start of the script.
  284 
  285 Due to the original analysis being performed prior to the addition of the \uf{structure\ufsep{}superimpose} user function to relax, you will see that the auto-analysis performs superimposition of each ensemble using the external software \software{Molmol}.
  286 If you wish to perform this analysis without using \software{Molmol}, please contact the relax users mailing list ``\relaxUsersML''\index{mailing list!relax-users} (see Section~\ref{sect: relax-users mailing list} on page~\pageref{sect: relax-users mailing list}).
  287 It would be rather straightforward for the relax developers to replace the complicated \software{Molmol} superimposition code with a single call to the \uf{structure\ufsep{}superimpose} user function.