METHYL ASSIGNMENTS USING SATISFIABILITY (MAUS)

MAUS has been tested for systems containing 35 to 94 methyl groups, using manually picked 3D or 4D NOE data. Because MAUS builds a system of hard constraints and uses it to enumerate all possible assignments, in order to avoid inconsistent inputs that can lead to "MAUS did not run successfully" output, meaning that MAUS cannot identify any assignments which satisfy all the inputs, for larger systems special emphasis must be placed on the instructions below. For all systems containing more than 90 methyl groups, we highly recommend stereospecific Leu/Val labelling and the highest possible resolution for recording reference 2D methyl HMQC spectra to reduce overlaps.

Input

Required files

As stated in our methods section, MAUS takes as input the following files:

  1. PDB structure
  2. 2D HMQC list
  3. NOESY list/s

1. PDB structure

The user can input a single crystal structure or a model or an ensemble of structures packed into a single PDB file. The single structure is further relaxed to account for different side-chain rotamers. This step generates eight conformations in Rosetta force field.

In the 'Structural ensemble' option, user can input an ensemble of structures generated using external softwares, NMR or Rosetta. This ensemble may contain any number of structures greater than 1 and less than 15. MAUS expects the user to submit the ensemble as a single PDB file. Each individual models in the PDB file should have 'MODEL number' and be separated by 'ENDMDL'. An example (1jnj_ensemble.pdb) of packed pdb file for ensemble is provided in the example folder.

Note: Before running MAUS, users must ensure that the PDB structure that they use as input contains ALL methyl groups present in the protein construct they used to record the NMR data. Methyl groups missing in the structure (e.g., in the case of termini and loop regions) can be modeled using existing on-line servers.

2. 2D HMQC list

The 2D HMQC reference list has to be in SPARKY .list format (see here for examples) with slight modifications to the first column also referred to as Assignment column or label for the corresponding 2D peak (see images of examples below). In this file, notice that w1 column corresponds to carbon and w2 corresponds to hydrogen. Data height column shows intensity of the peaks. The first image that contains X-tag is a more generic case, the other two cases force the assignments and are hence used as constraints by MAUS.

Here, the Assignment column consists of four fields:

Residue type/s

This field indicates the residue type of a given 2D peak. A user can provide multiple residue types if they are unable to identify unique residue type. For example, L, V, A, I, LV, ALV, MILV, ... etc. are all valid residue types.

X-tag

This tag is used to indicate to MAUS that the provided assignment label should not be used to force assignments. If the user has uniquely identified an assignment of the input 2D peaks and wants to provide that information to MAUS, then do not specify this tag after Residue type/s.

Residue ID or ID

This is a numeric ID given to the peak. If any two peaks are geminals, they can have a same residue ID or ID. This number can be any residue number in a sequence or structure or an arbitrary unique number. This number is utilized specifically to identify a given 2D peak. If you want to force some of the assignments, then this field must contain the residue ID of the input structure.

Stereospecificity of Valine or Leucine

If you have indicated that side chain labeling scheme is both in the input form and you have access to stereospecific assignments of 2D peaks, you may indicate the stereospecificity of geminal methyls using R or S tag to indicate Pro-R (CG1 for Valine and CD1 for Leucine) or Pro-S (CG2 for Valine and CD2 for Leucine) methyl respectively. If you do not know the stereospecificity, you may just add ?. This field can be ignored if the peak belongs to either Alanine, Isoleucine or Methionine.

NOTE: At present, stereospecificity is supported for all or none of the input 2D peaks. This means that you may not mix non-stereospecific and stereospecific labels for 2D peaks in the input 2D HMQC list.

Examples

Below are a few examples of valid assignment labels in the HMQC file: Here are the explanation of the example assignment labels provided above: A word of caution - If you are forcing your resonance assignments, then please make sure you are 100% confident about them. A simple mistake here can cause the world to crash.

3. NOESY list/s

MAUS supports 3D and 4D NOESY lists in SPARKY .list format (see here for examples). A user can input either 3D lists or 4D list (and not both) upon expanding the input form by either clicking 3D or 4D radio button respectively.

3D NOESY lists (CCH format)

If you are using 3D NOESY lists, then MAUS expects two NOESY lists recorded with short (typically 50 ms) and long (typically 300 ms) mixing times. Here, w1 column of a 3D list corresponds to carbon (attached to NOE 1H), w2 corresponds to carbon (attached to direct 1H) and w3 corresponds to direct 1H. Below is a screenshot of sample 50 ms NOE file.

4D NOESY lists (HCCH format)

If you are using 4D NOESY lists, then MAUS expects only one NOESY list recorded with long (typically 300 ms) mixing time. Here, w1 column of a 4D list corresponds to NOE 1H dimension, w2 corresponds to carbon attached to NOE 1H (or w1), w3 corresponds to carbon (attached to direct 1H) and w4 corresponds to direct 1H.

Please note that the NOESY list file format is exactly as provided in the link above. Unlike 2D file's assignment column, the user need not specify any information here. You can just add ?-?-? and let MAUS handle the rest. In addition, the first row of these NOESY files contains titles of columns, you can name the columns however you want; for instance, the 3D NOESY example file has Assignment, w1, w2 and Data Height, and the 4D NOEST example file has Assignment H1, C1, C2, H2 and Intensity.

CAUTION: In order to ensure that NOEs are clustered to their correct 2D methyl resonances, the 3D or 4D spectra must be carefully aligned, i.e., the 3D or 4D NOE data have to be phase corrected. In addition, the NOE peaks must be picked at sufficiently high signal-to-noise ratio levels (>5).

Example input:

Below is an example folder (example.zip) containing input files whose formats are compatible with MAUS. In this example folder, we have an X-ray structure in .pdb file format (HB2M.pdb), a 2D HMQC chemical shift file (HB2M_2D.list) and two 3D NOESY files; one recorded using short (50 ms) mixing time (HB2M_50ms_3D.list) and the other recorded using longer (300 ms) mixing time (HB2M_300ms_3D.list). The example data was recorded with AILV labeled sample. If you want to test MAUS with our example, upload the respective files provided in example.zip and select AILV for the Labeling scheme. You can leave the rest of the parameters in the input form to their default values. A sample final output file is also provided. The example file also contains an additional pdb file (1jnj_ensemble.pdb) that demonstrates a packed ensemble of structures.

example.zip

Parameters

Here are a few input parameters submitted to MAUS using fields provided in the input form.

Output

MAUS sends an e-mail from "Apache" with a subjectline "MAUS results" to the user when their results become available. This e-mail will contain a link to an output file that provides a detailed explanation of NOEs and final valid resonance assignments options. A link to the results is available at the webinterface also.

MAUS job complete

If the user receives an e-mail with MAUS job complete subject line, it means your job successfully ran. The output file sent to the user is divided into summary of user input, NOE peaks and their contribution, resonance assignments statistics and actual resonance assignment options suggested by MAUS.

The first section of MAUS's output gives a summary of the input data, 2D and NOE peaks (see image below). Specifically, MAUS lists input form information, the number of 2D peaks, total number of NOEs (computed by taking the union of long and short NOE peak lists), number of diagonal NOEs (of course they must perfectly overlap in both short and long NOE lists), number of NOEs that are ambiguous and hence not used in the construction of data graph (also referred to as complex components), number of NOEs that are unambiguous and then used in the construction of data graph (the row that says NOEs used after the reduction of complex components). In addition to this information, effective degree connectivity (= 2 * number of nodes/ number of edges) is also provided. In other words, this metric can be redefined as average number of NOEs per residue in the data graph.

The second section lists the NOE peaks information. In particular, diagonal NOEs identified by MAUS are listed first for user to inspect or examine if need be (see image below).

The second section continues to also report on non-diagonal NOEs (see screenshot below). Here, (i) Label column indicates MAUS's label used for the NOESY peak, (ii) Annotation column indicates if a NOESY peak is recorded at either a short or a long mixing time, (iii) C1, C2, H2 and Intensity are carbon, carbon, proton dimensions and data height input by the user, (iv) Clusters column indicates which 2D peak/s is the NOE peak clustered into, and finally (v) Symmetry matches indicates which peaks symmetrize with one another. If peaks are part of complex components, it is indicated by "complex" and hence unused by MAUS.

The third section gives resonance assignments statistics. It shows a bar chart of number of valid assignment options for each residue (see image below).

The last section gives the user, final valid resonance assignment options (see image below). It has user-defined Assignment column in 2D HMQC file under Label column, Residue type/s specified manually. It has forced assignments column which gets populated if the user has forced any assignments. Subsequently, it displays carbon and proton dimension values for the peak and finally all the valid assignment options identified by MAUS.

MAUS job failed

If the user receives an email stating "MAUS job failed", it may also contain information about troubleshooting. If the mail does not provide additional information, please check the following possibilities.
  1. In order to ensure that NOEs are clustered to their correct 2D methyl resonances, the 3D or 4D spectra must be carefully aligned, i.e., the 3D or 4D NOE data have to be phase corrected. In addition, the NOE peaks must be picked at sufficiently high signal-to-noise ratio levels (>5).
  2. If you have manually specified peak residue types, please check them to make sure you are 100% confident.
  3. If you have manually specified geminal methyl peaks, please check them to make sure you are 100% confident or use our geminal classifier.
  4. If you have provided known assignments and forced them, please check to make sure you are 100% confident about the assignments.
  5. Try adjusting maximum distance for short and long mixing time systematically from 10 to 6 and 15 to 10 respectively.
  6. Try increasing the clustering tolerances in the input form. If your clustering tolerances are very low, then an NOE peak may not be clustered into the correct 2D peak/s.
  7. Try increasing the symmetry tolerances in the input form. If your symmetry tolerances are very low, then your NOE network may not contain the correct symmetry connectivities.

FAQ

  1. I received an e-mail that says, my MAUS job failed, What do I do?

    Follow the steps provided in your e-mail. If you have thoroughly followed and checked every point on help page and still getting job failed message, then e-mail kuttikands@chop.edu.

  2. I am getting different results each time I run MAUS, but you claim that you are exhaustive which means I should get back the same result each time. What is going on?

    There is a heuristic part to this program called Relax. Rosetta's relax protocol is an optimization protocol that will relax your input crystal structure in its own force field. MAUS generates its structure graph with only 10 relaxed models of your input structure. If your input structure has missing electron density or large part of it containing loops, there is a high chance that Relax samples a lot of different conformations each time.

Return to home