Skip to content

CustomEnergyModels

Maxie D. Schmidt edited this page Dec 19, 2021 · 1 revision

Creating custom energy models to load into GTFold

Overview

The energy model DAT files can be modified by a C-sourced program, GTModify software, available through the gtDMMB research group. The parameters are specified as follows:

  • .specification.dat indicates the alphabet and pairs.
  • .coaxial. are parameters for flush coaxial stacking where two helix ends stack without an intervening mismatch.
  • .coaxstack. are parameters for coaxial stacking of helices with an intervening mismatch. This is for the stack where the backbone is not continuous.
  • .dangle. are parameters for dangling ends on pairs.
  • .hexaloop. are parameters for hairpin loops of 6 unpaired nucleotides that have stabilities not well modeled by the parameters.
  • .int11. are parameters for 1×1 internal loops.
  • .int21. are parameters for 2×1 internal loops.
  • .int22. are parameters for 2×2 internal loops.
  • .loop. are parameters for loop initiations.
  • .misloop. are parameters that do not fit into other tables.
  • .stack. are parameters for helical stacking.
  • .tloop. are parameters for hairpin loops of 4 unpaired nucleotides that have stabilities not well modeled by the parameters.
  • .triloop. are parameters for hairpin loops of 3 unpaired nucleotides that have stabilities not well modeled by the parameters.
  • .tstack. are parameters for terminal mismatches in exterior loops.
  • .tstackcoax. are parameters for coaxial stacking of helices with an intervening mismatch. This is for the stack where the backbone is continuous.
  • .tstackh. are parameters for first mismtaches in hairpin loops.
  • .tstacki. are parameters for the mismatches in internal loops.
  • .tstacki23. are parameters for the mismatches in 2×3 internal loops.
  • .tstackm. are parameters for the mismatches in multibranch loops.

Note that not all files are required for all modes that can be specified in the GTFoldPython configuration.

Software for parsing the parameter list energy model formats

A guide to using these parameter lists is documented on this site. Another solid bet for re-usable code to populate the GTFold-like energy model structures can be taken from the Simfold software project, though that application is not the primary usage for simfold.

Notes on some other semi-standardized modified energy model data sets specified by parameter lists

Several apparently at least semi-standard variants of the Turner-Matthews NNDB energy model parameter files have been specified using this parameter list format, including, but not limited to, the following named data set specifications:

  • (MT99): The well-known Turner99 data set (link)
  • (MT09): Another variant of the Matthews-Turner data set (link)
  • (DP03): The Dirks and Pierce '03 model data set (link)
  • (DP09): The Dirks and Pierce '09 model data set (link)
  • (CC06): The Cao and Chen '06 model' data set (link)
  • (CC09): The Cao and Chen '09 model' data set (link)
Clone this wiki locally