Using Sims to dissect trajectories¶
Sim objects are designed to store datasets that are obtained from a single simulation, and they give a direct interface to trajectory data by way of the MDAnalysis Universe object.
To generate a Sim from scratch, we need only give it a name. This will be used to distinguish the Sim from others, though it need not be unique. We can also give it a topology and/or trajectory files as we would to an MDAnalysis Universe
>>> from mdsynthesis import Sim
>>> s = Sim('scruffy', universe=['path/to/topology', 'path/to/trajectory'])
This will create a directory scruffy
that contains a single file
(Sim.<uuid>.h5
). That file is a persistent representation of the Sim on disk.
We can access trajectory data by way of
>>> s.universe
<Universe with 47681 atoms>
The Sim can also store selections by giving the usual inputs to
Universe.selectAtoms
>>> s.selections.add('backbone', 'name CA', 'name N', 'name C')
And the AtomGroup can be conveniently obtained with
>>> s.selections['backbone']
<AtomGroup with 642 atoms>
Note
Only selection strings are stored, not the resulting atoms of those selections. This means that if the topology of the Universe is replaced or altered, the AtomGroup returned by a particular selection may change.
Multiple Universes¶
Often it is necessary to post-process a simulation trajectory to get it into a useful form for analysis. This may involve coordinate transformations that center on a particular set of atoms or fit to a structure, removal of water, skipping of frames, etc. This can mean that for a given simulation multiple versions of the raw trajectory may be needed.
For this reason, a Sim can store multiple Universe definitions. To add a definition, we need a topology and a trajectory file
>>> s.universes.add('anotherU', 'path/to/topology', 'path/to/trajectory')
>>> s.universes
<Universes(['anotherU', 'main'])>
and we can make this the active Universe with
>>> s.universes['anotherU']
>>> s
<Sim: 'scruffy' | active universe: 'anotherU'>
Only a single Universe may be active at a time. Atom selections that are stored correspond to the currently active Universe, since different selection strings may be required to achieve the same selection under a different Universe definition. For convenience, we can copy the selections corresponding to another Universe to the active Universe with
>>> s.selections.copy('main')
Need two Universe definitions to be active at the same time? Re-generate a second Sim instance from its representation on disk and activate the desired Universe.
Resnums can also be stored¶
Depending on the simulation package used, it may not be possible to have the resids of the protein match those given in, say, the canonical PDB structure. This can make selections by resid cumbersome at best. For this reason, residues can also be assigned resnums.
For example, say the resids for the protein in our Universe range from 1 to 214, but they should actually go from 10 to 223. If we can’t change the topology to reflect this, we could set the resnums for these residues to the canonical values
>>> prot = s.universe.selectAtoms('protein')
>>> prot.residues.set_resnum(prot.residues.resids() + 9)
>>> prot.residues.resnums()
array([ 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48,
49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61,
62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74,
75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87,
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100,
101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113,
114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126,
127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139,
140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152,
153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165,
166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178,
179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191,
192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204,
205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217,
218, 219, 220, 221, 222, 223])
We can now select residue 95 from the PDB structure with
>>> s.universe.selectAtoms('protein and resnum 95')
and we might save selections using resnums as well. However, resnums aren’t stored in the topology, so to avoid having to reset resnums manually each time we load the Universe, we can just store the resnum definition with
>>> s.universes.resnums('main', s.universe.residues.resnums())
and the resnum definition will be applied to the Universe both now and every time it is activated.
Reference: Sim¶
-
class
mdsynthesis.
Sim
(sim, universe=None, uname='main', location='.', coordinator=None, categories=None, tags=None)¶ The Sim object is an interface to data for single simulations.
Generate a new or regenerate an existing (on disk) Sim object.
Required arguments: - sim
if generating a new Sim, the desired name to give it; if regenerating an existing Sim, string giving the path to the directory containing the Sim object’s state file
Optional arguments when generating a new Sim: - uname
desired name to associate with universe; this universe will be made the default (can always be changed later)
- universe
arguments usually given to an MDAnalysis Universe that defines the topology and trajectory of the atoms
- location
directory to place Sim object; default is the current directory
- coordinator
directory of the Coordinator to associate with the Sim; if the Coordinator does not exist, it is created; if
None
, the Sim will not associate with any Coordinator- categories
dictionary with user-defined keys and values; used to give Sims distinguishing characteristics
- tags
list with user-defined values; like categories, but useful for adding many distinguishing descriptors
- Note: optional arguments are ignored when regenerating an existing
- Sim
-
basedir
¶ Absolute path to the Container’s base directory.
This is a convenience property; the same result can be obtained by joining :attr:location and :attr:name.
-
categories
¶ The categories of the Container.
Categories are user-added key-value pairs that can be used to and distinguish Containers from one another through Coordinator or Group queries. They can also be useful as flags for external code to determine how to handle the Container.
-
containertype
¶ The type of the Container.
-
coordinators
¶ The locations of the associated Coordinators.
Change this to associate the Container with an existing or new Coordinator(s).
-
data
¶ The data of the Container.
Data are user-generated pandas objects (e.g. Series, DataFrames), numpy arrays, or any pickleable python object that are stored in the Container for easy recall later. Each data instance is given its own directory in the Container’s tree.
-
location
¶ The location of the Container.
Setting the location to a new path physically moves the Container to the given location. This only works if the new location is an empty or nonexistent directory.
-
name
¶ The name of the Container.
The name of a Container need not be unique with respect to other Containers, but is used as part of Container’s displayed representation.
-
selections
¶ Stored atom selections for the active universe.
Useful atom selections can be stored for the active universe and recalled later. Selections are stored separately for each defined universe, since the same selection may require a different selection string for different universes.
The tags of the Container.
Tags are user-added strings that can be used to and distinguish Containers from one another through Coordinator or Group queries. They can also be useful as flags for external code to determine how to handle the Container.
-
universe
¶ The active universe of the Sim.
Universes are interfaces to raw simulation data. The Sim can store multiple universe definitions corresponding to different versions of the same simulation output (e.g. post-processed trajectories derived from the same raw trajectory). The Sim has at most one universe definition that is “active” at one time, with stored selections for this universe directly available via
Sim.selections
.To have more than one universe available as “active” at the same time, generate as many instances of the Sim object from the same statefile on disk as needed, and make a universe active for each one.
-
universes
¶ Manage the defined universes of the Sim.
Universes are interfaces to raw simulation data. The Sim can store multiple universe definitions corresponding to different versions of the same simulation output (e.g. post-processed trajectories derived from the same raw trajectory). The Sim has at most one universe definition that is “active” at one time, with stored selections for this universe directly available via
Sim.selections
.The Sim can also store a preference for a “default” universe, which is activated on a call to
Sim.universe
when no other universe is active.
-
uuid
¶ Get Container uuid.
A Container’s uuid is used by other Containers to identify it. The uuid is given in the Container’s state file name for fast filesystem searching. For example, a Sim object with state file:
'Sim.7dd9305a-d7d9-4a7b-b513-adf5f4205e09.h5'
has uuid:
'7dd9305a-d7d9-4a7b-b513-adf5f4205e09'
Changing this string will alter the Container’s uuid. This is not generally recommended.
Returns: - uuid
unique identifier string for this Container
Reference: Universes¶
The class mdsynthesis.core.aggregators.Universes
is the interface used
by a Sim to manage Universe definitions. It is not intended to be used
on its own, but is shown here to give a detailed view of its methods.
-
class
mdsynthesis.core.aggregators.
Universes
(container, containerfile, logger)¶ Interface to universes.
-
activate
(handle=None)¶ Make the selected universe active.
Only one universe definition can be active in a Sim at one time. The active universe can be accessed from
Sim.universe
. Stored selections for the active universe can be accessed as items inSim.selections
.If no handle given, the default universe is loaded.
If a resnum definition exists for the universe, it is applied.
Arguments: - handle
given name for selecting the universe; if
None
, default universe selected
-
add
(handle, topology, *trajectory)¶ Add a universe definition to the Sim object.
A universe is an MDAnalysis object that gives access to the details of a simulation trajectory. A Sim object can contain multiple universe definitions (topology and trajectory pairs), since it is often convenient to have different post-processed versions of the same raw trajectory.
Using an existing universe handle will replace the topology and trajectory for that definition; selections for that universe will be retained.
If there is no current default universe, then the added universe will become the default.
Arguments: - handle
given name for selecting the universe
- topology
path to the topology file
- trajectory
path to the trajectory file; multiple files may be given and these will be used in order as frames for the trajectory
-
current
()¶ Return the name of the currently active universe.
Returns: - handle
name of currently active universe
-
deactivate
()¶ Deactivate the current universe.
Deactivating the current universe may be necessary to conserve memory, since the universe can then be garbage collected.
-
default
(handle=None)¶ Mark the selected universe as the default, or get the default universe.
The default universe is loaded on calls to
Sim.universe
orSim.selections
when no other universe is attached.If no handle given, returns the current default universe.
Arguments: - handle
given name for selecting the universe; if
None
, default universe is unchanged
Returns: - default
handle of the default universe
-
define
(handle, pathtype='abspath')¶ Get the stored path to the topology and trajectory used for the specified universe.
- Note: Does no checking as to whether these paths are valid. To
- check this, try activating the universe.
Arguments: - handle
name of universe to get definition for
Keywords: - pathtype
type of path to return; ‘abspath’ gives an absolute path, ‘relCont’ gives a path relative to the Sim’s state file
Returns: - topology
path to the topology file
- trajectory
list of paths to trajectory files
-
remove
(*handle)¶ Remove a universe definition.
Also removes any selections associated with the universe.
Arguments: - handle
name of universe(s) to delete
-
resnums
(handle, resnums)¶ Define resnums for the given universe.
Resnums are useful for referring to residues by their canonical resid, for instance that stored in the PDB. By giving a resnum definition for the universe, this definition will be applied to the universe on activation.
Will overwrite existing resnum definition if it exists.
Arguments: - handle
name of universe to apply resnums to
- resnums
list giving the resnum for each residue in the topology, in atom index order; giving
None
will delete resnum definition
-
Reference: Selections¶
The class mdsynthesis.core.aggregators.Selections
is the interface
used by a Sim to access its stored selections. It is not intended to be
used on its own, but is shown here to give a detailed view of its methods.
-
class
mdsynthesis.core.aggregators.
Selections
(container, containerfile, logger)¶ Selection manager for Sims.
Selections are accessible as items using their handles. Each time they are called, they are regenerated from the universe that is currently active. In this way, changes in the universe topology are reflected in the selections.
-
add
(handle, *selection)¶ Add an atom selection for the attached universe.
AtomGroups are needed to obtain useful information from raw coordinate data. It is useful to store AtomGroup selections for later use, since they can be complex and atom order may matter.
If a selection with the given handle already exists, it is replaced.
Arguments: - handle
name to use for the selection
- selection
selection string; multiple strings may be given and their order will be preserved, which is useful for e.g. structural alignments
-
asAtomGroup
(handle)¶ Get AtomGroup from active universe from the given named selection.
If named selection doesn’t exist,
KeyError
raised.Arguments: - handle
name of selection to return as an AtomGroup
Returns: - AtomGroup
the named selection as an AtomGroup of the active universe
-
copy
(universe)¶ Copy defined selections of another universe to the active universe.
Arguments: - universe
name of universe definition to copy selections from
-
define
(handle)¶ Get selection definition for given handle and the active universe.
If named selection doesn’t exist,
KeyError
raised.Arguments: - handle
name of selection to get definition of
Returns: - definition
list of strings defining the atom selection
-
keys
()¶ Return a list of all selection handles.
-