ppafm.ml#

ppafm.ml.AuxMap#

class ppafm.ml.AuxMap.AtomRfunc(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), zmin=-1.5, Rfunc=None, Rmax=10.0, drStep=0.1)[source]#

Bases: AuxMapBase

Generate AtomRfunc descriptors for molecules. Atoms are represented by disks with decay determined by Rfunc.

Parameters:
  • Rfunc – numpy.ndarray. Radial function of bonds&atoms potential. Converted to numpy.float32

  • Rmax – float. Cutoff in angstroms for radial function. Make sure is smaller than maximum range of Rfunc - 3*drStep. The additional three steps are for spline interpolation.

  • drStep – float. Step dx (dr) in angstroms for sampling of radial function Rfunc.

eval(xyzqs, Zs, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.AtomicDisks(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), zmin=-1.2, zmax_s=-1.2, Rpp=-0.5, diskMode='sphere')[source]#

Bases: AuxMapBase

Generate Atomic Disks descriptors for molecules. Each atom is represented by a conically decaying disk.

Parameters:
  • zmax_s – float. The maximum depth of vdW sphere shell when diskMode=’sphere’.

  • Rpp – float. A constant that is added to the vdW radius of each atom.

  • diskMode – ‘sphere’ or ‘center’. With ‘center’ only the center coordinates are considered, when deciding whether an atom is too deep. With ‘sphere’ also the effective size of the atom is taken into account.

eval(xyzqs, Zs, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.AuxMapBase(scan_dim, scan_window, zmin=None)[source]#

Bases: ABC

Base class for AuxMap subclasses.

Each subclass must override the method eval, which gets called when the object instance is called.

Parameters:
  • scan_dim – tuple of two ints. Indicates the pixel size of the scan in x and y.

  • scan_window – tuple ((start_x, start_y), (end_x, end_y)). The start and end coordinates of scan region in angstroms.

  • zmin – float. Deepest coordinate that is still included. Top is defined to be at 0.

abstract eval(Zs, pot, rot)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

prepare_projector(xyzqs, Zs, pos0, bonds2atoms=None, elem_channels=None)[source]#
class ppafm.ml.AuxMap.AuxMapFactory[source]#

Bases: object

class ppafm.ml.AuxMap.Bonds(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), zmin=-1.5, Rfunc=None, Rmax=10.0, drStep=0.1, ellipticity=0.5)[source]#

Bases: AuxMapBase

Generate Bonds descriptors for molecules. Bonds between atoms are represented by ellipses.

Parameters:
  • Rfunc – numpy.ndarray. Radial function of bonds&atoms potential. Converted to numpy.float32

  • Rmax – float. Cutoff in angstroms for radial function. Make sure is smaller than maximum range of Rfunc - 3*drStep. The additional three steps are for spline interpolation.

  • drStep – float. Step dx (dr) in angstroms for sampling of radial function Rfunc.

  • ellipticity – float. Ratio between major and minor semiaxis.

eval(xyzqs, Zs, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.ESMap(scanner, zmin=-2.0, iso=0.1)[source]#

Bases: AuxMapBase

Generate ESMap and HeightMap descriptors for molecules. Represents the charge distribution around the molecule as the z-component of the electrostatic field calculated on the surface defined by the HeightMap.

The HeightMap and ESMap descriptors are different from the other ones in that they depend on the simulation parameters. Before calling ESMap, first do a scan of the molecule with the scanner to get the forces. Giving the elements Zs as an input argument for eval is optional, since it is not used for anything.

Parameters:
  • scanner – Instance of oclr.RelaxedScanner.

  • iso – float. The value of the isosurface.

eval(xyzqs, Zs=None, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.ESMapConstant(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), height=4.0, vdW_cutoff=None, Rpp=0.5)[source]#

Bases: AuxMapBase

Generate constant-height ESMap descriptor for molecules. Represents the charge distribution around the molecule as the z-component of the electrostatic field calculated on a constant-height surface.

Parameters:
  • height – float. The height of the constant-height slice, counted up from the center of the top atom.

  • vdW_cutoff – float <0.0 or None. Use vdW-Spheres descriptor as a mask to cutoff regions without atoms. The cutoff is the same as zmin for the vdW-Spheres descriptor. If None, don’t use cutoff, and calculate the ES Map descriptor for whole slice.

  • Rpp – float. A constant that is added to the vdW radius of each atom if vdW_cutoff is set.

eval(xyzqs, Zs=None, pot=None, rot=array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]))[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.HeightMap(scanner, zmin=-2.0, iso=0.1)[source]#

Bases: AuxMapBase

Generate HeightMap descriptors for molecules. Represents the combined interaction of probe with atoms in molecule as a isosurface of the z-component of the forcefield around the molecule.

The HeightMap and ESMap descriptors are different from the other ones in that they depend on the simulation parameters. Before calling HeightMap, first do a scan of the molecule with the scanner to get the forces. Giving the atoms xyzqs and elements Zs as input arguments for eval is optional, since they are not used for anything.

Parameters:
  • scanner – Instance of oclr.RelaxedScanner.

  • iso – float. The value of the isosurface.

eval(xyzqs=None, Zs=None, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.MultiMapSpheres(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), zmin=-1.5, Rpp=-0.5, nChan=3, Rmin=1.4, Rstep=0.3, bOccl=0)[source]#

Bases: AuxMapBase

Generate Multimap vdW Spheres descriptors for molecules. Each atom is represented by a projection of a sphere with radius equal to the vdW radius of the element. Different sizes of atoms are separated into different channels based on their vdW radii.

Parameters:
  • Rpp – float. A constant that is added to the vdW radius of each atom.

  • nChan – int. Number of channels.

  • Rmin – float. Minimum radius.

  • Rstep – float. Size range per bin.

  • bOccl – 0 or 1. Switch occlusion of atoms 0=off 1=on.

eval(xyzqs, Zs, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

class ppafm.ml.AuxMap.MultiMapSpheresElements(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), zmin=-1.5, Rpp=-0.5, elems=[['H'], ['N', 'O', 'F'], ['C', 'Si', 'P', 'S', 'Cl', 'Br']], bOccl=0)[source]#

Bases: AuxMapBase

Generate Multimap vdW Spheres descriptors for molecules. Each atom is represented by a projection of a sphere with radius equal to the vdW radius of the element. Different elements can be separated arbitrarily into different channels.

Parameters:
  • Rpp – float. A constant that is added to the vdW radius of each atom.

  • elems – list of lists of int or str. Lists of elements in each channel as the atomic numbers or symbols.

  • bOccl – 0 or 1. Switch occlusion of atoms 0=off 1=on.

convert_elements(elems)[source]#
eval(xyzqs, Zs, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

get_elem_channels(Zs)[source]#
class ppafm.ml.AuxMap.vdwSpheres(scan_dim=(128, 128), scan_window=((-8, -8), (8, 8)), zmin=-1.5, Rpp=-0.5)[source]#

Bases: AuxMapBase

Generate vdW Spheres descriptors for molecules. Each atom is represented by a projection of a sphere with radius equal to the vdW radius of the element.

Parameters:

Rpp – float. A constant that is added to the vdW radius of each atom.

eval(xyzqs, Zs, pot=None, rot=None)[source]#
Parameters:
  • xyzqs – numpy.ndarray of floats. xyz coordinates and charges of atoms in molecule

  • Zs – numpy.ndarray of ints. Elements of atoms in molecule.

  • pot – HartreePotential. Sample hartree potential.

  • rot – np.ndarray of shape (3, 3). Sample rotation.

ppafm.ml.Generator#

class ppafm.ml.Generator.GeneratorAFMtrainer(afmulator, aux_maps, sample_generator, sim_type='LJ', batch_size=30, distAbove=5.3, iZPPs=[8], Qs=None, QZs=None, rhos=None, rho_deltas=None, ignore_elements=[], density_cutoff=None)[source]#

Bases: object

Generate batches of input/output pair samples for machine learning. An iterator.

The machine learning samples are generated for every sample system returned by a generator function. The generator should return dicts with the input arguments for the simulation. Possible entries in the dict are all of the call arguments to AFMulator.eval(). At least the entries 'xyzs' and 'Zs' should be present in the dict.

The following type of force fields for the simulations are supported (case-insensitive):

  • 'LJ': Lennard-Jones without electrostatics.

  • 'LJ+PC': Lennard-Jones with electrostatics from point-charges.

  • 'LJ+Hartree': Lennard-Jones with electrostatics from the Hartree potential.

  • 'FDBM': Full-density based model.

During the iteration for a batch, several callback methods are called at various points. The procedure is the following:

on_batch_start()
for each sample:
    on_sample_start()
    for each tip:
        on_afm_start()
        # Run AFM simulation
    # Run AuxMap calculations

These methods can be overridden to modify the behaviour of the simulation. For example, various parameters of the simulation can be randomized.

The iterator returns batches of samples (Xs, Ys, mols, sws):

  • Xs: AFM images as an np.ndarray of shape (batch_size, n_tip, nx, ny, nz).

  • Ys: AuxMap descriptors as an np.ndarray of shape (batch_size, n_auxmap, nx, ny).

  • mols: List of length batch_size of atomic coordinates, atomic numbers, and charges as an np.ndarray of shape (n_atoms, 5).

  • sws: Scan window bounds as an np.ndarray of shape (batch_size, n_tip, 2, 3).

See also the tutorial: Generating machine learning training data.

Parameters:
  • afmulator – An instance of AFMulator.

  • auxmaps – list of AuxMapBase.

  • sample_generator – Iterable. A generator function that returns sample dicts containing the input arguments for the simulation.

  • sim_type – str. Type of force field model to use in the AFM simulation. The contents of the dicts returned by the sample_generator need to be sufficient for the simulation type. Otherwise an error is raised at run time. If the dict contains entries not required for the chosen simulation type, then those entries are ignored.

  • batch_size – int. Number of samples per batch.

  • distAbove – float. Tip-sample distance parameter.

  • iZPPs – list of int. Atomic numbers of AFM tips. An image is produced with every tip for each sample.

  • Qs – list of arrays of length 4. Point charges for tips. Used for point-charge approximation of tip charge when the simulation type is LJ+PC.

  • QZs – list of arrays of length 4. Positions of tip charges. Used for point-charge approximation of tip charge when the simulation type is LJ+PC.

  • rhos – list of dict or TipDensity. Tip charge densities. Used for electrostatic interaction when the simulation type is LJ+Hartree or for Pauli repulsion calculation when the simulation type is FDBM.

  • rho_deltas – None or list of TipDensity. Tip delta charge density. Required for the when the simulation type is FDBM, where it is used for calculating the electrostatic interaction.

  • ignore_elements – list of int. Atomic numbers of elements to ignore in scan window distance and position calculations. Useful for example for ignoring surface slab atoms in centering the scan window.

  • density_cutoff – float or None. If not None, apply a cutoff to electron densities in the FDBM Pauli integral. Useful when working with all-electron densities where large density values near nuclei can cause artifacts in the resulting images. Ignored when sim_type is not 'FDBM'.

handle_distance()[source]#

Set correct distance of the scan window from the current molecule.

handle_positions()[source]#

Shift scan window laterally to center on the molecule.

on_afm_start()[source]#

Excecuted before every AFM image evaluation. Override to modify the parameters for each AFM image.

on_batch_start()[source]#

Excecuted at the start of each batch. Override to modify parameters for each batch.

on_sample_start()[source]#

Excecuted after loading in a new sample. Override to modify the parameters for each sample.

randomize_df_steps(minimum=4, maximum=20)[source]#

Randomize oscillation amplitude by randomizing the number of steps in df convolution.

Chosen number of df steps is uniform random between minimum and maximum. Modifies self.scan_dim and self.scan_size to retain same output z dimension and same dz step for the chosen number of df steps.

Parameters:
  • minimum – int. Minimum number of df steps (inclusive).

  • maximum – int. Maximum number of df steps (inclusive).

randomize_distance(delta=0.25)[source]#

Randomize tip-sample distance.

Parameters:

delta – float. Maximum deviation from the original value in angstroms.

randomize_tip(max_tilt=0.5)[source]#

Randomize tip tilt to simulate asymmetric adsorption of particle on tip apex.

Parameters:

max_tilt – float. Maximum deviation in xy plane in angstroms.

set_fdbm_parameters(A_pauli, B_pauli)[source]#

Set the Pauli integral parameters in an FDBM simulation. If set simulation type is not FDBM, does nothing.

Parameters:
  • A_pauli – float. Integral prefactor.

  • B_pauli – float. Integrant exponent.

class ppafm.ml.Generator.InverseAFMtrainer(afmulator, aux_maps, paths, batch_size=30, distAbove=5.3, iZPPs=[8], Qs=[[-10, 20, -10, 0]], QZs=[[0.1, 0, -0.1, 0]])[source]#

Bases: object

A data generator for training machine learning models. Generates batches of input/output pairs. An iterator.

Yields batches of samples (Xs, Ys, mols), where Xs are AFM images, Ys are aux map descriptors, and mols are molecules. Xs is a list of np.ndarray of shape (n_batch, sx, sy, sz), Ys is a list of np.ndarray of shape (n_batch, sx, sy), and mols is a list of length n_batch of np.ndarray of shape (n_atoms, 5), where n_batch is the batch size, sx, sy, and sz are the scan sizes in x, y, and z dimensions, respectively, and n_atoms is the number of atoms. The outer lists correspond to the tip number in Xs, the aux map number in Ys. In mols the rows of the arrays correspond to the x, y, and z coordinates, the charge, and the element of each atom.

See also the tutorial: Generating machine learning training data.

Parameters:
  • afmulator – An instance of AFMulator.

  • auxmaps – list of AuxMapBase.

  • paths – list of paths to xyz files of molecules. The molecules are saved to the “molecules” attribute in np.ndarrays of shape (num_atoms, 5) with [x, y, z, charge, element] for each atom.

  • batch_size – int. Number of samples per batch.

  • distAbove – float. Tip-sample distance parameter.

  • iZPPs – list of ints. Elements for AFM tips. Image is produced with every tip for each sample.

  • Qs – list of arrays of length 4. Charges for tips.

  • QZS – list of arrays of length 4. Positions of tip charges.

augment_with_rotations(rotations)[source]#

Augment molecule list with rotations of the molecules.

Parameters:

rotations – list of np.ndarray. Rotation matrices.

augment_with_rotations_entropy(rotations, n_best_rotations=30)[source]#

Augment molecule list with rotations of the molecules. Rotations are sorted in terms of their “entropy”.

Parameters:
  • rotations – list of np.ndarray. Rotation matrices.

  • n_best_rotations – int. Only the first n_best_rotations with the highest “entropy” will be taken.

handle_distance()[source]#

Set correct distance from scan region for the current molecule.

handle_positions()[source]#

Set current molecule to the center of the scan window.

on_afm_end()[source]#

Excecuted right after evaluating AFM image. Override to modify the parameters for each sample.

on_afm_start()[source]#

Excecuted right before every AFM image evalution. Override to modify the parameters for each AFM image.

on_batch_start()[source]#

Excecuted right at the start of each batch. Override to modify parameters for each batch.

on_sample_start()[source]#

Excecuted right before evaluating first AFM image. Override to modify the parameters for each sample.

randomize_distance(delta=0.25)[source]#

Randomize tip-sample distance.

Parameters:

delta – float. Maximum deviation from original value in angstroms.

randomize_mol_parameters(rndQmax=0.0, rndRmax=0.0, rndEmax=0.0, rndAlphaMax=0.0)[source]#

Randomize various interaction parameters for current molecule.

randomize_tip(max_tilt=0.5)[source]#

Randomize tip tilt to simulate asymmetric adsorption of particle on tip apex.

Parameters:

max_tilt – float. Maximum deviation in xy plane in angstroms.

read_xyzs()[source]#

Read molecule xyz files from selected paths.

shuffle_molecules()[source]#

Shuffle list of molecules.

ppafm.ml.Generator.getRandomUniformDisk()[source]#

generate points unifromly distributed over disk # see: http://mathworld.wolfram.com/DiskPointPicking.html

ppafm.ml.Generator.rotate(xyzs, rotations)[source]#
ppafm.ml.Generator.sortRotationsByEntropy(xyzs, rotations)[source]#