Python API documentation

qml.representations module

qml.representations.generate_atomic_coulomb_matrix(nuclear_charges, coordinates, size=23, sorting='distance', central_cutoff=1000000.0, central_decay=-1, interaction_cutoff=1000000.0, interaction_decay=-1, indices=None)

Creates a Coulomb Matrix representation of the local environment of a central atom. For each central atom \(k\), a matrix \(M\) is constructed with elements

\[\begin{split}M_{ij}(k) = \begin{cases} \tfrac{1}{2} Z_{i}^{2.4} \cdot f_{ik}^2 & \text{if } i = j \\ \frac{Z_{i}Z_{j}}{\| {\bf R}_{i} - {\bf R}_{j}\|} \cdot f_{ik}f_{jk}f_{ij} & \text{if } i \neq j \end{cases},\end{split}\]

where \(i\), \(j\) and \(k\) are atom indices, \(Z\) is nuclear charge and \(\bf R\) is the coordinate in euclidean space.

\(f_{ij}\) is a function that masks long range effects:

\[\begin{split}f_{ij} = \begin{cases} 1 & \text{if } \|{\bf R}_{i} - {\bf R}_{j} \| \leq r - \Delta r \\ \tfrac{1}{2} \big(1 + \cos\big(\pi \tfrac{\|{\bf R}_{i} - {\bf R}_{j} \| - r + \Delta r}{\Delta r} \big)\big) & \text{if } r - \Delta r < \|{\bf R}_{i} - {\bf R}_{j} \| \leq r - \Delta r \\ 0 & \text{if } \|{\bf R}_{i} - {\bf R}_{j} \| > r \end{cases},\end{split}\]

where the parameters central_cutoff and central_decay corresponds to the variables \(r\) and \(\Delta r\) respectively for interactions involving the central atom, and interaction_cutoff and interaction_decay corresponds to the variables \(r\) and \(\Delta r\) respectively for interactions not involving the central atom.

if sorting = 'row-norm', the atom indices are ordered such that

\(\sum_j M_{1j}(k)^2 \geq \sum_j M_{2j}(k)^2 \geq ... \geq \sum_j M_{nj}(k)^2\)

if sorting = 'distance', the atom indices are ordered such that

\[\|{\bf R}_{1} - {\bf R}_{k}\| \leq \|{\bf R}_{2} - {\bf R}_{k}\| \leq ... \leq \|{\bf R}_{n} - {\bf R}_{k}\|\]

The upper triangular of M, including the diagonal, is concatenated to a 1D vector representation.

The representation can be calculated for a subset by either specifying indices = [0,1,...], where \([0,1,...]\) are the requested atom indices, or by specifying indices = 'C' to only calculate central carbon atoms.

The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • nuclear_charges (numpy array) – Nuclear charges of the atoms in the molecule
  • coordinates (numpy array) – 3D Coordinates of the atoms in the molecule
  • size (integer) – The size of the largest molecule supported by the representation
  • sorting (string) – How the atom indices are sorted (‘row-norm’, ‘distance’)
  • central_cutoff (float) – The distance from the central atom, where the coulomb interaction element will be zero
  • central_decay (float) – The distance over which the the coulomb interaction decays from full to none
  • interaction_cutoff (float) – The distance between two non-central atom, where the coulomb interaction element will be zero
  • interaction_decay (float) – The distance over which the the coulomb interaction decays from full to none
  • indices (Nonetype/array/string) – Subset indices or atomtype
Returns:

nD representation - shape (\(N_{atoms}\), size(size+1)/2)

Return type:

numpy array

qml.representations.generate_bob(nuclear_charges, coordinates, atomtypes, size=23, asize={'C': 7, 'H': 16, 'N': 3, 'O': 3, 'S': 1})

Creates a Bag of Bonds (BOB) representation of a molecule. The representation expands on the coulomb matrix representation. For each element a bag (vector) is constructed for self interactions (e.g. (‘C’, ‘H’, ‘O’)). For each element pair a bag is constructed for interatomic interactions (e.g. (‘CC’, ‘CH’, ‘CO’, ‘HH’, ‘HO’, ‘OO’)), sorted by value. The self interaction of element \(I\) is given by

\(\tfrac{1}{2} Z_{I}^{2.4}\),

with \(Z_{i}\) being the nuclear charge of element \(i\) The interaction between atom \(i\) of element \(I\) and atom \(j\) of element \(J\) is given by

\(\frac{Z_{I}Z_{J}}{\| {\bf R}_{i} - {\bf R}_{j}\|}\)

with \(R_{i}\) being the euclidean coordinate of atom \(i\). The sorted bags are concatenated to an 1D vector representation. The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • nuclear_charges (numpy array) – Nuclear charges of the atoms in the molecule
  • coordinates (numpy array) – 3D Coordinates of the atoms in the molecule
  • size (integer) – The maximum number of atoms in the representation
  • asize (dictionary) – The maximum number of atoms of each element type supported by the representation
Returns:

1D representation

Return type:

numpy array

qml.representations.generate_coulomb_matrix(nuclear_charges, coordinates, size=23, sorting='row-norm')

Creates a Coulomb Matrix representation of a molecule. Sorting of the elements can either be done by sorting="row-norm" or sorting="unsorted". A matrix \(M\) is constructed with elements

\[\begin{split}M_{ij} = \begin{cases} \tfrac{1}{2} Z_{i}^{2.4} & \text{if } i = j \\ \frac{Z_{i}Z_{j}}{\| {\bf R}_{i} - {\bf R}_{j}\|} & \text{if } i \neq j \end{cases},\end{split}\]

where \(i\) and \(j\) are atom indices, \(Z\) is nuclear charge and \(\bf R\) is the coordinate in euclidean space. If sorting = 'row-norm', the atom indices are reordered such that

\(\sum_j M_{1j}^2 \geq \sum_j M_{2j}^2 \geq ... \geq \sum_j M_{nj}^2\)

The upper triangular of M, including the diagonal, is concatenated to a 1D vector representation.

If sorting = 'unsorted, the elements are sorted in the same order as the input coordinates and nuclear charges.

The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • nuclear_charges (numpy array) – Nuclear charges of the atoms in the molecule
  • coordinates (numpy array) – 3D Coordinates of the atoms in the molecule
  • size (integer) – The size of the largest molecule supported by the representation
  • sorting (string) – How the atom indices are sorted (‘row-norm’, ‘unsorted’)
Returns:

1D representation - shape (size(size+1)/2,)

Return type:

numpy array

qml.representations.generate_eigenvalue_coulomb_matrix(nuclear_charges, coordinates, size=23)

Creates an eigenvalue Coulomb Matrix representation of a molecule. A matrix \(M\) is constructed with elements

\[\begin{split}M_{ij} = \begin{cases} \tfrac{1}{2} Z_{i}^{2.4} & \text{if } i = j \\ \frac{Z_{i}Z_{j}}{\| {\bf R}_{i} - {\bf R}_{j}\|} & \text{if } i \neq j \end{cases},\end{split}\]

where \(i\) and \(j\) are atom indices, \(Z\) is nuclear charge and \(\bf R\) is the coordinate in euclidean space. The molecular representation of the molecule is then the sorted eigenvalues of M. The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • nuclear_charges (numpy array) – Nuclear charges of the atoms in the molecule
  • coordinates (numpy array) – 3D Coordinates of the atoms in the molecule
  • size (integer) – The size of the largest molecule supported by the representation
Returns:

1D representation - shape (size, )

Return type:

numpy array

qml.representations.generate_slatm(coordinates, nuclear_charges, mbtypes, unit_cell=None, local=False, sigmas=[0.05, 0.05], dgrids=[0.03, 0.03], rcut=4.8, alchemy=False, pbc='000', rpower=6)

Generate Spectrum of London and Axillrod-Teller-Muto potential (SLATM) representation. Both global (local=False) and local (local=True) SLATM are available.

A version that works for periodic boundary conditions will be released soon.

NOTE: You will need to run the get_slatm_mbtypes() function to get the mbtypes input (or generate it manually).

Parameters:
  • coordinates (numpy array) – Input coordinates
  • nuclear_charges (numpy array) – List of nuclear charges.
  • mbtypes (list) – Many-body types for the whole dataset, including 1-, 2- and 3-body types. Could be obtained by calling get_slatm_mbtypes().
  • local (bool) – Generate a local representation. Defaulted to False (i.e., global representation); otherwise, atomic version.
  • sigmas (list) – Controlling the width of Gaussian smearing function for 2- and 3-body parts, defaulted to [0.05,0.05], usually these do not need to be adjusted.
  • dgrids (list) – The interval between two sampled internuclear distances and angles, defaulted to [0.03,0.03], no need for change, compromised for speed and accuracy.
  • rcut (float) – Cut-off radius, defaulted to 4.8 Angstrom.
  • alchemy (bool) – Swith to use the alchemy version of SLATM. (default=False)
  • pbc (string) – defaulted to ‘000’, meaning it’s a molecule; the three digits in the string corresponds to x,y,z direction
  • rpower (float) – The power of R in 2-body potential, defaulted to London potential (=6).
Returns:

1D SLATM representation

Return type:

numpy array

qml.representations.get_slatm_mbtypes(nuclear_charges, pbc='000')

Get the list of minimal types of many-body terms in a dataset. This resulting list is necessary as input in the generate_slatm_representation() function.

Parameters:
  • nuclear_charges (list of numpy arrays) – A list of the nuclear charges for each compound in the dataset.
  • pbc (string) – periodic boundary condition along x,y,z direction, defaulted to ‘000’, i.e., molecule
Returns:

A list containing the types of many-body terms.

Return type:

list

qml.representations.vector_to_matrix(v)

Converts a representation from 1D vector to 2D square matrix. :param v: 1D input representation. :type v: numpy array :return: Square matrix representation. :rtype: numpy array

qml.kernels module

qml.kernels.gaussian_kernel(A, B, sigma)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - B_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(B_{j}\) are representation vectors. K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of representations - shape (N, representation size).
  • B (numpy array) – 2D array of representations - shape (M, representation size).
  • sigma (float) – The value of sigma in the kernel matrix.
Returns:

The Gaussian kernel matrix - shape (N, M)

Return type:

numpy array

qml.kernels.get_local_kernels_gaussian(A, B, na, nb, sigmas)

Calculates the Gaussian kernel matrix K, for a local representation where \(K_{ij}\):

\(K_{ij} = \sum_{a \in i} \sum_{b \in j} \exp \big( -\frac{\|A_a - B_b\|_2^2}{2\sigma^2} \big)\)

Where \(A_{a}\) and \(B_{b}\) are representation vectors.

Note that the input array is one big 2D array with all atoms concatenated along the same axis. Further more a series of kernels is produced (since calculating the distance matrix is expensive but getting the resulting kernels elements for several sigmas is not.)

K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of descriptors - shape (total atoms A, representation size).
  • B (numpy array) – 2D array of descriptors - shape (total atoms B, representation size).
  • na (numpy array) – 1D array containing numbers of atoms in each compound.
  • nb (numpy array) – 1D array containing numbers of atoms in each compound.
  • sigma (float) – The value of sigma in the kernel matrix.
Returns:

The Gaussian kernel matrix - shape (nsigmas, N, M)

Return type:

numpy array

qml.kernels.get_local_kernels_laplacian(A, B, na, nb, sigmas)

Calculates the Local Laplacian kernel matrix K, for a local representation where \(K_{ij}\):

\(K_{ij} = \sum_{a \in i} \sum_{b \in j} \exp \big( -\frac{\|A_a - B_b\|_1}{\sigma} \big)\)

Where \(A_{a}\) and \(B_{b}\) are representation vectors.

Note that the input array is one big 2D array with all atoms concatenated along the same axis. Further more a series of kernels is produced (since calculating the distance matrix is expensive but getting the resulting kernels elements for several sigmas is not.)

K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of descriptors - shape (N, representation size).
  • B (numpy array) – 2D array of descriptors - shape (M, representation size).
  • na (numpy array) – 1D array containing numbers of atoms in each compound.
  • nb (numpy array) – 1D array containing numbers of atoms in each compound.
  • sigmas (list) – List of the sigmas.
Returns:

The Laplacian kernel matrix - shape (nsigmas, N, M)

Return type:

numpy array

qml.kernels.laplacian_kernel(A, B, sigma)

Calculates the Laplacian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - B_j\|_1}{\sigma} \big)\)

Where \(A_{i}\) and \(B_{j}\) are representation vectors. K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of representations - shape (N, representation size).
  • B (numpy array) – 2D array of representations - shape (M, representation size).
  • sigma (float) – The value of sigma in the kernel matrix.
Returns:

The Laplacian kernel matrix - shape (N, M)

Return type:

numpy array

qml.kernels.linear_kernel(A, B)

Calculates the linear kernel matrix K, where \(K_{ij}\):

\(K_{ij} = A_i \cdot B_j\)

VWhere \(A_{i}\) and \(B_{j}\) are representation vectors.

K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of representations - shape (N, representation size).
  • B (numpy array) – 2D array of representations - shape (M, representation size).
Returns:

The Gaussian kernel matrix - shape (N, M)

Return type:

numpy array

qml.kernels.matern_kernel(A, B, sigma, order=0, metric='l1')

Calculates the Matern kernel matrix K, where \(K_{ij}\):

for order = 0:
\(K_{ij} = \exp\big( -\frac{d}{\sigma} \big)\)
for order = 1:
\(K_{ij} = \exp\big( -\frac{\sqrt{3} d}{\sigma} \big) \big(1 + \frac{\sqrt{3} d}{\sigma} \big)\)
for order = 2:
\(K_{ij} = \exp\big( -\frac{\sqrt{5} d}{d} \big) \big( 1 + \frac{\sqrt{5} d}{\sigma} + \frac{5 d^2}{3\sigma^2} \big)\)

Where \(A_i\) and \(B_j\) are representation vectors, and d is a distance measure.

K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of representations - shape (N, representation size).
  • B (numpy array) – 2D array of representations - shape (M, representation size).
  • sigma (float) – The value of sigma in the kernel matrix.
  • order (integer) – The order of the polynomial (0, 1, 2)
  • metric (string) – The distance metric (‘l1’, ‘l2’)
Returns:

The Matern kernel matrix - shape (N, M)

Return type:

numpy array

qml.kernels.sargan_kernel(A, B, sigma, gammas)

Calculates the Sargan kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\| A_i - B_j \|_1)}{\sigma} \big) \big(1 + \sum_{k} \frac{\gamma_{k} \| A_i - B_j \|_1^k}{\sigma^k} \big)\)

Where \(A_{i}\) and \(B_{j}\) are representation vectors. K is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of representations - shape (N, representation size).
  • B (numpy array) – 2D array of representations - shape (M, representation size).
  • sigma (float) – The value of sigma in the kernel matrix.
  • gammas (numpy array) – 1D array of parameters in the kernel matrix.
Returns:

The Sargan kernel matrix - shape (N, M).

Return type:

numpy array

qml.distance module

qml.distance.l2_distance(A, B)

Calculates the L2 distances, D, between two Numpy arrays of representations.

\(D_{ij} = \|A_i - B_j\|_2\)

Where \(A_{i}\) and \(B_{j}\) are representation vectors. D is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of descriptors - shape (N, representation size).
  • B (numpy array) – 2D array of descriptors - shape (M, representation size).
Returns:

The L2-distance matrix.

Return type:

numpy array

qml.distance.manhattan_distance(A, B)

Calculates the Manhattan distances, D, between two Numpy arrays of representations.

\(D_{ij} = \|A_i - B_j\|_1\)

Where \(A_{i}\) and \(B_{j}\) are representation vectors. D is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of descriptors - shape (N, representation size).
  • B (numpy array) – 2D array of descriptors - shape (M, representation size).
Returns:

The Manhattan-distance matrix.

Return type:

numpy array

qml.distance.p_distance(A, B, p=2)

Calculates the p-norm distances between two Numpy arrays of representations. The value of the keyword argument p = sets the norm order. E.g. p = 1.0 and p = 2.0 with yield the Manhattan and L2 distances, respectively.

\[D_{ij} = \|A_i - B_j\|_p\]

Where \(A_{i}\) and \(B_{j}\) are representation vectors. D is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • A (numpy array) – 2D array of descriptors - shape (N, representation size).
  • B (numpy array) – 2D array of descriptors - shape (M, representation size).
  • p (float) – The norm order
Returns:

The distance matrix.

Return type:

numpy array

qml.math module

qml.math.bkf_invert(A)

Returns the inverse of a positive definite matrix, using a Cholesky decomposition via calls to LAPACK dpotrf and dpotri in the F2PY module.

Parameters:A (numpy array) – Matrix (symmetric and positive definite, left-hand side).
Returns:The inverse matrix
Return type:numpy array
qml.math.bkf_solve(A, y)

Solves the equation

\(A x = y\)

for x using a Cholesky decomposition via calls to LAPACK dpotrf and dpotrs in the F2PY module. Preserves the input matrix A.

Parameters:
  • A (numpy array) – Matrix (symmetric and positive definite, left-hand side).
  • y (numpy array) – Vector (right-hand side of the equation).
Returns:

The solution vector.

Return type:

numpy array

qml.math.cho_invert(A)

Returns the inverse of a positive definite matrix, using a Cholesky decomposition via calls to LAPACK dpotrf and dpotri in the F2PY module.

Parameters:A (numpy array) – Matrix (symmetric and positive definite, left-hand side).
Returns:The inverse matrix
Return type:numpy array
qml.math.cho_solve(A, y)

Solves the equation

\(A x = y\)

for x using a Cholesky decomposition via calls to LAPACK dpotrf and dpotrs in the F2PY module. Preserves the input matrix A.

Parameters:
  • A (numpy array) – Matrix (symmetric and positive definite, left-hand side).
  • y (numpy array) – Vector (right-hand side of the equation).
Returns:

The solution vector.

Return type:

numpy array

qml.Compound class

class qml.Compound(xyz=None)

Bases: object

The Compound class is used to store data from

Parameters:xyz (string) – Option to initialize the Compound with data from an XYZ file.
generate_atomic_coulomb_matrix(size=23, sorting='row-norm', central_cutoff=1000000.0, central_decay=-1, interaction_cutoff=1000000.0, interaction_decay=-1, indices=None)

Creates a Coulomb Matrix representation of the local environment of a central atom. For each central atom \(k\), a matrix \(M\) is constructed with elements

\[\begin{split}M_{ij}(k) = \begin{cases} \tfrac{1}{2} Z_{i}^{2.4} \cdot f_{ik}^2 & \text{if } i = j \\ \frac{Z_{i}Z_{j}}{\| {\bf R}_{i} - {\bf R}_{j}\|} \cdot f_{ik}f_{jk}f_{ij} & \text{if } i \neq j \end{cases},\end{split}\]

where \(i\), \(j\) and \(k\) are atom indices, \(Z\) is nuclear charge and \(\bf R\) is the coordinate in euclidean space.

\(f_{ij}\) is a function that masks long range effects:

\[\begin{split}f_{ij} = \begin{cases} 1 & \text{if } \|{\bf R}_{i} - {\bf R}_{j} \| \leq r - \Delta r \\ \tfrac{1}{2} \big(1 + \cos\big(\pi \tfrac{\|{\bf R}_{i} - {\bf R}_{j} \| - r + \Delta r}{\Delta r} \big)\big) & \text{if } r - \Delta r < \|{\bf R}_{i} - {\bf R}_{j} \| \leq r - \Delta r \\ 0 & \text{if } \|{\bf R}_{i} - {\bf R}_{j} \| > r \end{cases},\end{split}\]

where the parameters central_cutoff and central_decay corresponds to the variables \(r\) and \(\Delta r\) respectively for interactions involving the central atom, and interaction_cutoff and interaction_decay corresponds to the variables \(r\) and \(\Delta r\) respectively for interactions not involving the central atom.

if sorting = 'row-norm', the atom indices are ordered such that

\(\sum_j M_{1j}(k)^2 \geq \sum_j M_{2j}(k)^2 \geq ... \geq \sum_j M_{nj}(k)^2\)

if sorting = 'distance', the atom indices are ordered such that

\[\|{\bf R}_{1} - {\bf R}_{k}\| \leq \|{\bf R}_{2} - {\bf R}_{k}\| \leq ... \leq \|{\bf R}_{n} - {\bf R}_{k}\|\]

The upper triangular of M, including the diagonal, is concatenated to a 1D vector representation.

The representation can be calculated for a subset by either specifying indices = [0,1,...], where \([0,1,...]\) are the requested atom indices, or by specifying indices = 'C' to only calculate central carbon atoms.

The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • size (integer) – The size of the largest molecule supported by the representation
  • sorting (string) – How the atom indices are sorted (‘row-norm’, ‘distance’)
  • central_cutoff (float) – The distance from the central atom, where the coulomb interaction element will be zero
  • central_decay (float) – The distance over which the the coulomb interaction decays from full to none
  • interaction_cutoff (float) – The distance between two non-central atom, where the coulomb interaction element will be zero
  • interaction_decay (float) – The distance over which the the coulomb interaction decays from full to none
  • indices (Nonetype/array/string) – Subset indices or atomtype
Returns:

nD representation - shape (\(N_{atoms}\), size(size+1)/2)

Return type:

numpy array

generate_bob(size=23, asize={'C': 7, 'H': 16, 'N': 3, 'O': 3, 'S': 1})

Creates a Bag of Bonds (BOB) representation of a molecule. The representation expands on the coulomb matrix representation. For each element a bag (vector) is constructed for self interactions (e.g. (‘C’, ‘H’, ‘O’)). For each element pair a bag is constructed for interatomic interactions (e.g. (‘CC’, ‘CH’, ‘CO’, ‘HH’, ‘HO’, ‘OO’)), sorted by value. The self interaction of element \(I\) is given by

\(\tfrac{1}{2} Z_{I}^{2.4}\),

with \(Z_{i}\) being the nuclear charge of element \(i\) The interaction between atom \(i\) of element \(I\) and atom \(j\) of element \(J\) is given by

\(\frac{Z_{I}Z_{J}}{\| {\bf R}_{i} - {\bf R}_{j}\|}\)

with \(R_{i}\) being the euclidean coordinate of atom \(i\). The sorted bags are concatenated to an 1D vector representation. The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:asize – The maximum number of atoms of each element type supported by the representation
Returns:1D representation
Return type:numpy array
generate_coulomb_matrix(size=23, sorting='row-norm', indices=None)

Creates a Coulomb Matrix representation of a molecule. A matrix \(M\) is constructed with elements

\[\begin{split}M_{ij} = \begin{cases} \tfrac{1}{2} Z_{i}^{2.4} & \text{if } i = j \\ \frac{Z_{i}Z_{j}}{\| {\bf R}_{i} - {\bf R}_{j}\|} & \text{if } i \neq j \end{cases},\end{split}\]

where \(i\) and \(j\) are atom indices, \(Z\) is nuclear charge and \(\bf R\) is the coordinate in euclidean space. if sorting = 'row-norm', the atom indices are reordered such that

\(\sum_j M_{1j}^2 \geq \sum_j M_{2j}^2 \geq ... \geq \sum_j M_{nj}^2\)

The upper triangular of M, including the diagonal, is concatenated to a 1D vector representation. The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:
  • size (integer) – The size of the largest molecule supported by the representation
  • sorting (string) – How the atom indices are sorted (‘row-norm’, ‘unsorted’)
Returns:

1D representation - shape (size(size+1)/2,)

Return type:

numpy array

generate_eigenvalue_coulomb_matrix(size=23)

Creates an eigenvalue Coulomb Matrix representation of a molecule. A matrix \(M\) is constructed with elements

\[\begin{split}M_{ij} = \begin{cases} \tfrac{1}{2} Z_{i}^{2.4} & \text{if } i = j \\ \frac{Z_{i}Z_{j}}{\| {\bf R}_{i} - {\bf R}_{j}\|} & \text{if } i \neq j \end{cases},\end{split}\]

where \(i\) and \(j\) are atom indices, \(Z\) is nuclear charge and \(\bf R\) is the coordinate in euclidean space. The molecular representation of the molecule is then the sorted eigenvalues of M. The representation is calculated using an OpenMP parallel Fortran routine.

Parameters:size (integer) – The size of the largest molecule supported by the representation
Returns:1D representation - shape (size, )
Return type:numpy array
generate_fchl_representation(max_size=23, cell=None, neighbors=24, cut_distance=5.0)

Generates the representation for the FCHL-kernel. Note that this representation is incompatible with generic qml.kernel.* kernels. :param size: Max number of atoms in representation. :type size: integer

generate_slatm(mbtypes, local=False, sigmas=[0.05, 0.05], dgrids=[0.03, 0.03], rcut=4.8, pbc='000', alchemy=False, rpower=6)

Generate Spectrum of London and Axillrod-Teller-Muto potential (SLATM) representation. Both global (local=False) and local (local=True) SLATM are available.

A version that works for periodic boundary conditions will be released soon.

NOTE: You will need to run the get_slatm_mbtypes() function to get the mbtypes input (or generate it manually).

Parameters:
  • mbtypes (list) – Many-body types for the whole dataset, including 1-, 2- and 3-body types. Could be obtained by calling get_slatm_mbtypes().
  • local (bool) – Generate a local representation. Defaulted to False (i.e., global representation); otherwise, atomic version.
  • sigmas (list) – Controlling the width of Gaussian smearing function for 2- and 3-body parts, defaulted to [0.05,0.05], usually these do not need to be adjusted.
  • dgrids (list) – The interval between two sampled internuclear distances and angles, defaulted to [0.03,0.03], no need for change, compromised for speed and accuracy.
  • rcut (float) – Cut-off radius, defaulted to 4.8 Angstrom.
  • alchemy (bool) – Swith to use the alchemy version of SLATM. (default=False)
  • pbc (string) – defaulted to ‘000’, meaning it’s a molecule; the three digits in the string corresponds to x,y,z direction
  • rpower (float) – The power of R in 2-body potential, defaulted to London potential (=6).
Returns:

1D SLATM representation

Return type:

numpy array

read_xyz(filename)

(Re-)initializes the Compound-object with data from an xyz-file.

Parameters:filename (string) – Input xyz-filename.

qml.fchl module

qml.fchl.generate_representation(coordinates, nuclear_charges, max_size=23, neighbors=23, cut_distance=5.0, cell=None)

Generates a representation for the FCHL kernel module.

Parameters:
  • coordinates (numpy array) – Input coordinates.
  • nuclear_charges (numpy array) – List of nuclear charges.
  • max_size (integer) – Max number of atoms in representation.
  • neighbors (integer) – Max number of atoms within the cut-off around an atom. (For periodic systems)
  • cell (numpy array) – Unit cell vectors. The presence of this keyword argument will generate a periodic representation.
  • cut_distance (float) – Spatial cut-off distance - must be the same as used in the kernel function call.
Returns:

FCHL representation, shape = (size,5,neighbors).

Return type:

numpy array

qml.fchl.get_atomic_kernels(A, B, sigmas, two_body_scaling=2.8284271247461903, three_body_scaling=1.6, two_body_width=0.2, three_body_width=3.141592653589793, two_body_power=4.0, three_body_power=2.0, cut_start=1.0, cut_distance=5.0, fourier_order=1, alchemy='periodic-table', alchemy_period_width=1.6, alchemy_group_width=1.6)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - B_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(B_{j}\) are FCHL representation vectors. K is calculated analytically using an OpenMP parallel Fortran routine. Note, that this kernel will ONLY work with FCHL representations as input.

Parameters:
  • A (numpy array) – Array of FCHL representation - shape=(N, maxsize, 5, size).
  • B (numpy array) – Array of FCHL representation - shape=(M, maxsize, 5, size).
  • sigma (list) – List of kernel-widths.
  • two_body_scaling (float) – Weight for 2-body terms.
  • three_body_scaling (float) – Weight for 3-body terms.
  • two_body_width (float) – Gaussian width for 2-body terms
  • three_body_width (float) – Gaussian width for 3-body terms.
  • two_body_power (float) – Powerlaw for \(r^{-n}\) 2-body terms.
  • three_body_power (float) – Powerlaw for Axilrod-Teller-Muto 3-body term
  • cut_start (float) – The fraction of the cut-off radius at which cut-off damping start.
  • cut_distance (float) – Cut-off radius. (default=5 angstrom)
  • fourier_order (integer) – 3-body Fourier-expansion truncation order.
  • alchemy (string) – Type of alchemical interpolation "periodic-table" or "off" are possible options. Disabling alchemical interpolation can yield dramatic speedups.
  • alchemy_period_width (float) – Gaussian width along periods (columns) in the periodic table.
  • alchemy_group_width (float) – Gaussian width along groups (rows) in the periodic table.
Returns:

Array of FCHL kernel matrices matrix - shape=(n_sigmas, N, M),

Return type:

numpy array

qml.fchl.get_atomic_symmetric_kernels(A, sigmas, two_body_scaling=2.8284271247461903, three_body_scaling=1.6, two_body_width=0.2, three_body_width=3.141592653589793, two_body_power=4.0, three_body_power=2.0, cut_start=1.0, cut_distance=5.0, fourier_order=1, alchemy='periodic-table', alchemy_period_width=1.6, alchemy_group_width=1.6)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - B_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(B_{j}\) are FCHL representation vectors. K is calculated analytically using an OpenMP parallel Fortran routine. Note, that this kernel will ONLY work with FCHL representations as input.

Parameters:
  • A (numpy array) – Array of FCHL representation - shape=(N, maxsize, 5, size).
  • sigma (list) – List of kernel-widths.
  • two_body_scaling (float) – Weight for 2-body terms.
  • three_body_scaling (float) – Weight for 3-body terms.
  • two_body_width (float) – Gaussian width for 2-body terms
  • three_body_width (float) – Gaussian width for 3-body terms.
  • two_body_power (float) – Powerlaw for \(r^{-n}\) 2-body terms.
  • three_body_power (float) – Powerlaw for Axilrod-Teller-Muto 3-body term
  • cut_start (float) – The fraction of the cut-off radius at which cut-off damping start.
  • cut_distance (float) – Cut-off radius. (default=5 angstrom)
  • fourier_order (integer) – 3-body Fourier-expansion truncation order.
  • alchemy (string) – Type of alchemical interpolation "periodic-table" or "off" are possible options. Disabling alchemical interpolation can yield dramatic speedups.
  • alchemy_period_width (float) – Gaussian width along periods (columns) in the periodic table.
  • alchemy_group_width (float) – Gaussian width along groups (rows) in the periodic table.
Returns:

Array of FCHL kernel matrices matrix - shape=(n_sigmas, N, M),

Return type:

numpy array

qml.fchl.get_global_kernels(A, B, sigmas, two_body_scaling=2.8284271247461903, three_body_scaling=1.6, two_body_width=0.2, three_body_width=3.141592653589793, two_body_power=4.0, three_body_power=2.0, cut_start=1.0, cut_distance=5.0, fourier_order=1, alchemy='periodic-table', alchemy_period_width=1.6, alchemy_group_width=1.6)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - B_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(B_{j}\) are FCHL representation vectors. K is calculated analytically using an OpenMP parallel Fortran routine. Note, that this kernel will ONLY work with FCHL representations as input.

Parameters:
  • A (numpy array) – Array of FCHL representation - shape=(N, maxsize, 5, maxneighbors).
  • B (numpy array) – Array of FCHL representation - shape=(M, maxsize, 5, maxneighbors).
  • sigma (list) – List of kernel-widths.
  • two_body_scaling (float) – Weight for 2-body terms.
  • three_body_scaling (float) – Weight for 3-body terms.
  • two_body_width (float) – Gaussian width for 2-body terms
  • three_body_width (float) – Gaussian width for 3-body terms.
  • two_body_power (float) – Powerlaw for \(r^{-n}\) 2-body terms.
  • three_body_power (float) – Powerlaw for Axilrod-Teller-Muto 3-body term
  • cut_start (float) – The fraction of the cut-off radius at which cut-off damping start.
  • cut_distance (float) – Cut-off radius. (default=5 angstrom)
  • fourier_order (integer) – 3-body Fourier-expansion truncation order.
  • alchemy (string) – Type of alchemical interpolation "periodic-table" or "off" are possible options. Disabling alchemical interpolation can yield dramatic speedups.
  • alchemy_period_width (float) – Gaussian width along periods (columns) in the periodic table.
  • alchemy_group_width (float) – Gaussian width along groups (rows) in the periodic table.
Returns:

Array of FCHL kernel matrices matrix - shape=(n_sigmas, N, M),

Return type:

numpy array

qml.fchl.get_global_symmetric_kernels(A, sigmas, two_body_scaling=2.8284271247461903, three_body_scaling=1.6, two_body_width=0.2, three_body_width=3.141592653589793, two_body_power=4.0, three_body_power=2.0, cut_start=1.0, cut_distance=5.0, fourier_order=1, alchemy='periodic-table', alchemy_period_width=1.6, alchemy_group_width=1.6)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - A_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(A_{j}\) are FCHL representation vectors. K is calculated analytically using an OpenMP parallel Fortran routine. Note, that this kernel will ONLY work with FCHL representations as input.

Parameters:
  • A (numpy array) – Array of FCHL representation - shape=(N, maxsize, 5, maxneighbors).
  • sigma (list) – List of kernel-widths.
  • two_body_scaling (float) – Weight for 2-body terms.
  • three_body_scaling (float) – Weight for 3-body terms.
  • two_body_width (float) – Gaussian width for 2-body terms
  • three_body_width (float) – Gaussian width for 3-body terms.
  • two_body_power (float) – Powerlaw for \(r^{-n}\) 2-body terms.
  • three_body_power (float) – Powerlaw for Axilrod-Teller-Muto 3-body term
  • cut_start (float) – The fraction of the cut-off radius at which cut-off damping start.
  • cut_distance (float) – Cut-off radius. (default=5 angstrom)
  • fourier_order (integer) – 3-body Fourier-expansion truncation order.
  • alchemy (string) – Type of alchemical interpolation "periodic-table" or "off" are possible options. Disabling alchemical interpolation can yield dramatic speedups.
  • alchemy_period_width (float) – Gaussian width along periods (columns) in the periodic table.
  • alchemy_group_width (float) – Gaussian width along groups (rows) in the periodic table.
Returns:

Array of FCHL kernel matrices matrix - shape=(n_sigmas, N, N),

Return type:

numpy array

qml.fchl.get_local_kernels(A, B, sigmas, two_body_scaling=2.8284271247461903, three_body_scaling=1.6, two_body_width=0.2, three_body_width=3.141592653589793, two_body_power=4.0, three_body_power=2.0, cut_start=1.0, cut_distance=5.0, fourier_order=1, alchemy='periodic-table', alchemy_period_width=1.6, alchemy_group_width=1.6)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - B_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(B_{j}\) are FCHL representation vectors. K is calculated analytically using an OpenMP parallel Fortran routine. Note, that this kernel will ONLY work with FCHL representations as input.

Parameters:
  • A (numpy array) – Array of FCHL representation - shape=(N, maxsize, 5, maxneighbors).
  • B (numpy array) – Array of FCHL representation - shape=(M, maxsize, 5, maxneighbors).
  • sigma (list) – List of kernel-widths.
  • two_body_scaling (float) – Weight for 2-body terms.
  • three_body_scaling (float) – Weight for 3-body terms.
  • two_body_width (float) – Gaussian width for 2-body terms
  • three_body_width (float) – Gaussian width for 3-body terms.
  • two_body_power (float) – Powerlaw for \(r^{-n}\) 2-body terms.
  • three_body_power (float) – Powerlaw for Axilrod-Teller-Muto 3-body term
  • cut_start (float) – The fraction of the cut-off radius at which cut-off damping start.
  • cut_distance (float) – Cut-off radius. (default=5 angstrom)
  • fourier_order (integer) – 3-body Fourier-expansion truncation order.
  • alchemy (string) – Type of alchemical interpolation "periodic-table" or "off" are possible options. Disabling alchemical interpolation can yield dramatic speedups.
  • alchemy_period_width (float) – Gaussian width along periods (columns) in the periodic table.
  • alchemy_group_width (float) – Gaussian width along groups (rows) in the periodic table.
Returns:

Array of FCHL kernel matrices matrix - shape=(n_sigmas, N, M),

Return type:

numpy array

qml.fchl.get_local_symmetric_kernels(A, sigmas, two_body_scaling=2.8284271247461903, three_body_scaling=1.6, two_body_width=0.2, three_body_width=3.141592653589793, two_body_power=4.0, three_body_power=2.0, cut_start=1.0, cut_distance=5.0, fourier_order=1, alchemy='periodic-table', alchemy_period_width=1.6, alchemy_group_width=1.6)

Calculates the Gaussian kernel matrix K, where \(K_{ij}\):

\(K_{ij} = \exp \big( -\frac{\|A_i - A_j\|_2^2}{2\sigma^2} \big)\)

Where \(A_{i}\) and \(A_{j}\) are FCHL representation vectors. K is calculated analytically using an OpenMP parallel Fortran routine. Note, that this kernel will ONLY work with FCHL representations as input.

Parameters:
  • A (numpy array) – Array of FCHL representation - shape=(N, maxsize, 5, maxneighbors).
  • sigma (list) – List of kernel-widths.
  • two_body_scaling (float) – Weight for 2-body terms.
  • three_body_scaling (float) – Weight for 3-body terms.
  • two_body_width (float) – Gaussian width for 2-body terms
  • three_body_width (float) – Gaussian width for 3-body terms.
  • two_body_power (float) – Powerlaw for \(r^{-n}\) 2-body terms.
  • three_body_power (float) – Powerlaw for Axilrod-Teller-Muto 3-body term
  • cut_start (float) – The fraction of the cut-off radius at which cut-off damping start.
  • cut_distance (float) – Cut-off radius. (default=5 angstrom)
  • fourier_order (integer) – 3-body Fourier-expansion truncation order.
  • alchemy (string) – Type of alchemical interpolation "periodic-table" or "off" are possible options. Disabling alchemical interpolation can yield dramatic speedups.
  • alchemy_period_width (float) – Gaussian width along periods (columns) in the periodic table.
  • alchemy_group_width (float) – Gaussian width along groups (rows) in the periodic table.
Returns:

Array of FCHL kernel matrices matrix - shape=(n_sigmas, N, N),

Return type:

numpy array

qml.wrappers module

qml.wrappers.arad_local_kernels(mols1, mols2, sigmas, width=0.2, cut_distance=5.0, r_width=1.0, c_width=0.5)
qml.wrappers.arad_local_symmetric_kernels(mols1, sigmas, width=0.2, cut_distance=5.0, r_width=1.0, c_width=0.5)
qml.wrappers.get_atomic_kernels_gaussian(mols1, mols2, sigmas)
qml.wrappers.get_atomic_kernels_laplacian(mols1, mols2, sigmas)