IRA library documentation

GitHub repository Online documentation and tutorials

Introduction

Even though this library is called the IRA library, there are three main algorithms contained inside:

  • CShDA: Constrained Shortest Distance Assignments;

  • IRA: Iterative Rotations and Assignments;

  • SOFI: Symmetry Operation FInder.

Each of those algorithms is designed to solve a different problem related to characterization of generic atomic structures.

Some other miscallaneous routines are contained, see the section of How-to guides.

CShDA

Constrained Shortest Distance Assignments (CShDA, [ira2021]) is the really fundamental algorithm for the rest of this library. It solves the so-called Linear Assignment Problem (LAP), also referred to as the “point-set registry”, and then computes the distance between two atomic structures \(A\) and \(B\).

More precisely, it solves the problem sometimes referred to as bottleneck-LAP, and uses that solution to compute the Hausdorff distance:

\[d_H( A, B ) = \max_i \min_j \big\{ d(a_i, b_j) \big\} \qquad(1)\]

where \(a_i \in A\) and \(b_j \in B\) are atomic positions, and \(d()\) is the Cartesian distance function.

By first solving the LAP, computing the distance between two atomic structures becomes trivial. In fact, such distance is invariant to permutations of atoms \(P_B\), and variant to all other rigid transformations. CShDA returns the atomic permutations, and values of atom-atom distances w.r.t. the permutation: distances \(d(a_i, b_i)\), with indices \(i\), after applying the permutation computed in the beginning (LAP solution). Finally, the distance between \(A\) and \(B\) is taken as the maximal value in the set.

It enforces a strict one-to-one assignment of the atoms, and works also for structures containing different numbers of atoms.

Since CShDA can be relatively heavy to compute many times, a heuristic early-exit criterion is set up, and used whenever possible. The criterion is in the form of a distance threshold: as soon as it is established that the final distance value returned by CShDA cannot be below the given threshold, the computation exits with a high distance value. The heuristic can be disregarded by simply inputting a very high value for this threshold.

(under construction)

IRA

Iterative Rotations and Assignments (IRA, [ira2021]) solves the so-called shape matching problem, also sometimes called structural alignment, or similar:

\[P_B B = \mathbf{R} A + \mathbf{t} \qquad (2)\]

where \(A\) and \(B\) are two atomic structures, \(\mathbf{R}\) is a rotation/reflection matrix, \(\mathbf{t}\) is a translation vector, and \(P_B\) is a permutation matrix of atomic indices in \(B\).

The problem of finding an optimal rotation \(\mathbf{R}\) when \(A\) and \(B\) do not include permutations is known as the orthogonal Procrustes problem, and can be easily solved by Singular Value Decomposition method (SVD), see the wikipedia article. The routines to solve it are also included in this library.

The shape-matching problem of Eq. (2) is however more complicated since \(A\) and \(B\) generally do include permutations, but it can be rewritten as an optimization problem:

\[\DeclareMathOperator*{\argmin}{arg\,min} \argmin_{\bf R,t} \big\{ D({\bf R}A + {\bf t}, B) \big\} \qquad (3)\]

in which \(D\) is a general distance function between two sets, that is (i) variant under \({\bf R}\) and \({\bf t}\), (ii) invariant under permutation \(P_B\), and (iii) returns value 0 when \({\bf R}\) and \({\bf t}\) are such that Eq. (2) is satisfied, i.e. when the best match is found.

The IRA algorithm solves the Eq. (3) by parsing the space of rigid transformations associated to the atomic structure, computing \(D()\) for each candidate, and returning the single transformation that minimizes it. In presence of distortions, it calls the SVD-based algorithm to further minimize the rotations. The function \(D()\) in Eq. (3) is computed by CShDA.

When \(A\) and \(B\) are congruent, IRA is guaranteed to return the optimal solution, independently of the initial orientation and permutation of the structures, which is not entirely obvious (see benchmarks in [ira2021]). In case of nearly-congruent structures, this becomes debatable, since the notion of near-congruency is not uniquely defined, and is dependent on the involved structures.

Therefore, IRA is not recommended for applications where the value of (dis-)similarity between structures is the key quantity. It is however highly efficient in tasks like retrieving a certain structure from a list of structures, where the structure is certain to be present in the list (in any orientation and permutation).

[ira2021] (1,2,3)
  1. Gunde, N. Salles, A. Hemeryck, L. Martin-Samos, JCIM, 2021, DOI: 10.1021/acs.jcim.1c00567, arXiv: 2111.00939

SOFI

Symmetry Operation Finder (SOFI, [sofi2024]) is an algorithm for finding point group symmetry operations of atomic structures.

By definition, the transformation of a structure by a symmetry operation, should give a structure that is equivalent to the original, up to a permutation of indices:

\[P_A A = \boldsymbol{\theta} A \qquad (4)\]

where \(A\) is an atomic structure, \(P_A\) is a permutation of atomic indices in \(A\), and \(\boldsymbol{\theta}\) is a symmetry operation in the form of 3x3 orthonormal matrix.

Notice the similarity of Eq. (4) to Eq. (2): the structure \(B\) is now equal to \(A\), the rigid transformation \(\mathbf{R}\) becomes a symmetry operation \(\boldsymbol{\theta}\), and the translation \(\mathbf{t}\) vanishes.

When the structure \(A\) contains point group symmetries, Eq. (4) has degenrate solutions in form of pairs \((\boldsymbol{\theta}, P_A)\). The set of all such pairs represents the set of point group symmetry operations for the structure. SOFI solves this problem. It can be seen as an extension of IRA, where IRA gives a single, optimal solution to matching two (near-)congruent structures, SOFI gives all degenerate solutions of matching a structure to itself.

[sofi2024]
  1. Gunde, N. Salles, L. Grisanti, L. Martin-Samos, A. Hemeryck, JCP, 2024, DOI: 10.1063/5.0215689, arXiv: 2408.06131

Compilation

The compilation will produce directories IRA/lib and IRA/include, which contain the library and module files respectively. Other directories may be creted, depending on the flavour of build tool used.

Python module via pip

If you just want the ira_mod python module, it can be done by running in IRA/ root directory (this relies on cmake):

python -m pip install .

Then you can directly use it:

>>> import ira_mod
>>> print( ira_mod.version )
>>> ira = ira_mod.IRA()
>>> sofi = ira_mod.SOFI()

conda installation

Using the conda-forge channel, you can download and install the ira_mod python module by:

conda install ira

Other build tools

Other build tools are available, use one of the following:

To install with cmake, execute:

cmake -B builddir
cmake --build builddir

This will produce directories lib and include in the builddir path specified, they will contain the static library libira.a and the fortran module files, respectively. If you wish to compile the shared library, configure the build with -DBUILD_SHARED_LIBS=ON:

cmake -B builddir -DBUILD_SHARED_LIBS=ON
cmake --build builddir

The available options are:

  • -DIRA_BUILD_TESTS: build some simple tests (default OFF);

  • -DIRA_ENABLE_NATIVE_OPTIMIZATION pass the compiler flags “-march=native -ffast-math” or equivalent (default ON).

To install the library and module files to a particular location, do:

cmake -B builddir -DCMAKE_INSTALL_PREFIX=your/install/path
cmake --build builddir
cmake --install builddir

This will copy the lib and include directories and their contents to your/install/path, and create files needed for other CMake and/or pkg-config based projects to find the library.

python module ira_mod

To use the python module, you will need to pass the -DIRA_INSTALL_PYTHON=ON, install the library somewhere, and set the PYTHONPATH variable to that same path with adding /ira_mod:

cmake -B builddir -DIRA_INSTALL_PYTHON=ON -DCMAKE_INSTALL_PREFIX=your/install/path
cmake --build builddir
cmake --install builddir
export PYTHONPATH=your/install/path/ira_mod

To compile the IRA library, you need the lapack library. On a standard linux machine, it should suffice to type:

cd src/
make all

This will generate the static and shared libraries: libira.a, and libira.so. To compile only one of the libraries, type make lib or make shlib, respectively.

To clean the build type:

make clean

lapack library

The lapack library is needed for compilation. Default flag is LIBLAPACK=-llapack, however if you wish to change that, i.e. with openblas, you can specify it from the command as:

LIBLAPACK=-lopenblas make all

If the lapack library is not available on your system, you can leave the variable undefined (this will compile a local version of the needed lapack routines, which is however not optimal):

LIBLAPACK='' make all

python module ira_mod

To use the python module, you will need to set the PYTHONPATH variable:

export PYTHONPATH=/path/to/IRA/interface:$PYTHONPATH

For local machines, it is possible to use pixi [pixi-install] to get a working version of the Python bindings in a fairly automated manner.

curl -fsSL https://pixi.sh/install.sh | bash
# build the library with openblas
pixi run build_lib
pixi shell # sets the environment variable
cd examples/IRA
python python_program.py
[pixi-install]

Installation instructions here: https://pixi.sh/latest/

Required minimum fpm version 0.12.0.

fpm build --flag "-fPIC -fcheck=bounds -ffree-line-length-none -Ofast -march=native -ffast-math -funroll-loops"
fpm install --prefix .

python module ira_mod

To use the python module, you will need to set the PYTHONPATH variable:

export PYTHONPATH=/path/to/IRA/interface:$PYTHONPATH

Linking a program to libira

Linking a program to libira is simple, however note that:

  • The base-level implementations are not placed in modules, therefore all routines are in principle acessible to the caller. Care must be taken to ensure the correct type, kind, shape, etc. of the arguments, i.e. interface matching needs to be checked manually;

  • The default precision is equivalent to c_int for integers, and c_double for reals, they are defined in the IRA/src/ira_precision.f90 module.

To link your Fortran or C target in CMake with libira, it suffices to add IRA project into your CMakeLists.txt project by one of the following methods:

  • To include IRA as part of the build process of your program:

    • With FetchContent as:

      # add IRA via FetchContent:
      include( FetchContent )
      FetchContent_Declare( ira_git
        GIT_REPOSITORY https://github.com/mammasmias/IterativeRotationsAssignments.git
        GIT_TAG master
        GIT_SHALLOW 1
        )
      FetchContent_MakeAvailable( ira_git )
      
      # link it to your target:
      target_link_libraries( your-target PRIVATE ira::ira )
      
    • With add_subdirectory() as submodule as:

      # i.e. IRA is a submodule located in a directory 'path/to/IRA'
      add_subdirectory( path/to/IRA )
      
      # link it to your target:
      target_link_libraries( your-target PRIVATE ira::ira )
      
  • To include an already-compiled IRA somewhere on your machine:

    • With include() to locate the exported targets file. Note, IRA exports a config file during the build step, located in the binary/cmake/ira dir, and during install step located in the install/lib/cmake/ira dir.

      # using include() with the build directory:
      include( path-to-ira-build/cmake/ira/ira-BuildConfig.cmake )
      
      # using include() with the install directory:
      include( path-to-ira-install/lib/cmake/ira/ira-config.cmake )
      
      # link to your target:
      target_link_libraries( your-target PRIVATE ira::ira )
      
    • Or with find_package() as:

      # using find_package() with the build directory:
      find_package( ira HINTS path-to-ira-build/cmake/ira NAMES ira-Build )
      
      # using find_package() with the install directory:
      find_package( ira HINTS path-to-ira-install )
      
      # link to your target:
      target_link_libraries( your-target PRIVATE ira::ira )
      

A program compiled with gcc or gfortran can easily link the IRA library, as-is, by linking either the shared library libira.so, or the static version libira.a. They are both located in the lib/ directory after compilation. The module files are located in include/.

Example for fortran program:

gfortran -o caller_program.x caller_program.f90 -L/your/path/to/IRA/lib/ -lira -Wl,-rpath,/your/path/to/IRA/lib

The C-headers are located in the IRA/interface directory, and can be included in compilation by -I/your/path/to/IRA/interface.

When linking the static library libira.a to a C-program, you need to add the math (-lm), and fortran (-lgfortran, or equivalent) to the compilation:

gcc -I/your/path/IRA/interface -o c_prog.x c_prog.c -L/your/path/to/IRA/src -lira -Wl,-rpath,/your/path/to/IRA/src -lm -lgfortran

Tutorials and How-to

(under construction)