tatami_stats
Matrix statistics for tatami
|
This library contains helper functions to compute matrix statistics for tatami, loosely inspired by what the matrixStats R package does for R matrices. It currently performs calculation of sums, variances, medians, ranges, etc. along either dimension. Each function automatically chooses the most efficient algorithm based on the matrix properties (e.g., sparsity, preferred access). Calculations can be parallelized across dimension elements via tatami::parallelize()
. If NaNs are present, they can be treated as missing and skipped. Low-level utilities for each algorithm are also exported for developer convenience.
tatami_stats is a header-only library, so it can be easily used by just #include
ing the relevant source files:
Check out the API documentation for more details.
The apply()
function in each statistic's namespace offers more control over the calculation of each statistic. For example, instead of creating a new vector, we can fill an existing array with the sums:
Some of the algorithms expose low-level functions for even more fine-grained control. For example, we can manage the loop over the matrix rows ourselves, computing the mean and median for each row:
These low-level functions allow developers to compute multiple statistics with a single pass through the matrix. In contrast, calling tatami_stats::sums::by_row
and tatami_stats::medians::by_row
separately would extract data from the matrix twice, which may be expensive for file-backed matrices.
FetchContent
If you're using CMake, you just need to add something like this to your CMakeLists.txt
:
Then you can link to tatami_stats to make the headers available during compilation:
find_package()
You can install the library by cloning a suitable version of this repository and running the following commands:
Then you can use find_package()
as usual:
If you're not using CMake, the simple approach is to just copy the files the include/
subdirectory - either directly or with Git submodules - and include their path during compilation with, e.g., GCC's -I
. You'll need to include the transitive dependencies yourself, check out extern/CMakeLists.txt
for a list.