tatami_stats
Matrix statistics for tatami
Loading...
Searching...
No Matches
Classes | Functions
tatami_stats::grouped_variances Namespace Reference

Functions for computing dimension-wise grouped variances. More...

Classes

struct  Options
 Grouped summation options. More...
 

Functions

template<typename Value_ , typename Index_ , typename Group_ , typename Output_ >
void direct (const Value_ *ptr, Index_ num, const Group_ *group, size_t num_groups, const Index_ *group_size, Output_ *output_means, Output_ *output_variances, bool skip_nan, Index_ *valid_group_size)
 
template<typename Value_ , typename Index_ , typename Group_ , typename Output_ >
void direct (const Value_ *value, const Index_ *index, Index_ num_nonzero, const Group_ *group, size_t num_groups, const Index_ *group_size, Output_ *output_means, Output_ *output_variances, Index_ *output_nonzero, bool skip_nan, Index_ *valid_group_size)
 
template<typename Value_ , typename Index_ , typename Group_ , typename Output_ >
void apply (bool row, const tatami::Matrix< Value_, Index_ > *p, const Group_ *group, size_t num_groups, const Index_ *group_size, Output_ **output, const Options &sopt)
 
template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > by_row (const tatami::Matrix< Value_, Index_ > *p, const Group_ *group, const Options &sopt)
 
template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > by_row (const tatami::Matrix< Value_, Index_ > *p, const Group_ *group)
 
template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > by_column (const tatami::Matrix< Value_, Index_ > *p, const Group_ *group, const Options &sopt)
 
template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > by_column (const tatami::Matrix< Value_, Index_ > *p, const Group_ *group)
 

Detailed Description

Functions for computing dimension-wise grouped variances.

Function Documentation

◆ direct() [1/2]

template<typename Value_ , typename Index_ , typename Group_ , typename Output_ >
void tatami_stats::grouped_variances::direct ( const Value_ *  ptr,
Index_  num,
const Group_ *  group,
size_t  num_groups,
const Index_ *  group_size,
Output_ *  output_means,
Output_ *  output_variances,
bool  skip_nan,
Index_ *  valid_group_size 
)

Compute the mean and variance from a dense objective vector. This uses the standard two-pass algorithm with naive accumulation of the sum of squared differences; thus, it is best used with a sufficiently high-precision Output_ like double.

Template Parameters
Value_Type of the input data.
Index_Integer type of the matrix indices.
Group_Integer type of the group assignments.
Output_Type of the output data.
Parameters
[in]ptrPointer to an array of values of length num.
numLength of the objective vector, i.e., length of the array at ptr.
[in]groupPointer to an array of length num, containing the group assignment for each entry of ptr. Entries of group should lie in \([0, N)\) where \(N\) is the number of unique groups.
num_groupsNumber of groups, i.e., \(N\).
[in]group_sizePointer to an array of length num_groups, containing the size of each group. This can be obtained by calling tabulate_groups() on group.
[out]output_meansPointer to an array of length num_groups. This is filled with the per-group mean on output. Values may be NaN if there are not enough (non-NaN) values in a group.
[out]output_variancesPointer to an array of length num_groups. This is filled with the per-group variances on output. Values may be NaN if there are not enough (non-NaN) values in a group.
skip_nanSee Options::skip_nan.
[out]valid_group_sizePointer to an array of length num_groups. This is used to store the number of non-NaN entries. Only used if skip_nan = true.

◆ direct() [2/2]

template<typename Value_ , typename Index_ , typename Group_ , typename Output_ >
void tatami_stats::grouped_variances::direct ( const Value_ *  value,
const Index_ *  index,
Index_  num_nonzero,
const Group_ *  group,
size_t  num_groups,
const Index_ *  group_size,
Output_ *  output_means,
Output_ *  output_variances,
Index_ *  output_nonzero,
bool  skip_nan,
Index_ *  valid_group_size 
)

Compute the mean and variance from a sparse objective vector. This uses the standard two-pass algorithm with naive accumulation of the sum of squared differences; thus, it is best used with a sufficiently high-precision Output_ like double.

Template Parameters
Value_Type of the input data.
Index_Integer type of the matrix indices.
Group_Integer type of the group assignments.
Output_Type of the output data.
Parameters
[in]valuePointer to an array of length num_nonzero, containing the values of the structural non-zeros.
[in]indexPointer to an array of length num_nonzero, containing the indices of the structural non-zeros. All indices should be non-negative and less than the length of the objective vector.
num_nonzeroNumber of structural non-zeros.
[in]groupPointer to an array of length equal to the length of the objective vector, containing the group assignment for each vector element. Entries of group should lie in \([0, N)\) where \(N\) is the number of unique groups.
num_groupsNumber of groups, i.e., \(N\).
[in]group_sizePointer to an array of length num_groups, containing the size of each group. This can be obtained by calling tabulate_groups() on group.
[out]output_meansPointer to an array of length num_groups. This is filled with the per-group mean on output. Values may be NaN if there are not enough (non-NaN) values in a group.
[out]output_variancesPointer to an array of length num_groups. This is filled with the per-group variances on output. Values may be NaN if there are not enough (non-NaN) values in a group.
[out]output_nonzeroPointer to an array of length num_groups. On output, this is filled with the number of structural non-zeros in each group.
skip_nanSee Options::skip_nan.
[out]valid_group_sizePointer to an array of length num_groups. This is used to store the number of non-NaN entries. Only used if skip_nan = true.

◆ apply()

template<typename Value_ , typename Index_ , typename Group_ , typename Output_ >
void tatami_stats::grouped_variances::apply ( bool  row,
const tatami::Matrix< Value_, Index_ > *  p,
const Group_ *  group,
size_t  num_groups,
const Index_ *  group_size,
Output_ **  output,
const Options sopt 
)

Compute per-group variances for each element of a chosen dimension of a tatami::Matrix.

Template Parameters
Value_Type of the matrix value, should be numeric.
Index_Type of the row/column indices.
Group_Type of the group assignments for each column.
Output_Type of the output value. This should be floating-point to store potential averages.
Parameters
rowWhether to compute variances for the rows.
pPointer to a tatami::Matrix.
[in]groupPointer to an array of length equal to the number of columns (if row = true) or rows (otherwise). Each value should be an integer that specifies the group assignment. Values should lie in \([0, N)\) where \(N\) is the number of unique groups.
num_groupsNumber of groups, i.e., \(N\). This can be determined by calling tatami_stats::total_groups() on group.
[in]group_sizePointer to an array of length num_groups, containing the size of each group.
[out]outputPointer to an array of pointers of length equal to the number of groups. Each inner pointer should reference an array of length equal to the number of rows (if row = true) or columns (otherwise). On output, this will contain the row/column variances for each group (indexed according to the assignment in group).
soptSummation options.

◆ by_row() [1/2]

template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > tatami_stats::grouped_variances::by_row ( const tatami::Matrix< Value_, Index_ > *  p,
const Group_ *  group,
const Options sopt 
)

Wrapper around apply() for row-wise grouped variances.

Template Parameters
Output_Type of the output.
Value_Type of the matrix value.
Index_Type of the row/column indices.
Group_Type of the group assignments for each row.
Parameters
pPointer to a tatami::Matrix.
[in]groupPointer to an array of length equal to the number of columns. Each value should be an integer that specifies the group assignment. Values should lie in \([0, N)\) where \(N\) is the number of unique groups.
soptSummation options.
Returns
Vector of length equal to the number of groups. Each entry is a vector of length equal to the number of rows, containing the row-wise variances for the corresponding group.

◆ by_row() [2/2]

template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > tatami_stats::grouped_variances::by_row ( const tatami::Matrix< Value_, Index_ > *  p,
const Group_ *  group 
)

Overload with default options.

Template Parameters
Output_Type of the output.
Value_Type of the matrix value.
Index_Type of the row/column indices.
Group_Type of the group assignments for each column.
Parameters
pPointer to a tatami::Matrix.
[in]groupPointer to an array of length equal to the number of columns. Each value should be an integer that specifies the group assignment. Values should lie in \([0, N)\) where \(N\) is the number of unique groups.
Returns
Vector of length equal to the number of groups. Each entry is a vector of length equal to the number of rows, containing the row-wise variances for the corresponding group.

◆ by_column() [1/2]

template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > tatami_stats::grouped_variances::by_column ( const tatami::Matrix< Value_, Index_ > *  p,
const Group_ *  group,
const Options sopt 
)

Wrapper around apply() for column-wise grouped variances.

Template Parameters
Output_Type of the output.
Value_Type of the matrix value.
Index_Type of the column/column indices.
Group_Type of the group assignments for each column.
Parameters
pPointer to a tatami::Matrix.
[in]groupPointer to an array of length equal to the number of rows. Each value should be an integer that specifies the group assignment. Values should lie in \([0, N)\) where \(N\) is the number of unique groups.
soptSummation options.
Returns
Vector of length equal to the number of groups. Each entry is a vector of length equal to the number of columns, containing the column-wise variances for the corresponding group.

◆ by_column() [2/2]

template<typename Output_ = double, typename Value_ , typename Index_ , typename Group_ >
std::vector< std::vector< Output_ > > tatami_stats::grouped_variances::by_column ( const tatami::Matrix< Value_, Index_ > *  p,
const Group_ *  group 
)

Overload with default options.

Template Parameters
Output_Type of the output.
Value_Type of the matrix value.
Index_Type of the column/column indices.
Group_Type of the group assignments for each column.
Parameters
pPointer to a tatami::Matrix.
[in]groupPointer to an array of length equal to the number of rows. Each value should be an integer that specifies the group assignment. Values should lie in \([0, N)\) where \(N\) is the number of unique groups.
Returns
Vector of length equal to the number of groups. Each entry is a vector of length equal to the number of columns, containing the column-wise variances for the corresponding group.