tatami_chunked
Helpers to create custom chunked tatami matrices
Loading...
Searching...
No Matches
Public Member Functions | List of all members
tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ > Class Template Reference

Oracular-aware cache for slabs. More...

#include <OracularSlabCache.hpp>

Public Member Functions

 OracularSlabCache (std::shared_ptr< const tatami::Oracle< Index_ > > oracle, size_t max_slabs)
 
 OracularSlabCache (const OracularSlabCache &)=delete
 
OracularSlabCacheoperator= (const OracularSlabCache &)=delete
 
Index_ next ()
 
template<class Ifunction_ , class Cfunction_ , class Pfunction_ >
std::pair< const Slab_ *, Index_next (Ifunction_ identify, Cfunction_ create, Pfunction_ populate)
 
size_t get_max_slabs () const
 
size_t get_num_slabs () const
 

Detailed Description

template<typename Id_, typename Index_, class Slab_, bool track_reuse_ = false>
class tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >

Oracular-aware cache for slabs.

Template Parameters
Id_Type of slab identifier, typically integer.
Index_Type of row/column index produced by the oracle.
Slab_Class for a single slab.
track_reuse_Whether to track slabs in the cache that are re-used.

Implement an oracle-aware cache for slabs. Each slab is defined as the set of chunks required to read an element of the target dimension (or a contiguous block/indexed subset thereof) from a tatami::Matrix. This cache can be used for Matrix representations where the data is costly to load (e.g., from file) and a tatami::Oracle is provided to predict future accesses on the target dimension. In such cases, chunks of data can be loaded and cached such that any possible future request for an already-loaded slab will just fetch it from cache.

It is assumed that each slab has the same size such that Slab_ instances can be effectively reused between slabs without requiring any reallocation of memory. For variable-sized slabs, consider using OracularVariableSlabCache instead.

Constructor & Destructor Documentation

◆ OracularSlabCache() [1/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::OracularSlabCache ( std::shared_ptr< const tatami::Oracle< Index_ > >  oracle,
size_t  max_slabs 
)
inline
Parameters
oraclePointer to an tatami::Oracle to be used for predictions.
max_slabsMaximum number of slabs to store in the cache.

◆ OracularSlabCache() [2/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::OracularSlabCache ( const OracularSlabCache< Id_, Index_, Slab_, track_reuse_ > &  )
delete

Deleted as the cache holds persistent pointers.

Member Function Documentation

◆ operator=()

Deleted as the cache holds persistent pointers.

◆ next() [1/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
Index_ tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::next ( )
inline

This method is intended to be called when num_slabs = 0, to provide callers with the oracle predictions for non-cached extraction of data. Calls to this method should not be intermingled with calls to its overload below; the latter should only be called when num_slabs > 0.

Returns
The next prediction from the oracle.

◆ next() [2/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
std::pair< const Slab_ *, Index_ > tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::next ( Ifunction_  identify,
Cfunction_  create,
Pfunction_  populate 
)
inline

Fetch the next slab according to the stream of predictions provided by the tatami::Oracle. This method should only be called if num_slabs > 0 in the constructor; otherwise, no slabs are actually available and cannot be returned.

Template Parameters
Ifunction_Function to identify the slab containing each predicted row/column.
Cfunction_Function to create a new slab.
Pfunction_Function to populate zero, one or more slabs with their contents.
Parameters
identifyFunction that accepts i, an Index_ containing the predicted index of a single element on the target dimension. This should return a pair containing:
  1. An Id_, the identifier of the slab containing i. This is typically defined as the index of the slab on the target dimension. For example, if each chunk takes up 10 rows, attempting to access row 21 would require retrieval of slab 2.
  2. An Index_, the index of row/column i inside that slab. For example, if each chunk takes up 10 rows, attempting to access row 21 would yield an offset of 1.
createFunction that accepts no arguments and returns a Slab_ object with sufficient memory to hold a slab's contents when used in populate(). This may also return a default-constructed Slab_ object if the allocation is done dynamically per slab in populate().
populateFunction where the arguments depend on track_reuse_. The return value is ignored.
  • If track_reuse = false, this function accepts a single std::vector<std::pair<Id_, Slab_*> >& specifying the slabs to be populated. The first Id_ element of each pair contains the slab identifier, i.e., the first element returned by the identify function. The second Slab_* element contains a pointer to a Slab_ returned by create(). This function should iterate over the vector and populate each slab. The vector is guaranteed to be non-empty but is not guaranteed to be sorted.
  • If track_reuse_ = true, this function accepts two std::vector<std::pair<Id_, Slab_*> >& arguments. The first vector (to_populate) specifies the slabs to be populated, identical to the sole expected argument when track_reuse_ = false. The second vector (to_reuse) specifies the existing slabs in the cache to be reused. This function should iterate over to_populate and populate each slab. The function may also iterate over to_reuse to perform some housekeeping on the existing slabs (e.g., defragmentation). to_populate is guaranteed to be non-empty but to_reuse is not. Neither vector is guaranteed to be sorted.
Returns
Pair containing (1) a pointer to a slab's contents and (2) the index of the next predicted row/column inside the retrieved slab.

◆ get_max_slabs()

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
size_t tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::get_max_slabs ( ) const
inline
Returns
Maximum number of slabs in the cache.

◆ get_num_slabs()

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
size_t tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::get_num_slabs ( ) const
inline
Returns
Number of slabs currently in the cache.

The documentation for this class was generated from the following file: