template<typename Id_, typename Index_, class Slab_, bool track_reuse_ = false>
class tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >
Oracular-aware cache for slabs.
- Template Parameters
-
Id_ | Type of slab identifier, typically integer. |
Index_ | Type of row/column index produced by the oracle. |
Slab_ | Class for a single slab. |
track_reuse_ | Whether to track slabs in the cache that are re-used. |
Implement an oracle-aware cache for slabs. Each slab is defined as the set of chunks required to read an element of the target dimension (or a contiguous block/indexed subset thereof) from a tatami::Matrix
. This cache can be used for Matrix
representations where the data is costly to load (e.g., from file) and a tatami::Oracle
is provided to predict future accesses on the target dimension. In such cases, chunks of data can be loaded and cached such that any possible future request for an already-loaded slab will just fetch it from cache.
It is assumed that each slab has the same size such that Slab_
instances can be effectively reused between slabs without requiring any reallocation of memory. For variable-sized slabs, consider using OracularVariableSlabCache
instead.
template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>
template<class Ifunction_ , class Cfunction_ , class Pfunction_ >
std::pair< const Slab_ *, Index_ > tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::next |
( |
Ifunction_ | identify, |
|
|
Cfunction_ | create, |
|
|
Pfunction_ | populate ) |
|
inline |
Fetch the next slab according to the stream of predictions provided by the tatami::Oracle
. This method should only be called if num_slabs > 0
in the constructor; otherwise, no slabs are actually available and cannot be returned.
- Template Parameters
-
Ifunction_ | Function to identify the slab containing each predicted row/column. |
Cfunction_ | Function to create a new slab. |
Pfunction_ | Function to populate zero, one or more slabs with their contents. |
- Parameters
-
identify | Function that accepts i , an Index_ containing the predicted index of a single element on the target dimension. This should return a pair containing:
- An
Id_ , the identifier of the slab containing i . This is typically defined as the index of the slab on the target dimension. For example, if each chunk takes up 10 rows, attempting to access row 21 would require retrieval of slab 2.
- An
Index_ , the index of row/column i inside that slab. For example, if each chunk takes up 10 rows, attempting to access row 21 would yield an offset of 1.
|
create | Function that accepts no arguments and returns a Slab_ object with sufficient memory to hold a slab's contents when used in populate() . This may also return a default-constructed Slab_ object if the allocation is done dynamically per slab in populate() . |
populate | Function where the arguments depend on track_reuse_ . The return value is ignored.
- If
track_reuse = false , this function accepts a single std::vector<std::pair<Id_, Slab_*> >& specifying the slabs to be populated. The first Id_ element of each pair contains the slab identifier, i.e., the first element returned by the identify function. The second Slab_* element contains a pointer to a Slab_ returned by create() . This function should iterate over the vector and populate each slab. The vector is guaranteed to be non-empty but is not guaranteed to be sorted.
- If
track_reuse_ = true , this function accepts two std::vector<std::pair<Id_, Slab_*> >& arguments. The first vector (to_populate ) specifies the slabs to be populated, identical to the sole expected argument when track_reuse_ = false . The second vector (to_reuse ) specifies the existing slabs in the cache to be reused. This function should iterate over to_populate and populate each slab. The function may also iterate over to_reuse to perform some housekeeping on the existing slabs (e.g., defragmentation). to_populate is guaranteed to be non-empty but to_reuse is not. Neither vector is guaranteed to be sorted.
|
- Returns
- Pair containing (1) a pointer to a slab's contents and (2) the index of the next predicted row/column inside the retrieved slab.