Oracular-aware cache for slabs. More...

#include <OracularSlabCache.hpp>

Public Member Functions
	OracularSlabCache (std::shared_ptr< const tatami::Oracle< Index_ > > oracle, size_t max_slabs)

	OracularSlabCache (const OracularSlabCache &)=delete

OracularSlabCache &	operator= (const OracularSlabCache &)=delete

Index_	next ()

template<class Ifunction_ , class Cfunction_ , class Pfunction_ >
std::pair< const Slab_ *, Index_ >	next (Ifunction_ identify, Cfunction_ create, Pfunction_ populate)

size_t	get_max_slabs () const

size_t	get_num_slabs () const

Detailed Description

template<typename Id_, typename Index_, class Slab_, bool track_reuse_ = false>
class tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >

Oracular-aware cache for slabs.

Template Parameters

Id_	Type of slab identifier, typically integer.
Index_	Type of row/column index produced by the oracle.
Slab_	Class for a single slab.
track_reuse_	Whether to track slabs in the cache that are re-used.

Implement an oracle-aware cache for slabs. Each slab is defined as the set of chunks required to read an element of the target dimension (or a contiguous block/indexed subset thereof) from a tatami::Matrix. This cache can be used for Matrix representations where the data is costly to load (e.g., from file) and a tatami::Oracle is provided to predict future accesses on the target dimension. In such cases, chunks of data can be loaded and cached such that any possible future request for an already-loaded slab will just fetch it from cache.

It is assumed that each slab has the same size such that Slab_ instances can be effectively reused between slabs without requiring any reallocation of memory. For variable-sized slabs, consider using OracularVariableSlabCache instead.

Constructor & Destructor Documentation

◆ OracularSlabCache() [1/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::OracularSlabCache	(	std::shared_ptr< const tatami::Oracle< Index_ > >	oracle,
		size_t	max_slabs
	)

inline

Parameters

oracle	Pointer to an `tatami::Oracle` to be used for predictions.
max_slabs	Maximum number of slabs to store in the cache.

◆ OracularSlabCache() [2/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::OracularSlabCache ( const OracularSlabCache< Id_, Index_, Slab_, track_reuse_ > & )

delete

Deleted as the cache holds persistent pointers.

Member Function Documentation

◆ operator=()

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

OracularSlabCache & tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::operator= ( const OracularSlabCache< Id_, Index_, Slab_, track_reuse_ > & )

delete

Deleted as the cache holds persistent pointers.

◆ next() [1/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

Index_ tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::next ( )

inline

This method is intended to be called when num_slabs = 0, to provide callers with the oracle predictions for non-cached extraction of data. Calls to this method should not be intermingled with calls to its overload below; the latter should only be called when num_slabs > 0.

Returns: The next prediction from the oracle.

◆ next() [2/2]

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

template<class Ifunction_ , class Cfunction_ , class Pfunction_ >

std::pair< const Slab_ *, Index_ > tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::next	(	Ifunction_	identify,
		Cfunction_	create,
		Pfunction_	populate
	)

inline

Fetch the next slab according to the stream of predictions provided by the tatami::Oracle. This method should only be called if num_slabs > 0 in the constructor; otherwise, no slabs are actually available and cannot be returned.

Template Parameters

Ifunction_	Function to identify the slab containing each predicted row/column.
Cfunction_	Function to create a new slab.
Pfunction_	Function to populate zero, one or more slabs with their contents.

Parameters

identify	Function that accepts `i`, an `Index_` containing the predicted index of a single element on the target dimension. This should return a pair containing: An `Id_`, the identifier of the slab containing `i`. This is typically defined as the index of the slab on the target dimension. For example, if each chunk takes up 10 rows, attempting to access row 21 would require retrieval of slab 2. An `Index_`, the index of row/column `i` inside that slab. For example, if each chunk takes up 10 rows, attempting to access row 21 would yield an offset of 1.
create	Function that accepts no arguments and returns a `Slab_` object with sufficient memory to hold a slab's contents when used in `populate()`. This may also return a default-constructed `Slab_` object if the allocation is done dynamically per slab in `populate()`.
populate	Function where the arguments depend on `track_reuse_`. The return value is ignored. If `track_reuse = false`, this function accepts a single `std::vector<std::pair<Id_, Slab_> >&` specifying the slabs to be populated. The first `Id_` element of each pair contains the slab identifier, i.e., the first element returned by the `identify` function. The second `Slab_` element contains a pointer to a `Slab_` returned by `create()`. This function should iterate over the vector and populate each slab. The vector is guaranteed to be non-empty but is not guaranteed to be sorted. If `track_reuse_ = true`, this function accepts two `std::vector<std::pair<Id_, Slab_*> >&` arguments. The first vector (`to_populate`) specifies the slabs to be populated, identical to the sole expected argument when `track_reuse_ = false`. The second vector (`to_reuse`) specifies the existing slabs in the cache to be reused. This function should iterate over `to_populate` and populate each slab. The function may also iterate over `to_reuse` to perform some housekeeping on the existing slabs (e.g., defragmentation). `to_populate` is guaranteed to be non-empty but `to_reuse` is not. Neither vector is guaranteed to be sorted.

Returns: Pair containing (1) a pointer to a slab's contents and (2) the index of the next predicted row/column inside the retrieved slab.

◆ get_max_slabs()

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

size_t tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::get_max_slabs ( ) const

inline

Returns: Maximum number of slabs in the cache.

◆ get_num_slabs()

template<typename Id_ , typename Index_ , class Slab_ , bool track_reuse_ = false>

size_t tatami_chunked::OracularSlabCache< Id_, Index_, Slab_, track_reuse_ >::get_num_slabs ( ) const

inline

Returns: Number of slabs currently in the cache.

The documentation for this class was generated from the following file:

tatami_chunked/OracularSlabCache.hpp

Public Member Functions

Detailed Description

Constructor & Destructor Documentation

◆ OracularSlabCache() [1/2]

◆ OracularSlabCache() [2/2]

Member Function Documentation

◆ operator=()

◆ next() [1/2]

◆ next() [2/2]

◆ get_max_slabs()

◆ get_num_slabs()