eminem
Parse Matrix Market files in C++
All Classes Namespaces Files Functions Variables Typedefs Enumerations Pages
eminem::Parser< Input_ > Class Template Reference

Parse a matrix from a Matrix Market file. More...

#include <Parser.hpp>

Public Member Functions

 Parser (std::unique_ptr< Input_ > input, const ParserOptions &options)
 
const MatrixDetailsget_banner () const
 
Index get_nrows () const
 
Index get_ncols () const
 
Index get_nlines () const
 
void scan_preamble ()
 
template<typename Type_ = int, class Store_ >
bool scan_integer (Store_ store)
 
template<typename Type_ = double, class Store_ >
bool scan_real (Store_ &&store)
 
template<typename Type_ = double, class Store_ >
bool scan_double (Store_ store)
 
template<typename Type_ = double, class Store_ >
bool scan_complex (Store_ store)
 
template<typename Type_ = bool, class Store_ >
bool scan_pattern (Store_ store)
 

Detailed Description

template<class Input_>
class eminem::Parser< Input_ >

Parse a matrix from a Matrix Market file.

Template Parameters
Input_Class for the source of input bytes, satisfying the byteme::PerByteInterface instance.

This parses a Matrix Market file according to the specification described at https://math.nist.gov/MatrixMarket/reports/MMformat.ps.gz. It is expected that users call scan_preamble() to determine the field type (see eminem::Field for supported values), after which they may call one of scan_integer(), scan_real(), etc. to parse the rest of the file. An error will be thrown upon detecting a non-compliant file.

As the Matrix Market specification is somewhat vague in parts, we apply the following refinements:

  • We define a "blank" character as a horizontal tab, a space (ASCII 32) or a carriage return. A new line is only defined by the line feed character (ASCII 10). The carriage return is only considered as a blank for compatibility with systems that use CRLF to define new lines, and we do not check for whether the carriage return is followed by a line feed.
  • The final line of the file may or may not be newline terminated.
  • Integer values for the row/column indices, number of rows/columns and number of lines should be a sequence of one or more digits. Leading zeros are ignored and will not be interpreted as octal. An error is thrown if overflow of the Index type occurs in scan_preamble(). An error is also thrown if the row/column indices in any data line exceed the number of rows/columns in the preamble, or if the observed number of data lines exceeds the expected number for coordinate matrices/vectors.
  • Integer data values should be a sequence of one or more digits, optionally preceded by a + or - sign. Leading zeros are ignored and will not be interpreted as octal. An error is thrown if overflow of the Type_ type occurs in scan_integer().

Real data values should follow one of the following formats:

  • (sign)[digit sequence](exponent)
  • (sign)[digit sequence].(exponent)
  • (sign).[digit sequence](exponent)
  • (sign)[digit sequence].[digit sequence](exponent)

The digit sequence should contain one or mor digits, possibly containing leading zeros that will be ignored. The sign is optional and should be either + or -. The exponent is optional should have the form e(sign)[digit sequence] or E(sign)[digit sequence], where the sign is again optional. We also support case-insensitive matches to inf, infinity or nan. This will be converted to their corresponding IEEE special values if they are available for the provided Type_ in scan_real(), otherwise an error is thrown. For non-special values, no additional checks for overflow are applied. If IEEE arithmetic is available, overflow will manifest as infinities, otherwise they will be undefined behavior.

No validation is performed to determine whether coordinates are consistent with non-general symmetries. Similarly, we do not check for the existence of multiple lines with the same row/column indices in coordinate matrices/vectors.

Constructor & Destructor Documentation

◆ Parser()

template<class Input_ >
eminem::Parser< Input_ >::Parser ( std::unique_ptr< Input_ > input,
const ParserOptions & options )
inline
Parameters
inputSource of input bytes, typically a byteme::PerByteInterface instance.
optionsFurther options.

Member Function Documentation

◆ get_banner()

template<class Input_ >
const MatrixDetails & eminem::Parser< Input_ >::get_banner ( ) const
inline

Retrieve the Matrix Market banner, containing information about the data format and type. This should only be called after scan_preamble().

Returns
Details about the matrix in this file.

◆ get_nrows()

template<class Input_ >
Index eminem::Parser< Input_ >::get_nrows ( ) const
inline

Get the number of rows in the matrix. This should only be called after scan_preamble(). If the object type is Object::VECTOR, the number of rows is equal to the length of the vector.

Returns
Number of rows.

◆ get_ncols()

template<class Input_ >
Index eminem::Parser< Input_ >::get_ncols ( ) const
inline

Get the number of columns in the matrix. This should only be called after scan_preamble(). If the object type is Object::VECTOR, the number of columns is set to 1.

Returns
Number of columns.

◆ get_nlines()

template<class Input_ >
Index eminem::Parser< Input_ >::get_nlines ( ) const
inline

Get the number of non-zero lines in the coordinate format. This should only be called after scan_preamble(). If the object type is Object::ARRAY, the number of lines is defined as the product of the number of rows and columns.

Returns
Number of non-zero lines.

◆ scan_preamble()

template<class Input_ >
void eminem::Parser< Input_ >::scan_preamble ( )
inline

Scan the preamble from the Matrix Market file, including the banner and the size line. This should only be called once.

◆ scan_integer()

template<class Input_ >
template<typename Type_ = int, class Store_ >
bool eminem::Parser< Input_ >::scan_integer ( Store_ store)
inline

Scan the file for integer lines, assuming that the field in the banner is Field::INTEGER.

Template Parameters
Type_Type to represent the integer.
Store_Function to process each line.
Parameters
storeFunction with the signature void(Index row, Index column, Type_ value), which is passed the corresponding values at each line. Both row and column will be 1-based indices; for Object::VECTOR, column will be set to 1. Alternatively, this may return bool, where a false indicates that the scanning should terminate early and a true indicates that the scanning should continue.
Returns
Whether the scanning terminated early, based on store returning false.

◆ scan_real()

template<class Input_ >
template<typename Type_ = double, class Store_ >
bool eminem::Parser< Input_ >::scan_real ( Store_ && store)
inline

Scan the file for real lines, assuming that the field in the banner is Field::REAL.

Template Parameters
Type_Type to represent the real value.
Store_Function to process each line.
Parameters
storeFunction with the signature void(Index row, Index column, Type_ value), which is passed the corresponding values at each line. Both row and column will be 1-based indices; for Object::VECTOR, column will be set to 1. Alternatively, this function may return bool, where a false indicates that the scanning should terminate early and a true indicates that the scanning should continue.
Returns
Whether the scanning terminated early, based on store returning false.

◆ scan_double()

template<class Input_ >
template<typename Type_ = double, class Store_ >
bool eminem::Parser< Input_ >::scan_double ( Store_ store)
inline

Scan the file for double-precision lines, assuming that the field in the banner is Field::DOUBLE. This is just an alias for scan_real().

Template Parameters
Type_Type to represent the double-precision value.
Store_Function to process each line.
Parameters
storeFunction with the signature void(Index row, Index column, Type_ value), which is passed the corresponding values at each line. Both row and column will be 1-based indices; for Object::VECTOR, column will be set to 1. Alternatively, this function may return bool, where a false indicates that the scanning should terminate early and a true indicates that the scanning should continue.
Returns
Whether the scanning terminated early, based on store returning false.

◆ scan_complex()

template<class Input_ >
template<typename Type_ = double, class Store_ >
bool eminem::Parser< Input_ >::scan_complex ( Store_ store)
inline

Scan the file for complex lines, assuming that the field in the banner is Field::COMPLEX.

Template Parameters
Type_Type to represent the real and imaginary parts of the complex value.
Store_Function to process each line.
Parameters
storeFunction with the signature void(Index row, Index column, std::complex<Type_> value), which is passed the corresponding values at each line. Both row and column will be 1-based indices; for Object::VECTOR, column will be set to 1. Alternatively, this function may return bool, where a false indicates that the scanning should terminate early and a true indicates that the scanning should continue.
Returns
Whether the scanning terminated early, based on store returning false.

◆ scan_pattern()

template<class Input_ >
template<typename Type_ = bool, class Store_ >
bool eminem::Parser< Input_ >::scan_pattern ( Store_ store)
inline

Scan the file for pattern lines, assuming that the field in the banner is Field::PATTERN. This function only works when the format field is set to Format::COORDINATE.

Template Parameters
Type_Type to represent the presence of a non-zero entry.
Store_Function to process each line.
Parameters
storeFunction with the signature void(Index row, Index column, Type_ value), which is passed the corresponding values at each line. Both row and column will be 1-based indices; for Object::VECTOR, column will be set to 1. value will always be true and can be ignored; it is only required here for consistency with the other methods. Alternatively, this function may return bool, where a false indicates that the scanning should terminate early and a true indicates that the scanning should continue.
Returns
Whether the scanning terminated early, based on store returning false.

The documentation for this class was generated from the following file: