|
| DataReader (std::vector< std::string > paths, size_t numThreads, size_t batchSize, std::string pathPrefix=std::string(), DataReaderThreadInitF init=DataReader_NoopF) |
| Please use makeDataReader() instead. More...
|
|
| DataReader (std::vector< std::string > paths, size_t numThreads, size_t batchSize, F transform, std::string pathPrefix=std::string(), DataReaderThreadInitF init=DataReader_NoopF) |
| Please use makeDataReader instead. More...
|
|
void | shuffle () |
| Shuffle the list of paths. More...
|
|
template<typename FF = F> |
std::unique_ptr< DataReaderIterator< T > > | iterator (typename std::enable_if_t< std::is_same< FF, DataReader_NoTransform >::value, bool >=true) |
| Create an iterator that provides multi-threaded data access. More...
|
|
template<typename FF = F> |
std::unique_ptr< DataReaderTransform< T, F & > > | iterator (typename std::enable_if_t< !std::is_same< FF, DataReader_NoTransform >::value, bool >=true) |
| Create an iterator that provides multi-threaded data access. More...
|
|
template<typename T, typename F = DataReader_NoTransform>
class common::DataReader< T, F >
A multi-threaded reader for cerealized data.
This class merely holds a list of paths pointing to files that contain cerealized versions of T. zstd decompression will transparently work. The actual multi-threaded data reading will happen in an iterator object that can be obtained by calling iterator()
.
Optionally, a pathPrefix
can be passed to the constructor which will be prepended to every element in paths
before accessing the respective file.
If a transform function is provided, the iterator will run batches through the function before returning them (in a dedicated thread).
Usage example with 4 threads and batch size 32:
auto reader = makeDataReader<MyDatumType>(fileList, 4, 32);
while (training) {
reader.shuffle();
auto it = reader.iterator();
while (it->hasNext()) {
auto batch = it->next();
}
}