|
| | DataReader (std::vector< std::string > paths, size_t numThreads, size_t batchSize, std::string pathPrefix=std::string(), DataReaderThreadInitF init=DataReader_NoopF) |
| | Please use makeDataReader() instead. More...
|
| |
| | DataReader (std::vector< std::string > paths, size_t numThreads, size_t batchSize, F transform, std::string pathPrefix=std::string(), DataReaderThreadInitF init=DataReader_NoopF) |
| | Please use makeDataReader instead. More...
|
| |
| void | shuffle () |
| | Shuffle the list of paths. More...
|
| |
| template<typename FF = F> |
| std::unique_ptr< DataReaderIterator< T > > | iterator (typename std::enable_if_t< std::is_same< FF, DataReader_NoTransform >::value, bool >=true) |
| | Create an iterator that provides multi-threaded data access. More...
|
| |
| template<typename FF = F> |
| std::unique_ptr< DataReaderTransform< T, F & > > | iterator (typename std::enable_if_t< !std::is_same< FF, DataReader_NoTransform >::value, bool >=true) |
| | Create an iterator that provides multi-threaded data access. More...
|
| |
template<typename T, typename F = DataReader_NoTransform>
class common::DataReader< T, F >
A multi-threaded reader for cerealized data.
This class merely holds a list of paths pointing to files that contain cerealized versions of T. zstd decompression will transparently work. The actual multi-threaded data reading will happen in an iterator object that can be obtained by calling iterator().
Optionally, a pathPrefix can be passed to the constructor which will be prepended to every element in paths before accessing the respective file.
If a transform function is provided, the iterator will run batches through the function before returning them (in a dedicated thread).
Usage example with 4 threads and batch size 32:
auto reader = makeDataReader<MyDatumType>(fileList, 4, 32);
while (training) {
reader.shuffle();
auto it = reader.iterator();
while (it->hasNext()) {
auto batch = it->next();
}
}