Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 17 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,29 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

<!-- ---------------------
v1.6.0
--------------------- -->
## v1.6.0 - 2-12-2024

### Added

- Method `saveToFile` becomes `DTensor<T>::saveToFile(std::string pathToFile, Serialisation ser)`, i.e., the user can
choose whether to save the file as a text (ASCII) file, or a binary one

### Changed

- Method `parseFromTextFile` renamed to `parseFromFile` (supports text and binary formats)


<!-- ---------------------
v1.5.2
--------------------- -->
## v1.5.2 - 1-12-2024

### Fixed

- Quick bug bix in `DTensor::parseFromTextFile` (passing storage mode to `vectorFromFile`)
- Quick bug bix in `DTensor::parseFromTextFile` (passing storage mode to `vectorFromTextFile`)


<!-- ---------------------
Expand All @@ -23,7 +38,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
### Fixed

- Set precision in `DTensor::saveToFile` properly
- `DTensor<T>::parseFromTextFile` throws `std::invalid_argument` if `T` is unsupported
- `DTensor<T>::parseFromFile` throws `std::invalid_argument` if `T` is unsupported

<!-- ---------------------
v1.5.0
Expand Down
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -248,7 +248,8 @@ The `DTensor` `B` will be overwritten with the solution.

### 1.8. Saving and loading tensors

Tensor data can be stored in simple text files which have the following structure
Tensor data can be stored in simple text files or binary files.
The text-based format has the following structure

```text
number_of_rows
Expand All @@ -259,13 +260,20 @@ data (one entry per line)

To save a tensor in a file, simply call `DTensor::saveToFile(filename)`.

To load a tensor from a file, the static function `DTensor<T>::parseFromTextFile(filename)` can be used. For example:
If the file extension is `.bt` (binary tensor), the data will be stored in binary format.
The structure of the binary encoding is similar to that of the text encoding:
the first three `uint64_t`-sized positions correspond to the number of rows, columns
and matrices, followed by the elements of the tensor.

To load a tensor from a file, the static function `DTensor<T>::parseFromFile(filename)` can be used. For example:

```c++
auto z = DTensor<double>::parseFromTextFile("path/to/my.dtensor")
auto z = DTensor<double>::parseFromFile("path/to/my.dtensor")
```

If necessary, you can provide a second argument to `parseFromTextFile` to specify the order in which the data are stored (the `StorageMode`).
If necessary, you can provide a second argument to `parseFromFile` to specify the order in which the data are stored (the `StorageMode`).

Soon we will release a Python API for reading and serialising (numpy) arrays to `.bt` files.

## 2. Cholesky factorisation and system solution

Expand Down
81 changes: 68 additions & 13 deletions include/tensor.cuh
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,7 @@ enum StorageMode {
defaultMajor = columnMajor
};


/**
* This library uses tensors to store and manipulate data on a GPU device.
* A tensor has three axes: [rows (m) x columns (n) x matrices (k)].
Expand Down Expand Up @@ -256,13 +257,16 @@ public:
*
* This static function reads data from a text file, creates a DTensor and uploads the data to the device.
*
* The data may be stored in a text file or a binary file. Binary files must have the extension .bt.
*
* @param path_to_file path to file as string
* @param mode storage mode (default: StorageMode::defaultMajor)
* @return instance of DTensor
*
* @throws std::invalid_argument if the file is not found
*/
static DTensor<T> parseFromTextFile(std::string path_to_file, StorageMode mode = StorageMode::defaultMajor);
static DTensor<T> parseFromFile(std::string path_to_file,
StorageMode mode = StorageMode::defaultMajor);

/**
* Constructs a DTensor object.
Expand Down Expand Up @@ -504,7 +508,12 @@ public:
/**
* Saves the current instance of DTensor to a (text) file
*
* @param pathToFile
* If the file extension is .bt, the data will be stored in a binary file.
* Writing to and reading from a binary file is significantly faster and
* the generated binary files tend to have a smaller size (about 40% of the
* size of text files for data of type double and float).
*
* @param pathToFile path to file
*/
void saveToFile(std::string pathToFile);

Expand Down Expand Up @@ -595,7 +604,7 @@ struct data_t {
};

template<typename T>
data_t<T> vectorFromFile(std::string path_to_file) {
data_t<T> vectorFromTextFile(std::string path_to_file) {
data_t<T> dataStruct;
std::ifstream file;
file.open(path_to_file, std::ios::in);
Expand Down Expand Up @@ -641,24 +650,70 @@ data_t<T> vectorFromFile(std::string path_to_file) {
}

template<typename T>
DTensor<T> DTensor<T>::parseFromTextFile(std::string path_to_file,
StorageMode mode) {
auto parsedData = vectorFromFile<T>(path_to_file);
data_t<T> vectorFromBinaryFile(std::string path_to_file) {
data_t<T> dataStruct;
/* Read from binary file */
std::ifstream inFile;
inFile.open(path_to_file, std::ios::binary);
inFile.read(reinterpret_cast<char *>(&(dataStruct.numRows)), sizeof(uint64_t));
inFile.read(reinterpret_cast<char *>(&(dataStruct.numCols)), sizeof(uint64_t));
inFile.read(reinterpret_cast<char *>(&(dataStruct.numMats)), sizeof(uint64_t));
uint64_t numElements = dataStruct.numRows * dataStruct.numCols * dataStruct.numMats;
std::vector<T> vecDataFromFile(numElements);
for (size_t i = 0; i < numElements; i++) {
T el;
inFile.read(reinterpret_cast<char *>(&el), sizeof(T));
vecDataFromFile[i] = el;
}
inFile.close();
dataStruct.data = vecDataFromFile;
return dataStruct;
}

template<typename T>
DTensor<T> DTensor<T>::parseFromFile(std::string path_to_file,
StorageMode mode) {
// Figure out file extension
size_t pathToFileLength = path_to_file.length() ;
std::string fileNameExtension = path_to_file.substr(pathToFileLength-3);
typedef data_t<T> (*PARSER)(std::string);
PARSER parser = (fileNameExtension == ".bt") ? vectorFromBinaryFile<T> : vectorFromTextFile<T>;
auto parsedData = parser(path_to_file);
DTensor<T> tensorFromData(parsedData.data, parsedData.numRows, parsedData.numCols, parsedData.numMats, mode);
return tensorFromData;
}

template<typename T>
void DTensor<T>::saveToFile(std::string pathToFile) {
std::ofstream file(pathToFile);
file << numRows() << std::endl << numCols() << std::endl << numMats() << std::endl;
std::vector<T> myData(numEl()); download(myData);
if constexpr (std::is_floating_point<T>::value) {
file << std::setprecision(std::numeric_limits<T>::max_digits10);
}
for(const T& el : myData) file << el << std::endl;
std::vector<T> myData(numEl());
download(myData);

// Figure out file extension
size_t pathToFileLength = pathToFile.length() ;
std::string fileNameExtension = pathToFile.substr(pathToFileLength-3);
// If the extension is .bt...
if (fileNameExtension == ".bt") {
uint64_t nr = (uint64_t) numRows(),
nc = (uint64_t) numCols(),
nm = (uint64_t) numMats();
std::ofstream outFile;
outFile.open(pathToFile, std::ios::binary);
outFile.write(reinterpret_cast<const char *>(&nr), sizeof(uint64_t));
outFile.write(reinterpret_cast<const char *>(&nc), sizeof(uint64_t));
outFile.write(reinterpret_cast<const char *>(&nm), sizeof(uint64_t));
for (const T &el: myData) outFile.write(reinterpret_cast<const char *>(&el), sizeof(T));
outFile.close();
} else {
std::ofstream file(pathToFile);
file << numRows() << std::endl << numCols() << std::endl << numMats() << std::endl;
if constexpr (std::is_floating_point<T>::value) {
file << std::setprecision(std::numeric_limits<T>::max_digits10);
}
for (const T &el: myData) file << el << std::endl;
}
}


template<typename T>
void DTensor<T>::reshape(size_t newNumRows, size_t newNumCols, size_t newNumMats) {
if (m_numRows == newNumRows && m_numCols == newNumCols && m_numMats == newNumMats) return;
Expand Down
14 changes: 10 additions & 4 deletions main.cu
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,15 @@


int main() {
auto z = DTensor<size_t>::parseFromTextFile("../test/data/my.dtensor",
StorageMode::rowMajor);
std::cout << z;
z.saveToFile("hohoho.dtensor");
/* Write to binary file */
auto r = DTensor<double>::createRandomTensor(3, 6, 4, -1, 1);
std::string fName = "tensor.bt"; // binary tensor file extension: .bt
r.saveToFile(fName);

/* Parse binary file */
auto recov = DTensor<double>::parseFromFile(fName);
auto err = r - recov;
std::cout << "max error : " << err.maxAbs();

return 0;
}
30 changes: 27 additions & 3 deletions test/testTensor.cu
Original file line number Diff line number Diff line change
Expand Up @@ -117,7 +117,7 @@ TEST_F(TensorTest, randomTensorCreation) {
}

/* ---------------------------------------
* Save to file and parse
* Save to file and parse (text)
* --------------------------------------- */

TEMPLATE_WITH_TYPE_T
Expand All @@ -128,7 +128,7 @@ void parseTensorFromFile() {
auto r = DTensor<T>::createRandomTensor(nR, nC, nM, -1, 1);
std::string fName = "myTest.dtensor";
r.saveToFile(fName);
auto a = DTensor<T>::parseFromTextFile(fName);
auto a = DTensor<T>::parseFromFile(fName);
EXPECT_EQ(nR, a.numRows());
EXPECT_EQ(nC, a.numCols());
EXPECT_EQ(nM, a.numMats());
Expand All @@ -148,7 +148,31 @@ TEST_F(TensorTest, parseTensorUnsupportedDataType) {
auto r = DTensor<double>::createRandomTensor(nR, nC, nM, -1, 1);
std::string fName = "myTest.dtensor";
r.saveToFile(fName);
EXPECT_THROW(DTensor<char>::parseFromTextFile(fName), std::invalid_argument);
EXPECT_THROW(DTensor<char>::parseFromFile(fName), std::invalid_argument);
}

/* ---------------------------------------
* Save to file and parse (binary)
* --------------------------------------- */

TEMPLATE_WITH_TYPE_T
void parseTensorFromFileBinary() {
size_t nR = 20, nC = 40, nM = 6;
auto r = DTensor<T>::createRandomTensor(nR, nC, nM, -1, 1);
std::string fName = "myTest.bt";
r.saveToFile(fName);
auto a = DTensor<T>::parseFromFile(fName);
EXPECT_EQ(nR, a.numRows());
EXPECT_EQ(nC, a.numCols());
EXPECT_EQ(nM, a.numMats());
auto diff = a - r;
T err = diff.maxAbs();
EXPECT_LT(err, 2 * std::numeric_limits<T>::epsilon());
}

TEST_F(TensorTest, parseTensorFromFileBinary) {
parseTensorFromFileBinary<float>();
parseTensorFromFileBinary<double>();
}

/* ---------------------------------------
Expand Down