Skip to content

Commit 6452194

Browse files
EnricoMipitrou
andauthored
GH-43057: [C++] Thread-safe AesEncryptor / AesDecryptor (#44990)
### Rationale for this change OpenSSL encryption / decryption is wrapped by AesEncryptor / AesDencryptor, which is used by multiple threads of a single scanner or by multiple concurrent scanners when scanning a dataset. Some thread may call `WipeOut` while other threads still use the instance. ### What changes are included in this PR? - Remove the `WipeOut` methods and related datastructures entirely. - Each call into `CtrEncrypt` / `CtrDecrypt` and `GcmEncrypt` / `GcmDecrypt` uses its own `EVP_CIPHER_CTX` instance, making this thread-safe. After fixing this `"AesDecryptor was wiped out"` issue, two other segmentation faults surfaced: GH-44988. This has also been addressed here as it can only be exposed after fixing the wipe-out issue. Fixes GH-43057. Fixes GH-44852. Fixes GH-44988. ### Are these changes tested? A unit test that scans a dataset concurrently reproduced the initial issue in 30% of the test runs. ### Are there any user-facing changes? No. * GitHub Issue: #43057 Lead-authored-by: Enrico Minack <github@enrico.minack.dev> Co-authored-by: Antoine Pitrou <antoine@python.org> Co-authored-by: Antoine Pitrou <pitrou@free.fr> Signed-off-by: Antoine Pitrou <pitrou@free.fr>
1 parent 9b0885a commit 6452194

21 files changed

Lines changed: 618 additions & 758 deletions

cpp/examples/parquet/low_level_api/encryption_reader_writer_all_crypto_options.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -429,7 +429,7 @@ void InteropTestReadEncryptedParquetFiles(std::string root_path) {
429429

430430
// Add the current decryption configuration to ReaderProperties.
431431
reader_properties.file_decryption_properties(
432-
vector_of_decryption_configurations[example_id]->DeepClone());
432+
vector_of_decryption_configurations[example_id]);
433433

434434
// Create a ParquetReader instance
435435
std::unique_ptr<parquet::ParquetFileReader> parquet_reader =

cpp/src/arrow/dataset/file_parquet.cc

Lines changed: 11 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -76,26 +76,25 @@ parquet::ReaderProperties MakeReaderProperties(
7676
}
7777
properties.set_buffer_size(parquet_scan_options->reader_properties->buffer_size());
7878

79+
auto file_decryption_prop =
80+
parquet_scan_options->reader_properties->file_decryption_properties();
81+
7982
#ifdef PARQUET_REQUIRE_ENCRYPTION
8083
auto parquet_decrypt_config = parquet_scan_options->parquet_decryption_config;
8184

8285
if (parquet_decrypt_config != nullptr) {
83-
auto file_decryption_prop =
86+
file_decryption_prop =
8487
parquet_decrypt_config->crypto_factory->GetFileDecryptionProperties(
8588
*parquet_decrypt_config->kms_connection_config,
8689
*parquet_decrypt_config->decryption_config, path, filesystem);
87-
88-
parquet_scan_options->reader_properties->file_decryption_properties(
89-
std::move(file_decryption_prop));
9090
}
9191
#else
9292
if (parquet_scan_options->parquet_decryption_config != nullptr) {
9393
parquet::ParquetException::NYI("Encryption is not supported in this build.");
9494
}
9595
#endif
9696

97-
properties.file_decryption_properties(
98-
parquet_scan_options->reader_properties->file_decryption_properties());
97+
properties.file_decryption_properties(file_decryption_prop);
9998

10099
properties.set_thrift_string_size_limit(
101100
parquet_scan_options->reader_properties->thrift_string_size_limit());
@@ -527,9 +526,11 @@ Future<std::shared_ptr<parquet::arrow::FileReader>> ParquetFileFormat::GetReader
527526
auto self = checked_pointer_cast<const ParquetFileFormat>(shared_from_this());
528527

529528
return source.OpenAsync().Then(
530-
[=](const std::shared_ptr<io::RandomAccessFile>& input) mutable {
531-
return parquet::ParquetFileReader::OpenAsync(input, std::move(properties),
532-
metadata)
529+
[self = self, properties = std::move(properties), source = source,
530+
options = options, metadata = metadata,
531+
parquet_scan_options = parquet_scan_options](
532+
const std::shared_ptr<io::RandomAccessFile>& input) mutable {
533+
return parquet::ParquetFileReader::OpenAsync(input, properties, metadata)
533534
.Then(
534535
[=](const std::unique_ptr<parquet::ParquetFileReader>& reader) mutable
535536
-> Result<std::shared_ptr<parquet::arrow::FileReader>> {
@@ -544,7 +545,7 @@ Future<std::shared_ptr<parquet::arrow::FileReader>> ParquetFileFormat::GetReader
544545
// here we know there are no other waiters on the reader.
545546
std::move(const_cast<std::unique_ptr<parquet::ParquetFileReader>&>(
546547
reader)),
547-
std::move(arrow_properties), &arrow_reader));
548+
arrow_properties, &arrow_reader));
548549

549550
// R build with openSUSE155 requires an explicit shared_ptr construction
550551
return std::shared_ptr<parquet::arrow::FileReader>(

0 commit comments

Comments
 (0)