[Windows Build] Implement MMAP for mmap_data_loader.cpp#5164
[Windows Build] Implement MMAP for mmap_data_loader.cpp#5164jsidewhite wants to merge 1 commit intopytorch:mainfrom
Conversation
There is no sys/mman.h or posix-compatible mmap() implementation on Windows. The extension data loaders use it to map in data files, so adding an implementation. Test-run, & .\cmake-out\extension\data_loader\test\Debug\extension_data_loader_test.exe --gtest_brief=1 --gtest_filter=MmapDataLoader* Running main() from ...\executorch\third-party\googletest\googletest\src\gtest_main.cc [==========] 8 tests from 1 test suite ran. (50 ms total) [ PASSED ] 8 tests.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/5164
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @jsidewhite! Thank you for your pull request and welcome to our community. Action RequiredIn order to merge any pull request (code, docs, etc.), we require contributors to sign our Contributor License Agreement, and we don't seem to have one on file for you. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at cla@meta.com. Thanks! |
dbort
left a comment
There was a problem hiding this comment.
I'm so sorry for the delay! I totally missed this PR. This looks like a good approach.
|
|
||
| #include <errno.h> | ||
| #include <io.h> | ||
| #include <windows.h> | ||
|
|
||
| #include "mman_windows.h" | ||
|
|
There was a problem hiding this comment.
- Please use angle brackets and the full path to the header
- .cpp files should include their .h file first to demonstrate that the .h files don't need earlier includes to work cleanly
| #include <errno.h> | |
| #include <io.h> | |
| #include <windows.h> | |
| #include "mman_windows.h" | |
| #include <executorch/extension/data_loader/mman_windows.h> | |
| #include <errno.h> | |
| #include <io.h> | |
| #include <windows.h> | |
| * LICENSE file in the root directory of this source tree. | ||
| */ | ||
|
|
||
| #include <executorch/extension/data_loader/mman.h> |
There was a problem hiding this comment.
Please move this include down to the group on line 17. Tests should always include their header-under-test first to demonstrate that the header can be included cleanly without needing other includes.
| #ifdef __cplusplus | ||
| extern "C" { | ||
| #endif |
There was a problem hiding this comment.
ExecuTorch is always C++, so no need for this or the matching one below
| #ifndef _SYS_MMAN_H_ | ||
| #define _SYS_MMAN_H_ |
| #include <sys/mman.h> | ||
| #include <unistd.h> | ||
|
|
||
| ET_INLINE long get_os_page_size(){return sysconf(_SC_PAGESIZE)} |
There was a problem hiding this comment.
Not sure if this will compile without a semicolon. And the formatting is off
| ET_INLINE long get_os_page_size(){return sysconf(_SC_PAGESIZE)} | |
| ET_INLINE long get_os_page_size() { | |
| return sysconf(_SC_PAGESIZE); | |
| } |
| #include <sys/mman.h> | ||
| #include <unistd.h> | ||
|
|
||
| ET_INLINE long get_os_page_size(){return sysconf(_SC_PAGESIZE)} |
There was a problem hiding this comment.
Now that we're wrapping this, I'd like to return size_t instead of long to be more consistent with the types that ET uses.
| * LICENSE file in the root directory of this source tree. | ||
| */ | ||
|
|
||
| #include <executorch/extension/data_loader/mman.h> |
There was a problem hiding this comment.
Please move this include down to the group on line 21 so that this .cpp file includes its .h file first.
| size_t map_size = range.size; | ||
| #ifdef _WIN32 | ||
| // On Windows, don't mmap-in memory past end of on-disk file. | ||
| // | ||
| // The Windows implementation of mmap uses CreateFileMapping which returns | ||
| // error STATUS_SECTION_TOO_BIG (0xc0000040) if we try to map past the end | ||
| // of the last page of a file mapped in as read-only. | ||
| if (range.start + range.size > file_size_) { | ||
| map_size = file_size_ - range.start; | ||
| } | ||
| #endif |
There was a problem hiding this comment.
Seems reasonable to always do this, if it works on unix systems; we're trying to minimize the number of ifdefs in common code.
| size_t map_size = range.size; | |
| #ifdef _WIN32 | |
| // On Windows, don't mmap-in memory past end of on-disk file. | |
| // | |
| // The Windows implementation of mmap uses CreateFileMapping which returns | |
| // error STATUS_SECTION_TOO_BIG (0xc0000040) if we try to map past the end | |
| // of the last page of a file mapped in as read-only. | |
| if (range.start + range.size > file_size_) { | |
| map_size = file_size_ - range.start; | |
| } | |
| #endif | |
| size_t map_size = range.size; | |
| if (range.start + map_size > file_size_) { | |
| // Clamp to the end of the file. | |
| // | |
| // The Windows implementation of mmap uses CreateFileMapping which returns | |
| // error STATUS_SECTION_TOO_BIG (0xc0000040) if we try to map past the end | |
| // of the last page of a file mapped in as read-only. | |
| map_size = file_size_ - range.start; | |
| } |
| srcs = [ | ||
| "mmap_data_loader.cpp", | ||
| "mman_windows.cpp" | ||
| ] if host_info().os.is_windows else [ | ||
| "mmap_data_loader.cpp" | ||
| ], |
There was a problem hiding this comment.
Something like this would avoid repeating the common part.
| srcs = [ | |
| "mmap_data_loader.cpp", | |
| "mman_windows.cpp" | |
| ] if host_info().os.is_windows else [ | |
| "mmap_data_loader.cpp" | |
| ], | |
| srcs = ["mmap_data_loader.cpp"] + | |
| ["mman_windows.cpp"] if host_info().os.is_windows else [], |
| ], | ||
| headers = [ | ||
| "mman.h", | ||
| "mman_windows.h" |
There was a problem hiding this comment.
Please also make this header conditional on is_windows
|
@jsidewhite can I offer help on adding changes here? |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
@jsidewhite could you address the code changes mentioned above? Or if PR #8916 has addressed this, do close this issue. |
|
@jsidewhite closing this assuming #8916 completes the changes suggested here. Re-open if necessary |
There is no sys/mman.h or posix-compatible mmap() implementation on Windows. The extension data loaders use it to map in data files, so adding an implementation.
Test-run,
& .\cmake-out\extension\data_loader\test\Debug\extension_data_loader_test.exe --gtest_brief=1 --gtest_filter=MmapDataLoader*
Running main() from ...\executorch\third-party\googletest\googletest\src\gtest_main.cc
[==========] 8 tests from 1 test suite ran. (50 ms total)
[ PASSED ] 8 tests.
For issue #4661