-
Notifications
You must be signed in to change notification settings - Fork 420
Fixes inefficient array usage in Import-Excel #1703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes inefficient array usage in Import-Excel #1703
Conversation
|
Cool. Interesting. Does it run faster in the newer version of PowerShell where they optimized += on the array? Also, did we chat about this by email? Do the test run locally? |
|
Hi Doug, this Ed. Yes we did chat about this by email. I was able get the test scripts running on my environment after fixing some minor issues that were caused by pester. The results are obvious volatile but my tests showed a clear improvment regardless of the Version i tested it on: I did the ran the tests a couple of times and took the average. For the test I just used Import-Excel and saved the result into an variable: foreach($file in $files) {
$time = Measure-Command {
$data = Import-Excel -Path $file.Fullname
}
Write-Host $file.Name
Write-Host "Seconds: $($time.TotalSeconds)"
}With the Fix: | rows: | 3 | 3 | 3 | 3 | 3 |
| cols: | 10 | 100 | 1000 | 10000 | 100000 | Version
| Time in Seconds | 0,01 Seconds | 0,01 Seconds | 0,04 Seconds | 0,17 Seconds | 1,73 Seconds | PSVersion: 7.5.1
| Time in Seconds | 0,01 Seconds | 0,02 Seconds | 0,05 Seconds | 0,17 Seconds | 1,75 Seconds | PSVersion: 7.4.1
| Time in Seconds | 0,01 Seconds | 0,02 Seconds | 0,05 Seconds | 0,27 Seconds | 2,07 Seconds | PSVersion: 5.1.19041Without the Fix: | rows: | 3 | 3 | 3 | 3 | 3 |
| cols: | 10 | 100 | 1000 | 10000 | 100000 | Version
| Time in Seconds | 0,01 Seconds | 0,01 Seconds | 0,06 Seconds | 0,28 Seconds | 20,04 Seconds | PSVersion: 7.5.1
| Time in Seconds | 0,01 Seconds | 0,02 Seconds | 0,05 Seconds | 1,16 Seconds | 110 Seconds | PSVersion: 7.4.1
| Time in Seconds | 0,02 Seconds | 0,01 Seconds | 0,08 Seconds | 2,2 Seconds | 300 Seconds | PSVersion: 5.1.19041Note: I used Powershell 7.4.1 because this version is still a Powershell Core Version but did not have the optimizations for += on arrays. Hardware I tested this on: Let me know if you need something else or if anything is unclear. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a performance issue in Import-Excel by replacing inefficient array usage with a more performant list-based implementation.
- Replaced array initialization with instantiation of a generic List for each worksheet.
- Replaced array concatenation with the List.Add() method for adding rows.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR addresses a performance issue in Import-Excel by replacing inefficient array operations with a more efficient list-based approach.
- Replaces array initialization with a generic List for storing sheet data.
- Uses the List.Add() method instead of array concatenation for improved handling of large datasets.
|
|
||
| $targetSheetname = $sheet.Name | ||
| $xlBook["$targetSheetname"] = @() | ||
| $xlBook["$targetSheetname"] = [System.Collections.Generic.List[object]]::new() |
Copilot
AI
May 27, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider preallocating the list capacity if the expected number of rows can be estimated to further optimize performance and reduce potential reallocations.
| $xlBook["$targetSheetname"] = [System.Collections.Generic.List[object]]::new() | |
| $rowCount = $sheet.Dimension.End.Row - $StartRow + 1 | |
| $xlBook["$targetSheetname"] = [System.Collections.Generic.List[object]]::new($rowCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good suggestion. Never thought of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
made some tests using preallocating. no difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a thought, back in the Visual Basic days, before classes. When you did a ReDim on an already allocated array, we found that it was better to allocate 50% more than the size of the array as you went. Size 10, use it the ReDim to 15 not 11. Then 22, 33 and so on. Very effective.
No clue if that can play here.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This fixes a performance issue in Import-Excel which occures with larger files. #1701
As scriptingstudio suggested in the issue using a list instead of an array is much more efficient. Especially the larger the file (The more Rows and Lines) the xlsx file has.