Optimization on checking for duplicates by MecuSorin · Pull Request #10325 · dotnet/fsharp

MecuSorin · 2020-10-23T07:38:30Z

Optimized computing idf for all elements just n times instead of 2 * n^2

…(n + 1) times.

En3Tho · 2020-10-23T09:08:37Z

src/fsharp/CheckDeclarations.fs

-            let id1 = (idf uc1)
-            let id2 = (idf uc2)
+    let ids = elems |> List.mapi (fun i uc -> i, idf uc)
+    for (i, id1) in ids do


Sorry, but I don't get what's the difference between iterating list via List.iteri and iterating it via double foreach. Maybe I'm missing something? What's the gain? Also, you create new tuple list and then use tuples so this code allocates more objects now while previous was only allocating fsharpfunc

Yeah I didn't see we would end up with tuples when I suggested this. @MecuSorin maybe my suggestion was bad

@forki I think the idea of optimization was to split this function into two, make them recursive, one will keep track of current start (tail) and then call other, which will try to find duplicates using this tail. This way you won't be needing j > i check every time and can skip not needed iteration.
But still it needs some kind of benchmark to be sure and it's not O(n) anyway. Only way I see here making this O(n) is to make a Hashset/Map but it needs additional memory

@En3Tho The optimization is about how many times the function idf is invoked.

Before for each element in the sequence, idf was invoked 2 * n times that means actually 2*n^2 . By allocating a new list with the mapping of each element to idf we will invoke that function just n times. The actual checking of the duplicates has the same complexity like before

Tbf in all call sites is only calling the ID property. Not computing something. So after thinking about it: there is probably not much to gain here.

I think there's a lack of evidence on both sides here:

Proof that this speeds things up, given that we're still effectively doing a nested for loop

Proof that the replacement with tuples actually allocates, and thus proof that this makes anything worse

Given the lack of evidence on either side I don't think we can proceed with either accepting or closing this.

I agree, without a proper measurement is hard to take a decision. Maybe someone with a better deeper innerworkings experience will shed a light on the thing. I don't have the time now to write a proper benchmark test.

@MecuSorin , I shall close this PR, if you want to reopen it when you have time to provide some benchmarks that would be good.

Thanks for looking at this, and I look forward to seeing the perf analysis

Kevin

Small optimization to compute the elements idf just n times, not n * …

a2c264c

…(n + 1) times.

MecuSorin mentioned this pull request Oct 23, 2020

split TypeChecker.fs #10317

Merged

En3Tho reviewed Oct 23, 2020

View reviewed changes

KevinRansom closed this Oct 26, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimization on checking for duplicates#10325

Optimization on checking for duplicates#10325
MecuSorin wants to merge 1 commit intodotnet:mainfrom
MecuSorin:DuplicatesCheckOptimization

MecuSorin commented Oct 23, 2020 •

edited

Loading

Uh oh!

En3Tho Oct 23, 2020 •

edited

Loading

Uh oh!

forki Oct 23, 2020

Uh oh!

En3Tho Oct 23, 2020 •

edited

Loading

Uh oh!

MecuSorin Oct 23, 2020

Uh oh!

forki Oct 23, 2020

Uh oh!

cartermp Oct 24, 2020

Uh oh!

MecuSorin Oct 24, 2020

Uh oh!

KevinRansom Oct 26, 2020

Uh oh!

KevinRansom Oct 26, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

MecuSorin commented Oct 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

En3Tho Oct 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

forki Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

En3Tho Oct 23, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MecuSorin Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

forki Oct 23, 2020

Choose a reason for hiding this comment

Uh oh!

cartermp Oct 24, 2020

Choose a reason for hiding this comment

Uh oh!

MecuSorin Oct 24, 2020

Choose a reason for hiding this comment

Uh oh!

KevinRansom Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

KevinRansom Oct 26, 2020

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

MecuSorin commented Oct 23, 2020 •

edited

Loading

En3Tho Oct 23, 2020 •

edited

Loading

En3Tho Oct 23, 2020 •

edited

Loading