-
-
Notifications
You must be signed in to change notification settings - Fork 40
Create Math-Statistics package and add harmonic and geometric mean, and mode #186
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Math-Statistics package and add harmonic and geometric mean, and mode #186
Conversation
| sum := 0. | ||
|
|
||
| self do: [ :i | i < 0 | ||
| ifTrue: [Error new signal: 'The harmonic mean cannot be calculated on negative numbers'] ifFalse: [ sum := (sum + (1 / i))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a guard clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactored this to use a guard clause.
| @@ -0,0 +1,37 @@ | |||
| Extension { #name : #Collection } | |||
|
|
|||
| { #category : #'*Math-Statistics' } | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergeStinckwich I suggested that geometric and harmonic mean computation should be in our Statistics package after some quick research on Wikipedia. Would you agree? Are there alternatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that there is an arithmetic-geometric mean already implemented in ArbitraryPrecisionFloat.
This is absolutely not specific to statistics package.
Eventually, for small utilities like that, we can tolerate duplications, or we may extract those common utilities in a specific package...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the agm: is implemented in ArbitraryPrecisionFloat, my understanding is that you want this generalized to a collection, right? In this case, there is no conflict, just two implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oups, case of miss-reading, you don't want an arithmetic-geometric mean, you just want indepedent arithmetic, geometric, harmonic means? Sorry for the noise.
Also add in a test to make sure that passing in a negative number causes an error.
| "{ 4. 1. 1 / 32} geometricMean >>> (1/2)" | ||
| "#(3.14 1 4.56 0.333) geometricMean >>> 1.4776945822943937" | ||
|
|
||
| ^(self reduce: [ :a :b | a * b ] ) raisedToFraction: 1 / self size. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that this may result in Float overflow/underflow on large collections.
An alternative is to sum the logarithms. (a*b*c)^(1/3)=exp(ln(a*b*c))^(1/3)=exp((ln(a)+ln(b)+ln(c))/3)
That is something like ^((self collect: #ln) sum / self size) exp.
The price is that we may answer an inexact Float instead of an exact Integer in some edge cases.
For example{1/3. 2/3. 8/3. 16/3} geometricMean would answer 1.3333333333333333 instead of (4/3)...
Also note that the geometricMean may not be well defined for negative numbers, likewise the harmonicMean...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I have updated the implementation with your suggestion and added a guard clause similar to harmonicMean.
* Add in guard clause to prevent use with negative numbers as the behaviour is not well defined. * Prevent a potential float under/overflow by summing the logarithms as suggested by Nicolas Cellier.
|
Wondering whether this introduction of geometric and harmonic means should be an extension to the Collection class. Here I plead ignorance. Could an expert enlighten me? For example, if we have a collection of strings, does a geometric mean make sense? As an alternative, my thinking is we introduce the idea of a |
|
I agree, it only makes sense if the collection contains a numeric type. When I was implementing these I took inspiration from the statistical methods already defined on collections that similarly only work if they contain numbers. I'm new to smalltalk so I don't know what's considered good style in this case. |
|
@ctSkennerton hi there. Could you try to extract a |
| "#(2.5, 3, 10) harmonicMean >>> 3.6" | ||
| | sum | | ||
|
|
||
| self detect: [ :i | i < 0 ] ifFound: [ Error new signal: 'The harmonic mean cannot be calculated on negative numbers' ] . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe test i <= 0 here too
| "#(3.14 1 4.56 0.333) geometricMean >>> 1.4776945822943937" | ||
| "{1/3. 2/3. 8/3. 16/3} geometricMean >>> 1.3333333333333335" | ||
|
|
||
| self detect: [ :i | i < 0 ] ifFound: [ Error new signal: 'The geometric mean should not be calculated on negative numbers' ] . |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe test i <= 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I've changed the logic to be <=0
|
@hemalvarambhia Maybe I can sketch out what a StatisticalSample class >>> harmonicMean: aCollection
StatisticalSample class >>> geometricMean: aCollection
StatisticalSample class >>> mode: aCollection(This could then be further developed with other measures as needed) What do you think? |
You've got the idea. My thinking is that the methods would be instance ones, with the collection being a state variable. |
|
Ok, I've converted the code to use a new class |
|
Merge at your leisure, @ctSkennerton. @nicolas-cellier-aka-nice are you happy that your comments have been addressed? |
|
@hemalvarambhia I don't have permission to merge. |
|
@ctSkennerton merged. Thank you for contributing. Looking forward to your next one. :) |
|
@ctSkennerton love the documentation by the way. Very thorough! |
Hi,
Would you be interested in having a few extra basic statistics measures added into Polymath?
Best,
Connor