The really nice people that run Pi blogs invited me over with the offer of a platform to knowledge share and learn in multiple languages, amongst a growing family of blogs. So you could in the future see versions of these posts in Spanish, Polish, etc? Nice!
Hmmm, so what should the first post on the new Pi blogs site be? How better to start off than investigating how to compute Pi to a large amount of digits using Swift? I’ve often wondered how Pi can be calculated and it sounds like an interesting exercise.
I’m not interested in the fastest algorithm or breaking any records (no super computers in my house) but it will be interesting how to adapt an algorithm to compute beyond the precision limitations of the CPU. A 64 bit CPU will only give us 18 digits of precision, which won’t be that impressive. Also Swift’s new language features might offer us some neat ways to develop and write the code, hopefully in a way that will lend itself to multithreading.
I’ll also employ TDD at every step to ensure we can refactor with safety as we go.
So to start off we need an algorithm. There are many to choose from, but John Machin’s formula from 1706 seems to be popular, easy to understand and can be used with techniques to compute to large amount of digits:
π/4 = 4·arctan(1⁄5) – arctan(1⁄239)
Taking baby steps, let’s test this
class PiMachinAlgorithmTest: XCTestCase
let tolerance = 0.00000001
let π = ((4.0 * atan(1.0 / 5.0)) - atan(1.0 / 239.0)) * 4.0
XCTAssertEqualWithAccuracy(π, M_PI, tolerance)
It passes, so are we done? Well….no. The atan function returns a double so that will only give us a limited amount of decimal places. So we need to next work out how to create our own arctan function, which we can then adapt to use a custom number type that will cope with large precision numbers.
Time to look at the Taylor Series. This is an infinite series that returns the result for our arctan function. As it is an infinite series, that means the longer we go on the more accuracy we get, perfect for our Pi quest. If we calculate arctan to the required decimal places, we can get Pi using the original Machin formula.
arctan(x) = x – x3⁄3 + x5⁄5 – x7⁄7 + x9/9 ………….
As infinity seems like a long time, lets just test with 5 steps.
let x = 1.0 / 5.0
let a = atan(x)
let b = x - (pow(x, 3.0) / 3.0) + (pow(x, 5.0) / 5.0)
- (pow(x, 7.0) / 7.0) + (pow(x, 9.0) / 9.0)
XCTAssertEqualWithAccuracy(a, b, tolerance)
Another success. So now things get interesting. We know the algorithm works and can scale but it’s the Double number type we already know is holding us back from creating enough Pi digits to fill our screen. We have to create a new number type and use it in the Taylor and Machin algorithms.
Creating a new number type that works with all operators and functions is not going to be trivial. So let’s try and cut some corners. Firstly the algorithm is going to converge on the value of Pi, so the numbers involved will always be less than Pi. So our number type only needs to handle small. Next with a bit of tweaking to the maths we can ensure this new number type (lets call it a super number) will only have to perform the following operations:
- be multiplied by an integer
- be divided by an integer
- add a super number to another super number
- subtract a super number from another super number
We have two strategies to implement this super number – store digits as multiple base 10 digits or to store the number in binary as multiple binary words. Although it’s not at all optimal in the past I’ve implemented a multiple base 10 digit strategy to create a ‘BigInt’ class for handling very large integers. To handle 1000 digit integers it creates an array of 1000 integers, where each integer represents a single digit. To perform addition, subtraction, division and multiplication it uses the basic methods learnt at school. All good to code, but not very efficient either for storage or making use of the CPU. The main advantage is the structure requires no conversion to read in base 10, just peak at the array or joining the entire array into a string is trivial.
For our Pi calculation the value only needs to be read once, when we have read enough of the Taylor sequence to get the required accuracy. So this technique will not win much there, and performing arithmetic with single digits is going to be slowwwwww.
So we will look at the big binary approach. To simplify as we know the numbers will never exceed pi we do not have to implement floating point, where the point can move to accomodate both small numbers and large. Instead we will implement a fixed point number with a fixed amount of bits to represent the integer part and another fixed amount of bits to represent the digits to the right of the floating point.
Consider 4 bits for integer and 4 bits for the fractional part:
or in decimal
So, to represent 12.75 would be
Or 5.625 would be
This takes a bit of getting used to, as the fractional part does not map directly back to decimal. It’s storing as a combination of binary fractions. For example 0.123 cannot be represented using the 8 bits shown above. Instead it requires many more bits which might be an approximation rather than the exact figure. However it is a true binary representation so when we implement the arithmetic methods they will follow standard operations that are CPU friendly. In other words fast and efficient!
To simplify the binary arithmetic operations we’ll store the bits in 32 bit unsigned integers. But how many do we need? How much resolution will 16 bits give us?
1/216 = 0.00001525878906
Nearly 5 places.
1/232 = 0.00000000023283
We can approximate each 32 bits will give us at least 9 decimal places of precision. In fact a bit more calculation shows it will give us around 9.6 places. So if we want Pi to 100 decimal places the number of 32 bit words we need is
100 / 9.6 = 11 (rounding up)
Plus one more to handle the integer part.
So that’s the strategy. To prevent this blog post from turning into war and peace the complete XCode project including the tests can be found here which has the fixed number class, a formatter to convert it’s value to decimal and a taylor sequence generator to implement the arctan function.
And here’s the payoff, to 1000 digits:
Computed in 0.3 seconds. Not too shabby!
What would be interesting now would be using the tools in XCode to look at any obvious ways to improve performance, and how to split the computation onto multiple threads to make use of multiple CPU’s. But that will have to wait for another post, I’m off now to get some Pi!