Now that we have all the partial solutions in *.sums, we can merge them together to generate the final answer: the variance of all 100,000 random numbers.
If we add up all the sums, and sum of squares we calculated previously, we can obtain the variance as follows:
variance = (sum_squared - sum * sum / N)/Nwhere N is the number of values
See the wikipedia entry for a discussion of why this is a very naive approach!
To do this, all we have to do is merge together all the values in *.sums, i.e. add up the sums and sum_squared for each chunk. We can then apply the above (naive) formula.
Merging files is straightforward in Ruffus:
![]()
The @merge decorator tells Ruffus to take all the files from the step 6 task (i.e. *.sums), and produced a merged file in the form of "variance.result".
- Thus if step_6_calculate_sum_of_squares created
1.sums and2.sums etc.This would result in the following function call:
![]()
The final result is, of course, in "variance.result".