Thought Valve: 2021

Monday, October 04, 2021

from a pyramid towards a cone

When i last analyzed the properties of similar triangles, i thought i was toiling a bit too much over a simple fact... But i now see that was time well spent, for the properties of similar triangles seem to be a master stroke of geometry without which many pre-calculas era derivations wouldn't be possible.

And, for that matter, even in the calculus era, one would not be able to tell between linear and non-linear functions without using the properties of similar triangles. (i'll let you think through why, in case it isn't obvious)

Here, lets see how wonderfully the properties of similar triangles become an essential ingredient in the derivation for the cone volume.

We won't go all the way till the cone volume, but pause at an essential step in getting there.

Before getting to cone, lets look at a pyramid, it looks some thing like below with a triangular base area of 'b', and 3 triangular sides meeting at an apex at height 'h':

We can view this pyramid as a stack of infinitesimally thin triangles laid one on top of the other with continuously decreasing size till they converge at the apex.

Now, if the height of the pyramid is fixed, is there any relationship between two triangles?: i) the ABC at the base (b), and ii) the other, A'B'C' at some arbitrary height h' with area (b') ?

The first observation is that triangles ABC and A'B'C' are "similar" (similar triangle usage count: 1)

How? There are many ways to get to it, but here is a simple one:

a) ABC and A'B'C' are parallel to each other across the z-axis, which means all corresponding sides of the two triangles are parallel to each other. (this is how we laid out pyramid, by placing one triangle on top of the other)

b) Lets (just pick up and) drop lines C'A' and B'C' onto the x-y plane. C'A' is parallel to CA and B'C' is parallel to BC. So, the angle we get when we make lines CA and BC intersect is same as the angle we get when we make the lines C'A' and B'C' intersect.

c) So ∠BCA = ∠B'C'A', and on similar lines we can establish that the other two corresponding angles are equal.

As stated earlier, we assumed that area of triangle ABC = b and that of A'B'C' = b'.

One property of similar triangles is that the ratio of their areas is equal to the ratios of the squares of any of their corresponding sides.

Why is that so?

Lets quickly examine this from the below figure where triangles ABC and EBD are similar to each other:

Area(ABC)/Area(EBD) = (b*h/b`*h`) [The halves in the formula cancel out]

But, we also see that traingles AGC and EFD are similar too, and hence h/h` = c/c`.

(similar triangle usage count: 2)

But since ABC and EBD are similar, c/c` = b/b`, and hence:

Area(ABC)/Area(EBD) = (b*h/b`*h`) = (b/b`)*(h/h`) = (b/b`)*(c/c`) = (b/b`)*(b/b`) = (b*b)/(b`*b`)

Now lets us go back to our first figure:

So we have it that Area(ABC)/Area(A`B`C`) = b/b` = (AB*AB)/(A`B`*A`B`).

A cool trick or two from here gets us to note this beautiful fact that the ratio can be completely expressed in terms of the triangles's corresponding height from the x-y plane.

trick #1) Observe that triangles ABV and A`B`V are similar too.. This means (AB/A`B`) = (AV/A`V)

(similar triangle usage count: 3)

Next lets drop three lines on our figure:

i) a line parallel to z-axis through point A

ii) a line parallel to z-axis through point A`, and

iii) a line parallel to x-axis through point V

We get a figure as below:

trick #2: Observe that triangles AVW and A`VW` are similar. So, AV/A`V = AW/A`W` = h/(h-h`)

(similar triangle usage count: 4)

So, Area(ABC)/Area(A`B`C`) = (AB*AB)/(A`B`*A`B`) = (AV*AV)/(A`V*A`V) = (h*h)/( (h-h`)*(h-h`) )

Area(ABC)/Area(A`B`C`) = (h*h)/( (h-h`)*(h-h`) )

Now, here lets pause at this remarkable fact we understood: if we have a pyramid with a base area b and height h, the area of every triangle in the pyramid parallel to the base, along the z-axis shrinks exclusively as a function of how high up the triangle is.

This leads to another remarkable fact, but we need to consider the volume of pyramid first.

The volume of the pyramid is the sum of all the stacked up triangles.

So, lets assume we divided the pyramid ABCV into some n infinitesimally small triangles.

Consider another pyramid XYZT with same height h and same base area b but a completely dissimilar base shape.

Now, this new pyramid can also be divided into the same n infinitesimally small triangles.

If we start at the base, base_area(ABC) = base_area(XYZ) = b

Then when we go to the first decrement in ABCV, A`B`C`:

Area(A`B`C`) = [Area(ABC) * (h-h`)*(h-h`) ] / (h` * h`) = [b * (h-h`)*(h-h`) ] / (h`*h`)

Now, lets get to the corresponding first decrement in XYZT, X`Y`Z`:

Area(X`Y`Z`) = [Area(XYZ) * (h-h`)*(h-h`) ] / (h` * h`) = [b * (h-h`)*(h-h`) ] / (h`*h`)

Which means even if we have two dissimilar pyramids but with same base area b and height h, their volumes will be equal.

Isn't it an amazing property to just stand in awe of ? and why not thank the properties of similiar triangles, which we used a total 5 times in this thought process.

Friday, September 24, 2021

similar triangles, same me....

... from this obsession of diving deep into shallow waters, somebody save me :-)

Usually when similar triangles are discussed, the texts start off by stating their properties or maybe exemplify a bit, and then take off into describing various tricks to deduce similarity between given set of triangles.

Of course, if you keep drawing to scale or use intuition, you would reckon that similar triangles are like aspect ratio locked pictures, you can scale their size up or down but you wouldn't be able to distort them.

But me being me stopped and asked, why is it that if the angles within two triangles are equal, the ratios of their corresponding sides must be equal? pretending to forget both the concept of slope in co-ordinate geometry as well as trigonometry for a moment. Because, obviously circular references are bad ;-)

As per usual practice, looked up the net, but on this one, could not find an explanation satisfactory to my taste and here goes one that i write for myself.

So, lets first set our task up with an illustration:

This arrangement of two triangles ABC and DEF with labeling of the internal angles is our starting point, but not much helpful, so lets superimpose the triangles and check if we see a pattern:

Hmm, not much helpful either but we can see that what we need to find out is: how does the Δa of change on 'a' relate to the corresponding Δb of change in 'b'.

This would get easier if our 'Δa' is actually an integral multiple of 'a'. By shrinking the units, we can extend the relationship to fractional increase as well, we'll get to this at the very end.

So, lets start with the simplest possible extension, that is, lets double the length of the side that measures 'a' :

We have the smaller triangle AED with side lengths AE = a, ED = c, and DA = b.

And we have the bigger triangle ABC with side length AB = 2*a, but we don't really know the lengths of other sides BC and CA yet.

However, we do know that CA is an extension of DA(=b), so if we can find the length of CD, we can sum it up with 'b' to find the length of CA.

That is simple to do as illustrated in the figure below:

On the extended triangle, we draw 2 more lines to make our analysis easier. First, we draw a line EF parallel to AC intersecting point E, and then we draw a line DF parallel to AB going through point D.

With these two lines in place, we observe two parallelograms.

The brown parallelogram AEFD reflects the length 'b' from side DA to side EF, and then the orange parallelogram DEFC reflects the length 'b' from side EF onto side CD.

Thus, we can conclude that length of CD = 'b', and hence the length of the side CA in the bigger triangle is indeed CD+DA = b+b = '2*b'.

If you are unsure as why the two extra lines we drew form a parallelogram, here some explanation: consider the triangle at right bottom, EBF. ∠FEB = ∠DAE = α; AE = EB = a; and ∠EBF = ∠AEF = β.

Using Angle-Side-Angle theorem of triangle congruence, we can conclude, triangle EBF is congruent to triangle AED.

And hence, EF = DA = 'b' and it follows that FD = AE = 'a'.

Lets now look at the below figure to understand the length of the other side BC:

Here, the brown parallelogram DEFC reflects the length 'c' from side DE onto side FC.

The purple parallelogram EBFD reflects the length 'c' from the side DE onto side BF.

So, we have FC = c and BF = c

Together the length of the other side of the bigger triangle BC = BF + FC = 2*c

Thus by relying on a simple fact that the opposite sides of a parallelogram are equal, we just showed that when a triangle is enlarged by keeping the internal angles fixed, and doubling the length of one side, the length of other two sides double up too!

Lets now consider the situation where we extend one side by some 'n' times its original size without changing internal angles:

We can draw lines parallel to side with length c at each of the n divisions on the side that measures (n+1) times 'a' and get something as below:

Using same logic as we did when the side doubled, we can show that the AB/AE = BC/ED = CA/DA = 'n'

What about the extensions that are not integral multiples? Lets sort that out by means of an example. Consider we have a small triangle AED, that we are enlarging into a bigger triangle ABC and that AE = 5 and AB = 7.9 (instead of some multiple of 5)

We can divide the side AE of the AED triangle into 50 equal units, draw lines parallel to ED passing through each of those 50 points, and there we have 50 triangles in ascending order of their size but with the same internal angles. Lets us denote the sides of the smallest of these triangles with unit length on the AE segment as Δa, Δc, and Δb respectively. And the corresponding sides AE, ED, and DA be 'a', 'c', and 'b' respectively

From our earlier analysis we can show that a/Δa = c/Δc = b/Δb = 50 ----- (1)

Similarly, if we consider the bigger triangle ABC, we can create 79 triangles of unit side length on the side corresponding to AB and if we consider AB, BC, and CA as a`, c`, and b` respectively, we can show that a`/Δa = c`/Δc=b`/Δb = 79 ---- (2)

From (1) and (2) a`/a = b`/b = c`/c = 79/50

And so much goes for the lack of intuition :-)

Sunday, September 12, 2021

Midpoint of a Straight line

Which of the below two is easier to convince yourself of?

a) the angle-side-angle theorem of triangle congruence

b) relative proportions of a line remain consistent with its projection (shadow).

i know it ought to be b).... but on a muddled up day... you invent some muddle up logic ...

Don't ask me why but i had to calculate the midpoint of a straight line between two points : A (x1, y1) and B: (x2, y2) on the cartesian plane :-)

There is the well known formula for the midpoint, if we call it C : ( (x1+x2)/2, (y1+y2)/2 )

Though it is visually convincing to see that the x co-ordinate and y co-ordinate of the mid point C fall midway between the x and y co-ordinates of the point A and B, i was trying to build some logical reasoning behind it, if not the proof. Looked around a bit online, but couldn't really find a satisfactory explanation and hence wrote one for myself.

Let's take a look at the below example with points A (1, 4) and B (7, 2) whose midpoint lies at C (4, 3)

Its easy to see this by taking the projections of points A, B, C on X and Y axes and realizing that the X-cord of C is exactly half way between the X-coords of A and B. And likewise with the Y-Coords...

But it seems difficult to put this into a logically tight expression.

How else can we deduce the midpoint in a sound way? Lets define the midpoint again.. It is a point that lies exactly half way between the end points of a finite line segment.

So, if we consider the AB as the line segment, and C as the midpoint, length of AC is same as length of CB. Lets denote this length AC = CB = m.

Let's look at the above same picture but with a different perspective now, repurposed as below:

The triangles AOC, and CQB are congruent because:

a) ∠AOC = ∠CQB = 90

b) ∠OCA = ∠QBC Why? Because, A straight line (AB) makes the same angle with two parallel lines (OC and QB).

c) Length of sides AC = CB = m

d) Additionally, ∠CAO = ∠BCQ (from a and b)

So, using angle-angle-side or angle-side-angle theorems, we can conclude that triangles AOC and CQB are congurent.

Hence:

1) side AO = CQ Which means Y-coord of C is exactly halfway between the Y-cords of points A and B, given that Q's Y-Coord is same as B's Y-Coord.

2) side OC = QB. Which means X-coord of C is exactly halfway between the X-coords of points A and B, given that O and A have same X-coords, and C and Q have same X-coords.

Ofcourse, here we took help of angle-side-angle theorem for our deduction.

Saturday, July 17, 2021

How did that thought occur?

Third para from here, i write a basic explanation of Euclid's algorithm for calculating G.C.D.

Why i write this version is because, it baffles me when i see an opening statement:

'we need to find g.c.d of numbers a and b .. consider (a - b)'

[or]

'we need to find g.c.d of numbers a and b .. consider a = (q1)b + (r1) and b = (q2)(r1) + (r2)'

The 'consider this or that' version might be terse and tight to prove, but i think the motivation behind how such consideration came into being gets lost...

It is like for example, you want to understand how red-black trees are designed the way they are without knowing their evolution from the 2-3 trees.

You will keep wondering forever as how such 'considerations' occur.

Maybe even resigning to a false notion that it just can't happen to you and so get on with the considerations that are already laid out.

But then while your ability to memorize how something works enhances, your ability to arrive at how something can work diminishes...

So you have two numbers a and b, and you need to find their greatest common denominator, G.C.D.

Let d be the G.C.D of a and b.

Now, what do we want to know? we want to know what d is... But what do we actually know?

We only know about two numbers a and b whose G.C.D, d we want... How can we get to d from there?

We know a and b ... and we also know that a and b are actually a bunch of 'd's lumped together... [ ex: a = xd and b = yd ]

So a maybe some x number of 'd's and b maybe some y numbers 'd's ...

But, we don't want x or y number of 'd's , we just want "a" 'd' ..

We some how need a way to reduce that x or y down to one, and we will be at our prize.

If a > b, we can reduce the numbers of 'd's in a by the number of 'd's in b ...

How can we do that? Well, just subtract b from a... and now we have (x-y) number of 'd's in a as opposed to the previous x number of 'd's ..

Now that is some progress... And we can do this chopping off, till a > b...

What if a becomes less than b ? Simple, just swap them... and repeat...

So, where does this repeated chopping down of the number of 'd's from either a or b lead us to?

What happens in the end?

In the end, a will have become equal to b.

What does it mean when a equals b? It means both of them have the same number of 'd's . And when we are looking at the d as the "greatest" common denominator, both have them, exactly have one 'd'.

This explanation is no different* from others except for the emphasis on the thought process that leads one to the first step in problem size reduction.

* Or even deficient w.r.t completeness.

Sunday, May 16, 2021

Gist vs Details

All of us have heard that 'the devil is in the detail', and is intended to mean that we should bother about the details.

But should we really?

Lets check a counter example, which would rather re-interpret 'the devil is in the detail' as don't get into the details or you will awaken the devil 😈

It is a major efficiency issue in many 'structured' organizations, several people focusing on details and the details going ping-pong, and finally what one could implement in a day taking months.

End result is a robust bullet proof museum worthy product crafted by hundreds. Why museum worthy? Oh its market is already taken by a product quickly patched together by a dozen

Just to clear out the air, this is metaphoric writing and 'details' is quite overloaded in its meaning.

Here is the counter example problem statement:

"We are hosting a tournament with 'n' number of participants.

If 'n' is even, we conduct n/2 matches and n/2 winners advance to next round.

If 'n' is odd, we give free pass to a random participant, and conduct (n-1)/2 matches with the remaining (n-1) participants. So the random lucky participant and (n-1)/2 winners advance to the next round.

The tournament thus proceeds with remaining participants till a champion is decided.

Calculate the total number of matches that need to be held in the tournament.

The Gist way of solving this problem:

We need a champion. How do we get a champion? After n-1 participants lose. How do they lose? By playing a match. So if we play n-1 matches, we eliminate n-1 participants, and we have a champion.

``But wait...what about odd, even and all that?``

What about it? Does it contradict the statement that every match conducted eliminates exactly one participant? ``Hmm, no... ``

Then, hey, its (n-1)

The Details way of solving this same problem:

``What about odd, even and all that?``

We change course here, and say ``No no, lets do some detailed analysis, we want to be sure we covered the odd/even part of the requirements``

Hang on tight as we get into the "detailed analysis :)"

Detailed Case Analysis:

Part 1: Formulate basic recursion:

Lets define matches required for a given number n as m(n).
"n teams need n/2 matches if n is even" translates to
m(2n) = n + m(n) ------> Eq.1
"n teams need (n-1)/2 matches if n is odd, and one gets a free pass" translates to
m(2n+1) = n + m(n) + 1 ------> Eq.2
From Eq.1 and 2:
m(2n+1) = m(2n) + 1 ------> Eq.3
Now, this is great that we can express m(2n+1) in terms of m(2n).
If we can calculate m(2n) in closed form, we can easily calculate m(2n+1) by adding a 1 to it.

We have base cases: m(1) = 0 and m(2) = 1.

Here, Eq.1 and Eq.2 are good, but they are discontinuous, in the sense that we have to ways to move ahead based on whether number of participants in odd or even

Eq.3 is better, but its still a one way relationship in the sense that we can go from even to next odd, but we don't have a way to go from an odd to next even.

Part 2:Derive a continuous recursion

So, lets try to formulate a recursion that is valid for all 'n' in a uniform way.

Lets consider m(2n+2)...

m(2n+2) = m(2(n+1)) = n+1+m(n+1) ---> Eq.4

Part 2.1: n is even:

Here we consider that n is even, which means n+1 is odd.

So, m(n+1) = m(n) + 1 based on Eq.2

Eq.4 reduces to:

m(2n+2) = (n+1 + m(n)) + 1 And from Eq.2 n+1+m(n) = m(2n+1)

So, For all even 'n', m(2n+2) = m(2n+1) + 1 -----> Eq5

Part 2.2: n is odd:

Here, we consider that n is odd, which means n = 2x+1, from some integer 'x'.

lets substitute this in m(n+1)

Then, m(n+1) = m(2x+1+1) = m(2x+2) = m(2x+1) + 1 (from Eq5 due courtesy Part2.1)

Now, lets plug back the value of n = 2x+1

We get for all odd n, m(n+1) = m(n) + 1 ---> Eq.6

So, we have proved that:

when n is even m(n+1) = m(n) + 1 (Eq.3)

when n is odd, m(n+1) = m(n) + 1(Eq.6)

And hence,

For all n, m(n+1) = m(n) + 1 --> Eq.6

And thus we arrive at a continuous recursion of m(n) = m(n-1) + 1

Part 3: Arrive at the closed form from continuous recursion

From base cases, m(1) = 0 and expanding Eq. 6:

m(n) = 1 + m(n-2) + 1 = 1 + 1 + m(n-3) + 1 = 1 + 1 + ... (n-2) times + m(1) + 1

m(n) = (n-2) + m(1) + 1 = n-2 + 0 + 1 = n -1

m(n) = n-1 --> Eq.7

Q.E.D

And here we see just the tips of devil's horns.

The claws, the thorny mace, and other gory tools of torture lie buried in the reviewing, standardizing, platforms, and ways of working :-)

Friday, May 14, 2021

A Hare and Tortoise tale of computation

Its one of those usual days when i solved a little practice problem, and on a second look, felt like i touched my nose from around the back of my head..

The problem is this: "Given an array of positive integers, calculate the sum of all possible odd-length sub-arrays."

Below are 2 implementations to solve the problem..

Now place your bets on which one will be faster on a general purpose risc machine..

Its really like an Ingenious vs Ingenuous face-off, and who in the sane world would ever identify themself with a super brainy victor? 😉

The ingenious way: total= ∑ count[i]*val[i] where count[i] defines the number of odd length arrays that can be formed with val[i]. For indices beginning at 0, count[i] = ceil( (i+1)*(n-i ) / 2 )

The smarter code

int sumOddLengthSubarrays(vector<int>& arr) {
        int total = 0;
        int n = arr.size();
        for(int i = 0; i < n; i++){
            int count = (i+1)*(n-i);
            int count_odd = count/2 + count%2;
            total += count_odd*arr[i];
        }
        return total;
}

The ingenuous way: total = ∑val[i] + ∑combi_sum[i]; where combi_sum[i] is defined as the sum of all the odd length sub-arrays that end at position (i) other than the single element subarray [val[i]] and computed as below:

combi_sum[0]=combi_sum[1]=0;

combi_sum[2]=val[0]+val[1]+val[2];

For all other (i):
combi_sum[i] = ∑ {

num_odd_arrays_ending_at[i-2] * (val[i-1] + val[i]) +

combi_sum[i-2] +

val[i-2] + val[i-1] + val[i];

}

Wondering what nasty cocktail is that summation above?

Just go with the flow, there is explanation to follow..

The round about code

int sumOddLengthSubarrays(vector<int>& arr) {
        vector<int> running_sum;
        running_sum.reserve(arr.size());
        vector<int> combination_sum;
        combination_sum.reserve(arr.size());
        int total = 0;
        int i = 0;
        int k = arr.size() > 1 ? 2 : arr.size();
        for(i = 0; i < k; i++){
            total += arr[i];
            running_sum.push_back(total);
            combination_sum.push_back(0);
        }
        if(i < arr.size()){
            total += arr[i];
            running_sum.push_back(total);
            combination_sum.push_back(total);
            total += total;
            i++;
        }
        
        for(; i < arr.size(); i++){
            total += arr[i];
            running_sum.push_back(running_sum[i-1] + arr[i]);
            combination_sum.push_back(
                (((i/2) - 1) * (arr[i-1] + arr[i])) + 
                combination_sum[i-2] +
                running_sum[i] - running_sum[i-3]
            );
            total += combination_sum[i];
        }
        return total;
}

Now, which version do you think is faster on a usual computer?
i for sure thought the smarter and concise version was also the faster... but was in for a surprise trying to cheerfully test its performance.
And that too, after a recovery from the rather laborious and clumsy looking round about version.
Let me lay it out for you.. the fact is that the 'round about way', beats the 'smarter way'..
And how is it so? if you wonder, it seems to come from the cost of multiplications... not only the number of them, but also from the value of the multiplicand.
So, lets draw a table tracing through the calculations of each approach.
Lets start with the smarter and concise way first.

index	0	1	2	3	4	5	6	7	8	9	10
example data	6	5	8	7	1	2	4	3	9	11	10
cnt(all)	1*11= 11	2*10= 20	3*9 = 27	4*8 = 32	5*7 = 35	6*6 = 36	7*5 = 35	8*4 = 32	9*3 = 27	10*2 = 20	11*1= 11
cnt(odd) = ceil(cnt(all)/2)	6	10	14	16	18	18	18	16	14	10	6
*cnt(odd) val**	6*6= 36	10*5= 50	14*8= 112	16*7= 112	18*1= 18	18*2= 36	18*4= 72	16*3= 48	14*9= 126	10*11= 110	6*10= 60

Total = ∑(cnt(odd)[i]*val[i]) = 780

So, for each element, we have two multiplications and one divisions. That is obvious before we drew the table.

But what we see from the table is that for an array of size 'n', the multiplicands range:

a) from 1 to n for an array of size n when calculating cnt(odd).

b) and from ceil(n/2) to ceil((n*n)/8 when calculating cnt(odd)*val

So, for an array of size 100, the minimum multiplicand is 50, and maximum is 1250.

The n*n part in the multiplicand indicates, that the multiplicand grow faster than the array size.

Now, let us look at the round about way

Lets recap the definition of calculations used

1) combi_sum[0]=combi_sum[1]=0

2) combi_sum[2]=val[0]+val[1]+val[2]

3) For all other (i) combi_sum[i] = ∑ {

num_odd_arrays_ending_at[i-2] * (val[i-1] + val[i]) +

combi_sum[i-2] +

val[i-2] + val[i-1] + val[i]

};

A bit of explanation:

At position (i), assume that already the sum of all the odd length arrays ending at (i-2) is computed.

Now (i), together with (i-1) will add to each of these sub-arrays.

Apart from that, (i) will also form a new odd length array with (i-2), (i-1), and itself.

In the example down below, lets consider position 6. position 6 can form an odd length array in following combinations: a) 0 to 6, b) 2 to 6, c) 4 to 6.

The sum of those odd length arrays a), b), and c) would be ∑val[0..6], ∑val[2..6], and ∑val[4..6] respectively.

But we already calculated the sum of ∑val[0..4] =27 plus ∑val[2..4] =16 and stored it as combi_sum[4]=27+16=43.

a) ∑val[0..6] = ∑val[0..4] + ∑val[5..6];

b) ∑val[2..6] = ∑val[2..4] + ∑val[5..6]

c) ∑val[4..6]

a) + b) + c) = (∑val[0..4] + ∑val[5..6]) + (∑val[2..4] + ∑val[5..6]) + ∑val[4..6]

re-arranging we get:

a) + b) + c) = ∑val[0..4] + ∑val[2..4] + 2 * ∑val[5..6] + ∑val[4..6]

But we know ∑val[0..4] + ∑val[2..4] = combi_sum[4], so

a) + b) + c) = combi_sum[4] + 2 * ∑val[5..6] + ∑val[4..6]

q.e.d 😁

4) num_odd_arrays_ending_at[i-2] = (i-2)/2

5) we really don't need to, but we also maintain a running sum to skip an addition.

instead of doing: val[i-2] + val[i-1] + val[i], we do running_sum[i] - running_sum[i-3]

6) total = ∑val[i] + ∑combi_sum[i]

index	0	1	2	3	4	5	6	7	8	9	10
example data	6	5	8	7	1	2	4	3	9	11	10
running sum	6	11	19	26	27	29	33	36	45	56	66
combi sum	0	0	19	20= 0*15+0 +26-6	43= 1*8+19 +27-11	33= 1*3+20 +29-19	62= 2*6+43 +33-26	56= 2*7+33 +36-27	114= 3*12+62 +45-29	139= 3*20+56 +56-33	228= 4*21+114 +66-36

Total = running_sum[10] + ∑(combination_sum[i]) = 780

Here, in the round about way, we surely have additional memory to buffer up the combi_sum and also running_sum, but we have only one multiplication per element, though we have a few more additions.

The highest multiplicand is (n-2)/2 for an array of size n... compare it to n*n/8 in case of the smarter solution.

And hence, sometimes, the operations we assume wont have any noticeable performance impact, can indeed do, or can it not? Especially since the multiplication operation within wordsize is guaranteed to have same asymptotic cost as addition..

While the hare was hopping element to element without looking back much swiftly, it was kept busy in multiplication..
The tortoise was looking back at at least 3 elements but as a result kept its multiplication simpler..
And in the end it went past the finish line without losing much of its breath.

But on a third look, its not the multiplication that's the problem with the smarter way, its just the calculation of odd_count as (count/2+count%2) or ceil(count/2.0)

Do it as odd_count = (count+1)/2. Now, both hare and rabbit are happy, they crossed the line at the same time, though they took different routes...

Time to think, whats the cost of modulo 😱

Thursday, May 06, 2021

Obsession Outlook Originality = Output

Yea right, the three Os of Obsession, Outlook, and Originality together can read OOO and mean that you are Out Of Office for Others 😉

Can be on anything, but when one leads to the other in the order, it can be good, and when it feeds back, can probably be better after each cycle.

Journaling here one such cycle while working on a leet problem.

The Obsession:

Write good code that got translated to get that "faster than 100.00%" report on submission :-)

The problem was very simple, shuffle a std::string 's' based on vector<int> 'indices'.

ex: for input -- std::string s ("rat"); vectot<int> indices{1,0,2}; -- the output std::string t = "art"

The easiest solution for this is:

string t(s.length(), 0);

for(int i = 0; i < s.length(); i++){

t[indices[i]] = s[i];

}

return t;

But here is where obsession bit because of 'faster than 95.12%' feedback... and then what crazy things i tried to get that execution speed up 🙈..

Read the constraints again, tried to take advantage of the fact that 100 is the maximum size of the input string... switched to cycle sort kind of inplace shuffle, even tried to pre-allocate memory for the output string, and somehow hack std::string to wrap around it.. unfortunately, or should i say, fortunately, none of that worked..

The Outlook:

Then i took a pause and asked myself, what am i doing here? what does 100% faster mean? 100% faster has tuned into an obsession because it means well written code. i choose 100% faster as a parameter to measure the quality of code, but here i am obviously destroying it for the sake of 100% faster... so i stepped back and re-asserted that no, the operation is not a success if the patient is dead... Then i was like, ok lets shake the obsession away and think slowly case by case of various approaches and possible areas of improvement..

The Originality:

The simple approach is good, and only goes through the string once, but because we construct an output std:string and there is no interface that gets it to own precreated memory, we will end up initializing the output string once, and hence it takes 2n steps of assignment and O(n) extra memory..

The cycle sort style inplace shuffle is good, but it can also take 2n steps of comparisons though it does only n assignments.

So, what if we can cut the number of comparisons??

Time to do some case analysis of cycle sort style shuffle..

1) Worst case input order:

The worst case occurs when we need to swap n/2 elements individually with each other..

In this case, there are n/2 cycles and after each cycle we step back to the starting point of that cycle and move past it... So it takes at least 2n comparisons no matter what.

2) Best case input order:

The best case order is when we have a perfect cycle, where we start at the beginning and all the elements are chained together neatly ending up at the starting point...

This is a case that can be exploited to avoid additional n comparisons.

3) Average case input order:

It may have some split cycles, but last cycle usually ends somewhere in the middle of the string.

Even, this case can be exploited to avoid at least n/2 additional comparisons..

And that kind of thinking felt like the purpose of practicing code being met and why not don the grin if it also fetches a 'faster than 100%' on the side 😉without forgetting to ask 'can we do better'?

The Output:

https://leetcode.com/problems/shuffle-string/discuss/1193932/beats-100-on-exec-time..-cyclic-sort-with-early-exit

Thursday, April 15, 2021

little joy of sorting

What good is sorting unless you really need to sort? Take a look at this problem:

/**
 * Return the number of specially colored things.
 * Each thing has only one color on it.
 * 'thing' s are comparable based on their color 
 *   with a 'color' or another 'thing'.
 * 'color' s can be compared to each other as well.
 */
int find(vector special_colors, vector things)

Now we can implement it in two ways
1) The so mixed up way :-)

int find(vector<color> special_colors, vector <thing> things) {
    int special_things = 0;
    for (auto thing: things) {
        for (auto special_color: special_colors) {
            if (thing == special_color) {
                special_things++;
            }
        }
    }
    return special_things;
}

It gets the job done, but errr, has a time complexity of O(m*n) where m = special_colors.size() and n = things.size()

(OR)

2) The all nice and sorted way ;-)

 int find(vector special_colors, vector things) {
     sort(special_colors.begin(), special_colors.end());
     sort(things.begin(), things.end());
     auto thing = things.begin();
     auto special_color = special_colors.begin();
     int special_things = 0;
     while (thing != things.end() && special_color != special_colors.end()) {
         if ( *thing == *special_color) {
             special_things++;
             thing++;
         } else if ( *thing < *special_color) {
             thing++;
         } else {
             special_color++;
         }
     }
     return special_things;
 }

Now, this has an *average* time complexity of O(m*logm + n*logn) for sorting, and O(m+n) for counting (i.e., everything we do after sorting both the vectors), which is much better than O(m*n)

Goes on to say, "Hey, better be sorted and save yourself some free time" ;-)

Quiz: For what kind of input, will the counting complexity be the most?

*average* - because, most often a hybrid algorithm to achieve average optimal performance is used to implement sort()