Thought Valve: 2020

Thursday, September 03, 2020

Now this may be silly..

But yes, i didn't think its safe to swap two numbers numbers represented in fixed bits without using a temporary variable

We have this famous trick from college days,

$a = a + b$

$b = a - b$ # a + b - b = a

$a = a - b$ # a + b -a = b

Which i practically never needed to use, up until in a recent hobby project where the number of registers available are only two.

well my college days happened some 20 years back, so maybe i understood it at that time but not any more... use it or lose it you see..

When i wanted to use this trick now, what really concerned me was the overflow, and turns out the overflow would not matter. i just try to explain in simple terms as below..

For an $n$ bit width field, the largest unsigned number that can be represented is $2^n - 1$.

So for an overflow to occur on adding numbers $a$ and $b$, we would need $a$ as:

$a = 2^{n-1} + x$

and $b$ such that $x + b = 2^{n-1} + y$ where $y \ge 0$

From above, we can re-express $b$ as $b = 2^{n-1} + y - x$ Lets remember this as $eq(1)$

When we do

$a := a + b$, we have

$a := 2^{n-1} + x + b = 2^{n-1} + 2^{n-1} + y = 2^n + y$

But because we only have $n$ bits, the value stored at a would be:

$a = y$

And then we set out to do

$b : = a - b = y - b$, so how would we do '$a - b$'?

We would first take 2's compliment of $b$ and then add it to $a$

2's compliment of $b$ is $2^n - b$

Let's now substiture $b$ from $eq(1)$, then we have,

$b := a - b = y - b = y + 2^n - b = y + 2^n - (2^{n-1} + y - x)$

Thus we get

$b := \cancel{y} + 2^n - 2^{n-1} - \cancel{y} + x = 2^{n-1} + x$

Now this last expression is exactly the same as the value originally at $a$.

Similary, we can convince ourselves for the step $a = a - b$ that follows

But does it mean i understand the elegance of the modular arithmetic involved here?? No way :-)

Saturday, March 14, 2020

Why only 2 Weighted Medians at most?

It is intuitive to observe that in a distinct ordering of numbers, there are at most two medians.If the number of elements is odd, i.e. $2n-1$, then $n^{th}$ element is the median.
Whereas, if the number of elements is even, i.e $2n$, then $n^{th}$ and $(n+1)^{th}$ elements form the median.

Apart from median, there is also a concept called "weighted median".

The Definition

In a given distinct ordering of pairs $\{(i_1 ,w_1) , (i_2, w_2) .. , (i_n, w_n)\}$:
------> Let $i$ represent any arbitrary identity of a given element; $w$ represents the weight of that element in the sequence; and $W$ the sum of weights of all the elements
------> Then, an element $i_m$ is a weighted median if $\displaystyle\sum_{1 \le j \lt m} w_j \le \frac{W}{2}$ and $\displaystyle\sum_{m \lt j \le n} w_j \le \frac{W}{2}$

The Interpretation

One good use of weighted median is to compute the median of all the values when there are multiples/repetitions of same values in the sampled data.
So if you have $n$ values with average repetition $w_{avg}$ unordered, the median could be computed at $O(n.w_{avg})$.
But if you organize or have the data as distinct values and their repetitions, the complexity to find the median could come down to $O(n)$

Let's see why this is so:

After expanding an ordering of the form: (Lets refer to this as abridged ordering)

$\{(i_1 ,w_1) , (i_2, w_2) .. , (i_n, w_n)\}$ .....

to an ordering of the form: (lets call this expanded ordering)

$\{i_1, i_1,...(w_1times), i_2, i_2, ....(w_2 times), ...., i_n, i_n, ... (w_n times) \}$

The total number of elements in the ordering would be:

$\displaystyle\sum_{1\le j \le n} w_j = W$.

Now we can easily say that the median of such an ordering would be $(\frac{W+1}{2})^{th}$ element if $W$ turns out to be odd.
And if its even, the median comprises of $(\frac{W}{2})^{th}$and $(\frac{W}{2} + 1)^{th}$elements.
Let a given repetition of $i_m$ be the median in the above expanded ordering, and lets also consider $W$ to be odd.
Then, the number of elements in the expanded ordering before and after the selected instance of $i_m$ must exactly be $W/2$.
Note that, for any element other than median, this property will not hold good.
Now if we extract out the copies of $i_m$ on its left, and right, if any, we can establish that the number of non $i_m$ elements before and after $i_m$ is at most $W/2$.
Hence, if we calculate the weighted median of abridged ordering, it would be same as the median of the expanded ordering.
Similar argument can be made for when $W$ is even, and also the special case when such ordering has two distinct medians $i_m$ and $i_{m+1}$
From this analysis on using weights as repetitions, it is easy to see, that like median, the weighted medians can also be at most two.

The observation

But what if we focus on just the plain simple definition of weighted median without interpreting the weights as repetitions or such, and were to "observe" that at most, there can just be two weighted medians.... How would we establish that fact? Well, its not too hard.
Let $i_m$ be a weighted median in a given abridged ordering, then,

$\displaystyle\sum_{1 \le j \lt m} w_j + w_m + \displaystyle\sum_{m \lt j \le n} w_j = W$

Let $A = \displaystyle\sum_{1 \le j \lt m} w_j$, and $B = \displaystyle\sum_{m \lt j \le n} w_j$

Then $A + w_m + B = W$ and by definition of weighted median, $A \le W/2$ and $B \le W/2$

Now, if we assume that: $w_m + B \le W/2$, then we can claim that $w_{m-1}$ is also a weighted median.

This is because: $A \le W/2 \implies A - w_{m-1} \le W/2$ and of course by our assumption, we have $w_m + B \le W/2$

We have three relations to satisfy here: $1: A + w_m + B = W; 2: w_m + B \le W/2;$ and $3: A \le W/2$.

All the three can be satisfied if and only if $4: A = W/2$ and $5: w_m + B = W/2$

Given that we already have weighted medians $w_{m-1}$ and $w_m$, the requirements in 4: and 5: above make it impossible to have any other weighted median.

Similar argument can also be made assuming that $A + w_m \le W/2$

For me this understanding proved to be more fun than actually writing the R-Select code to find out the weighted median itself, and hence come these notes 😀Though, i actually ended up describing the algorithm here 😛

Thursday, March 05, 2020

2-D Peak-A-Proof

Hmm 😵.. Capacity for proofs... how important is that?

No logic is sound unless it is proved. The ability to prove is a testament to one of the three things:

a) clarity of thought,

b) loudness of mouth,

c) willingness to self deceive 😛.

But more often than not, i guess its useful when its a); c) is not bad either, for it counter balances its other opposite, which is self doubt 😉

Somewhere between self doubt and self deceit lies self confidence, right? Not sure of b) but who knows, its probably used pretty widely, or? 😆

Ahem ahem... course correction.. back to the utility of proofs...
Take for example the rather simple 2-D Peak finding problem.
The 2-D Peak finding problem:

From a given n-by-n matrix A, find a (i.e., any) number A(i, j) such that all valid elements in A at positions A(i-1, j), A(i+1, j), A(i, j-1), A(i, j+1) are less than or equal to A(i, j).

Valid element means that the position computed is a valid legal position in the matrix.

For example: A(-1, 0) is NOT a valid element.

Simply put, an element is a peak, when the elements if any, immediately above, below, to the left, and to the right of it, are less than or equal to the given element.

In the example 5-by-5 matrix below, the peak elements are highlighted.

60	100	103	80	95
125	120	108	101	45
130	30	93	90	156
115	40	98	65	125
75	50	96	55	25

The Divide and Conquer algorithm(s) to efficiently find "a" peak is(are) rather simple, but, is the proof of their correctness as simple?
Well, i guess it varies from person to person. But for some reason, i have this inherent discomfort with anything more than one dimension.
And i get the urge to write about it as i have a feeling there maybe a few more confused souls like me who feel dazed thinking in more than one dimension 😛.

i tried for several hours to come up with a less than $O(n^2)$ algorithm, but just could not.

Started looking up for hints and tried the....
1-D Peak finding problem:
A peak in an array A means any A(i) such that both A(i-1) and A(i+1) are less than or equal to A(i).
For the beginning and end elements of the array, A(0) is a peak if A(1) ≤ A(0) and A(end) is a peak if A(end-1) ≤ A(end). Find such a peak from the array A.

1-D Peak finding algorithm
"A" (i.e. any) peak in this array can be found in O(logn) time using the following strategy:
1. Take a middle element, A(m) and check if it is a peak.If so, return A(m).
2. If not, either A(m+1) or A(m-1) or both would be greater than A(m)
3. Continue with step 1 on the right half, if A(m+1) >A(m)
Continue with step 1 on the left half, if A(m-1) > A(m)
If both A(m-1), and A(m+1) are greater than A(m), either half is fine.
Is this algorithm correct? Seems trivial enough to prove so.

[proof:1-D Peak finding algorithm]
Suppose A(m) is not a peak and A(m+1) > A(m).
In this case, A(m+1) will be a peak unless its own right neighbor, A(m+2) > A(m+1).
Lets assume A(m+2) meets this condition. Inductively, for A(m+2) to not be a peak, A(m+3) > A(m+2) and it can go on and on, only till the end of the array where the end element will turn out to be a peak.
Hence, when we follow the half where A(m+1) > A(m), we are guaranteed to find a peak.
We can argue similarly for the case when A(m-1) > A(m).
[proof:end]

Surprisingly so, this new found wisdom on the 1-D case got me no where with the 2-D case, and finally i looked up the algorithm 😁
However, i still could not convince myself very easily that the 2-D algorithm i looked up is actually correct.
Guess this is always the case with borrowed analysis, because it is less of an analysis and more of an assurance.

Anyways putting a stop to the sob, here is the most simple of the divide and conquer algorithms for the 2-D case:
O(nlogn) 2-D Peak finding algorithm
1. Find the maximum element of the middle column, let it be A(i, m).
If its a peak, return it and terminate.
2.a) If A(i, m-1) >= A(i, m) and A(i, m+1) ≤ A(i, m); then recurse in the left half of matrix.
2.b) Else If A(i, m+1) > A(i, m) and A(i, m+1) ≤= A(i, m); then recurse in the right half of matrix.
2.c) Else If A(i, m-1) >= A(i, m) and A(i, m+1) >= A(i, m); then recurse in either left or right half. (i.e., either of the halves is guaranteed to have a peak)
3. If the recursion reduces the matrix to a single column, just return the max of that column and terminate.

Sounds neat and simple to code up, but is the algorithm correct?
How do we go about its correctness? It was a bit of mind muddling experience for me, because for once in the evening i thought i got the proof but couldn't really write it down. And when i tried to write it down before bed, i couldn't really come up with the proof again.. Wonder if the evening was a moment of self deceit or a volatile intuition 🤔. Well, so i decided to give it a go in the morning instead when luckily i guess i could reason the correctness correctly.

So here is a proof for the above binary search algorithm for the 2-D peak:
i changed my original proof a bit starting with the base case first. This way, it is simpler to understand and takes fewer words i think.
[proof:O(nlogn) 2-D Peak finding algorithm]
Consider the base case (Step 3), a single column and n rows of elements. Lets call this column as c.
Its easy to see that the max of that column is a "peak" in that 1 column matrix. Lets call this peak as p.
Now, lets unwind the recursion, and add another column, c+1 of n elements to this matrix.
p ceases to be a peak in the new 2 column matrix only if column c+1 has an element greater than p in the same row as p.
But, Step 2.a) in the above algorithm ensures that not only the one next to p, but, all of the elements in column c+1 are less than p if column c+1 was skipped out of the search.
Fine, lets unwind a bit more, and add another column, c-1 of n elements to this matrix.
p ceases to be a peak in the new 3 column matrix only if column c-1 has an element greater than p in the same row as p.
However, Step 2.a) ensured that all the elements in column c-1 are indeed less than p if column c-1 was skipped out of the search.
Hence, inductively we can prove that when the algorithm hits the base case, it always finds a peak which is valid for the undivided input matrix
[proof:end]

The complexity of this algorithm is O(nlogn).
If you look at the design of this algorithm, you will notice, the main ingredient is to use the middle column as a virtual valley with the knowledge that the half next to this column has at least one element greater than any of the elements in this virtual valley, and hence there is a guarantee to find a peak in that particular half.

Now lets looks at the O(n) algorithm:
O(n) 2-D Peak finding algorithm

The O(n) algorithm extends this valley building effort both across the middle column as well as the middle row, and recurses into one of the quadrants where a peak is guaranteed to be found.
The algorithm goes as below:
1. Find the maximum element among all the elements of the middle row, $m_r$ and middle column, $m_c$
If the maximum is at the intersection of the row and column, it is definitely a peak, so return it and terminate.
2. If the maximum is a peak, return it and terminate.
3. If the maximum is not a peak, recurse is the quadrant where adjacent neighbor of the maximum is greater than the maximum.
4. When recursion is left with only one row, return the max of that row and terminate. Or, if the recursion is left with only one column, return that max of that column and terminate. Or, if the recursion is only left with one element, that one element must be the peak, return it, and terminate.

Bit more details on step3
i am just writing more details on step3 to clarify and maybe also to aid the proof which is coming up.
3.a. If the maximum A($m_r$, j) is in the middle row, and j > $m_c$, and A($m_r$ + 1, j) > A($m_r$, j), then recurse in the quadrant bounded by rows $m_r$ +1 to end, and columns $m_c$ + 1 to end.Below 7 by 7 table gives an example of this case:
$m_c$ = $m_r$ = 4, maximum is A(4, 5) = 4520.
But A(5, 5) = 4610 > A(4, 5), Hence we recurse in the bottom right quadrant.

1923	3317	1784	4236	126	526	2583
4060	1068	1667	1263	3989	2461	4342
2764	139	4552	4029	502	2764	653
1347	2172	1614	2852	4520	4305	2199
4780	353	2606	1504	4610	3475	4515
1581	1132	719	38	1396	4285	4456
2607	3458	1651	3426	1078	4388	559
3328	2332	42	3865	378	3989	3959

3.b. Else if the maximum A($m_r$, j) is in the middle row, and j > $m_c$, and A($m_r$ - 1, j) > A($m_r$, j), then recurse in the quadrant bounded by rows 0 to $m_r$ -1, and columns $m_c$ + 1 to end. Below table gives an example of this case:
$m_c$ = 2, $m_r$ = 2, maximum is A(2, 3) = 4456.
But A(1, 3) = 4515 > A(2, 3), Hence we recurse in the top right quadrant.

4610	3475	4515
1396	4285	4456
1078	4388	559
378	3989	3959

Similarly it follows for other cases as below...
3.c. Else If the maximum A($m_r$, j) is in the middle row and j < $m_c$ , and A($m_r$ - 1, j) > A($m_r$, j), then recurse in the quadrant bounded by rows 0 to $m_r$ -1, and columns 0 to $m_c$ -1
3.d. Else if the maximum A($m_r$, j) is in the middle row and j < $m_c$ , and A($m_r$ + 1, j) > A($m_r$, j), then recurse in the quadrant bounded by rows $m_r$ +1 to end, and columns 0 to $m_c$ -1
3.e. Else if the maximum A(i, $m_c$) is in the middle column and i < $m_r$, and A(i, $m_c$ - 1) > A(i, $m_c$), then recurse in the quadrant bounded by rows 0 to $m_r$ -1 and columns 0 to $m_c$ -1
3.f. Else if the maximum A(i, $m_c$) is in the middle column and i > $m_r$, and A(i, $m_c$ - 1) > A(i, $m_c$), then recurse in the quadrant bounded by rows $m_r$ + 1 to end, and columns 0 to $m_c$ -1
3.g. Else if the maximum A(i, $m_c$) is in the middle column and i < $m_r$, and A(i, $m_c$ + 1) > A(i, $m_c$), then recurse in the quadrant bounded by rows 0 to $m_r$ -1 and columns $m_c$ + 1 to end.
3.h. Else if the maximum A(i, $m_c$) is in the middle column and i > $m_r$, and A(i, $m_c$ + 1) > A(i, $m_c$), then recurse in the quadrant bounded by rows $m_r$ + 1 to end, and columns $m_c$ + 1 to end.

Now that we stated the algorithm, rather elaborately, lets try to prove it by starting with base case.

[proof: O(n) 2-D Peak finding algorithm]

Consider the base case when only one row $r$ and columns $c_x$ to $c_y$ are left.

The maximum of that row, say, A(r,j) is the peak of that one row, r.

Lets unwind the recursion and add back the row r-1 to the array.

A(r, j) ceases to be a peak only if A(r-1, j) > A(r, j)

And steps 3.a, 3.d, 3.f, 3.h are the ones responsible for peeling off rows above.

Because of preconditions in the those steps, not only A(r-1, j), but for all c such that $c_x$ < c < $c_y$; A(r-1, c) ≤= A(r, j)

On similar lines, the proof can be extended to the case when we add back row r+1, as well as when the base case is a column.

[proof:end]

60	100	103	80	95
125	120	108	101	45
130	30	93	90	156
115	40	98	65	125
75	50	96	55	25

60	100	103	80	95
125	120	108	101	45
130	30	93	90	156
115	40	98	65	125
75	50	96	55	25

60	100	103	80	95
125	120	108	101	45
130	30	93	90	156
115	40	98	65	125
75	50	96	55	25