February 16, 2015

Finding Switch - Bulb Association

Problem 1: There are three light bulbs in a room. Outside the room are three switches which operate those bulbs. You know that the bulbs are initially turned off and you can toggle the switches as much as you want. While you toggle the switches, you cant peek inside the room. Finally when you are done with the switches, you enter the room once and you have to tell which switch is associated with which bulb.

Solution: show

Problem 2: Same problem as 1, but now we dont know if the bulbs are initially on or off. So in a way we do not know if turning the switch up turns on the bulb or turns it off. All we know is that all the bulbs are in same state, i.e., all are either on or off. Now we have to toggle the switches and enter the room once and tell which switch operates which bulb.

July 2, 2014

Solving Sudoku Puzzles

Recently, I got hooked on to Sudoku (thanks to a wonderful IPhone app). I found it to be a slightly more constructive time sink and plan to teach it to my daughter some day (she is one year old as of yet). For this, I thought of writing an assistive software to teach kids how to simplify a Sudoku. This motivated me to first build a Sudoku solver and this blog is about it.

A Sudoku is a matrix of cells, typically 9x9, but it can be any number. In this post, we consider N x N Sudoku (where N is a perfect square, i.e. 1x1, 4x4, 9x9, 16x16). Now for such a Sudoku, we first construct a groups of cells called as blocks. We get N row blocks, N column blocks and N sub-grid blocks. All the blocks contain N cells each. A cell can have a value from 1 to N, such that, if a cell is assigned a value, then no other cell in the same block can be assigned that value. To check if a Sudoku is solved, all the blocks must sum to N*(N+1)/2. Another check is one where number uniqueness constraints are satisfied per block and each cell is assigned a value from 1 to N.

To simplify a Sudoku by hand, I typically use the following techniques:
  1. A number is assigned to a cell,
    • If a cell can hold only that number.
    • If that number can be held by only that cell (in at least one block).
    Once a cell is assigned a number, other cells (in the same block) get their possible hints (or values) pruned.

  2. Two numbers are assigned to two cells if those two cells share at least one block (Note: two cells can share maximum of two blocks),
    • The two numbers are referred exclusively by these two cells in a given block. In this case other hints of the two numbers are pruned.
    • The two cells refer to those two numbers only. In this case the two numbers are removed from the hints of other cells in their shared block.

  3. To finally solve it, simplify Sudoku by applying (1) and (2) until further simplification is not possible. Then pick a cell with least number of hints. Assign it one of those hints and repeat this brute-force method. If failed, then try another hint. If all hints fail then Sudoku is not solvable.
Algorithm 1a, 1b are pretty straightforward. Algorithm 2a and 2b are also easy to see (if you have played Sudoku before). In spirit Algo 2 can be extended for k>2 but the benefit of simplification it provides is marginal compared to the cost of finding set of k cells or k numbers that satisfy the unique reference aspect. Now to encode these algorithms, I wrote a Java program which solves Sudoku of size N x N (where N is a perfect square).

The program is oriented to be event driven, where a cell is asked to remove its hint based on the logic of (1,2,3). In this process if a cell has only one hint remaining, it triggers a fix event. When the fix event is called, this cell is assigned the value and that value is removed from the hint of cells that share block with it. If at anytime any inconsistencies are found, the events return a “false”, indicating that Sudoku is not solvable. This is a fail fast technique and it quickly helps in determining if a Sudoku is non-solvable or not. Otherwise, we would wait for all cells to get filled and then realize that it failed (wasting precious time). A simple brute-force strategy is very easy to code, however the possibilities to check are huge. For e.g., we might end up checking O(9^81) possibilities (which can be prohibitive). The simplification algorithms (1) and (2) come to the rescue by drastically simplifying the Sudoku. For most of the easy problems - the recursive step (3) is not even triggered.

I searched online for Sudoku problems and found that Peter Norvig has a blog about it. He used algo(1,3) but not (2) and has shared a set of easy, hard, and hardest Sudoku puzzles. It amazes me to say that for most hard problem, I take at least 8-10 minutes (and sometimes a hint) but a computer can solve it in order of ~0.1 ms. If you are interested in the maths of Sudoku (e.g. number of possible solutions), please check out the wikipedia page: maths of sudoku.

In the first analysis below, we see how many cells different algorithms are able to solve.
 Input    (1)  (2a) (2b)  (2) (1+2a)  (1+2b)   (*)
Easy 35% 89% 54% 68% 70%   96%  94% 96%
Hard 25% 31% 26% 26% 26%   41%  31% 41%
Hardest 30% 45%  35%  35%  36%    50%  45% 50% 

Here we see that algorithm (1) itself can solve 89% of the cells for easy problems. However, the joint application of (1+2) is able to solve 96% of the easy cells and 41% of the hard cells (and roughly provide a 10-15% improvement. This is a great simplification, especially for easy Sudoku problems. Amongst (2a) and (2b), I find (2a) to be a better strategy to run in conjunction with (1) as (1+2a) has same result as (*).

In the next analysis, we see how many hints are left after application of different algorithms
 Input   (1)  (2a) (2b)  (2) (1+2a)  (1+2b)   (*)
Easy  474  29 126  89  82   10    16  10
Hard  544 230 252 260 250   184  228 181
Hardest  508 161  191  193  189    141  158 141 

Similar analysis as above. 2a appears better than 2b. But (*) must better than (1) or (2). So combining these strategies is much more effective. We see that clearly (1+2a) as good as (*) so we can simply drop algorithm 2b. To see how much (if at all) there is a penalty of running the (2b) algo, we turn to time taken.

For measuring time, I use Java's System.NanoTime() function, which gives very precise timing. Experiments are run on my MacBook Pro with 2.6Ghz intel I7 processor. To get the timings, I solve a puzzle 500 times and take the average solving time. We get the following timings.

 Brute Force      (1)   (1+2a)   (1+2b)      (*)
Easy    0.14 ms 0.04 ms  0.03 ms  0.03 ms   0.03 ms 
Hard    21 ms 1.16 ms  0.44 ms  0.55 ms  0.42 ms
Hardest    0.61 ms 0.17 ms  0.20 ms   0.19 ms   0.22 ms 

Clearly brute force without simplification is a bad idea. (*) improves timing over (1) for hard problems at least. For other problems, numbers are very close to conclude. My experience over other random problems that I tried, suggests that (*) leads to fastest simplification (i.e. reduced iterations) and hence better.

I also tried the problem that Peter mentions taking most time and noted its timing through my Java implementation. It took 6134 ms for algo (1), but only 10ms for (*). A clear evidence of the joint algorithms' usefulness. This improved timing comes from the fact that Sudoku gets drastically simplified before entering the brute-force stage and even for each choice of brute-force, it further gets simplified eliminating possible hints.

The Sudoku program is generic in the sense that it can solve 1x1, 4x4, 9x9, 16x16, … problems. A quick run of the program is as follows:

Solving a Sudoku puzzle of size 1 ...
 ===
| * |
 ===
[status] solved=0 hints=1
 ===
| 1 |
 ===
[status] SOLVED
Time taken=0.022 ms

Solving a Sudoku puzzle of size 4 ...
 ===== =====
| 4 * | 2 * |
| * * | 4 1 |
 ===== =====
| 1 * | * * |
| * * | 1 4 |
 ===== =====
[status] solved=7 hints=36
 ===== =====
| 4 1 | 2 3 |
| 2 3 | 4 1 |
 ===== =====
| 1 4 | 3 2 |
| 3 2 | 1 4 |
 ===== =====
[status] SOLVED
Time taken=0.132 ms

Solving a Sudoku puzzle of size 9 ...
 ======= ======= =======
| * * * | 7 * * | 6 * 3 |
| 7 * 3 | 9 * * | * 1 2 |
| * * * | 5 3 * | * * * |
 ======= ======= =======
| 1 * * | * * * | * * 7 |
| 9 * * | * * * | * * * |
| * 4 5 | * * * | * 9 6 |
 ======= ======= =======
| 3 * * | * * * | 9 * * |
| * * * | * * 1 | * 6 5 |
| * 1 * | 3 6 * | * 7 4 |
 ======= ======= =======
[status] solved=27 hints=486
 ======= ======= =======
| 5 9 4 | 7 1 2 | 6 8 3 |
| 7 8 3 | 9 4 6 | 5 1 2 |
| 6 2 1 | 5 3 8 | 7 4 9 |
 ======= ======= =======
| 1 6 2 | 8 5 9 | 4 3 7 |
| 9 3 7 | 6 2 4 | 1 5 8 |
| 8 4 5 | 1 7 3 | 2 9 6 |
 ======= ======= =======
| 3 5 6 | 4 8 7 | 9 2 1 |
| 4 7 8 | 2 9 1 | 3 6 5 |
| 2 1 9 | 3 6 5 | 8 7 4 |
 ======= ======= =======
[status] SOLVED
Time taken=2.198 ms

Solving a Sudoku puzzle of size 16 ...
 ============= ============= ============= =============
| 8  7  *  *  | *  *  *  *  | *  3  *  *  | 13 *  4  *  |
| *  5  14 *  | *  *  3  10 | 15 9  1  *  | *  6  *  *  |
| 16 *  *  *  | 5  8  7  *  | *  14 *  *  | 9  *  11 12 |
| *  *  4  *  | *  14 6  13 | *  11 10 12 | *  7  *  3  |
 ============= ============= ============= =============
| 14 *  *  8  | *  *  1  *  | *  *  *  3  | 7  4  12 *  |
| 9  *  *  *  | *  6  15 12 | *  *  13 14 | *  3  1  *  |
| 11 *  10 3  | *  *  13 *  | *  8  *  1  | *  *  6  *  |
| 6  *  *  1  | 14 *  4  *  | *  5  *  9  | 11 *  *  13 |
 ============= ============= ============= =============
| *  *  *  *  | 15 *  *  *  | *  *  9  *  | 5  *  2  10 |
| 10 1  *  *  | 6  *  5  *  | 13 15 7  16 | *  *  *  *  |
| *  *  16 11 | *  4  *  8  | 2  *  *  *  | *  13 *  7  |
| *  9  *  7  | 1  3  *  2  | 6  *  8  10 | 16 15 14 4  |
 ============= ============= ============= =============
| 7  *  13 *  | 9  16 *  5  | *  *  14 4  | 3  8  *  2  |
| *  *  3  *  | 10 *  *  *  | *  *  *  *  | *  16 15 *  |
| 1  *  9  *  | *  *  14 4  | *  *  *  *  | *  *  7  *  |
| *  6  8  *  | 3  *  *  *  | 10 7  *  *  | *  *  *  *  |
 ============= ============= ============= =============
[status] solved=116 hints=2240
 ============= ============= ============= =============
| 8  7  11 10 | 2  12 9  1  | 5  3  16 6  | 13 14 4  15 |
| 12 5  14 13 | 4  11 3  10 | 15 9  1  7  | 2  6  16 8  |
| 16 3  1  6  | 5  8  7  15 | 4  14 2  13 | 9  10 11 12 |
| 2  15 4  9  | 16 14 6  13 | 8  11 10 12 | 1  7  5  3  |
 ============= ============= ============= =============
| 14 13 15 8  | 11 2  1  9  | 16 10 6  3  | 7  4  12 5  |
| 9  4  7  5  | 8  6  15 12 | 11 2  13 14 | 10 3  1  16 |
| 11 2  10 3  | 7  5  13 16 | 12 8  4  1  | 15 9  6  14 |
| 6  16 12 1  | 14 10 4  3  | 7  5  15 9  | 11 2  8  13 |
 ============= ============= ============= =============
| 3  8  6  12 | 15 13 16 7  | 14 4  9  11 | 5  1  2  10 |
| 10 1  2  4  | 6  9  5  14 | 13 15 7  16 | 8  12 3  11 |
| 15 14 16 11 | 12 4  10 8  | 2  1  3  5  | 6  13 9  7  |
| 13 9  5  7  | 1  3  11 2  | 6  12 8  10 | 16 15 14 4  |
 ============= ============= ============= =============
| 7  11 13 15 | 9  16 12 5  | 1  6  14 4  | 3  8  10 2  |
| 5  12 3  14 | 10 7  8  6  | 9  13 11 2  | 4  16 15 1  |
| 1  10 9  2  | 13 15 14 4  | 3  16 5  8  | 12 11 7  6  |
| 4  6  8  16 | 3  1  2  11 | 10 7  12 15 | 14 5  13 9  |
 ============= ============= ============= =============
[status] SOLVED
Time taken=1.346 ms

Solving all puzzles from: ./src/sudoku/hardest.txt
Solved       :  11/11
Solved Cells :  270(input) 891(finally) 891(expected)
Cell Hints   :  5589(input) 0(finally)
Time Taken   :  2.59014144 milli-seconds
Avg time     :  0.23546740363636365 milli-seconds

Solving all puzzles from: ./src/sudoku/hard.txt
Solved       :  95/95
Solved Cells :  1953(input) 7695(finally) 7695(expected)
Cell Hints   :  51678(input) 0(finally)
Time Taken   :  41.772783616 milli-seconds
Avg time     :  0.4397135117473684 milli-seconds

Solving all puzzles from: ./src/sudoku/easy.txt
Solved       :  50/50
Solved Cells :  1418(input) 4050(finally) 4050(expected)
Cell Hints   :  23688(input) 0(finally)
Time Taken   :  1.868110848 milli-seconds
Avg time     :  0.03736221696 milli-seconds

A word of advice to readers. While building the Sudoku (or similar programs), you might be tempted to optimize it. I suggest its more important for the code to be readable and debuggable and concise. I tried to optimize my code, but realized its not worth it. Using some tricks i got speed up of 1ms for a specific alto, but that meant highly cryptic data structure. This would lead to increased timing in another algo and a nightmare to debug. Just do basic optimizations. As a side-effect of my attempt, i learnt few useful things to speed up the code in Java.
  1. bytes is more expensive that int. So use int.
  2. int check (e.g. i==0) is very-very slightly faster than boolean check. But don't use it if it compromises readability. 
  3. Exception handling (even if exception is never thrown) is expensive.
  4. Java doesn't do tail call optimization. So don't worry too must about trying formatting your recursive routine.
  5. keyword final has no effect on performance in Java. So if you think a one liner function is inlined, its not.
  6. Avoid using array list and use fixed array. Even if array list is of size say k and array of n where k < n, its faster to use array. UNLESS k <<< n.
  7. Hashmap, Hashsets would easily hog performance. It can easily double the running time. So if you can avoid it, its better.
  8. 1-d array is cheaper than 2-d array.
  9. Using objects is extremely cheap. I find next to nil penalty on using objects. So use them to simplify code.
  10. Caching can be useful. for example instead of referring A[i] several times, its faster to set x =A[i] and refer to x.
  11. private, public, and other qualifies only increase space of byte code but has no performance impact. So use them for good design principles and not for performance improvements.
Code is available at https://github.com/n1balgo/sudoku. Feel free to check it out and drop in comments or email for feedback (and suggestions and bugs).

May 2, 2014

Stick Breaking

Problem Set 1: Given a stick of length 1 meter. It is cut into two parts randomly.
  • Problem 1a: What is the expected length of the smaller stick?
  • Problem 1b: What is the expected length of the bigger stick?

Problem Set 2: Given a stick of length 1 meter. It is cut into three parts randomly.
  • Problem 2a: What is the expected length of the smallest stick?
  • Problem 2b: What is the expected length of the largest stick?
  • Problem 2c: What is the expected length of the stick with second smallest (or second largest) length?
  • Problem 2d: What is the expected length of the middle stick (stick that is between the two cuts)?
  • Problem 2e: What is the probability that the three sticks would form a triangle?
  • Problem 2f: What is the probability that the three sticks would form a right angle triangle?
  • Problem 2g: What is the probability that the three sticks would form an equilateral triangle?

Problem Set 3: Given a stick of length 1 meter. It is cut into two parts randomly. Then bigger stick is picked and is again cut into two parts randomly.
  • Problem 3a: What is the expected length of the smallest stick?
  • Problem 3b: What is the expected length of the largest stick?
  • Problem 3c: What is the expected length of the stick with second smallest (or second largest) length?
  • Problem 3d: What is the expected length of the middle stick (stick that is between the two cuts)?
  • Problem 3e: What is the probability that the three sticks would form a triangle?
  • Problem 3f: What is the probability that the three sticks would form a right angle triangle?
  • Problem 3g: What is the probability that the three sticks would form an equilateral triangle?

Generalized Cut Problem: Find the solution to problem 2a-2c, 3a-3c for n-1 cuts.

Solution: show

May 1, 2014

Defective Balls

For problem 1-3 consider a weighing scale (with two sides) and it only tells which side is heavier (it doesnt tell by how much).

Problem 1: You have 9 balls of same weight, except that 1 ball is defective and it is known to be lighter. Find the defective ball in two weighing.

Problem 2: You have 9 balls of same weight, except that 1 ball is defective (it can be heavier or lighter, we dont know). Find the defective ball in three weighing.

Problem 3: You have 9 balls of same weight, except that 2 balls are defective and they are known to be lighter. Find the defective balls in four weighing.

Problem 4: There are 9 machines that produce the balls. You get 9 balls from each machine during a production cycle. You are given a weighing scale that gives the exact weight. You are told that one machine produced defective balls in a given production cycle. Find the defective machine through a single weight measurement. Assume that a perfect ball weight 1lbs and defective ball weight 1.1lbs.

Problem 5: Consider problem 4. Now assume that the weight of the defective ball is unknown. Now find the defective machine in two weighing.

Problem 6: There are two jars A and B. A contains 50 perfect balls and B contains 50 defective balls. Arrange the 100 balls in the two jars in a way that the probability of randomly picking a ball (by first randomly picking a jar) is maximized.

Solution: show

December 9, 2013

Palindrome Numbers

Problem 1: Given a number n, check whether it is a palindrome or not. For e.g., n = 303 is a palindrome, where as n = 302 is not. Give an O(log n) algorithm.

Problem 2: Given a number n, find the largest palindrome that is less than or equal to n. For e.g., n = 36, then 33 is the largest palindrome less than or equal to n. Give an O(log n) algorithm.

Problem 3: Given a number n, find all the palindrome less than or equal to n. Give an efficient O(k) algorithm, where k = number of palindromes less than n.

Problem 4: Prove that all palindromes with even length is divisible by 11. For e.x. 6996 is divisible by 11.

Problem 5: Consider a palindrome number generator as follows.

Step 1: n = n + reverse(n)
Step 2: If !is_palindrome(n): Goto Step 1

Given a palindrome number m, find the smallest n which can generate m using this generator. Is it possible that no such n exist?

For e.g., m = 121, we get n = 19. To see this, we get, n1 = 19+91 = 110. In the next step we get n2 = 110 + 011 = 121.

December 1, 2013

Finding Relevant Keywords from Large Corpus of Text

Problem 1: Given a large paragraph of words containing n words (separated by space), and k keywords. Find the smallest distance between these keywords in the paragraph.

To illustrate the problem, consider that the paragraph is A B A C E D O A B. Let us say that we are interested in two keywords A and D. Then the closest they appear is at position 8 and 6 (D O A) and the minimum distance is 1 (i.e. number of non-keyword words between them).

Build an efficient algorithm with O(k) space and O(n log k) running cost.

Solution: show

Problem 2: Given a large paragraph of words (separated by space). Consider N tuples of names and interests. If a given person and their interest appears within K words then its a match else not. Find all matches from the paragraph.

To illustrate the problem, consider that the paragraph is "John likes swimming and Jack likes Tennis". Let the interest map is <john swimming>, <john tennis>, <jack swimming>, <jack tennis>. For k = 3, 4, 5, 6 the matches found in the paragraph are <john swimming>, <jack swimming>, <jack tennis>.

Propose an efficient algorithm for two cases: k is small, k is very large.

August 12, 2012

Bulls and Cows

Bulls and Cows is a two player game. First player thinks of a N digit secret number and second player guesses that number. This is done by second player making a guess and first player telling the number of bulls and cows in the guess. If the digits of guess matches secret and they are in right position, it's a bull, if they match but on different position, they are cows. For e.g. Let secret be 0123 and guess be 0012. The guess has 1 bull, 2 cows. The game continues, until second player gets a response of N bulls.

The problem is to write a computer solver for bulls and cows that would try to guess the secret making only a small number of attempts.

Solution: show

July 21, 2012

Number of Paths in a Rectangular Grid

Problem 1: Consider a rectangular grid of n x m rectangular cells. The problem is to find the number of shortest (or monotonic in this case) paths along the edges of the cell that start at (0,0) and end at (n,m). How do we print all such paths?

A monotonic path consists of moving up or right but not down or left. Figure 1 illustrates such a path for a grid of size 5 x 3.

Figure 1
Solution: show

Problem 2: Now we need to print all monotonic paths that do not go above the diagonal y = x. Figure 2 shows such a path. Note that the path in figure 1 goes above the diagonal, hence not desirable in this case. How many such paths exist for n x m grid (n >= m)?

Figure 2
Solution: show

Problem 3: Now let us consider that apart from up and right moves we can take up-right (diagonal) moves. Figure 3 shows such a path. How many such paths exist for n x m grid?

Figure 3
Solution: show

July 9, 2012

Finding Top K Elements in Large Dataset

Problem: Consider the problem of finding top K frequent numbers from a file with N numbers. Now consider that N is very large such that it cannot fit in the memory M available to the program. How do we find the top K frequent elements now with the assumption that K < M < N.

Deterministic solution: show

Single pass probabilistic solution: show

July 3, 2012

Finding Top K Frequent Items


Problem: Consider a file containing N numbers. For e.x. {2,3,4,3,1,78,78,3} is a file containing N=8 numbers. Assume that N is small enough to fit in the computer's memory M. So in this case N << M. Now we need to find top K frequent numbers in the file. For the above example, output should be {3,78} for K = 2.

Running Time: Time complexity should be either O(N), O(N log K), O(N + K log N)

Solution: show

March 26, 2010

Problems solvable using Hashtable

Hashtable are extremely useful data-structure as they provide storage and retrieval in O(1) time (amortized). Several problems of algorithms can be very efficiently solved using hashtables which otherwise turn out to be quite expensive. In this post, we consider some of these problems:

Problem 1: Remove duplicate elements from an unsorted array of size N.

Solution: show

Problem 2: Find intersection of K unsorted array of N elements each. Intersection consists of elements that appear in all the K arrays.

Problem 3: How to make a linked list support operations in O(1) time. The operations on linked list can be insertion after any arbitrary valued node, deletion of any arbitrary valued node.

Problem 4: Find all unique pairs of element in an array that sum to S. For ex. If array = {2,4,6,4,6} and S = 8 then answer is {(2,6), (4,4)}

Problem 5: Consider an array containing unique elements. Find a triplet of elements in the array that sum to S (extension of problem 4). Can hash-tables improve the running time of your algorithm.

Problem 6: Consider two strings of size M, N. Perform string matching in size O(M+N).

March 19, 2010

Dragon and Knight

Problem: A dragon and knight live on an island. This island has seven poisoned wells, numbered 1 to 7. If you drink from a well, you can only save yourself by drinking from a higher numbered well. Well 7 is located at the top of a high mountain, so only the dragon can reach it.

One day they decide that the island isn't big enough for the two of them, and they have a duel. Each of them brings a glass of water to the duel, they exchange glasses, and drink. After the duel, the knight lives and the dragon dies.

Why did the knight live? Why did the dragon die?

Solution: show

Problem: Now consider that Dragon and Knight are equally intelligent then who is expected to die.

Answer: Both survive the battle.

Solution: show

March 18, 2010

Number and Age of David's Kids

Problem: The product of the ages of David's children is the square of the sum of their ages. David has less than eight children. None of his children have the same age. None of his children is more than 14 years old. All of his children is at least two years old. How many children does David have, and what are their ages?

Answer: (12,6,4,2)

Solution: show

Source Code: david_kids_ages.py

March 11, 2010

Largest Sum of Consecutive Numbers

Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest sum.

For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest sum is 10 (start = 4, end = 7)

Solution: show

Problem: Given an array of N integers (both positive and negative), find the sub-sequence with largest absolute sum. This problem differs from the one at the top as we want the absolute sum (taking mod)

For ex: Let A = {1 2 -5 4 5 -1 2 -11} then largest absolute sum is 11.

Solution: show

February 3, 2010

Polynomial Evaluation

Problem: Given a polynomial with degree bound of n=m^r. How do we evaluate the polynomial at n different points such that the polynomial can be recovered from these n points (i.e. the coefficients can be calculated back using the points).

Note: We do not want to recover the coefficients but cleverly choosing the points as that impacts the running time of the algorithm.

Solution: show

February 15, 2009

Longest Monotone Sequence and Palindromes

Problem: Given an array of integers, find the longest subsequence of elements which monotonically increases. for ex. array = {1 4 8 2 5 7 3 4 6}, the longest subsequence = {1 2 3 4 6}

Solution: show

Problem: Given a string, find the longest size palindrome in the string. for ex. string = "aaabbccaccbaaa", solution = "bccaccb"

Solution: show

January 31, 2009

Loops in Linked List

Problems with linked lists occur mainly while building or using a bad memory manager. Let us discuss two well known linked list problems.

Problem: Find if two linked lists short at some node. Find the node at which the two lists short.

Solution: show

Problem: A linked list might contain a loop. How do we detect existence of the loop and find the node from which loop starts. Propose an O(n) time algorithm that doesn't take extra space and doesn't modify the linked list.

Solution: show

Monty Hall Problem

This is a very famous probability problem. This problem illustrates that probability is sometimes more convincing than "what we perceive as obvious". I'll post another problem that defeats "obvious reasoning".

Monty Hall problem: Suppose you're on a game show, and you're given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what's behind the doors, opens another door, say No. 3, which has a goat. He then says to you, "Do you want to pick door No. 2?" Is it to your advantage to switch your choice?

Solution: show

Red Green Cards: A box has three cards. First card has both sides green, second card has both sides red, third card has one side green and one side red. A card is picked at random and its one side is observed to be green. What is the probability that the other side of the card is also green.

Solution: show

January 30, 2009

Finding Second Smallest Element

Jargon: Order Statistics problem is to find the kth smallest element in an unsorted array A[1..n]. This problem can be solved in O(n) time in average case using randomized select algorithm or in O(n) time in worst case time using an algorithm that chooses the pivot element carefully.

Problem: How to find second smallest element in an array A[1..n] of size n. We would like to do in much less than 2n comparisons.

Answer: Time Complexity = n + ceil(log n) - 2

Solution: show

Code for n + log n - 2 algorithm: second_smallest.py
Alternate implementation using linked list: second_smallest_linkedlist.py

January 23, 2009

String Permutations

Printing all permutations of a string is a very common interview question. We'll discuss this problem and some interesting variations of it. The original problem of string permutation says, "print all permutations of a string". As an example, if the string is "abc" there are 6 permutations {abc, acb, bac, bca, cab, cba}. Assume that string contains no duplicate elements.

Solution: show

Source Code: permute_unique_chars.py

Lets us assume that string can contain duplicates. How do we print all non-redundant permutations of the string. For ex. If string is "abab" the permutations are {baab, abba, abab, baba, bbaa, aabb}

Solution: show

Source Code: permute_chars.py

Some other constraints could be added to make the problem more interesting. One such constrain could be of partial ordering of characters. For Ex. 'a' should appear before 'c', and 'b' should appear before 'c' in all permutations. There can exist ordering that leads to no permutation of the string such as 'a' precedes 'b', 'b' precedes 'c', 'c' precedes 'a'. Additionally implementing the permutation algorithm in these cases require checking for ordering violation at each permutation level or just at the leaf of recursion, depending on whether checking for ordering violations is more costly or generating all permutations.

An interesting problem, that talks of special kind of ordering and can be very efficiently computed is as follows: Assume that the string is made up of equal number of '{' and '}'. You need to print all permutations of this string which could satisfy a C compiler i.e. an ending bracket is balanced by an opening bracket. For ex. for '{{}}' the permutations are [{{}}, {}{}].  

Solution: show

Source Code: permute_brackets.py

January 22, 2009

Party Friends

Problem: In any party, there exists two people with same number of friends. Is is true or false. Assume that party has more than one people.

Solution: show