## Tuesday, April 2, 2013

### SPOJ Problem Set (Classical) 5970. Finding Primes

Fun problem! First it wants you to run a sieve from 2 to N, and then do the similar operations starting from N to 2, except, this time, you have to mark the factors of N and then find the next largest N1 not yet marked, then mark factors of N1 and so on... Finally, you have to tell how many numbers will be unmarked by both algorithms up to N.

First observation is, you only need to consider those numbers which are primes, so, we can run a sieve up to 107, which will allow us to test whether a number is prime or not in O(1) time.

Now, for each N, we can pre-calculate the result. Just note that, say if you know ans[n-1] and want to determine ans[n] from it, this inequality will hold |ans[n] - ans[n-1]| <= 1. Because the current N will be added as an unmarked, or there may be a number which you added in the past, will be removed by one of the factors of this N. Or, if N is not a prime, then it will be already removed by the first algorithm i.e. sieve.

If N is a prime number, then no N1 has removed this N already, so in this case the ans[n] = ans[n-1] + 1.

If N is an even number, and if N/2 was a prime, you have added that with your result, but this should be removed at this stage. So in such a case, ans[n] = ans[n-1].

Why don't you need to consider other factors like N/3, N/5 ... ? 2 is the only even prime factor and it can produce even numbers by multiplying other primes with it. For other primes, p = 3, 5 and so on, if N/p is prime, then N cannot be even. So the basic idea is as follows:

```for( i = 2; i <= N; i++ ) {
ans[i] = ans[i - 1];
if( i is even and i/2 is prime ) ans[i] = ans[i] - 1;
if( i is prime ) ans[i] = ans[i] + 1;
}
```

I had absolutely no clue at the first glance. Really nice problem.

## Monday, March 25, 2013

### SPOJ Problem Set (classical) 2737. Perfect Rhyme

A perfect rhyme is not a crime,
it is something that exceeds time,
a bit of science, a piece of art,
soft as a pillow, sharp as a dart.

I really love this little rhyme.

Basically the problem is, you are given a dictionary of words, and some query words. For each query word q, you have to find a dictionary word u such that, u != q and the common suffix of q and u are of maximum length possible. In case there is a tie, the problem requires the lexicographically smallest such word.

This problem can be solved using STL maps and storing the suffixes along with their sorted id, but there is a more elegant solution using Trie data structure, i.e. prefix tree. As we are interested in maximizing common suffix length, we can store the strings in the Trie in reversed form, so now the suffixes will become prefixes in this tree. For each dictionary word, we also need to store its index number, when sorted, as a termination marker for that word, so that, we can find the id of an word easily during the query. On each node, we also need to keep two additional information, the ids of lexicographically smallest and second smallest strings passing through that node, which we call min1 and min2. Initially both should contain an infinite value.

Now for each query word, we also search the word in reversed form. If it is not found in the tree, i.e. a path may exist, but the end marker is not present, then the task is simple, we just return the lexicographically smallest id, which is min1, from the current node, i.e. the node which we reached while trying to match the query string.

But extra cares should be taken when the query string is found on the tree, because, then we have to look for another candidate, for which, we have kept min2, i.e. second lexicographically smallest index. If we can deduce that, going to node x from current node cur, if it evidently means we will end up finding the exact same word, then we can't follow that path, instead, we decide which index to return from current node cur, if its min1 index refers to the word itself, then we return min2, otherwise we return min1. And if we have no other choice but end up at the exact matching point, then we are also sure that there is at least another string which follows the same path, but does not end at our current node, i.e. at least two different words. Then depending on our query word, we select min1 or min2 from our current node.

So, if you know how to code a Trie, it is not really a hard one, but indeed a tricky one.

## Sunday, March 24, 2013

### SPOJ Problem Set (Classical) 224. Vonny and her dominos

This is exactly the same problem from 2006 TopCoder Collegiate Challenge, problem DominoesFinding. I am not sure whether this problem can be solved by bipartite matching algorithm or dynamic programming, probably both will run out of time limit. But it can be solved by straight forward backtracking with a little bit pruning. The backtracking idea is pretty simple, just keep track of which tiles are used (each tile must be used exactly once), and try filling the grid in row major fashion. So there is no point discussing the solution, it is better to discuss why backtracking can be used here. I am not going to write these on my own words as it has already been written, so I will repost the analysis from TopCoder

### DominoesFinding

#### by soul-net

Backtracking. Yes, that's it. Knowing that a problem is in fact solvable with a backtracking approach is most times a matter of intuition gained with experience. Anyway, in this and some other cases, there can be found more formal estimators that the idea is in fact THE idea.

I'll describe a possible backtracking approach, possibly the easiest to implement, but there are other possibilities. The idea is based on the fact that all squares must be used. For example, if we take the upper-left square of the board, we can see that we must connect it with one of its two neighbors. With this in mind, we can iterate over all squares and, each time we find an unused one, we know that we must match it with one of its two (or one) remaining neighboors -- or both, if we iterate in a column-row or a row-column fashion; when we find an unused square, we know that everybody in its upper-left rectangle is already used.

As we do this, we go marking each used piece and only continue trying if the new piece made by each new matching is "new". In this way, if we finally get all squares to be used, we know also that all pieces are used (because we managed to get no repeats) and then, we add 1 to the counter.

To be sure this approach works perfectly in time, you can conduct a little experiment and run the algorithm over an empty board without the "new piece" pruning. This will show you that there are less than 1.3 million ways to divide the board (1,292,697 actually), so it is perfectly feasible to try every one of them. Of course, the pruning of the "new piece" will reduce the running time dramatically in most cases.

There is also a good theoretical estimator that the approach will work in time, to convince ourselves before programming anything (many programmers think this is a must). There is a total of 56 squares in the board, our algorithm does nothing for half of them (when it finds them already used) and tries 2 or less cases for the other half (the ones it finds unused). This means the total number of leaves in the search tree will be bounded by 256/2 which is roughly 256 millions. This is pretty big, but considering it is a wide margin upper bound, it can be pretty well used as a "proof" that time limits won't bother.

View original analysis page from TopCoder.