Talk on Graph Theory

The topics covered in the talk were Representation and Types of Graphs and Searching in Graphs (BFS + DFS) . The slides used for the talk are attached.

Some related problems are :
Connected Components using BFS/DFS
http://www.codechef.com/problems/CDOLP1
http://www.codechef.com/problems/FIRESC
http://www.codechef.com/problems/EMRGENCY
http://www.codechef.com/problems/GALACTIK
http://www.codechef.com/problems/CC
http://www.codechef.com/problems/INS01

Textbooks that can be used for reference in Graph Theory are CLRS and Steven Skienna. Since, there was a request for resources to be uploaded, I’ve put it up here.
https://drive.google.com/folderview?id=0Bw94k-7BT0eua3Azam9QZlR5Yms&usp=sharing

 

Dynamic Programming Tutorial – Level 1 (Easy-Medium)- part 2/2

Dynamic Programming Tutorial – Level 1 (Easy-Medium)- 2/2

For those of you who read the first part of this tutorial, if you had implemented the “Sum of paths in a triangle problem” according to what I had mentioned, you probably(or maybe definitely) must have encountered a “Time Limit Exceeded” error. Ok.. now please don’t say that I was wrong… I had purposely given that code so that you will actually understand what slow and possibly incorrect Dynamic Programming is. “THAT” was an example of what I had meant by slow algorithm in the case of the Subset-Sum problem. The splitting up of sub-problems was right, the recursion was right but still there was some tiny( or large.. :P) error that made it slow. This was because of the so-called “Overlapping sub-problems” or more informally (.. :P) doing the same thing again and again and again… and again.. 😛 . Let me explain to you what it actually means…
This was the example considered..
5
1 7
6 3 4
Here while finding out path(1) and path(7), we have
path(1)=1+max(path(6)+path(3))
path(7)=7+max(path(3)+path(4))

Here we are finding path(3) twice, which really is not necessary as we can already use the result we found in the first computation in the second computation. So if we can somehow store this result somewhere and access this as and when it is required, then the problem can be more efficiently solved. This technique is what makes dynamic programming so effective in solving problems that could have never ever been imagined being solved. This technique is called “memoization” or memorization. “memoization” literally means memorization, but I have no idea why the creators of the word removed the “r” to make it complex.. :P. Try solving the triangle problem now and see if the solution gets accepted… The following is the code snippet(and PS:- It is correct this time.. :P)

int memo[105][105];
int computedp(int a[105][105],int n,int row,int col)
{
if(row==n) return a[row][col];

//SET ALL ELEMENTS OF MEMO TO -1 INITIALLY TO CHECK IF VISITED OR NOT
if(memo[row][col]!=-1) return memo[row][col];

memo[row][col]=max(a[row][col]+computedp(a,n,row+1,col),a[row][col]+ computedp(a,n,row+1,col+1));

return memo[row][col];
}

Always look for a possibility of having overlapping sub-problems. If this happens, yeah you got it…. Use MEMOIZATION.

Ok as now nearly all the basics have been covered, let me move on to some of the well-known applications of dynamic programming…

1. Longest Increasing Subsequence (LIS)
Never say “Longest Increasing Subsequence”. Always say “LIS”. People start feeling that you are some pro programmer (yeah.. pro- programmer… :P) when you start using short forms. Ok.. let me begin by giving the definition of what a subsequence is.. Okay, leave boring definitions I’ll explain by giving examples..
Consider an Array –{1,2,3,4,5,6,7,8,9}
Some of the subsequences are…
1 3 5 6 7 8 9
1 2 4 5
1 6 9
2 5
3 7
4 8 9
5 7 8

Examples which are not subsequences…
4 1 3
6 2 8
1 5 6 7 8 0
9 8 7 6 5 4
3 2

I hope that the examples have provided the definition of a subsequence. Any sequence of elements from left to right is known as a subsequence.

Consider another example… – {4,1,5,2,7,6}
The subsequences are…
4 1 2
4 7 6
1 5 2
1 2 7 6

Examples which are not subsequences…
1 2 4
1 6 7

Ok now the problem of “Longest Increasing Subsequence” would be to find the length of the longest subsequence such that the elements keep on increasing from left to right.
The answer in the previous example would be :- 3
There are many possibilities such as {1,5,6},{1,2,7},{1,2,6}..etc.

REMEMBER :- The LIS solution only gives the length of the longest subsequence and not the elements that comprise it…

Ok.. first of all this is a dynamic programming problem…
The aim of Dynamic Programming is to solve smaller sub-problems and use them to solve larger problems. That’s exactly what I’ll be doing here. But this time I’ll use a slightly different approach. Previously I had shown you an example of a top-down dp solution with memorization. Now I’ll show you how memorization can be directly avoided(but still indirectly present in the solution) using the
So-called bottom-up dp approach. It’s the same thing as its top-down counterpart but instead of going from a larger problem to smaller problems, we start from smaller problems and move on towards the larger problem. (As an exercise, I encourage you to solve the triangle problem using the bottom-up dp approach).

Okay, so coming to defining what the smaller subproblems are…
Let me be a little conclusive over here…. Defining smaller problems is not something what can be learnt by reading tutorials, its something what develops over time the more you practice. Seeing and solving different problems gives you a hint on how the subproblems should be defined so as to solve the main problem. So now…. What exactly could be the smaller problems for the LIS problem. The LIS problem is to find the Longest Increasing Subsequence for an array, say of length ‘n’. So a possible option would be to find the LIS for smaller arrays like n-1 or n-2 … so on till a single element.
Finding solutions for these and combining them will give you the answer for the entire array.

Let’s see an example :- 5 1 2 4 3, an array of length 5

As LIS always deals in an order, that is from left to right,
LIS(1) = 1 {5 is the only element present. Hence length of longest increasing subsequence is 1}
LIS(2) { Remember the 2 over here is the length of the array from left to right}
LIS(2) = 1 { the possible sequences are either 5 or 1. 5,1 is not possible as it is not increasing}
LIS(3) = 2 {1,2 is increasing}
LIS(4) = 3
LIS(5) = 3

So hence Longest Increasing Subsequence of the above example is 3. Seems pretty easy eh..? Solving subproblems and combining them…? So in this example we used our so-called super human brains to solve it. But to program it on a stupid computer is not going to be that easy. Care should be taken to make it perfect…

int dp[105];

int LIS(int a[],int N)
{
for(int i=1;i<=N;i++)dp[i]=1;
//initialise all LISs to 1.. you figure out why..? 😛

for loop i=1 to N
{
for loop j=1 to i-1
dp[i]=max(dp[i],dp[j]+1);
}
}
return dp[N];
}

The above is the code to find the LIS of a sequence. I am not explaining the code, I’ll just introduce you to the logic applied. We check from left to right starting from array of length 1 to array of length N. …Oh yeah and by the way, the array is 1-indexed (i.e it starts from 1 and ends at N). Each time as we are going from left to right, we add 1 element. We compare this element to each of the previous elements. If it is greater(As it is called the longest INCREASING subsequence… duh.. :P), we add 1 to the LIS found at that intermediate element. This maybe hard to understand, but just read the code a few times and you should be comfortable with it.
I’ll take the following example to show it..
4 1 2 3 – N=4

If LIS(3) is being considered..(this means that we are finding the LIS upto length 3 in the array, that is till the element 2), we check if the element 2 is greater than each of the previous elements. If it is greater, we add 1(because this element is greater, thus it increases the length by 1) to the LIS at that point. Continuing this till N, gives us the LIS for the entire array. This is the final answer… And the reason max() is used, is to find the longest on the many possible cases where the element could be greater than the intermediate elements…

2. Coin Denomination Problem

I’ll just briefly state the problem and the solution. I leave it up to you to write the code….
The CD problem(lol.. CD was a bad short form(and just for the info. I made it up)… :P) requires you to find minimum number of coins to be used so that the total value of these coins is equal to a given sum.
For eq:- consider you have these coins 1,3 ,4 . You got to make 10 with this.. The best way would be to use 3+3+4 = 10. The method to find this would again be to use bottom up dp followed by finding the solution to subproblems(where the subproblem means a sum of 1,a sum of 2,…etc..). Following a similar approach as that for LIS, the solution should be obtained…

So that’s pretty much it for easy dp problems. Again as I said before, just knowing the solutions won’t help. When a random real world problem is given to you, you should be in a situation to figure out that it’s a dp problem and apply the concepts to solve it. I have introduced nearly all the concepts involved to get you started into the vast world of dynamic programming..It’s up to you to use them wisely.. :P(that was LAMEEEEEE…). So, that concludes level 1. Level 2 will be mostly about difficult dp problems where the application of dp concepts are not directly implemented. I am still in the process of learning those, so in the meanwhile, login to your favourite programming website, and start DPing….!!!! TC N HF…!!! 

Dynamic Programming Tutorial – Level 1 (Easy-Medium)- part 1/2

Dynamic Programming Tutorial – Level 1 (Easy-Medium)- 1/2

WARNING : This tutorial is only for newbies who have no clue about what Dynamic Programming is and want to do well in it in programming contests. The tutorial is best if read from top to bottom, but you are free to skip to topics of your desire.
Ok..Hmmm… Dynamic Programming… Dynamic.. eh? What is the first thing that comes to your mind when you hear the word dynamic? Things moving…Things which are not static..Yes, those are precisely what dynamic means.

But now the catch…, Dynamic Programming has actually got nothing to do with those things. Dynamic Programming is just a fancy method of telling some normal guy(not a programmer) that you cheated while solving an algorithmic problem. Normally algorithmic problems involve searching/finding something which require some tedious work when a human does it but can be solved faster when a computer does it. But there are also some cases when a computer does it slowly, because of a slow algorithm(people normally call this brute force- meaning checking all possibilities blindly). This is normally exponential in the size of the input(which is very BAD).
For example, consider a list of elements – 5 3 8 1 2 10 23 . Find the number of distinct sets of elements such that the sum is say, 12. Now for this there could be many possibilities like { 8,3,1 } , {10,2} and so on. Finding the number of distinct sets in this example is easy. But what if the size of the input was like 100 or 1,000 or even 100,000. Then things could start getting messed up. Now try programming this using a computer. The simplest method would be to actually try all possible combinations like {5,3}, {5} ,{5,3,8} {8,1}. Doing this would be easy to code, but would take large number of unnecessary operations. Hence the time taken to execute the program would be enormous(when you actually code it, you will have to wait for nearly 4-5 mins. for the program to end..).

The aim of dynamic programming is to make this time less than the actual time you take to say… DYNAMIC PROGRAMMING. 😛 .
This problem is formally known as the Subset-Sum Problem (duh… 😛 ) which an example of the Knapsack Problem(leave these for now..you will be familiar with them as you practice more and more…).

Now, a formal definition of Dynamic Programming..(actually a bit informal..:P)
-Trying to split a large problem into smaller problems such that the smaller problems are solved in the best way possible.
If the smaller problems are solved in the best way possible then, as the large problem is a collection of smaller problems, the larger problem can also be solved in the best way possible.

Now the aim of the problem is to solve the smaller problems. Like in the Subset-Sum problem, a smaller problem can be thought of finding the solution, say for a sum 1 or 2. Using this, an efficient solution for the main problem can be found out.
Another way to think of a dp solution is a recursion(a function which depends on results of smaller inputs..such as the following..)
Fibonacci – fib(n)=fib(n-1)+fib(n-2).
So if you can get the recursion for your dynamic programming problem, you are nearly done with your solution.. 😀 . Ok.. now coming towards the basics..

Try Solving this problem on codechef.. one of my favourites.. It’s pretty easy..
http://www.codechef.com/problems/SUMTRIAN

People learn more when they are taught the problem and then the solution, rather than the opposite. So solve the problem first(or atleast try to.. 😛 ) and please DO NOT look at the solution before trying it.
OK.. for those people who have solved it after hearing dynamic programming for the first time, CONGRATS..!!, even some of the world’s top programmers haven’t been able to achieve what you have done. But for others, don’t feel discouraged, after all the some of the world’s top programmers were also like you at some point.. 😛 . Ok coming to the solution.., this problem is a typical example of dynamic programming.
Remember the aim – Solve smaller sub-problems and use them to solve the large problems.
For the purpose of this tutorial, I am considering the following example…
5
1 7
6 3 4
One of the methods is the following. Start from the top.. (also known as the top-down approach). The longest path from 5 can be given as…
path(5) = 5+(max(path(1)+path(7))).
This means that if we solve the problem in row 2 i.e elements 1 and 7, we can automatically have the solution for row 1. Now each row, depends on the ones below it. Continuing this till we reach the last row, we solve the problem. In the last row, we just have individual elements and not paths. No need to specify a path for the row after the last one. The code snippet is shown below

int computedp(int a[105][105],int n,int row,int col)
{
if(row==n) return a[row][col];
return max(a[row][col]+computedp(a,n,row+1,col),a[row][col]+computedp(a,n,row+1,col+1));
}
Don’t just memorize dp solutions. It involves more of a technique than an algorithm. And like all techniques, it only gets better with practice.

Try solving the following problems. If you are a beginner in dp, then you may find these problems hard. But no worries, it happens to everyone. Read tutorials for these problems on any blog/forum/programming site and practise more. Hardwork is the key to Dynamic Programming and PS:- Just for the info.-300 problems from easy to hard makes you only an above average dynamic programmer.. 😛 . So people wanting to do well in programming contests, should focus on this. The difference between the world’s best competitive coders who come 1st and 2nd, is their skill in dynamic programming..!!

READ THE TUTORIALS FOR THESE PROBLEMS IF YOU CANNOT SOLVE THEM. Take your time .. it may take days to understand the following problems and implement them.
These are the easiest dp problems. So do not skip them, if you don’t know how to solve them.

http://www.codechef.com/problems/FCBARCA
http://www.codechef.com/problems/DBOY
http://codeforces.com/problemset/problem/166/E
http://codeforces.com/problemset/problem/363/B
http://codeforces.com/problemset/problem/118/D
http://codeforces.com/problemset/problem/159/D
http://www.spoj.com/problems/PARTY/-

The next part of this tutorial deals with some of the well known dp problems such as Longest Increasing Subsequence, Coin Denomination problem, Knapsack etc.

First talk on STL & Implementation

NITK-CCC conducted the first talk of the semester with a motive to get more first years and second years inclined towards competitive programming and participate in Codechef Contests. The talk focussed on getting acquainted to  programming in C++ with a lot of stress on using Standard Template Library to write clean and easy code.

Here is the presentation that was made at the talk.

Rabin-Karp String matching algorithm

Moving a level above the naive approach to string matching can be done by making use of the comparisons being done during each iteration and using it for the next iteration . In the brute force approach , say we compare T[a…b] with the Pattern P in an iteration and in the next iteration we compare T[a+1….b+1] . So, we are comparing the same characters again and again , which leads to in-efficiency .

The Rabin Karp Algorithm makes a better attempt in solving the above problem . Before giving the implementation we can define a few steps to easily comprehend the Algorithm implementation .

1 . Let us first define the string as a collection of numbers only , say Set Q = {0…9} from which the pattern and text are generated . Then , we will take forward the idea to characters. So, now there are 10 characters over which the set Q is defined .
2. Let p denote the decimal value of P[1….m] where m is the length of the pattern . For Example if the pattern is “32145” the corresponding decimal value is 32,145.
Let t(s) denote the decimal value of Text txt[s+1…s+m] where s is the shift value.

3. We can compute p in time O(m) using HORNERS RULE.
p = P[m]+ 10 (P[m-1] + 10(P[m-2]+……10(P[2] + 10 P[1]….)))  —-&gt; The 10 is the number of characters

Similarly when shift value is s=0 we can calculate txt[1….m] as t(0) in O(m) time .
t(1) is txt[2….m+1] which can again be calculated in O(m) time .

But, by induction we can say that ,

t (s+1) = 10 (t(s) – 10^(m-1) T[s+1]) + T[s+m+1]

This is nothing but removal of first digit and addition of the new last digit . Subtracting 10^(m-1) T[s+1] removes the higher order digit and adding T[s+m+1] brings in the appropriate lower digit.
Hence, we observe that we can calculate t(s+1) from t(s) in constant time provided we calculate t(0) in O(m) time .
We can pre-compute the constant 10 ^ (m-1) in O(logm) time as given in my implementation .

But, a major problem which we will face is that the decimal values may be too large leading to overflow . Hence , we compute p and t(s) values modulo a suitable prime number in O(n-m+1) .

Also, if we extend the decimal value concept to characters (the 256 possible characters ) , this actually leads up to finding the hash value of a pattern and comparing with a required pattern .
But, the matching of the hash value of t(s) with p does not necessarily imply that both the patterns are the same. Hence, if the two values are equal then we proceed further to check the equality of characters present in them . If the two hash values are equal but the characters are not the same it is called a SPURIOUS HIT .

Example of calculating t(s+1)%mod from t(s)%mod where mod denotes any prime number :

Precompute :     100000 % 97 = 9
Previous Hash :  41592 % 97 = 76
Next Hash  :     15926 % 97 = ??

OBSERVATION :  15926 % 97 = (41592 % 97 – ((4 % 97) * (100000 % 97 )) ) * (10 % 97 ) + (6 %97)
=  76   –  ( 4  *  9 ) * 10     + 6
=   406 % 97
=   18

The worst case complexity is again O(m(n-m+1)) which again makes it in-efficient but its best and average case is O(n+m) which makes it better than naive approach . It is extremely useful in 2d processing
So, here’s the final implementation in C++


/*
Rabin Karp Fingerprint Algorithm:
The average and best case running time of the Rabin-Karp algorithm is O(n+m)
Worst-case time is O(nm).
*/

#include<iostream>
#include<cstring>
#include<cstdio>
#define d 256                  // The dictionary of 256 characters
using namespace std;
void search(char *txt , char *pat , int q){
int m=strlen(pat);
int n=strlen(txt);

int i,j;
int p=0;                       // hash value for pattern
int t=0;               // hash value for text
int h=1;                       // d^(m-1) value

// Compute the value of d^(m-1) --> Pre-processing in linear time

for(i=0;i<m-1;i++){
h=(d*h)%q;
}

// Multiply each character by its place value and obtain the hash value
// Initial Hash values of current sliding text window and the pattern is being calculated

for(i=0;i<m;i++){
p=(d*p+pat[i])%q;
t=(d*t+txt[i])%q;
}

for(i=0;i<=n-m;i++){

// Check if the current sliding window of text and pattern have same hash values

if(t==p){
// Check if all characters are same or it's a SPURIOUS HIT !

for(j=0;j<m;j++){
if(txt[i+j]!=pat[j]) {
break;
}
}
if(j==m) cout<<"Pattern matched at index "<<i<<endl;
}

// Now compute the next sliding window for the text using previous value..

if(i<n-m){
t = (d*(t - txt[i]*h) + txt[i+m])%q;

// We might get negative value of t, converting it to positive

if(t < 0)
t = (t + q);
}
}
}

int main(){
char a[500],b[500];
cout<<"Enter the string"<<endl;
gets(a);
cout<<"Enter the sub-string you wish to search for"<<endl;
gets(b);
int q=7;                      // Let the prime number for computation be 7
search(a,b,q);
return 0;
}

Fast Multiplication – explanation

Here is our first tutorial on fast multiplication. This article is contributed by Aditya Kadam of 3rd year.

I will try to explain the working of the fast multiplication function in brief.If we want to do an operation like 10^18 * 10^18 % MOD, where MOD is also of the order of 10^18, direct multiplication will result in overflow of even unsigned long long as the maximum value that can be stored in even the largest data type in C or C++ ,unsigned long long, cannot store numbers more than 10^19. This was the reason for many people getting WA for MTRICK in January 14 Long on CodeChef.

The best way to solve this is to split the multiplication operation into different steps so that the modulo operator can be applied in intermediate steps to avoid overflow.The idea for this is similar to the concept used in fast exponentiation. We can write a * b as 2 * ( a * (b/2) ). Again we write a * (b/2) as 2 * (a * (b/4)). This is the recursion we used to calculate the value of a * b. As we are dividing b by 2 at each step , we can get the answer in logarithmic time i.e. log(b).

Now if we want to multiply a by b and b is say 13, then a * b = a * 13 can be written

a * 13 = a + 2 * ( 13/2 * a)
       = a + 2 * ( 6 * a)
       = a + 2 * ( 2 * (6/2 * a))
       = a + 2 * (2 * (3 * a))
       = a + 2 * (2 * (2 * (a + (3/2 * a))))
       = a + 2 * (2 * (2 * (a + ( 1 * a))))

This splitting can easily be done by recursion. The main advantage of splitting is that we can take modulo in the intermediate steps which is not possible in direct multiplication. **Note: b>>1 gives the quotient obtained when b is divided by 2.

Now consider the code:

 long long multiple(long long a, long long b, long long c) //Code for a * b % c

{

  if (b == 0) {  //Base case a * 0 =0

      return 0

  }

  long long ret = multiple(a, b >> 1, c)  //Multiply a by (b>>1).

 ret = (ret + ret) % c  //we double the value of ret i. 2 * (a * (b>>1)). 
                        //Take MOD in this step

  if (b & 1) {  //implies b is ODD

      ret = (ret + a) % c  
               //if b is odd then we express it as a * b = a+ a * (b>>1). 
               //We have computed a*(b>>1) in the previous step by recursion i.e the value ret.
               //We now add the extra a to it.

  }

  return ret 

}

Hope the explanation is clear. Please feel free to ask anything if you have doubts.

(P.S. The original tutorial was posted on Codechef here. You might want to have a look at it for further details)

Competitive Programming Lecture Series

We are planning to hold a series of lectures on the essentials of Data Structures and Algorithms required for competitive programming. We will be covering the following topics during the semester :

1. Basics of Standard Template Library (STL) and implementation problems.                    2. Binary Search and Applications
3. Dynamic programming, Greedy and recursion
4. Matrix exponentiation
5. Data structures-
– Tries
– Suffix trees and arrays
– BIT
– Segment Trees
6. Graph algorithms
– searching (BFS+DFS)
– Components
– Cut vertices

7. String Algorithms
– String Matching (Exact and Approximate)
– Manacher’s algorithm

8. Fast Fourier Transformation for convolutions.

If you wish to have any other topics covered as a part of the series, please feel free to comment it out.

All the talks will be conducted by those who have been a part of teams that took part in ACM-ICPC this year.

Looking forward to having a great participation especially from the 1st years and 2nd years .

PS: Exciting CodeChef merchandise will be given during the sessions

All those interested in attending the session please fill up the form below
https://docs.google.com/forms/d/1FTN9tSDnbltt_aoBfBH9z_wFvXXq0PHHBFnWrVXt_lo/viewform .