Lab 1: Flows and matchings

In this lab you will, in three steps, write a program that takes a bipartite graph as input and produces a matching of maximum size as output, by reducing (transforming) the matching problem to the flow problem. You will check the correctness and efficiency of the solutions to the three steps by sending them to Kattis, see separate instructions. You have to be approved by Kattis on each of the three steps, and then show the code to a teaching assistant. Kattis both checks that the program is correct and that it solves the problem fast enough.

Step 1: Reduce the problem to the flow problem

You should write a program that solves the matching problem by using a black box that solves the flow problem. The program should work in the following way:

Read the input instance for the matching problem from the standard input.
Translate the matching instance to a flow problem instance.
Write the input to the flow problem to the standard output (ensure that output is flushed!).
The black box solves the flow problem.
Read the output of the flow problem from the standard input.
Translate the solution to the flow problem to a solution to the matching problem.
Write the output of the matching problem to the standard output.

See below how the input and output formats for the matching and flow problems are defined.

Your program should solve the problem efficiently. Kattis will run the program on bipartite grapha of up to (5000+5000) vertices and up to 10000 edges. Kattis knows the problem as adkreducetoflow. There is a program skeleton for stage one in a few different languages in the directory /info/algokomp12/labb2/exempelprogram

Step 2: Solve the flow problem

Now you are going to write a program that solves the flow problem. The program will read input from standard input and write the solution to the standard output. See below how the input and output formats for the flow problem are defined.

Your program should solve the problem efficiently. Kattis will run the program on general flow graphs of up to 2000 vertices and 10000 edges. Kattis knows the problem as adkmaxflow.

Step 3: Combine step 1 & 2

In Step 1, you solved the matching problem by using a solution to the flow problem. In Step 2, you solved the flow problem. Now you should combine these solutions into a single program by changing the communication of the flow instance by using a fuction call instead of using standard input and standard output. The program will continue to read input from standard input and write the solution to the standard output. Your program should solve the problem efficiently. Kattis will run the program on bipartite graphs of up to (5000+5000) vertices and up to 10000 edges. Kattis knows the problem as adkbipmatch.

The matching problem

Given a bipartite graph G = (X,Y,E) find a maximum matching.

Input

The first line consists of two integers representing the number of vertices in X and Y, respectively.
The second line consists of a number indicating |E|, i.e. the number of edges in the graph. The following |E| rows each consist of two integers corresponding to an edge.

The corners are numbered from 1 upwards. If you entered a vertices in X and b vertices in Y we let X = {1, 2,..., a} and Y = {a+1, a+2,..., a+b}. An edge is indicated by the endpoints (first the X-vertex, and then the Y-vertex).

Example: A graph can for example be represented as follows.

This graph thus has X = {1, 2} and Y = {3, 4, 5}. The edge set E contains the edges (1, 3), (1, 4), (2, 3) and (2, 5).

Output

First write a line which is the same as the first in the input, and then a line consisting of an integer that specifies the number of edges of the found matching. Thereafter write one line for each edge in the matching. The edge is described by a pair of integers in the same way as in the input.

Example: If we have the graph above as input, the output can look like this:

The flow problem

Given a flow graph G = (V,E) find a maximum flow. Solve the flow problem using Edmonds-Karp's algorithm, that is Ford Fulkerson's algorithm where you always use the shortest path found by breadth first search.

Ford Fulkerson's algorithm in pseudocode

c[u,v] is the capacity from u to v, f[u,v] is the flow, cf[u,v] is the residual capacity.

foreach edge (u,v) in the graph do
    f[u,v]:=0; f[v,u]:=0
    cf[u,v]:=c[u,v]; cf[v,u]:=c[v,u]
while there is a path p from s to t in the residual flow graph do
    r:=min(cf[u,v]: (u,v) is in p)
    foreach edge (u,v) in p do
         f[u,v]:=f[u,v]+r; f[v,u]:= -f[u,v]
         cf[u,v]:=c[u,v] - f[u,v]; cf[v,u]:=c[v,u] - f[v,u]

Input

The first line contains an integer specifying the number of vertices in V.
. The second line consists of two integers s and t defining the vertices of the source and outlet. The third line consists of a number indicating |E|, i.e. the number of edges in the graph.
The following |E| lines consist each of three integers corresponding to an edge.

The vertices are numbered from 1 upwards. If you entered a vertices of V we let V = {1, 2,..., a}. An edge is represented by its endpoints (first the from-vertex and them the to-vertex) followed by its capacity.

Example: A graph may e.g. be represented in the following way.

Output

The first line contains an integer specifying the number of vertices in V.
The second line consists of three integers s,t, and the flow from s to t.
The third line consists of an integer that specifies the number of edges with positive flow.
Thereafter there is one line for each such edge. The edge is represented by three numbers in a similar way as the input, but instead of the capacity, we now have the flow.

Example: If we have the graph above as input, the output might look like this.

Testing

In the directory /info/algokomp12/labb1 you find the programs bipgen, flowgen, maxflow, combine and matchtest that you can use to test your programs.

The program bipgen generates a random bipartite graph. The graph is written to standard output in the matching problem input format above.

/info/algokomp12/labb1/bipgen x y e

gives a graph of x vertices in X, y vertices in Y and e edges.
The program flowgen generates a random flow graph. The graph is written to standard output in the flow problem input format above.

/info/algokomp12/labb1/flowgen v e c

gives a graph of v vertices and e edges which capacities are positive integers not greater than c.
The program maxflow solves the flow problem and can be used as the black box in step 1. maxflow takes a flow graph on standard input and writes a maximum flow on standard output.
The program combine is a small program that you can use in step 1 in order to get your program to speak to the black box.

/info/algokomp12/labb1/combine java MatchReduce \; /info/algokomp12/labb1/maxflow < graphfile > matchingfile

will run java MatchReduce as solution to step 1, and use the maxflow-program as black box. The input graph is read from the file graphfile and the output is written to the file matchingfile.
The program matchtest reads a graph followed by output from a matchings program (i.e., first the graph and then the matching) and checks that the matchinge is of maximum size. The output is written to standard output and is either Matchning av maximal storlek (meaning matching of maximum size), Matchning av mindre än maximal storlek (matching of less than maximum size) or Ingen matchning (no matching).
Here is an example showing how you can use bipgen and matchtest in order to test your solution of step 3 (mylab).
/info/algokomp12/labb1/bipgen 5000 5000 10000 > graphfile
./mylab < graphfile > matchingfile
cat graphfile matchingfile | /info/algokomp12/labb1/matchtest
Some documented test cases for testing the three steps can be found in the directory /info/algokomp12/labb1/testfall/

If you are now aware of the meaning of the characters >, < and | in the examples above, please look it up in the Introduction to Unix booklet or ask a teaching assistant. In order to check the execution time of your program on your own test cases you can use the Unix command time and look at the user time..

Theory exercises

To be solved individually or in pairs and handed in at the lab session February 8, 2012.

Compare the time complexity of the algorithm when the graph is implemented as a neighboring matrix and when it is implemented with neighbor lists. (In order to implement the statement f[v,u]:= -f[u,v] efficiently, the neighbor list implementation has to be expanded so that each edge has a pointer to the opposite edge.)
Express the time complexity in n and m where n is the total number of vertices and m is the number of edges in the bipartite graph. Then select the implementation which is fastest when m=O(n), i.e. the graph is sparse.
Donald believes that if we start with a bipartite graph G and transform it into a flow graph H with a new source s and a new outlet t, then the distance from s to t will be 3.
Donald hence thinks that the BFS-step always will find a path of length 3 in the residual flow graph (if there is any path).
Donald's first claim is true, but not the second. Why haven't the paths that BFS finds in the residual flow graph necessarily the length 3? How long can they be?
The reason for the fact that bipartite matching can be reduced to a flow problem is that a solution to the flow problem can be interpreted as a solution to the matching problem. This applies only if the flow of the algorithm provides an integer flow (where the flow in each edge is an integer), which in this case means that the flow along an edge is either 0 or 1. Luckily, this is true. Prove that the Ford-Fulkerson algorithm always generates integer flows if the edge capacities are integers!