### A. Cluster Analysis

Author: Suhendry Effendy.
Prepared by: Suhendry Effendy, Felix Halim.
Problem: A – Cluster Analysis

This is the easiest problem in the contest.

The low constraint (N ≤ 100) allows contestant to solve this problem using less efficient solutions. For example, you can construct a boolean N x N matrix where element (i, j) denotes whether ith and jth elements differ by at most K. Then you can employ disjoint set data structure and join each true element in the matrix to create the clusters. Alternatively, you can employ graph traversal algorithm (e.g., BFS or DFS) to calculate the number of clusters.

All the methods I mentioned previously are overkill (I know several teams using those methods).

There is an easy O(N lg N) solution to the problem:

1. Sort all integers in array A, let’s call the sorted A as A’;
2. Count how many adjacent elements in array A’ (A’i and A’i+1) s.t. the difference is more than K, let this value be X;
3. The output is X + 1.

### 19 Responses to “ACM-ICPC 2014 Jakarta”

1. The BEST thing about Jakarta regional contest that is rarely seen on any other contests (aside from the dinner party) is the editorial to the problems. That is, this blog serves the best of the event. Not even the world final has such an editorial.

I would like to share some rather strange experience as a contestant. This is for problem D. As Suhendry wrote, that problem is purely combinatorial. Do some basic combinatorial problems and this problem should not be hard.

Our first submission got ‘wrong answer’, although we are sure about the correctness of our solution. We then double-checked for any overflow errors, but we could not find one. We literally do the MOD operator after each statement. There is no such statement as {long long} = {int} * {int}. I remembered we have some {long long} = {int} * {long long}, but it should be all right, isn’t it? I also printed long long as %I64, as the previous contests in Binus used such format.

In desperation, we do the following ‘unimportant’ changes:
-no ints are used in mathematical operations.
-we scan and print using ints to avoid the %I64 format.
-after scanning, we typecast ints lo long longs.
-before printing, we typecast long longs to ints.
Suddenly it got accepted! Now we can’t explain why it didn’t get accepted the first time.

————————–

Okay, now this is for problem E.
The intended solution was explained in the blog. But one of my friend which was not in my team told me that their team solved it using simple DFS. Well, it should get TLE, but if you are lucky you can get it accepted that way.

No wonder why so many teams got it accepted, while some top teams struggled in this problem.

• I don’t know what happened to your D, but I also experienced a similar thing when preparing the problem. I made a mistake on the mod operation (a += b mod m, which is wrong). A common error on problem D is overflow (not sure whether this also happened to your D).

As for problem E, some contestants also report to me that they managed to get accepted with a DFS for each Q query. It should be TLE. The mistake is on us; we did not code the DFS solution or any linear time solution for Q query (even though we’re aware of this), and directly prepare data which (supposedly) make the solution TLE. It seems the data was not strong enough, some teams managed to get accepted with this.

2. I have question for problem F.
when we check for second swords which have been satisfied by the first sword, if no first sword have
satisfied it, what step we must take?
In my opinion we must take one number(the length) to satisfied the second sword.
But i still confused How the greedy technique to choose that number.

• Let say each (second) remaining sword is a line segment [b,c], then our task it to select a minimum number of points such they cover all the given segments.

To get the greedy solution idea, imagine you’re considering the point one by one from the left most (e.g., 1), and increase (move it to the right) one by one at each step. At each point, ask yourself whether this point must be chosen. The keyword here is “must”, it means if we didn’t choose this point, then some segments (to the left of it) will not be covered, e.g., because it’s the end of some uncovered segments. Try to figure out by yourself why this strategy will find the minimum number of the needed points to cover all the segments (it’s not hard to prove it by contradiction, i.e., assume there is a better solution by choosing a point which lie to the left of the “must” point).

• How to handle cases like:
1
2
1 1 1
1 1 1

?

• @Joe:

I’ve just checked your case with my code, and the output is 2. After re-reading the problem and solution in this post; supposed there are two dragons (like in your input): A and B; you need 2 swords to kill A (even though B <= A <= C), and the same set of swords can be used to kill B; thus, you only need 2 swords. The first sentence of the last paragraph of the analysis (F) contains the key-phrase.

3. Problem E
For query Q A B
Did you check it by find the parent of A and B like on disjoint set data structure for every query?

• Yes. We reverse the input order so we can use disjoint set data structure for each Q A B query.

• So in the worse case, if we have skewed binary tree with 5000 query Q A B, where A is the last node and B is the second last node. Isn’t that will TLE?

• Err… no. Disjoint set (union-find) data structure, using tree structure, has an O(log* N) time complexity for each find query, i.e. answering Q A B. Observe that it’s log*, much slower than log.

This disjoint set data structure is constructed using the input tree, but it is NOT the input tree.

You can find more about disjoint set data structure on the net, e.g.,
http://en.wikipedia.org/wiki/Disjoint-set_data_structure
Find “disjoint set forests”, especially the “union by rank”, of course combined with “path compression”.

It is one of the basic data structure you should know; and it’s very useful, for example, when you want to implement Kruskal Algorithm to find minimum spanning tree on a graph.

4. Could you please explain the implementation of O(N.K.E.M) in problem C? Thank you

• @Lost Boy: for problem C, the naive transition is from state dp[x][k][e][p] to another is O(L). That is, by bruteforcing the 4 possible outcome of assist levels.

It can be reduced to O(1) by precalculating the top two transitions based on the state of [x][k][e] in O(L). Then, to pick the best outcome, we only need to decide from the two possibilities, thus the transition cost is now O(1).

• @FH: Oh? so is it O(N.K.E.L) or O(N.K.E.M) with M = 2?

@Lost Boy: Theoretically, both O(N.K.E.L^2) and O(N.K.E.L) are O(N.K.E) since L in this problem is constant (4). However, it seems this constant factor plays an important factor here, so it’s good to know.

• @Suhendry: it’s O(N.K.E.L + N.K.E.L). One is to precompute the top two best outcome, the other is to actually apply it.

Actually, I suggested Suhendry to bump up the L to 10 so that it will make a bigger difference :P, but since we didn’t have enough time to modify the problemset, we just use L = 4.

5. Hi, sorry to bother again, but after some time, I still don’t know how to solve D. Could you give any tips ?

• You can count for each position [K+1..N] where p will be selected (of course p will not be selected if it lies in [1..K]). In other word, focus your attention on a particular p position first (then you only need to evaluate for all possible positions for p).

Let say p lies at P: [1, …, K, …, P, … N]. Note that P should be between K+1 and N to be selected. There are 3 parts that you should analyse: [1..K-1], [K..P-1], and [P+1..N]. The segment [K..P-1] should be in increasing order (the ranks are worsen), otherwise p at P will not be selected, and p should be < rank at P-1. Gathering all the details, you just have to play with combinatoric (only nCk and n!). I don't remember the details, but that's the general idea.

6. I have question for problem B.
I still not understand how to verify a complete graph from the graph that we suspect as “barbel”. Could you help me? thanks

• Pick any arbitrary node which has degree of |V|/2 – 1, let label this node as p. The neighbour of p is neighbour(p) = a1, a2, a3, …, ax, where x should be |V|/2 – 1. Now for each q of p’s neighbour, you need to check whether they also connect to all nodes in {p U neighbour(p)} \ q, i.e. set of neighbour(p) and p, and exclude q. Just a simple iteration suffices.

For example, let p = 1 and neighbour(p) = 2, 3, 4, 5. Now you have to check whether node 2 also connects to {1, 3, 4, 5}. Check whether node 3 connects to {1, 2, 4, 5}, and so on, do this to all p’s neighbours.

• Ah, I understand now. Thanks