Nov 082013
 

F. Pasti Pas!

Author: Suhendry Effendy
(link to problem F)

First, we have to notice that the greedy decomposition works, i.e. always find a shortest substring which is a prefix and also a suffix of the string S.

For the example given in the problem statement (“ABCADDABCA”), we can start by omitting the first (and the last A), and then proceed by omitting BC, A and lastly D.

: ABCADDABCA
: A-BCADDABC-A
: A-BC-ADDA-BC-A
: A-BC-A-DD-A-BC-A
: A-BC-A-D-D-A-BC-A

The next question is, how to find such substring? As the size of the string is quite large (50,000), naively iterating through the string will give you Time Limit Exceeded.

Hashing
The expected solution to this problem employs hashing technique to find the matched substring. We can use rolling hash like one used in Rabin-Karp string matching algorithm.

INPUT
  S   input string
 
L = 0, R = 0, P = 1
i = 1, j = S.length()
ans = 0
 
while i <= j
  L = L * P + S[i]
  R = R + S[j] * P
  P = P * x                    # x is some large prime number
  if L = R AND i < j
    ans = ans + 2
    L = 0, R = 0, P = 1
  i = i + 1
  j = j - 1
 
if P <> 1
  ans = ans + 1
 
return ans

The hash value can be computed in O(1); we only need to “add” the current character to the previous computed hash (it’s the property of rolling hash!). As there are N characters to be considered, thus the overall complexity of this approach is O(N).

Suffix Array
At first, using Suffix Array and LCP for this problem seems to be a good idea; however this approach will not solve the problem within the given time limit. Notice that the time limit for this problem is quite strict, i.e. 2 seconds. Suffix Array’s construction has O(N lg N) and also Ω(N lg N) time complexity, which means this approach will always “hit” the N lg N order in any cases (compared to bruteforce + “smart” pruning approach below).

I have tried to implement Suffix Array and run it in judge’s PC; it needs ~6 seconds to pass all tests. With O(N lg2 N) Suffix Array, it needs ~15 seconds. I did not try other sophisticated Suffix Tree algorithm (e.g., Ukkonen’s algorithm) which run in O(N), but I guess it is overkill.

Bruteforce + “smart” pruning
Any bruteforce algorithm which run in O(N2) is expected to get TLE. However, I observed some teams got accepted on this problem using this approach combined with some “smart” pruning (actually it is a simple pruning, but we did not prepare for this). After carefully read the solution, it appears that the solutions were still in O(N2), but there is no case in the test data which trap that pruning solution. Lucky them :-(. A plain O(N2) bruteforce solution still get TLE though, as reported by team koalabiru.

Anyway, this should be a lesson for us (the authors and testers) to prepare the problem better and not let such “cheap” solutions pass in the future.


  12 Responses to “ACM-ICPC 2013 Jakarta – Problems and Analysis”

  1. are you sure there isn’t mod operation ?

    and this : j = j + 1, why it isn’t j = j-1 ?

    • Ah, you’re right, it should be j =j – 1 (updated).

      MOD operation is not needed, What we really need is whether a two substrings have a same hash value or not, not the hash value itself. There are only addition and multiplication operators, so overflowing the result doesn’t matter.

  2. ^that comment above is for pasti pas

  3. Halo Mr. Suhendry.
    Saya ingin sekali bisa berkompetisi di ACM ICPC. Saya Mahasiswa tingkat 2 di salah satu perguruan tinggi di bandung. Tapi skill koding saya tidak terlalu bagus. algoritma sorting saja masih bingung. Gimana ya kiat nya untuk belajar algoritma yg baik dan benar agar bisa nanti suatu saat ke ACM ICPC?

  4. How to determine which prime number should be used? I tried some prime numbers and the result is not as expected.

    • (Problem F). Just use large prime number (larger is better, its chance to collide is smaller). I used 1000003 and it’s fine.

      • I have implemented it and submitted on Live Archive, but got WA 😀
        Is there something wrong with my implementation? Could you please check it? http://pastebin.com/ECJX9yEa

      • I didn’t read your code, but it failed the first sample input: PASTIPAS.

        • Yes and it works if I do MOD operation, but still WA on Live Archive. Does your solution with above algorithm still work on LA? Maybe you/others have added more test cases to break that hash function? Or my implementation wrong?

        • I’m the one who send the data to LA and I’m pretty sure LA’s admin was quite busy to do any changes on the dataset.

          BTW, I’ve just noticed an error in F’s pseudocode. There is one line which is wrong. Try to figure that out! 🙂

          hint: pay attention to the rolling hash (when you do the multiplication with prime number); you might want to check other resources on rolling hash.

  5. ^ problem F Pasti Pas!

  6. Problem H can be solved by Dynamic Programming,here I find two states,one is index of the current question,another is wrong answers used/left so far, and then minimize the answer. But this solution does not work,though sample test case gives correct output. I saw another boolean state in some accepted codes in HUST, which tries both minimizing and maximizing. Can you please explain why another state is needed here?

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

(required)

(required)