DoorKickers的博客

博客

后缀排序

2020-02-26 12:26:19 By DoorKickers

Pre-defination

S : a string named S that consists of several characters

|S| : the size of S

S[l:r] : a sub-string of S which consists of the characters S[l], ... , S[r]

Suf(i) : a suffix of S that means S[i:|S|]

Algorithm

Sqy Algorithm I is an algorithm that can sort the suffix of a given string S in O(nlogn) time, and the result of this algorithm will be stored in a array called Suffix Array.

Let me show the code to you first, and then I will explain some intricacy details

(Assumed that you have already got the Radix Sort which can sort an array in O(n) time.)

inline void SA_sort(char *s, int n) {
    for (int i = 1; i <= n; i++) ++c[x[i] = s[i]];
    for (int i = 2; i <= m; i++) c[i] += c[i - 1];
    for (int i = n; i >= 1; i--) sa[c[x[i]]--] = i;
    for (int k = 1; k <= n; k <<= 1) {
        int num = 0;
        for (int i = n - k + 1; i <= n; i++) y[++num] = i;
        for (int i = 1; i <= n; i++) {
            if (sa[i] > k)
                y[++num] = sa[i] - k;
        }
        for (int i = 1; i <= m; i++) c[i] = 0;
        for (int i = 1; i <= n; i++) ++c[x[i]];
        for (int i = 2; i <= m; i++) c[i] += c[i - 1];
        for (int i = n; i >= 1; i--) sa[c[x[y[i]]]--] = y[i], y[i] = 0;
        swap(x, y);
        x[sa[1]] = 1, num = 1;
        for (int i = 2; i <= n; i++) {
            x[sa[i]] = (y[sa[i]] == y[sa[i - 1]] && y[sa[i] + k] == y[sa[i - 1] + k]) ? num : ++num;
        }
        if (num == n) break;
        m = num;

    }
}

Ok, that's the code.

Above all, the most important part that can make you better understanding of that snippet is the meaning of each array.

To simplify, we can use the function concept instead of array concept.

A function f is a machine that can get input and return an output.

Good, looks like you understand it completely.

 

Next step : look through the arrays that mentioned in that snippet.

You will comprehend them via function concept (expect c[])

 

x[] : input a initial position i, and return the "value" of the first-key whose length is k and whose start position is i.

y[] : input a current partial rank k of second-keys, and return the initial position i of the first-key which corresponds to the No.k second-key.

sa[] : input a current global rank k of the combination of first-keys and it's corresponding second-key, and return the initial position i which represents for that rank k string.

Remember it!!!!!!!!!!!!!

Ok, the writter is fucked up.

So

To be continued.

评论

Ou Ri Zlt
  • 2020-02-26 22:14:04
  • Reply
[+8]
这两天天下大雨
  • 2020-02-27 01:49:59
  • Reply
[+1]
orzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltqlorzltql
  • 2020-03-10 22:33:58
  • Reply

发表评论

可以用@mike来提到mike这个用户,mike会被高亮显示。如果你真的想打“@”这个字符,请用“@@”。