KMP 算法要解决的是在字符串 S 中寻找模式字符串 P 的问题。
naive 的方法是两重循环,时间复杂度 O(m*n)。KMP 的时间复杂度为 O(m+n)。
其实并不复杂,分两步:
- 求出 P 的 Partial Match Table
- 借助 table 搜索 S
时间复杂度 O(m+n)。关键步骤是求出 P 的 Partial Match Table,其含义是
The length of the longest proper prefix in the (sub)pattern that matches a proper suffix in the same (sub)pattern
其中,
Proper prefix: All the characters in a string, with one or more cut off the end. “S”, “Sn”, “Sna”, and “Snap” are all the proper prefixes of “Snape”
Proper suffix: All the characters in a string, with one or more cut off the beginning. “agrid”, “grid”, “rid”, “id”, and “d” are all proper suffixes of “Hagrid”
实现如下
public int[] kmpTable(String p) {
// 一开始是声明 p.length() 长度的数组来表示相应位的状态,但是 table[1] = 1 时会一直循环
int[] table = new int[p.length()+1];
int i = 2, cnt = 0;
while (i <= p.length()) {
if (p.charAt(i - 1) == p.charAt(cnt)) {
table[i++] = ++cnt;
} else if (cnt > 0) {
cnt = table[cnt];
} else {
table[i++] = 0;
}
}
return table;
}
参考文献