Problem
Given a string, find the length of the longest substring without repeating characters.
Examples:
Given "abcabcbb", the answer is "abc", which the length is 3.
Given "bbbbb", the answer is "b", with the length of 1.
Given "pwwkew", the answer is "wke", with the length of 3. Note that the answer must be a substring, "pwke" is a subsequence and not a substring.
大致意思:给定一个字符串,找到其中最大不重复子串,返回其长度。
My View
一开始的思路就是暴力解决,两个循环来检查是否满足条件,然后就有了下面的代码
class Solution {
public int lengthOfLongestSubstring(String s) {
if(s.equals("")){
return 0;
}else{
int count;
int ecount = 0;
for(int i = 0;i<s.length(); i++){
count = 1;
for(int j = i+1; j < s.length(); j++){
if(s.substring(i,j).contains(String.valueOf(s.charAt(j)))){
break;
}
count++;
}
if(count > ecount){
ecount = count;
}
}
return ecount;
}
}
}
很可惜,
得到了这样的结果,982/983,最后一组数据因为太长,规定时间耗尽,然后不符合题目要求。
第一反应是HashMap,因为刚做做过的第一题优解是采用HashMap的,可是搞了半天还是搞不明白,对Hashmap了解甚少……感觉Java白学了,挖个坑复习去。
Solution
solution1
那么下面是官方的solution
To enumerate all substrings of a given string, we enumerate the start and end indices of them. Suppose the start and end indices are ii and jj, respectively. Then we have 0 \leq i \lt j \leq n0≤i<j≤n (here end index jj is exclusive by convention). Thus, using two nested loops with ii from 0 to n - 1n−1 and jj from i+1i+1 to nn, we can enumerate all the substrings of s.
To check if one string has duplicate characters, we can use a set. We iterate through all the characters in the string and put them into the set one by one. Before putting one character, we check if the set already contains it. If so, we return false. After the loop, we return true.
具体的意思就是建立一个双层循环,来取得子字符串,然后通过检查函数来检查此字符串满不满足条件,而检查函数的写法是通过一个set集合来储存被比较值,然后不断拿比较值来比较是否存在。
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length();
int ans = 0;
for (int i = 0; i < n; i++)
for (int j = i + 1; j <= n; j++)
if (allUnique(s, i, j)) ans = Math.max(ans, j - i);
return ans;
}
public boolean allUnique(String s, int start, int end) {
Set<Character> set = new HashSet<>();
for (int i = start; i < end; i++) {
Character ch = s.charAt(i);
if (set.contains(ch)) return false;
set.add(ch);
}
return true;
}
}
可以看到allUnique函数,建立一个数据集,然后在比较范围内,没有包含字符就添加进结果集,有包含的话直接返回false。
然后通过max来找到最后的最大值。
由于用了三层循环,所以是O(n3)的复杂度,结果自然是[Time Limit Exceeded]了。
solution2
然后官方第二种solution
Sliding Window(滑动窗口)
The naive approach is very straightforward. But it is too slow. So how can we optimize it?
上来就来一句:明显的嘲讽……
In the naive approaches, we repeatedly check a substring to see if it has duplicate character. But it is unnecessary. If a substring s_{ij}
from index ii to j - 1j−1 is already checked to have no duplicate characters. We only need to check if s[j] is already in the substring s_{ij}
那么这里的意思是,如果我们已经检查了某个字串不含重复字符的话,就不用在多检查一次了,而我们用双层循环的时候,其实是对已经检查过的串又检查一遍。
By using HashSet as a sliding window, checking if a character in the current can be done in O(1).
这里就给出了解决方案,用一个HashSet集合来作为一个滑动窗口来检查字符串,可以降低我们的复杂度,那么突破点也就在这。
A sliding window is an abstract concept commonly used in array/string problems.
那么滑动窗口多用来解决一些数组或者字符串的问题,是一个抽象的概念
Back to our problem. We use HashSet to store the characters in current window [i, j) (j ==i initially). Then we slide the index j to the right. If it is not in the HashSet, we slide j further. Doing so until s[j] is already in the HashSet. At this point, we found the maximum size of substrings without duplicate characters start with index ii. If we do this for all ii, we get our answer.
回到问题中,我们用哈希集合来储存一个[i,j)区间的子串作为一个窗口,然后我们移动右边值,如果发现的新值不在集合里就接着找,直到找到一个在集合里已经有的值,记录下此时的不重复长度,那么接下来把窗口左值向右移动就可以遍历完所有的子串。从而最后得到我们的最大值。
接下来代码
public class Solution {
public int lengthOfLongestSubstring(String s) {
//取得字符串长度
int n = s.length();
//建立HashSet
Set<Character> set = new HashSet<>();
int ans = 0, i = 0, j = 0;
//遍历字符串
while (i < n && j < n) {
// try to extend the range [i, j]
//如果数据集不包含查找到的新字符
//就把他放到集合里
if (!set.contains(s.charAt(j))){
set.add(s.charAt(j++));
//返回可以找到的最大值
ans = Math.max(ans, j - i);
}
else {
//否则本次查找结束,窗口左边向右移动
set.remove(s.charAt(i++));
}
}
return ans;
}
}
那么很显然这个算法的复杂度是O(n),那么空间复杂度的话,是取决于Hash占用的,当然也取决于字符串的长度。
在这里会发现Hash在解决数组或字符串问题上很有用,完全可以替代两层循环的复杂度,在下面官方还给了一个滑动窗口改进版,
public class Solution {
public int lengthOfLongestSubstring(String s) {
int n = s.length(), ans = 0;
Map<Character, Integer> map = new HashMap<>(); // current index of character
// try to extend the range [i, j]
for (int j = 0, i = 0; j < n; j++) {
if (map.containsKey(s.charAt(j))) {
i = Math.max(map.get(s.charAt(j)), i);
}
ans = Math.max(ans, j - i + 1);
map.put(s.charAt(j), j + 1);
}
return ans;
}
}
太深奥,慢慢理解吧。