Date and Time: Oct 11, 2024, 16:38 (EST)
Link: https://leetcode.com/problems/substring-with-concatenation-of-all-words/
You are given a string s and an array of strings words. All the strings of words are of the same length.
A concatenated string is a string that exactly contains all the strings of any permutation of words concatenated.
- For example, if
words = ["ab","cd","ef"], then"abcdef","abefcd","cdabef","cdefab","efabcd", and"efcdab"are all concatenated strings."acdbef"is not a concatenated string because it is not the concatenation of any permutation ofwords.
Return an array of the starting indices of all the concatenated substrings in s. You can return the answer in any order.
Example 1:
Input: s = "barfoothefoobarman", words = ["foo","bar"]
Output: [0,9]
Explanation:
The substring starting at 0 is "barfoo". It is the concatenation of ["bar","foo"] which is a permutation of words.
The substring starting at 9 is "foobar". It is the concatenation of ["foo","bar"] which is a permutation of words.
Example 2:
Input: s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]
Output: []
Explanation:
There is no concatenated substring.
Example 3:
Input: s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]
Output: [6,9,12]
Explanation:
The substring starting at 6 is "foobarthe". It is the concatenation of ["foo","bar","the"].
The substring starting at 9 is "barthefoo". It is the concatenation of ["bar","the","foo"].
The substring starting at 12 is "thefoobar". It is the concatenation of ["the","foo","bar"].
-
1 <= s.length <= 10^4 -
1 <= words.length <= 5000 -
1 <= words[i].length <= 30 -
sandwords[i]consist of lowercase English letters.
-
Use
word_counts{}to record each word inwordsand their occurancy. -
Start from
range(word_len)(we can form substring starting from range(0, word_len) and each time we step by word_len), we createl = i, counts = 0, sub_counts{}. And we check fromltor in range(l, len(s)-word_len + 1, word_len), so each increment ofrwill let us create asub_wordand we check if thissub_wordinword_countsor not. -
In the for loop, we want to check the range of
l, rand find if the words in this range equal to the counts ofcounts, and if we detect repeated word in this range, ie.sub_counts[sub_word] > word_counts[sub_word], we incrementl += word_lenand decrementcounts -= 1. Later, if we find equalcounts == len(words), we find a substring of all words, and we append thislindex tores[]. -
If a current
sub_worddoes not exist inword_counts, we need to incrementl = r + word_lenand clearsub_counts{}, resetcounts = 0. -
Finally, return
res.
Example 1: from index 0, we can form substring = barfoothefoobarm, so we can find any valid concatenation from this substring. Later, we go to index 1, we can form arfoothefoobarma, for index 2, we can form rfoothefoobarman. This is how we use range(word_len) to find all the concatenations.
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
# Count words frequency
# start from each index of s, try if we can find all words from i
# if yes, append i to res, otherwise, continue
# if word in curr_counts > word_counts, decrement by moving l
# TC: O(n^2), SC: O(n)
res = []
word_counts = {}
word_len = len(words[0])
for word in words:
word_counts[word] = word_counts.get(word, 0) + 1
for i in range(word_len):
l = i
counts = 0
sub_counts = {}
# Check sub_word
for r in range(l, len(s) - word_len + 1, word_len):
sub_word = s[r:r + word_len]
# Check if sub_word exists in word_counts
if sub_word in word_counts:
sub_counts[sub_word] = sub_counts.get(sub_word, 0) + 1
counts += 1
# If repeated word found, shrink the window from left
while sub_counts[sub_word] > word_counts[sub_word]:
sub_counts[s[l:l + word_len]] -= 1
l += word_len
counts -= 1
if counts == len(words):
res.append(l)
# If sub_word doesn't exist, update l
else:
l = r + word_len
sub_counts.clear()
counts = 0
return resTime Complexity:
Space Complexity: