# Can interval algorithm problems be solved in seconds with line segment tree?

2022-01-27 04:15:32 lucifer

# The interval algorithm problem can be solved in seconds with a line segment tree ？

## background

Give an array of two , One of the arrays is A [1,2,3,4], Another array is B [5,6,7,8]. Let you find the of the large array after the combination of two arrays ：

• Maximum
• minimum value
• The sum of the

Is the question very simple ？ We can directly and easily in Time to solve , among m and n They are arrays A and B Size .

Then if I could modify A and B Some values , And I ask Many times Maximum , What about the minimum and the sum ？

The simple idea is to modify the array in place , then Time to recalculate . Obviously, this does not take advantage of the previously calculated results , Efficiency is not high . Is there a more efficient way ？

Yes ！ Segment tree can solve .

## What is a segment tree ？

A segment tree is essentially a tree . More precisely , It's a binary tree , And it's a balanced binary tree . About why balanced binary tree , We'll talk about later , Here we first have this understanding .

Although it is a binary tree , But we usually use arrays to simulate tree structure , Instead of the traditional definition TreeNode .

On the one hand, it's easy to implement , On the other hand, because the segment tree is actually a complete binary tree , Therefore, using array direct simulation will be very efficient . The reason for this has been mentioned in the implementation of binary heap in the heap topic I wrote earlier , You can put it in my official account 《 Force buckle plus 》 reply Pile up obtain .

## What problem to solve

It's just like its name , Line segment tree and line segment （ Section ） of . Each tree node of the line segment tree actually stores a Section （ paragraph ） Information about . Then the information of these intervals if Satisfy certain properties You can use the line segment tree to improve the performance .

that ：

1. What kind of nature ？
2. How to improve the performance of ？

### What kind of nature ？

For example, the maximum value we mentioned earlier , The minimum and the sum satisfy this Certain properties . That is, I can according to several （ Here are two ） Subset deduces an index of the union of subsets .

Take the example above , We can put arrays A and Array B As two sets . that ： aggregate A The maximum and set of B The maximum value of is known , We can go straight through max(Amax, Bmax) Find the set A And assemble B The maximum value of the union of . among Amax and Bmax They are set A And collection B The maximum of . The minimum and the sum are the same , I won't repeat . Therefore, if the statistical information satisfies this nature , We can use the segment tree . But do you want to use , It still depends on using the segment tree , Whether it can improve performance .

### How to improve the performance of ？

About improving performance , I'll sell it first , When we finish the implementation later , Let's talk .

## Realization

Take the summation at the beginning of the article as an example .

We can put the interval A and Section B As the left and right nodes of a tree , And will A The interval sum of and B The interval and are stored in the left and right child nodes respectively .

Next , take A The interval of is divided into left and right parts , Empathy B It is also divided into left and right parts . Continue this process until you cannot continue .

To sum up, the interval is continuously divided into two , The interval information is stored in the left and right nodes respectively . If it's summation , Then interval information is the sum of intervals . The line segment tree at this time is like this ：

*

The blue font indicates the interval and .

*

Be careful , All leaf nodes of this tree have a total of n individual （n Is the length of the original array ）, And each one corresponds to a value of the original array .

It's also easy to reflect on the code . Use it directly Subsequent traversal Can solve . This is because , We need to know the statistics of the left and right nodes , To calculate the statistics of the current node .

*

Those who are not familiar with post order traversal can take a look at my previous tree topic , The official account can be obtained by adding the reply tree.

*

Just like binary heap , We can use arrays to represent trees , use `2 * i + 1` and `2 * 1 + 2` To represent the indexes of the left and right nodes , among i Corresponds to the current node in tree Index on .

*

tree Is an array used to build a line segment tree , Similar to binary pile . It's just tree[i] At present, only interval information is stored .

*

As I described above, there is obvious recursion when building trees , So we can build trees recursively . say concretely , You can define a build(tree_index, l, r) Method To build trees . among l and r Is the left and right endpoints of the corresponding interval , such l and r You can uniquely determine an interval . tree_index In fact, it is used to mark that the current interval information should be updated to tree Where in the array .

We are tree Upper storage interval information , Then you can finally use tree[tree_index] = .... To update the interval information .

Core code ：

``def build(self, tree_index:int, l:int, r:int):    '''     Create a line segment tree recursively     tree_index :  The position of the segment tree node in the array     l, r :  The left side of the interval represented by this node , Right border     '''    if l == r:        self.tree[tree_index] = self.data[l]        return    mid = (l+r) // 2 #  The midpoint of the interval , Corresponding to the end of the left child interval , The right child starts with the interval     left, right = 2 * tree_index + 1, 2 * tree_index + 2 # tree_index  Left and right subtree index     self.build(left, l, mid)    self.build(right, mid+1, r)    #  Typical post order traversal     #  Interval sum can be added , If it's not interval and, change the following line of code     self.tree[tree_index] = self.tree[left] + self.tree[right]``

The array of the above code self.tree[i] In fact, it is used to store intervals and... Similar to the blue font in the above figure . Every interval is in tree There is a place on it , Save its interval and .

Complexity analysis

• Time complexity ： By the recurrence relation T(n) = 2*T(n/2) + 1, So the time complexity is zero
*

I don't know how to get it ? You can have a look at my 《 The way of algorithmic clearance 》 The first chapter of . https://leetcode-solution.cn/book-intro

*
• Spatial complexity ：tree The size and n Isomorphism , So the space complexity is zero

Finally built the tree , But I know it's not efficient at all . What we need to do is to deal with it efficiently Interval query with frequent updates .

The method based on this line segment tree , How to update and query interval information ？

### Interval query

First answer the simple question Interval query What is the principle .

If you query the information of an interval . Here is also the use of post order traversal ok 了 . For example, I want to find an interval [l,r] And .

So if the current left node ：

• Fall completely on [l,r] Inside . such as [2,3] Fall completely on [1,4] Inside . We will directly tree The left node in the middle is used for the interval sum of , Might as well be extremely lsum.
• Part falls on [l,r] Inside . such as [1,3] Part falls on [2,4]. At this time, we continue to recurse , Until it falls completely within the interval （ The situation above ）, At this time, we will directly tree The left node in the middle is used for the interval sum of
• The answer is to add up all the previous values taken out for standby

The processing of the right node is the same , I won't repeat .

Complexity analysis

• Time complexity ： The query does not need to process two leaf nodes at each time , In fact, the number of processing is roughly the same as the height of the tree . And trees are balanced , So the complexity is

Or by the recurrence relation T(n) = T(n/2) + 1, So the time complexity is zero

*

I don't know how to get it ? You can have a look at my 《 The way of algorithmic clearance 》 The first chapter of . https://leetcode-solution.cn/book-intro

*

You can understand this complexity in combination with the following code .

### Interval modification

So if I modify A by 1 Well ？

If not modified tree, So obviously, the query interval as long as it contains A It must be wrong , For example, query interval [1,3] And You'll get the wrong answer . So we have to revise A Modify at the same time tree.

The problem lies in What are we going to modify tree Value , How much is it ？

Answer the first question first , Modify what tree The value of ？

We know , The leaf nodes of the segment tree are the values on the original array , Also said , Of the line segment tree n A leaf node corresponds to the original array . So we must first Find the leaf node corresponding to the position we modified , Change its value .

That's the end of it ？

It's not over . actually , We modify all the parent nodes and grandfather nodes of the leaf node （ If any ） All need to be changed . That is to say, we need From this leaf node to the root node , And modify the interval information along the way

*

This process is similar to the browser's event model

*

Now answer the last question , What is the specific modification ？

For sum , We need to first change the leaf node to the modified value , In addition, the sum of the intervals from all leaf nodes to the points on the root node path is added with delta, among delta Is the difference between before and after the change .

*

How to update the maximum and minimum values ？ Let's think for ourselves .

*

Modify which nodes , The problem of how much to modify has been solved , Then the code implementation is easy .

Complexity analysis

• Time complexity ： The modification does not need to deal with two leaf nodes at each time , In fact, the number of processing is roughly the same as the height of the tree . And trees are balanced , So the complexity is

Or by the recurrence relation T(n) = T(n/2) + 1, So the time complexity is zero

*

I don't know how to get it ? You can have a look at my 《 The way of algorithmic clearance 》 The first chapter of . https://leetcode-solution.cn/book-intro

*

You can understand this complexity in combination with the following code .

## Code

The line segment tree code has been placed on the question brushing plug-in , official account 《 Force buckle plus 》 Reply to the plug-in to get .

``class SegmentTree:    def __init__(self, data:List[int]):        '''        data:  The array passed in         '''        self.data = data        self.n = len(data)        #   apply  4  times  data  Length space to store line segment tree nodes         self.tree = [None] * (4 * self.n) #  Indexes  i  The left child index is  2i+1, The right child is  2i+2        if self.n:            self.build(0, 0, self.n-1)    #  It is essentially a bottom-up updating process     #  Therefore, you can use post order traversal , That is, update the parent node when the function returns .    def update(self, tree_index, l, r, index):        '''        tree_index:  A root node index         l, r :  This root node represents the left and right boundaries of the interval         index :  The index of the updated value         '''        if l == r==index :            self.tree[tree_index] = self.data[index]            return        mid = (l+r)//2        left, right = 2 * tree_index + 1, 2 * tree_index + 2        if index > mid:            #  The interval to be updated is in the right subtree             self.update(right, mid+1, r, index)        else:            #  The interval to be updated is in the left subtree  index<=mid            self.update(left, l, mid, index)        #  The query interval is partly in the left subtree and partly in the right subtree         #  Interval sum can be added , If it's not interval and, change the following line of code         self.tree[tree_index] = self.tree[left] + self.tree[right]    def updateSum(self,index:int,value:int):        self.data[index] = value        self.update(0, 0, self.n-1, index)    def query(self, tree_index:int, l:int, r:int, ql:int, qr:int) -> int:        '''         Recursive query interval  [ql,..,qr]  Value         tree_index :  The index of a root node         l, r :  The left and right boundaries of the interval represented by this node         ql, qr:  The left and right boundaries of the interval to be queried         '''        if l == ql and r == qr:            return self.tree[tree_index]        #  The midpoint of the interval , Corresponding to the end of the left child interval , The right child starts with the interval         mid = (l+r) // 2        left, right = tree_index * 2 + 1, tree_index * 2 + 2        if qr <= mid:            #  The query interval is all in the left subtree             return self.query(left, l, mid, ql, qr)        elif ql > mid:            #  The query interval is all in the right subtree             return self.query(right, mid+1, r, ql, qr)        #  The query interval is partly in the left subtree and partly in the right subtree         #  Interval sum can be added , If it's not interval and, change the following line of code         return self.query(left, l, mid, ql, mid) + self.query(right, mid+1, r, mid+1, qr)    def querySum(self, ql:int, qr:int) -> int:        '''         Return to interval  [ql,..,qr]  And         '''        return self.query(0, 0, self.n-1, ql, qr)    def build(self, tree_index:int, l:int, r:int):        '''         Create a line segment tree recursively         tree_index :  The position of the segment tree node in the array         l, r :  The left side of the interval represented by this node , Right border         '''        if l == r:            self.tree[tree_index] = self.data[l]            return        mid = (l+r) // 2 #  The midpoint of the interval , Corresponding to the end of the left child interval , The right child starts with the interval         left, right = 2 * tree_index + 1, 2 * tree_index + 2 # tree_index  Left and right subtree index         self.build(left, l, mid)        self.build(right, mid+1, r)        #  Interval sum can be added , If it's not interval and, change the following line of code         self.tree[tree_index] = self.tree[left] + self.tree[right]``

## Related topics

• Pile up

You can take a look at the binary heap implementation of the topic I wrote before . Then contrast learning , By the way, I also learned to pile , Beauty is not true ？

• Tree array

A tree array is similar to a line segment tree , The difficulty is a little higher than the line segment tree . I have the opportunity to write an article on tree array for you .

• immutablejs

The front-end partner should know immutable Well ？ and immutablejs Is a very famous implementation immutable The tool library . West France has written an article before immutable Principle analysis article , You can see what you are interested in https://lucifer.ren/blog/2020/06/13/immutable-js/

### Why is a balanced binary tree ？

The previous time complexity is also based on the premise that the tree is a balanced binary tree . So must a line segment tree be a balanced binary tree ？ Yes . This is because the segment tree is a complete binary tree , The complete binary tree is balanced .

Of course, there is another premise , That is, the total number of nodes of the tree is of the same order as the length of the original array , That is to say n The magnitude of . About why they are of the same order , It's also easy to calculate , Just according to the recursive formula, we can get .

### Why can segment trees improve performance ？

As long as you understand me Implementation part Time complexity of , Then it's not difficult to understand this problem . Because the time complexity of modification and query is , Instead of using the violent method of line segment tree, the query complexity is as high as . The corresponding cost is to make achievements Space , Therefore, segment tree is also a typical space for time algorithm .

Finally, let's talk about it . Can interval algorithm problems be solved by line segment tree seconds ？ In fact, this has been answered in the article , It depends on whether two points are met ：

1. Whether it can be based on several （ Here are two ） Subset deduces an index of the union of subsets .
2. Whether it can improve performance （ Compared to the simple violent solution ）. Usually faced with Frequent modification You can consider using segment trees in all scenarios Optimize the query time consumption after modification .