Merging Overlapping Ranges Efficiently: Merge Intervals
O(n log n)O(n)Write-up
Problem Summary
Youâre given a list of intervals where each interval is a pair [start, end]. The task is to merge all intervals that overlap and return a list of non-overlapping intervals that cover the same ranges. Intervals are inclusive and may touch at boundaries.
Constraints keep the input manageable. The list length is in [1, 10â´], each interval has exactly two integers, and all values fall in [0, 10â´] with start <= end. These bounds allow an O(n log n) sort followed by a linear pass without performance issues.
My Approach
Sort the intervals by start time, then make a single pass to merge. Sorting clusters potentially overlapping intervals next to each other. Once the list is ordered by start, you only need to compare the current interval with the last merged one to decide whether to extend or to finalize and start a new merged interval.
Hereâs how the scan works. Maintain a running merged interval [s, e]. Initialize it to the first interval after sorting. For each next interval [cs, ce], if cs is greater than e thereâs a gap, so push [s, e] to the result and reset [s, e] to [cs, ce]. Otherwise they overlap or touch, so update e to max(e, ce) and continue. At the end, push the final [s, e]. This greedy choice is safe because once sorted by start, thereâs no earlier interval left that could overlap with the current one without also overlapping the previously considered ones.
Why pick this strategy. Itâs simple, minimizes passes over the data, and aligns with the structure of the problem. Overlap detection depends only on the previous merged interval once sorting groups potential overlaps.
Time complexity is dominated by sorting at O(n log n). The merge pass is O(n). Space is O(n) in the output size; if reusing the input array with a write index, auxiliary space stays O(1) beyond the output.
Why This Works
Sorting by start time creates a monotonic property. After sorting, every intervalâs start is greater than or equal to the previous intervalâs start. That ensures the only possible overlap for the current interval is with the last interval weâve merged so far, not with any earlier finalized interval.
Think about the invariant during the scan. We keep a current merged interval [s, e] that represents the union of all intervals processed so far that overlap with each other and begin at or before s. When we consider the next interval [cs, ce], there are two possibilities.
- If cs is greater than e, then [cs, ce] starts after the end of the current merged interval. Thereâs a true gap, so we can finalize [s, e] and start a new merged interval with [cs, ce].
- If cs is less than or equal to e, then [cs, ce] overlaps or touches [s, e]. Because intervals are inclusive, touching at the boundary means they should be merged. Extending e to max(e, ce) captures the union exactly.
No interval can âsneak inâ between two non-overlapping merged intervals once the list is sorted. If one existed, it would have to start after s and before cs, contradicting the sort order. This argument validates the greedy strategy and proves correctness.
Edge behavior on boundaries is also handled. With inclusive endpoints, [1, 4] and [4, 5] should merge. The condition used is cs > e to detect a gap, so cs == e triggers a merge, which is correct.
Code Walkthrough
Below is the implementation in C++ that follows the sort-and-scan approach. It uses the default lexicographic sort for vectors of size two, which orders by start then end. Inline comments explain each step.
C++
1 class Solution { 2 public: 3 vector<vector<int>> merge(vector<vector<int>>& intervals) { 4 // Sort intervals by start, then by end if starts are equal 5 sort(intervals.begin(), intervals.end()); 6 7 // Initialize the current merged interval to the first element 8 int s = intervals[0][0], e = intervals[0][1]; 9 vector<vector<int>> ans; // Result container for merged intervals 10 11 // Process remaining intervals 12 for (int i = 1; i < intervals.size(); ++i){ 13 int current_s = intervals[i][0], current_e = intervals[i][1]; 14 15 // If there is a gap, finalize the current merged interval 16 if (current_s > e) { 17 ans.push_back({s, e}); 18 s = current_s; 19 e = current_e; 20 } else { 21 // Overlapping or touching: extend the end as needed 22 e = max(e, current_e); 23 } 24 } 25 26 // Push the last merged interval 27 ans.push_back({s, e}); 28 return ans; 29 } 30 };
Key sections worth calling out.
- Sorting. The call to sort without a custom comparator works because vector<int> compares lexicographically. That effectively does pairwise sort by start, then end. A custom comparator is fine too if you prefer explicitness.
- Overlap check. The condition current_s > e flags a true gap. Using > rather than >= allows intervals that touch at the boundary to merge.
- Accumulation. The algorithm keeps a single [s, e] and emits it only when encountering a gap or reaching the end.
Complexity.
- Time is O(n log n) due to sorting. The scan is O(n).
- Space is O(n) for the output list. If you write back into the input and resize, the auxiliary overhead stays constant.
Alternative Approaches
The sort-and-scan solution is the right tool for most cases, but there are scenarios where other methods may be considered.
-
Sweep line with event points. Convert each interval [s, e] into two events, a +1 at s and a â1 at e+epsilon. Sorting events and maintaining an active count reconstructs coverage. This shines when you need more than merged ranges, like tracking peak overlap counts or handling very large inputs with custom event types. Itâs still O(n log n) to sort events and O(n) to sweep.
-
Bucketing on small coordinate ranges. When endpoints are integers in a small domain, e.g., [0, 10â´], you can use a difference array over the coordinate axis. Mark +1 at start and â1 at end+1, prefix sum, and rebuild contiguous positive segments. This trades sorting for O(U) processing where U is the universe size. Itâs practical if U is small relative to n log n and youâre okay with linear memory in U.
-
Segment tree or interval tree. These structures are useful when you have many online queries like insert interval, delete interval, query overlap, or union length. For a one-time batch merge theyâre overkill, but in a dynamic system they provide logarithmic updates and queries.
-
Union-Find on overlapping graph. Consider each interval a node and connect edges for overlaps. Each connected componentâs union becomes a merged interval. Building all edges naively is O(n^2), then DSU merges components. This is inefficient for large n unless your input has structure that limits potential overlaps or you have a spatial index to prune edges. Mostly a teaching tool rather than a production path here.
-
Dynamic programming. There isnât a natural substructure that DP exploits well for simple merging. It complicates an otherwise straightforward problem and costs O(n^2) in the worst case. I wouldnât pick this unless reframing the problem into a different optimization objective.
When to pick something else. If you need peak concurrency or counts of overlaps, the sweep line naturally extends. If youâre working on a streaming or dynamic set of intervals, interval trees or ordered maps keyed by starts are more appropriate. When endpoints are bounded and dense, a difference array can be faster and simpler than sorting.
Pitfalls and Edge Cases
Merging intervals looks easy, but there are a few traps worth calling out.
-
Wrong overlap predicate. Using current_s >= e to signal a gap treats touching intervals as separate. If the problem defines [a, b] and [b, c] as overlapping or mergeable, use current_s > e to detect a gap. Always align the predicate with endpoint semantics.
-
Not handling equal starts. Intervals with the same start but different ends, like [1, 2] and [1, 10], require sorting by start then end. The default lexicographic sort handles this. If you write your own comparator, ensure ties sort by end to avoid premature splits.
-
Forgetting to append the last merged interval. After the loop you must push the final [s, e]. Skipping this drops the last merged range.
-
Empty input assumptions. The given constraints guarantee at least one interval. If you generalize this code to arbitrary inputs, handle the empty case before accessing intervals[0].
-
Integer overflow concerns. Not an issue here given endpoints are in [0, 10â´]. If you port to a domain with larger endpoints, use 64-bit integers and watch for end+1 operations in sweep line or bucketing approaches.
-
Sorting comparator mistakes. In some languages, incorrect comparator returns or non-transitive comparisons can cause undefined behavior. Keep it simple. Compare starts, then ends.
-
Mutating input unexpectedly. If an API contract requires not changing the input order, either work on a copy or clearly document the in-place sort.
-
Stability is not required. Donât rely on stable sort for correctness. Only the relative order of starts matters.
-
Data with single-element ranges. Intervals like [3, 3] are valid. They should merge with [3, 5] and [1, 3] appropriately using the same logic.
-
Negative or reversed intervals. Problem constraints ensure start <= end and non-negative values. If you ever relax those constraints, normalize intervals first by swapping ends where needed.
Test cases that exercise tricky boundaries help validate behavior.
- [[1,3], [2,6], [8,10], [15,18]] should produce [[1,6], [8,10], [15,18]]
- [[1,4], [4,5]] should produce [[1,5]] due to touching endpoints
- [[1,4], [0,2], [3,5]] should produce [[0,5]]
- [[1,1], [1,1], [1,1]] should produce [[1,1]]
- [[0,0]] should produce [[0,0]]
Further Thinking
If you want to strengthen this skill set, try a few related problems that connect with sorting, sweeping, and interval reasoning.
-
Insert Interval. Given a sorted list of non-overlapping intervals and a new interval, insert and merge as needed. Hint: Do a single pass, append all intervals ending before the new one, merge overlaps with the new one, then append the rest.
-
Meeting Rooms II. Given intervals, find the minimum number of rooms needed to hold all meetings. Hint: Sweep line with start and end events or two-pointer technique over sorted start and end arrays.
-
Employee Free Time. Given K employeesâ schedules as sorted non-overlapping intervals, find common free time. Hint: Merge all busy intervals across employees, then collect the gaps.
-
Minimum Number of Arrows to Burst Balloons. Intervals as balloons that burst with a single arrow shot at some point. Hint: Sort by end and greedily shoot an arrow at the earliest finishing balloonâs end.
-
Interval List Intersections. Two lists of non-overlapping intervals sorted by start, compute their intersections. Hint: Two-pointer technique advancing the interval with the smaller end.
-
Merge K Sorted Lists of Intervals. Each list is individually merged and sorted. Hint: Use a min-heap on starts or a k-way merge, then run a final linear merge pass.
-
Count of Range Overlaps at Each Point. For a set of intervals, report how many cover each query point. Hint: Sweep line with prefix sums or binary search on separate start and end arrays.
-
Skyline Problem. Buildings as intervals with heights, compute the skyline outline. Hint: Event sweep with a multiset or heap to track active heights.
This family of problems trains the same instincts. Sort when order matters, sweep when events change state, and keep invariants tight during linear passes.
Conditions
- 1 <= intervals.length <= 10â´
- intervals[i].length == 2
- 0 <= starti <= endi <= 10â´
Examples
Example 1
- input: intervals = [[1,3],[2,6],[8,10],[15,18]]
- output: [[1,6],[8,10],[15,18]]
- explanation: Since intervals [1,3] and [2,6] overlap, merge them into [1,6].
<strong class="
Example 2
- input: intervals = [[1,4],[4,5]]
- output: [[1,5]]
- explanation: Intervals [1,4] and [4,5] are considered overlapping.
<strong class="
Example 3
- input: intervals = [[4,7],[1,4]]
- output: [[1,7]]
- explanation: Intervals [1,4] and [4,7] are considered overlapping.
My Notes
Sorting order is really important.
Solution
1 class Solution { 2 public: 3 vector<vector<int>> merge(vector<vector<int>>& intervals) { 4 // Sort intervals based on the starting times 5 sort(intervals.begin(), intervals.end()); 6 7 // Initialize the first interval 8 int s = intervals[0][0], e = intervals[0][1]; 9 vector<vector<int>> ans; // Vector to hold merged intervals 10 11 // Loop through the intervals starting from the second one 12 for (int i = 1; i < intervals.size(); ++i){ 13 int current_s = intervals[i][0], current_e = intervals[i][1]; 14 15 // If the current interval does not overlap with the last merged one 16 if (current_s > e) { 17 ans.push_back({s, e}); // Add the merged interval 18 s = current_s, e = current_e; // Update to the current interval 19 } else { 20 // Merge the intervals by updating the end time 21 e = max(e, current_e); 22 } 23 } 24 // Add the last merged interval to the result 25 ans.push_back({s, e}); 26 return ans; // Return the merged intervals 27 } 28 };
Key Points
- âThe intervals must be sorted by their starting times for optimal merging.
- âA greedy approach is used to decide whether to merge overlapping intervals or not.
- âIntervals are merged based on the comparison of the current interval's start with the last added interval's end.
- âSpace complexity is primarily due to the storage of merged intervals.
- âMerging takes place in a single pass post sorting.
Recommended Approaches
O(n log n)UsedSorting the intervals followed by a single scan to merge overlapping intervals is efficient and straightforward. After sorting, we can easily check each interval against the last added interval in the result.
O(n log n)A segment tree could be built to keep track of overlapping intervals, which would allow for merging queries and updates. This is more complex and may be unnecessary for this problem's constraints.
O(n log n)Using a sweep line algorithm with event points could be used to identify starts and ends of intervals. It can handle merging efficiently and is suitable when intervals represent events that need to be tracked over a range.
O(n log n)Although union-find is typically useful for connectivity problems, it could be adapted for merging intervals. Each interval could be a node, and overlaps can be seen as connections. This is overkill for this problem.
O(n^2)A DP approach could theoretically store results for previously computed intervals, but since intervals can overlap in many combinations, a direct DP solution would be inefficient and complicated without clear state definitions.
Key Idea for Similar Problems
Sort intervals by start, then scan once to merge by maintaining a running [s, e] and extending or finalizing based on the next start. The greedy approach is correct because sorting ensures potential overlaps occur only with the last merged interval. Time is O(n log n) for sorting, with a linear merge pass and O(n) output space.
Related Topics
- âgreedy
- âsorting