Python – Identify the Longest Consecutive Subsequence in an Array
In many data structure and algorithm interviews, you might be asked to find the longest consecutive subsequence in an array. This means identifying the maximum length of a sequence of consecutive integers (regardless of their order in the array). In this tutorial, we’ll break down the problem, show some sample inputs and outputs, and explain a step-by-step solution with a Python program.
Problem Statement
Given an unsorted array of integers, your task is to identify the length of the longest subsequence of consecutive numbers. Note that the consecutive subsequence does not need to be contiguous in the array.
Sample Input and Output
Example 1:
# Input:
arr = [100, 4, 200, 1, 3, 2]
# Output:
# The longest consecutive subsequence is [1, 2, 3, 4]
# Length: 4
Example 2:
# Input:
arr = [0, -1, 1, 2, -2, 3, 5]
# Output:
# The longest consecutive subsequence is [-2, -1, 0, 1, 2, 3]
# Length: 6
Solution Approach
There are several ways to solve this problem, but one efficient approach uses a hash set to achieve an O(n) time complexity. The idea is as follows:
- Create a set: Convert the array into a set to allow O(1) lookups.
- Identify the start of a sequence: For each number in the set, check if the number just before it (num – 1) exists in the set. If it does not, then the current number is the start of a sequence.
- Count consecutive numbers: Starting from this number, count how many consecutive numbers exist in the set.
- Update the maximum length: Compare and store the maximum length found.
Python Program
The following Python program demonstrates this approach:
def longest_consecutive_subsequence(arr):
# Convert the array to a set for O(1) lookups
num_set = set(arr)
max_length = 0
# Iterate over each number in the set
for num in num_set:
# Only try to build sequences from numbers that are the start of a sequence
if num - 1 not in num_set:
current_num = num
current_length = 1
# Count consecutive numbers
while current_num + 1 in num_set:
current_num += 1
current_length += 1
# Update maximum length if current sequence is longer
max_length = max(max_length, current_length)
return max_length
# Sample Test Cases
# Example 1
arr1 = [100, 4, 200, 1, 3, 2]
print("Longest consecutive subsequence length (Example 1):", longest_consecutive_subsequence(arr1))
# Expected Output: 4
# Example 2
arr2 = [0, -1, 1, 2, -2, 3, 5]
print("Longest consecutive subsequence length (Example 2):", longest_consecutive_subsequence(arr2))
# Expected Output: 6
This program efficiently identifies the longest sequence of consecutive numbers by ensuring that we only start counting from the smallest element of a potential sequence. This avoids redundant work and ensures an optimal solution.
Conclusion
In this tutorial, we learned how to find the longest consecutive subsequence in an array using a hash set to achieve O(n) time complexity. The key steps included identifying the start of a sequence and counting consecutive elements.