Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Program to find number of unique people from list of contact mail ids in Python
Suppose we have a list of contact entries, where each entry contains multiple email addresses belonging to the same person. We need to find the number of unique people by identifying contacts that share common email addresses. A contact is considered duplicate when it shares any email with a previously processed contact.
So, if the input is like contacts = [["alex@gmail.com", "alex@yahoo.com"], ["alex_25@yahoo.com", "alex@gmail.com"], ["bob15@gmail.com"]], then the output will be 2. The first and second contacts share "alex@gmail.com", so they represent the same person, leaving us with two unique people.
Algorithm
To solve this, we will follow these steps ?
- Initialize ans := 0 (counter for unique people)
- Initialize found := a new set (to track seen emails)
- For each contact c in contacts, do
- Set duplicate := False
- For each email in c, do
- If email is not in found, then
- Add email to found set
- Otherwise,
- Set duplicate := True
- If email is not in found, then
- If duplicate is False, then
- Increment ans by 1
- Return ans
Example
Let us see the following implementation to get better understanding ?
def solve(contacts):
ans = 0
found = set()
for c in contacts:
duplicate = False
for email in c:
if email not in found:
found.add(email)
else:
duplicate = True
if not duplicate:
ans += 1
return ans
contacts = [
["alex@gmail.com", "alex@yahoo.com"],
["alex_25@yahoo.com", "alex@gmail.com"],
["bob15@gmail.com"]
]
print(solve(contacts))
The output of the above code is ?
2
How It Works
The algorithm processes each contact sequentially. For the first contact ["alex@gmail.com", "alex@yahoo.com"], both emails are new, so we add them to the found set and count this as a unique person. For the second contact ["alex_25@yahoo.com", "alex@gmail.com"], we find that "alex@gmail.com" already exists in our set, marking this contact as a duplicate. The third contact ["bob15@gmail.com"] contains a new email, so it represents another unique person.
Conclusion
This greedy approach efficiently identifies unique people by tracking seen email addresses. The algorithm runs in O(n×m) time where n is the number of contacts and m is the average number of emails per contact.
