Largest and Smallest in a Python collection

Share the blog with your friends
  •  
  •  
  •  
  •  
  •  
  •  
  •  

Today I had come across a module in Python which eases a process of finding the number of largest and smallest values in a collection. We can assume a simple scenario of finding the 10 largest numbers in a list of numbers up to 1000. Obviously we will think of the below way first.

values = range(1000)
print([i for i in values[-10:]])

And one more obvious factor to be considered with the above example is to order the collection in case if it’s not. Since this set I populated with range(), I did not ordered the values.

Let’s take another scenario that list employee of information including their number of years of experience with the current company. We need to get the top 10 senior employees based on their experience. I have given the sample data below. The sample data contains the employee information including their salary. This sample data set has been fetched from Kaggle. Then entire data can be fetched from the below link.
Download Dataset

[{'Abutments/Hour Wk 2': '0', 'Abutments/Hour Wk 1': '0', 'Date of Hire': '8/1/2011', 'Manager Name': 'Elisa Bramante', 'Positio
n': 'Production Manager', 'Daily Error Rate': '0', 'Pay': ' $54.50 ', 'Performance Score': 'Fully Meets', '90-day Complaints': '
0', 'Reason for Term': 'N/A - still employed', 'Race Desc': 'White', 'Department': 'Production       ', 'Employment Status': 'Ac
tive', 'TermDate': '', 'Employee Name': 'Albert, Michael  '}, {'Abutments/Hour Wk 2': '0', 'Abutments/Hour Wk 1': '0', 'Date of 
Hire': '9/30/2013', 'Manager Name': 'Elisa Bramante', 'Position': 'Production Manager', 'Daily Error Rate': '0', 'Pay': ' $50.50
 ', 'Performance Score': 'Fully Meets', '90-day Complaints': '0', 'Reason for Term': 'retiring', 'Race Desc': 'Asian', 'Departme
nt': 'Production       ', 'Employment Status': 'Voluntarily Terminated', 'TermDate': '8/7/2014', 'Employee Name': 'Bozzi, Charle
s'}, {'Abutments/Hour Wk 2': '0', 'Abutments/Hour Wk 1': '0', 'Date of Hire': '1/28/2016', 'Manager Name': 'Elisa Bramante', 'Po
sition': 'Production Manager', 'Daily Error Rate': '0', 'Pay': ' $55.00 ', 'Performance Score': 'Exceeds', '90-day Complaints': 
'0', 'Reason for Term': 'N/A - still employed', 'Race Desc': 'White', 'Department': 'Production       ', 'Employment Status': 'Active', 'TermDate': '', 'Employee Name': 'Butler, Webster  L'}, {'Abutments/Hour Wk 2': '0', 'Abutments/Hour Wk 1': '0', 'Date of Hire': '9/18/2014', 'Manager Name': 'Elisa Bramante', 'Position': 'Production Manager', 'Daily Error Rate': '0', 'Pay': ' $51.00 ', 'Performance Score': 'Fully Meets', '90-day Complaints': '0', 'Reason for Term': 'N/A - still employed', 'Race Desc': 'White', 'Department': 'Production       ', 'Employment Status': 'Active', 'TermDate': '', 'Employee Name': 'Dunn, Amy  '}, {'Abutments/Hour Wk 2': '0', 'Abutments/Hour Wk 1': '0', 'Date of Hire': '6/2/2015', 'Manager Name': 'Elisa Bramante', 'Position': 'Production Manager', 'Daily Error Rate': '0', 'Pay': ' $54.00 ', 'Performance Score': 'Fully Meets', '90-day Complaints': '0', 'Reason for Term': 'N/A - still employed', 'Race Desc': 'White', 'Department': 'Production       ', 'Employment Status': 'Active', 'TermDate': '', 'Employee Name': 'Gray, Elijiah  '}]

Now let’s see how to get the top 2 person who earns the higher salary with the new module.

import heapq
top_pay = heapq.nlargest(10,data,key=lambda x: float(x['Pay'].strip().replace("$","")))
for emp in top_pay:
    print(emp['Employee Name'],emp['Pay'])

Output:

(‘Butler, Webster L’, ‘ $55.00 ‘)
(‘Liebig, Ketsia’, ‘ $55.00 ‘)
(‘Sullivan, Kissy ‘, ‘ $55.00 ‘)
(‘Albert, Michael ‘, ‘ $54.50 ‘)
(‘Gray, Elijiah ‘, ‘ $54.00 ‘)
(‘Miller, Brannon’, ‘ $53.00 ‘)
(‘Stanley, David ‘, ‘ $53.00 ‘)
(‘Spirea, Kelley’, ‘ $52.00 ‘)
(‘Dunn, Amy ‘, ‘ $51.00 ‘)
(‘Bozzi, Charles’, ‘ $50.50 ‘)

It’s this simple with the heapq module in Python. The same way least salaried employees also can be fetched using the method “nsmallest” in heapq module. These two methods accepts three arguments as below.

1. Number of elements which matches the criteria to be returned.
2. Target collection data.
3. Filter key to apply the filter logic.

Explore the module further and enjoy.

  •  
  •  
  •  
  •  
  •  
  •  
  •  

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *