We have some data in a table (check the table below). In the table, there is entry_id, which is the primary key, there is city which is the name of the city, there is date, and rain_measurement.
It contains rain measurement data for different cities.
Our job is to find the rain measurement of the latest entry date for each city. So, for each city let’s get the latest record by date.
Table name: rain_data
Use the following query to create the table and populate data, if you want to follow along.
CREATE TABLE `rain_data` ( `entry_id` int(11) NOT NULL AUTO_INCREMENT, `city` varchar(100) DEFAULT NULL, `date` date DEFAULT NULL, `rain_measurement` decimal(10,1) DEFAULT NULL, PRIMARY KEY (`entry_id`) ) ENGINE=InnoDB; INSERT INTO `rain_data`(`entry_id`,`city`,`date`,`rain_measurement`) values (1,'city1','2012-01-01',1.0), (2,'city1','2012-02-02',1.0), (3,'city1','2012-03-03',1.3), (4,'city2','2012-01-01',1.0), (5,'city2','2012-02-02',3.6), (6,'city2','2012-03-03',1.8), (7,'city3','1012-01-01',1.0), (8,'city3','2012-02-02',2.1), (9,'city3','2012-03-03',1.0), (10,'city3','2012-03-04',0.9);
Though it can be approached in a few different ways, we want to approach it in two different ways here. The first one is by using a group by of one column (city) and trying to get the max of another column (date) and fetch data from that row. And the second one is by using ROW_NUMBER() window function(this is MySQL specific implementation).
Method #1: [General] Using Join
In this method, we will use
GROUP BY and
JOIN. This is a general process and will work on any relational database system.
The following queries are written for MySQL, but the same process and steps will work for any database.
Let’s discuss the process step by step.
Step #1: Group by and get the MAX date for each
If we do a GROUP BY with any(one or more) column (in this case, we are performing a GROUP BY city), we will get distinct cities, and using MAX(date) we will get the maximum date for that city:
SELECT city, MAX(date) AS max_date FROM rain_data GROUP BY city;
Using this query we will get results like the below:
This will give us each city and the MAX date for that city.
Step #2: Join the main table with the subquery result
Now we will use it as a subquery and INNER JOIN with the main table. The join needs to be based on the columns that we selected in the previous query (city and max_date).
The final query will be like the below:
SELECT rd.* FROM rain_data rd INNER JOIN (SELECT city, MAX(`date`) AS max_date FROM rain_data GROUP BY city) grd ON rd.city=grd.city AND rd.date=grd.max_date;
We will get entry_id, city, date, and rain_easurement for the last date for each city. The output will be something like this:
Method #2: [For MySQL] Using ROW_NUMBER()
In this method we will use MySQL 8 Window Function ROW_NUMBER() to rank the rows of data for each city and based on the rank we will filter the latest row.
This method will not work in the MySQL versions lower than 8. As ROW_NUMBER() window function is available in MySQL 8 (and above).
Let’s discuss the process step by step.
Step #1: Rank data using ROW_NUMBER()
Use the following query to rank data for each city based on date.
SELECT *, ROW_NUMBER() OVER (PARTITION BY city ORDER BY `date` DESC) AS rank_no FROM rain_data ORDER BY city ASC, rank_no ASC
We are rating rows for each city based on the date, and ranking it in descending order, so the row which has the latest date will have the lowest rank (rank_no = 1) for each city. This query will give you result like the below:
Step #2: Filter and take rows with rank_no= 1
Now use the filter
rank_no = 1 in the select query, like below.
SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY city ORDER BY `date` DESC) AS rank_no FROM rain_data) rd WHERE rd.rank_no = 1
Or you can write it in the format below if you prefer.
WITH ranked_rain_data AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY city ORDER BY `date` DESC) AS rank_no FROM rain_data ) SELECT * FROM ranked_rain_data WHERE rank_no= 1;
Both will give the same result, like below.
You will get the following output:
I hope that it will help you.
2 thoughts on “SQL Tips: Select Latest Record for Each Group”
Lifesaver! Thank You.
Happy to help, Andre