Working on a recent project, I had a task that required me to number the rows of result set; however, I needed to reset the numbering based on the distinct value of a field, or group of fields.
I accomplished my task by leveraging ROW_NUMBER() and PARTITION BY. To summarize what each of these functions do:
- ROW_NUMBER() allows the assignment of consecutive numbers to rows in a result set.
- PARTITION BY divides the result set into partitions (groups based on distinct values), allowing a function to operate on each partition rather than on an entire result set.
The query below allows me to generate a row number for each row of a partition, resetting the row number back to the initial value of each partition. The example below leverages the PARTITION BY across 2 fields; however, it could be more or less depending upon the requirements of the query. I would also highly recommend ensuring that an ORDER BY clause is used to ensure your data is ordered consistently.
Obviously, the variables below need to be replaced prior the the execution of the query, but this should give a good example of how to structure the query. A real-world example is also been shown below.
SELECT {FIELD1}, {FIELD2}, {FIELD3}, ROW_NUMBER() OVER (PARTITION BY {FIELD1}, {FIELD2} ORDER BY {FIELD1} ASC) AS ROWNUM FROM {TABLENAME}
To show a real-world example, using a WebSphere Commerce database, the query below will assign a row number to each CATENTRY_ID within the distinct group of CATGROUP_ID and CATALOG_ID. The data is pulled from the CATGPENREL table.
SELECT CATALOG_ID, CATGROUP_ID, CATENTRY_ID, ROW_NUMBER() OVER (PARTITION BY CATALOG_ID, CATGROUP_ID ORDER BY CATENTRY_ID ASC) AS ROWNUM FROM CATGPENREL
NOTE: The above queries are designed to work in DB2. The may or may not work on other DB platforms.