Friday 2 November 2018

How to select the nth row in a SQL database table?

I'm interested in learning some (ideally) database agnostic ways of selecting the nth row from a database table. It would also be interesting to see how this can be achieved using the native functionality of the following databases:
  • SQL Server
  • MySQL
  • PostgreSQL
  • SQLite
  • Oracle
I am currently doing something like the following in SQL Server 2005, but I'd be interested in seeing other's more agnostic approaches:
WITH Ordered AS (
SELECT ROW_NUMBER() OVER (ORDER BY OrderID) AS RowNumber, OrderID, OrderDate
FROM Orders)
SELECT *
FROM Ordered
WHERE RowNumber = 1000000

 Answers


There are ways of doing this in optional parts of the standard, but a lot of databases support their own way of doing it.
Basically, PostgreSQL and MySQL supports the non-standard:
SELECT...
LIMIT y OFFSET x 
Oracle, DB2 and MSSQL supports the standard windowing functions:
SELECT * FROM (
  SELECT
    ROW_NUMBER() OVER (ORDER BY key ASC) AS rownumber,
    columns
  FROM tablename
) AS foo
WHERE rownumber <= n
(which I just copied from the site linked above since I never use those DBs)
Update: As of PostgreSQL 8.4 the standard windowing functions are supported, so expect the second example to work for PostgreSQL as well.



I'm not sure about any of the rest, but I know SQLite and MySQL don't have any "default" row ordering. In those two dialects, at least, the following snippet grabs the 15th entry from the_table, sorting by the date/time it was added:
SELECT * FROM the_table ORDER BY added DESC LIMIT 1,15
(of course, you'd need to have an added DATETIME field, and set it to the date/time that entry was added...)



I suspect this is wildly inefficient but is quite a simple approach, which worked on a small dataset that I tried it on.
select top 1 field
from table
where field in (select top 5 field from table order by field asc)
order by field desc
This would get the 5th item, change the second top number to get a different nth item
SQL server only (I think) but should work on older versions that do not support ROW_NUMBER().



Verify it on SQL Server:
Select top 10 * From emp 
EXCEPT
Select top 9 * From emp
This will give you 10th ROW of emp table!



Oracle:
select * from (select foo from bar order by foo) where ROWNUM = x



SQL SERVER

Select n' th record from top
SELECT * FROM (
SELECT 
ID, NAME, ROW_NUMBER() OVER(ORDER BY ID) AS ROW
FROM TABLE 
) AS TMP 
WHERE ROW = n
select n' th record from bottom
SELECT * FROM (
SELECT 
ID, NAME, ROW_NUMBER() OVER(ORDER BY ID DESC) AS ROW
FROM TABLE 
) AS TMP 
WHERE ROW = n



ADD:
LIMIT n,1
That will limit the results to one result starting at result n.



Here's a generic version of a sproc I recently wrote for Oracle that allows for dynamic paging/sorting - HTH
-- p_LowerBound = first row # in the returned set; if second page of 10 rows,
--                this would be 11 (-1 for unbounded/not set)
-- p_UpperBound = last row # in the returned set; if second page of 10 rows,
--                this would be 20 (-1 for unbounded/not set)

OPEN o_Cursor FOR
SELECT * FROM (
SELECT
    Column1,
    Column2
    rownum AS rn
FROM
(
    SELECT
        tbl.Column1,
        tbl.column2
    FROM MyTable tbl
    WHERE
        tbl.Column1 = p_PKParam OR
        tbl.Column1 = -1
    ORDER BY
        DECODE(p_sortOrder, 'A', DECODE(p_sortColumn, 1, Column1, 'X'),'X'),
        DECODE(p_sortOrder, 'D', DECODE(p_sortColumn, 1, Column1, 'X'),'X') DESC,
        DECODE(p_sortOrder, 'A', DECODE(p_sortColumn, 2, Column2, sysdate),sysdate),
        DECODE(p_sortOrder, 'D', DECODE(p_sortColumn, 2, Column2, sysdate),sysdate) DESC
))
WHERE
    (rn >= p_lowerBound OR p_lowerBound = -1) AND
    (rn <= p_upperBound OR p_upperBound = -1);



For example, if you want to select every 10th row in MSSQL, you can use;
SELECT * FROM (
  SELECT
    ROW_NUMBER() OVER (ORDER BY ColumnName1 ASC) AS rownumber, ColumnName1, ColumnName2
  FROM TableName
) AS foo
WHERE rownumber % 10 = 0
Just take the MOD and change number 10 here any number you want.



For SQL Server, a generic way to go by row number is as such: SET ROWCOUNT @row --@row = the row number you wish to work on.
For Example:
set rowcount 20 --sets row to 20th row
select meat, cheese from dbo.sandwich --select columns from table at 20th row
set rowcount 0 --sets rowcount back to all rows
This will return the 20th row's information. Be sure to put in the rowcount 0 afterward.
I know noobish, but I am a SQL noob and I have used it so what can I say?



SELECT * FROM emp a
WHERE  n = (SELECT COUNT( _rowid)
              FROM emp b
             WHERE a. _rowid >= b. _rowid);



unbelievable that you can find a SQL engine executing this one ...
WITH sentence AS
(SELECT 
    stuff,
    row = ROW_NUMBER() OVER (ORDER BY Id)
FROM 
    SentenceType
    )
SELECT
    sen.stuff
FROM sentence sen
WHERE sen.row = (ABS(CHECKSUM(NEWID())) % 100) + 1



This is how I'd do it within DB2 SQL, I believe the RRN (relative record number) is stored within the table by the O/S;
SELECT * FROM (                        
   SELECT RRN(FOO) AS RRN, FOO.*
   FROM FOO                         
   ORDER BY RRN(FOO)) BAR             
 WHERE BAR.RRN = recordnumber



It seems to me that, to be efficient, you need to 1) generate a random number between 0 and one less than the number of database records, and 2) be able to select the row at that position. Unfortunately, different databases have different random number generators and different ways to select a row at a position in a result set - usually you specify how many rows to skip and how many rows you want, but it's done differently for different databases. Here is something that works for me in SQLite:
select * 
from Table 
limit abs(random()) % (select count(*) from Words), 1;
It does depend on being able to use a subquery in the limit clause (which in SQLite is LIMIT <recs to skip>,<recs to take>) Selecting the number of records in a table should be particularly efficient, being part of the database's meta data, but that depends on the database's implementation. Also, I don't know if the query will actually build the result set before retrieving the Nth record, but I would hope that it doesn't need to. Note that I'm not specifying an "order by" clause. It might be better to "order by" something like the primary key, which will have an index - getting the Nth record from an index might be faster if the database can't get the Nth record from the database itself without building the result set.

0 comments:

Post a Comment