2gb table with 10 million rows, late pagination select is slow
I have table in MySQL with 10 million rows with 2 GB data
selecting IN LIFO format data is slow
Table engine is = InnoDB
table has one primary key and one unique key
SELECT * FROM link LIMIT 999999 , 50;
how I improve the performance of the table. ?
id int(11) NO PRI NULL auto_increment
url varchar(255) NO UNI NULL
website varchar(100) NO NULL
state varchar(10) NO NULL
type varchar(100) NO NULL
prio varchar(100) YES NULL
change varchar(100) YES NULL
last varchar(100) YES NULL
SELECT * FROM link LIMIT 1 , 50; is taking .9ms but current sql is taking 1000ms its 100 time taking more
This most likely is due to “early row lookup”
MySQL can be forced to do “late row lookup”. Try below query
SELECT l.* FROM ( SELECT id FROM link ORDER BY id LIMIT 999999 , 50 ) q JOIN link l ON l.id = q.id
Check this article
For the Next and Prev buttons you can use a
WHERE clause instead of
LIMIT 10 – Sample data explained below): You are on some page which shows you 10 rows with the ids
[2522,2520,2514,2513,2509,2508,2506,2504,2497,2496]. This in my case is created with
select * from link l order by l.id desc limit 10 offset 999000
For the next page you would use
limit 10 offset 999010
getting rows with ids
For the previous page you would use
limit 10 offset 998990
getting rows with ids
All above queries execute in 500 msec. Using the “trick” suggested by Sanj it still takes 250 msec.
Now with the given page with
maxId=2522 we can create queries for the Next and Last buttons using the
select * from link l where l.id < :minId -- =2496 order by l.id desc limit 10
select * from link l where l.id > :maxId -- =2522 order by l.id asc limit 10
To reverse the order you can use the query in a subselect:
select * from ( select * from link l where l.id > 2522 order by l.id asc limit 10 ) sub order by id desc
These queries execute in “no time” (less than 1 msec) and provide the same result.
You can not use this solution to create page numbers. But i don’t think you are going to output 200K page numbers.
Data used for the example and benchmarks has been created with
CREATE TABLE `link` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `url` VARCHAR(255) NOT NULL, `website` VARCHAR(100) NULL DEFAULT NULL, `state` VARCHAR(10) NULL DEFAULT NULL, `type` VARCHAR(100) NULL DEFAULT NULL, `prio` VARCHAR(100) NULL DEFAULT NULL, `change` VARCHAR(100) NULL DEFAULT NULL, `last` VARCHAR(100) NULL DEFAULT NULL, PRIMARY KEY (`id`), UNIQUE INDEX `url` (`url`) ) COLLATE='utf8_general_ci' ENGINE=InnoDB; insert into link select i.id , concat(id, '-', rand()) url , rand() website , rand() state , rand() `type` , rand() prio , rand() `change` , rand() `last` from test._dummy_indexes_2p23 i where i.id <= 2000000 and rand() < 0.5
test._dummy_indexes_2p23 is a table containing 2^23 ids (about 8M). So the data contains about 1M rows randomly missing every second id. Table size: 228 MB
Due to the large amount of data,
There are few tips for improve the query response time:
- Change the storage engine Innodb to myisam.
- Create table partitioning
- Mysql cluster (http://dev.mysql.com/doc/refman/5.7/en/mysql-cluster-overview.html)
- Increase hardware capacity.
First of all running on your table without any order doesn’t guaranty your query will return the same data if ran twice.
It’s better adding an
ORDER BY clause. Taking
id as a good candidate, as it’s your primary key and seems unique (as it’s an
You could use this as your base:
SELECT * FROM link ORDER BY id LIMIT 50;
This will give you the first 50 rows in your table.
Now for the next 50 rows, instead of using
OFFSET, we could save our last location in the query.
You would save the id from the last row last id from the previous query and use it in the next query:
SELECT * FROM link WHERE id > last_id ORDER BY id LIMIT 50;
This will give you the next 50 rows after the last id.
The reason your query runs slowly on high values of
OFFSET is because mysql has to run on all rows in the given
OFFSET and return the last
LIMIT number of rows. This means that the bigger
OFFSET is the slower the query will run.
The solution I showed above, doesn’t depend on
OFFSET, thus the query will run at the same speed independent of the current page.
See also this useful article that explains a few other options you can choose from: http://www.iheavy.com/2013/06/19/3-ways-to-optimize-for-paging-in-mysql/
I have updated my SQL Query to this and this is taking less amount of time.
SELECT * FROM link ORDER BY id LIMIT 999999 , 50 ;