I have had some experience with optimizing the my.cnf file but my database has around 4 million records (MyISAM). I am trying to restore from a mysqldump but every time I do I eventually get the dreaded "Repair With Keycache", that may take days. Is there any way to get past this and let it roll as "Repair By Sorting"?
I have 2GB RAM, Dual Cores, lots of extra hard-drive space.
Snip out of my.cnf:
set-variable = max_connections=650
set-variable = key_buffer=256M
set-variable = myisam_sort_buffer_size=64M
set-variable = join_buffer=1M
set-variable = record_buffer=1M
set-variable = sort_buffer_size=2M
set-variable = read_buffer_size=2M
set-variable = query_cache_size=32M
set-variable = table_cache=1024
set-variable = thread_cache_size=256
set-variable = wait_timeout=7200
set-variable = connect_timeout=10
set-variable = max_allowed_packet=16M
set-variable = max_connect_errors=10
set-variable = thread_concurrency=8
Answers
"Repair by sorting" uses the filesort routine, which in turn creates several temporary files (usually) in your tmpdir.
If your tmpdir does not have enough space for them, it will revert to "Repair by keycache". This is extremely bad as it's much slower AND creates less optimal indexes.
There are some other conditions but I haven't identified them.
Working out the size of tmpdir you need for filesort() is nontrivial; the format data are stored in the filesort buffer is not the same as MYD files, it typically uses a lot more space.
So if your tmpdir points at a small /tmp (or tmpfs), you might want to change it to a larger /var/tmp - if that exists.
I accidentially ran a repair table quick on a new database which I had not set up to be fast reg. myisam_max_sort_file_size which was way too small compared to the .MID file (which is 88279393280 byes large, about 88GB). The data file is 85GB. The table is 1.2 billion records, consisting of an ID, two dates, a tinytext ,a few bigints and a double. My server (2GB virtual linux running in a box under windows7) only have one core of the 4 on the windows server, but it is running 3+ GHZ. I was fearing this "repair by keycache" event would take forever - given horror stories with far smaller tables.
Fortunately it "only" took 1 day , 10 hours and 20.72 seconds to complete the repair table quick operation.
What I miss the most is some way of knowing how far into the operation that mysql is, and how soon it might be finished. This is still unknown to me.
I have now changed my my.ini file and double checked with df that I have ample disk space for those large temporary files.
Anyway.. my main point, which might be very useful knowledge to the next guy who falls into this trap.. is in fact... don't panic! it might be slow, but it is possible on rather sub-par hardware to get 1+ billion records sorted out within a day or two. Got three indexes, one on a date field, one on a bigint field, and one primary on the ID field.
I would've posted this as a comment to one of the solutions, but I can't seem to figure how to do this, with the user interface here, so I'll drop it off as a solution. Don't upvote me, it's just a note that I would have loved to have here, I was almost going to kill my "sort by keycache" thread as I thought it could take a week or more. 2 days per billion records is manageable..
Edit: And now, a repair table on the same database, but with a large enough mysiam_max_sort_file_size setting took 10 hours, 20 minutes using repair by sorting. The most diskspace used was about 250GB, but i had set myisam_max_sort_file_size a lot higher, reflecting how much disk space is actually free on the server.
Tracking progress is hard. Disk space went up and down while the individual indexes were built, but there were hour long pauses where no changes were made reg. disk space usage (as reported by df).
According to the MySQL Reference Manual, disk space must be available "in the file system containing the directory where the original index file is located" (http://dev.mysql.com/doc/refman/5.5/en/server-system-variables.html#sysvar_myisam_max_sort_file_size) -- this applies to (at least) v5.0 and above. This contradicts some of the above answers, that claim that increasing the disk space for the tmp directory would help.
I can confirm the behaviour described in the Reference Manual: temporary disk space is used where the table's data (
*.MYD
) & index files (*.MYI
) are stored, but not in tmpdir
.
0 comments:
Post a Comment