2. MySQL Data Types
As a programmer, understanding the data types is curial in understanding the working of the underlying database system.
MySQL supports many data types, grouped in 3 categories:
- Numeric: including integers, floating-point numbers, bits and boolean.
- String: including fixed-length and variable-length strings, binary data, and collections (enumeration and set).
- Date/Time: Date and time are extremely important in database applications. This is because business records often carry date/time information (e.g.,
orderDate
,paymentDate
). There is also a need to time-stamp the creation and last update of records for auditing and security considerations. MySQL provides many date/time data types (e.g.,DATETIME
,DATE
,TIME
,YEAR
,TIMESTAMP
) and built-in functions for date/time manipulation.
2.1 Numeric - Integers
Integers, by default, are signed integers; unless
UNSIGNED
is declared.
You could set the display width, using the syntax
INTEGER_TYPE(n)
, where n
is the display field-width of up to 255, e.g., INT(10)
. You could also specify ZEROFILL
to pad the displayed numbers with leading zeros (for UNSIGNED
only) instead of blanks. The field-width affects only the display, and not the number stored.
MySQL supports many integer types with various precisions and ranges.
- TINYINT: 8-bit precision. The range is
[-128, 127]
for signed integer, and[0, 255]
for unsigned. - SMALLINT: 16-bit precision. The range is
[-32768, 32767]
for signed integer, and[0, 65535]
for unsigned. - MEDIUMINT: 24-bit precision. The range is
[-8388608, 8388607]
for signed integer, and[0, 16777215]
for unsigned. - INT (or INTEGER): 32-bit precision. The range is
[-2147483648, 2147483647]
for signed integer, and[0, 4294967295]
for unsigned. - BIGINT: 64-bit precision. The range is
[-9223372036854775808, 9223372036854775807]
for signed integer, and[0, 18446744073709551615]
for unsigned. - BIT(n): A n-bit column. To input a bit-value, use the syntax
b'value'
or0bvalue
, e.g.,b'1001'
,0b10010
. - BOOLEAN (or BOOL): same as
TINYINT(1)
(with range of -128 to 127, display field-width of 1). Value of zero is considered false; non-zero is considered true. You could also useBIT(1)
to store boolean value.
Choosing the right integer type is important for optimizing storage usage and computational efficiency. For example, for a integer column with a range of 1 to 9999, use
SMALLINT(4) UNSIGNED
: for 1 to 9999999, use MEDIUMINT(7) UNSIGNED
(which shall be sufficient for a small business, but always allow more room for future expansion).
Integers (and floating-point numbers to be discussed later) could be declared as
AUTO_INCREMENT
, with default starting value of 1. When you insert a NULL
(recommended) (or 0, or a missing value) into an AUTO_INCREMENT
column, the maximum value of that column plus 1 would be inserted. You can also insert any valid value to an AUTO_INCREMENT
column, bypassing the auto-increment. Take note that further INSERT
would fail if the last INSERT
has reached the maximum value. Hence, it is recommended to use INT
for AUTO_INCREMENT
column to avoid handling over-run. There can only be one AUTO_INCREMENT
column in a table and the column must be defined as a key.
Example (Testing the Integer Data Types): Read "
integer_arena.sql
".
Example (Testing AUTO_INCREMENT): autoincrement_arena.sql.
2.2 Numeric - Fixed-Point & Floating-Point Numbers
MySQL supports both approximated floating points (
FLOAT
and DOUBLE
) and exact fixed-point point (DECIMAL
).- FLOAT: 32-bit single precision floating-point numbers. You can specify
UNSIGNED
to disallow negative number, andZEROFILL
to pad the displayed number with zeros. - DOUBLE (or DOUBLE PRECISION or REAL): 64-bit double precision floating-point numbers.
- DECIMAL(n, d) (or DEC or NUMERIC or FIXED): fixed-point decimal numbers, where
n
is the number of digits withd
decimal places. For example,DECIMAL(6, 2)
specifies 6 total digits (not including the decimal point) with 2 digit after the decimal point, which has the range of-9999.99
to9999.99
.
UnlikeINT(n)
, wheren
specifies the display field-width and does not affect the number stored;n
inDECIMAL(n, d)
specifies the range and affects the number stored.
The fixed-pointDECIMAL
is not available in popular languages such as C/C++/Java/C#.DECIMAL
is an exact representation, as it is represented as integer with a fix decimal point. On the other hand,FLOAT
andDOUBLE
values are approximated and inexact, which create problems in comparisonand cause rounding errors in arithmetic operations. Attempt to compare inexact values using'='
and'<>'
leads to unexpected result.DECIMAL
type was created for representing currency, which is exact and does not suffer from rounding errors in addition and subtraction. For example,DECIMAL(n, 2)
.DECIMAL
values can be compared with'='
or'<>'
. - FLOAT(n, d) or DOUBLE(n, d): Stored with
n
total digits withd
decimal places. Exact or Approximated?? Advisable to useDECIMAL(n, d)
.
The fixed-point and floating-point numbers can also be declared as
UNSIGNED
to exclude negative numbers.
Example (Testing the Floating-point Data Type): Read "
floatingpoint_arena.sql
".2.3 Character Sets and Collations
Strings are made up of a sequence of characters. Before presenting the various string data types, it is important to understand the so-called character set and collating sequence (or collation).
Character Sets
A character set (also called charset, character encoding, character map, code page) defines a set of character and maps each character to a unique numeric code. For example, in ASCII, character
'A'
is assigned 65, 'a'
is 97, '0'
is 48 and space is 32. The most commonly-used character sets are the 8-bit Latin-1 (ISO/IEC 8859-1) for English characters (which is backward compatible with 7-bit ASCII), and multi-byte Unicode (ISO/IEC 10646) for internationalization. Unicode can be encoded in variable-length UTF8 (1-3 bytes) for fixed-length UCS2 (2 bytes).
MySQL's UCS-2 is in big-endian without BOM (Byte Order Mark), i.e., UCS-2BE or UTF2BE. MySQL uses no BOM for UTF-8.
Collating Sequences
A collating sequence (or collation) refers to the orders in which individual characters should be ranked in comparing or sorting the strings. In other words, a collation is a set of rules for ranking characters in a character set.
A string may contain uppercase and lowercase characters, digits and symbols. As an example, suppose we wish to sort three strings:
"apple"
, "BOY"
, and "Cat"
encoded in ASCII. If we rank the characters according to the underlying ASCII code numbers, the order would be "BOY"
, "apple"
and"Cat".
It is because uppercase letters have smaller code numbers than lowercase letters in ASCII code. Ranking strings according to their underlying code numbers is known as binary collation. Binary collation does not agree with the so-called dictionary order, where the same uppercase and lowercase letters have the same rank.
There are many other collating sequences available. Some case-insensitive (
ci
) dictionary-order collating sequences have the same rank for the same uppercase and lowercase letter, i.e., 'A'
, 'a'
⇒ 'B'
, 'b'
⇒ ... ⇒ 'Z'
, 'z'
. Some case-sensitive (cs
) dictionary-order collating sequences put the uppercase letter before its lowercase counterpart, i.e., 'A'
⇒'a'
⇒ 'B' ⇒
'b'
⇒ ... ⇒ 'Z'
⇒ 'z'
. Typically, space is ranked before digits '0'
to '9'
, followed by the alphabets.
Collating sequence is also language dependent, as different languages use different sets of characters, e.g., a, A, α, Ä, á, é. It becomes more complicated if more than one languages are used, e.g., you need to decide how to rank 'a' and 'α', by choosing an appropriate collating sequence.
MySQL Character Sets and Collations
MySQL 5.5 supports 39 character sets with 197 collations.
You could use command
SHOW CHARACTER SET
to display all the supported character sets and their default collation; and SHOW COLLATION
to display all the collations and the charsets their support.mysql> SHOW CHARACTER SET; +----------+-----------------------------+---------------------+--------+ | Charset | Description | Default collation | Maxlen | +----------+-----------------------------+---------------------+--------+ | latin1 | cp1252 West European | latin1_swedish_ci | 1 | | utf8 | UTF-8 Unicode | utf8_general_ci | 3 | | ucs2 | UCS-2 Unicode | ucs2_general_ci | 2 | | ascii | US ASCII | ascii_general_ci | 1 | ...... 39 rows in set (0.06 sec) mysql> SHOW COLLATION; +--------------------------+----------+-----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +--------------------------+----------+-----+---------+----------+---------+ | big5_chinese_ci | big5 | 1 | Yes | Yes | 1 | | big5_bin | big5 | 84 | | Yes | 1 | ...... 197 rows in set (0.01 sec)
The default character set in MySQL is
latin1
(aka ISO-8859-1). The default collation for latin1
is latin1_swdish_ci
, which is a case-insensitive collation suitable for Swedish/finnish (MySQL was created by some Swedishs). You can list the available collations for latin1
, as follows:mysql> SHOW COLLATION LIKE 'latin1%'; +-------------------+---------+----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +-------------------+---------+----+---------+----------+---------+ | latin1_german1_ci | latin1 | 5 | | Yes | 1 | | latin1_swedish_ci | latin1 | 8 | Yes | Yes | 1 | | latin1_danish_ci | latin1 | 15 | | Yes | 1 | | latin1_german2_ci | latin1 | 31 | | Yes | 2 | | latin1_bin | latin1 | 47 | | Yes | 1 | | latin1_general_ci | latin1 | 48 | | Yes | 1 | | latin1_general_cs | latin1 | 49 | | Yes | 1 | | latin1_spanish_ci | latin1 | 94 | | Yes | 1 | +-------------------+---------+----+---------+----------+---------+
latin1_general_ci
is a case-insensitive collation for multilingual western european. latin1_general_cs
is case-sensitive. latin1_bin
is the binary collation according to latin1
encoding.
I recommend that you use
utf8
charset for applications that require internationalization (i18n) support (because utf8
supports all the languages in this world). The default collation for utf8
is utf8_general_ci
. Other collations for utf8
includes:mysql> SHOW COLLATION LIKE 'utf8%'; +--------------------+---------+-----+---------+----------+---------+ | Collation | Charset | Id | Default | Compiled | Sortlen | +--------------------+---------+-----+---------+----------+---------+ | utf8_general_ci | utf8 | 33 | Yes | Yes | 1 | | utf8_bin | utf8 | 83 | | Yes | 1 | | utf8_unicode_ci | utf8 | 192 | | Yes | 8 | | utf8_icelandic_ci | utf8 | 193 | | Yes | 8 | | utf8_latvian_ci | utf8 | 194 | | Yes | 8 | | utf8_romanian_ci | utf8 | 195 | | Yes | 8 | ......
You may want to try out some of these collations by ranking space, numbers, uppercase and lowercase letters, and some typical strings.
Specifying Character Set and Collation in MySQL
You could specify the character set and collation via keyword
CHARACTER SET
(or CHARSET
) and COLLATE
. They could be set at 4 levels: server, database, table, and column.
To set the charset and collation for the server, start
mysqld
with --character-set-server=charset
and --collation-server=collation
options. The default charset is latin1
.
You can set the default charset and collation for an database in
CREATE DATABASE
. For example,CREATE DATABASE databaseName DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
You can set the character set and collation for table and column in
CREATE TABLE
. For example,CREATE TABLE IF NOT EXISTS StringArena ( cString1 VARCHAR(50) CHARACTER SET latin1 COLLATE latin1_general_cs, -- case-sensitive 'A' -> 'a' -> 'B' -> 'b' cString2 CHAR(10) CHARACTER SET latin1, -- default collation of latin1_swedish_ci cString3 TEXT ) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci; -- case-insensitive 'A' and 'a' are the same
In the above example, the default character set for all the string columns for the table is
utf8
(for unicode support) with collating sequence of utf8_unicode_ci
(default). However, column cString1
uses character set latin1
with collation latin1_general_cs
; column cString2
uses character set Latin1
with default collation latin1_swedish_ci
(not from table default but from the charset default); column cString3
uses the default character set and collation of the table.
Example (Testing Character Sets and Collations): Read "
charset_arena.sql
".Character Set and Collation Related System Variables
There are quite a number of system variables related to charset and collation, with names starting with "
character_set_
" and "collation_
", respectively. You can list them as follows:mysql> SHOW VARIABLES LIKE 'character\_set\_%'; -- Use the escape'\_'
for'_'
+--------------------------+--------+ | Variable_name | Value | +--------------------------+--------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | <= default charset for database | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | <= default charset for server | character_set_system | utf8 | <= meta-data +--------------------------+--------+ 7 rows in set (0.00 sec) mysql> SHOW VARIABLES LIKE 'collation\_%'; +----------------------+-------------------+ | Variable_name | Value | +----------------------+-------------------+ | collation_connection | latin1_swedish_ci | | collation_database | latin1_swedish_ci | <= default collation for database | collation_server | latin1_swedish_ci | <= default collation for server +----------------------+-------------------+ 3 rows in set (0.00 sec)
The system variables
character_set_server
, character_set_database
and collation_server
, collation_database
maintain the default charset and collation for the server and the current database respectively.
In MySQL, columns, tables, databases may use different character sets. However, the meta-data (such as column names and attributes) must be stored in the same character set, and include all languages. Hence, meta-data, by default, stored in UTF-8, as maintained in the system variable
character_set_system
.Character Set for Client-Server Communication
A client connects to the server via a so-called connection object. The client sends the SQL statements over the connection to the server. The server returns the results via the connection to the client. Server maintains a connection for each client, and maintains connection-related character set and collation system variables for each client connection.
The client sends its statements using
character_set_client
. They are translated to character_set_connection
, and uses collation_connection
for comparing string literals. The server returns the results to the client using character_set_results
. You can use command "SET NAMEScharset
" (e.g., "SET NAMES 'utf8'
") to change the character set used for client-server communication. "SET NAMES
", in effect, changes system variables character_set_client
, character_set_connection
, and character_set_results
. For example,-- Default charset for client, connection, and results is latin1 mysql> SHOW VARIABLES LIKE 'character\_set\_%'; +--------------------------+--------+ | Variable_name | Value | +--------------------------+--------+ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | latin1 | | character_set_server | latin1 | | character_set_system | utf8 | +--------------------------+--------+ 7 rows in set (0.00 sec) mysql> SET NAMES 'utf8'; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'character\_set\_%'; +--------------------------+--------+ | Variable_name | Value | +--------------------------+--------+ | character_set_client | utf8 | <== | character_set_connection | utf8 | <== | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | utf8 | <== | character_set_server | latin1 | | character_set_system | utf8 | +--------------------------+--------+ 7 rows in set (0.00 sec)
When a client (e.g.,
mysql
, mysqlshow
, mysqladmin
) connects to the server, it sends the name of the character set that it wants to use to the server (default is latin1
; or specify in --default-character-set
startup option of the client). The server uses this name to set the character_set_client
, character_set_results
, and character_set_connection
system variables for this client connection. In effect, the server performs a "SET NAMES
".2.4 String Data Types
MySQL supports fixed-length and variable-length string types:
- CHAR(n): fix-length string of
n
characters, right-pad with spaces. The maximumn
is 255 characters. - VARCHAR(n): variable-length string of up to
n
characters. The maximumn
is 65535 bytes of storage (utf8
characters range from 1-4 bytes). - TINYTEXT: up to 255 Bytes
- TEXT: up to 64 KBytes
- MEDIUMTEXT: up to 16 MBytes
- LONGTEXT: up to 4 GBytes
Example: [TODO]
String Literals and String Introducer
A string literal has on optional charset introducer and collate clause. The introducer tells the parser that the string that is followed uses a certain character set.
// Syntax [_charsetName]'string' [COLLATE collationName] // Example SELECT 'Test String'; SELECT _latin1'Test String'; SELECT _latin1'Test String' COLLATE ... SELECT HEX('abc'); -- 616263 SELECT _latin1 x'616263'; -- 'abc': can express string in hex code SELECT HEX(_utf8'abc'); -- 616263
CONVERT(... USING ...) Function
CONVERT(expr USING charset)
can be used to convert string between different character sets, e.g.,SELECT CONVERT(_latin1'....' USING utf8); INSERT INTO utf8Table (utf8Column) VALUES ((SELECT CONVERT(latin1Column USING utf8) FROM latin1Table)));
String introducer merely interprets the string literal and does not change its value; whereas
CONVERT
() returns a new value of the specified type. For example,-- default charset is latin1 mysql> SELECT HEX(CONVERT('abc' USING ucs2)); +--------------------------------+ | HEX(CONVERT('abc' USING ucs2)) | +--------------------------------+ | 006100620063 | +--------------------------------+ -- Returns a new UCS2 representation. -- Each character in UCS2 takes two bytes mysql> SELECT HEX(_ucs2'abc'), _ucs2'abc'; +-----------------+-----+ | HEX(_ucs2'abc') | ? | +-----------------+-----+ | 00616263 | ?? | +-----------------+-----+ -- Interpret 'abc' (or x'616263') as two UCS2 characters
2.5 UTF8 Charset
For internationalization, UTF8 charset shall be used.
UTF8-Encoded Chinese Characters in Windows' CMD Shell
In brief, do not use CMD shell to work on UTF8 charset. Use a tool such as NetBeans, Eclipse, MySQL workbench or PhpMyAdmin which fully supports UTF8 charset.
If you want to display a text file encoded in a particular character set (e.g., ASCII, Latin1, UTF8) in CMD correctly, you need to choose a matching "codepage". For example, ANSI/Latin1 requires codepage 1252, Simplified Chinese GBK (GB2312) requires 936, UTF8 requires 65001, UCS2LE requires 1200, UCS2BE requires 1201, and the original DOS extended-ASCII uses codepage 437.
CMD is NOT able to detect the character set automatically and switch to the matching codepage. To set the codepage in CMD, issue command "
chcp color-page-number
", e.g. "chcp 65001
" to choose UTF8 codepage. This codepage could display UTF8 text correctly, and display garbage if the text is encoded using other character set, including UCS2, GB2312.
You also have to choose a font (such as Consolas, Lucida Console, but NOT raster font) that supports the characters. To choose the font for CMD, click on the CMD icon ⇒ "Properties" ⇒ "Font".
There are a number of character encoding schemes for Chinese characters, e.g., Unicode (UTF8, UCS2GE, UCS2LE), GB2312, GBK and BIG5, which are not compatible.
Unicode is an international standard, which supports all languages, including Chinese, Japanese and Korean (CJK). Unicode has two frequently-used encoding schemes: UTF8 (1-3 bytes variable-length, the 1-byte codes are backward-compatible with ASCII) and UCS2 (2 byte fixed-length). UTF8 is ideal if your text consists of mainly ASCII (English text), with occasional non-ASCII text. I strongly recommend using UTF8 character set for MySQL columns that require internationalization support. GB2312 is a Chinese national standard for simplified chinese, it is not compatible with UCS2 or UTF8. For example, for the chinese characters "您好", the UCS2 is "
60A8 597D
", the UTF8 is "E682A8 E5A5BD
", the GB1232 is "C4FA BAC3
".
Unfortunately, in CMD, the Microsoft Pinyin IME (MSPY), the Chinese character input tool, is disabled for codepage 65001 (UTF8) (?!?!). MSPY works on GB2312 (codepage 936). If you run your MySQL client in CMD, using codepage 936 (GB2312) so that you can input chinese characters (e.g., for
INSERT
and UPDATE
), but your database character columns are using UTF8, you need to handle this situation with great care.
In MySQL client, you can issue the command "
SET NAMES 'charset
'" to specify the character set used for the client/server communication. (Alternatively, you could start 'mysql'
client with the --default-character-set=charset
option.)
In the above situation, you need to issue "
SET NAMES 'gb2312'
", as your input is encoded in GB2312, which is sent to the server. The server receives the characters in GB2312, converts them into the character set defined for the column, which is UTF8, before storing them. During query, the server retrieves the characters from the columns in UTF8, convert to GB2312 (according to the SET NAMES
), and send to the client in CMD, which uses codepage 936 to correctly display the text. Try the following commands:mysql> SET NAMES 'gb2312'; Query OK, 0 rows affected (0.00 sec) mysql> SHOW VARIABLES LIKE 'character\_set\_%'; +--------------------------+--------+ | Variable_name | Value | +--------------------------+--------+ | character_set_client | gb2312 | | character_set_connection | gb2312 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results | gb2312 | | character_set_server | latin1 | | character_set_system | utf8 | +--------------------------+--------+ 7 rows in set (0.00 sec)
Observe that
SET NAMES
affects the character_set_client
, character_set_connection
, and character_set_results
.
Suppose that you interacts with the server via a Java program, written in Eclipse or others. You should set the encoding to UTF8, and include "
SET NAMES 'utf8'
" in your MySQL script.
Similarly, suppose that you are running a SQL script in batch mode or using source command in CMD, and your script in encoded in UTF8, then you should include "
SET NAMES 'utf8'
" in the script ("SET NAMES 'gb2312'
" results in garbage inserted into the database). But to query interactively in CMD using code page 936, you need to "SET NAMES 'gb2312'
" to convert UTF8 to GB2312 (cp936).
I would expect that, in CMD using codepage 65001, I could display the UTF8 column from MySQL correctly with "
SET NAMES 'utf8'
". But it does not work in my computer (?!?!).
Take note that you could use
HEX()
function to request the hex code of the characters, stored in the database.SELECT `id`, `description`, HEX(`description`) FROM `ChineseUtf8Arena` ORDER BY `description`, `id`;
Example (Testing UTF8-encoded Chinese Characters):
Notes on Using UTF8-Encoded Characters in Eclipse/Java
- Open Run Dialog ⇒ "your application" ⇒ Common Tab ⇒ Console Encoding ⇒ Other ⇒ "set it to UTF-8".
- Open Run Dialog ⇒ "your application" ⇒ Arguments Tab ⇒ VM Arguments ⇒ Add "
-Dfile.encoding=UTF-8
"; or add "-Dfile.encoding=UTF-8
" to "eclipse.ini
". - Window Menu ⇒ General ⇒ Workspace ⇒ Text file encoding ⇒ set to "UTF-8".
2.6 Binary Data and BLOB
MySQL support both character string (a sequence of characters) and binary string (a sequence of bytes). Binary data (e.g., images and codes) are special string with character set of
binary
, that is, all characters are treated as binary raw bytes.- BINARY(n): Similar to
CHAR(n)
, but store binary data with fixed-length ofn
bytes. - VARBINARY(n): Similar to
VARCHAR(n)
, but store binary data of variable-length of up ton
bytes. - TINYBLOB: up to 255 bytes
- BLOB: up to 64KBytes
- MEDIUMBLOB: up to 16MBytes
- LONGBOLB: up to 4GBytes
Example: The
LOAD_FILE()
function, which reads from a file and returns a string, can be used to load a BLOB
/TEXT
field with a binary/text file.CREATE TABLE IF NOT EXISTS `blob_arena` ( `id` INT UNSIGNED ZEROFILL NOT NULL AUTO_INCREMENT, `photo` BLOB, PRIMARY KEY(`ID`) ); INSERT INTO `blob_arena` (`photo`) VALUES (LOAD_FILE('d:/path/test.jpg'));
Example: Storing a thumbnail image [Refer to Rental Database Example].
2.7 ENUM
ENUM
is a special string type. An ENUM
(or enumeration) is a string with a value chosen from a list of allowed values (or members). The list of allowed values are defined explicitly in the column definition of the CREATE TABLE
command. For example,CREATE TABLE Shoes ( brand ENUM('abc', 'xyz', 'qqq'); size ENUM('S', 'M', 'L', 'XL', 'XXL'); customerRating ENUM('1', '2', '3', '4', '5'); );
The value must be a quoted string literal. Duplicate values are not allowed.
NULL
is allowed only if the column is declared to allow NULL
. If you insert a value not in the ENUM
list, an empty string (''
) would be inserted, which signals an error value.
Each
ENUM
type is associated with an index, beginning with 1 for the first allowable value. The empty string has index of 0 (to denotes an error value). The index of NULL
is NULL
.
ENUM is typically represented in 16-bit, which allows 65,535 members.
Example (Testing ENUM Type):
- Create a table with
ENUM
columns. Observe thatENUM
's values are quoted string literals. Numbers must be stored as quoted strings.CREATE TABLE IF NOT EXISTS `enum_arena` ( `desc` VARCHAR(50) NULL, `brand` ENUM('abc', 'xyz', 'qqq') NOT NULL DEFAULT 'qqq', `size` ENUM('S', 'M', 'L', 'XL', 'XXL') NOT NULL, `customerRating` ENUM('5', '4', '3', '2', '1') DEFAULT NULL ); DESCRIBE `enum_arena`; +----------------+------------------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------------+------------------------------+------+-----+---------+-------+ | desc | varchar(50) | YES | | NULL | | | brand | enum('abc','xyz','qqq') | NO | | qqq | | | size | enum('S','M','L','XL','XXL') | NO | | NULL | | | customerRating | enum('5','4','3','2','1') | YES | | NULL | | +----------------+------------------------------+------+-----+---------+-------+
- Insert values into
ENUM
fields. Check case-sensitivity and values not in the enum list. Observe that theENUM
string is NOT case-sensitive (in MySQL). Empty string (''
), denoting erroneous value, would be inserted for value not in theENUM
list.INSERT INTO `enum_arena` (`description`, `brand`, `size`) VALUES ('Valid value', 'abc', 'm'), ('Uppercase value', 'XYZ', 'M'), ('Invalid value', 'qqq', 'unknown'); UPDATE `enum_arena` SET customerRating='3'; SELECT * FROM `enum_arena`; +-----------------+-------+------+----------------+ | description | brand | size | customerRating | +-----------------+-------+------+----------------+ | Valid value | abc | M | 3 | | Uppercase value | xyz | M | 3 | | Invalid value | qqq | | 3 | +-----------------+-------+------+----------------+
- Using numeric index as
ENUM
's value: Instead of the literal string value, you could also use the numeric index associated with eachENUM
type. The numeric index begins at 1, for the first value is the list.INSERT INTO `enum_arena` VALUES ('Use numeric index', 'qqq', 2, 1); SELECT * FROM `enum_arena` WHERE `description`='Use numeric index'; +-------------------+-------+------+----------------+ | description | brand | size | customerRating | +-------------------+-------+------+----------------+ | Use numeric index | qqq | M | 5 | +-------------------+-------+------+----------------+
- Selecting the error records: The empty string (erroneous value) has index of 0 (error code). You could use the error index of 0 to retrieve the erroneous records as follows.
SELECT * FROM Shirts WHERE brand=0 OR size=0 OR customerRating=0; +-------+------+----------------+ | brand | size | customerRating | +-------+------+----------------+ | qqq | | 3 | +-------+------+----------------+
- To retrieve all the allowable values of an
ENUM
, you could query the MySQL systemdatabase information_schema
, tablecolumns
, as follows:SELECT column_type FROM information_schema.columns WHERE table_schema='test' AND table_name='enum_arena' AND column_name='brand'; +-------------------------+ | column_type | +-------------------------+ | enum('abc','xyz','qqq') | +-------------------------+
2.8 SET
SET
is also a special string type. SET
is similar to ENUM
, but you could choose zero (empty string) or more values from the allowable values; or NULL
if configured. For a SET
('a'
, 'b'
, 'c'
), the valid values are ''
(empty string), 'a'
, 'b'
, 'c'
, 'a,b'
, 'a,c'
, 'b,c'
and 'a,b,c'
. Commas are used to delimit multiple values, with no spaces around the commas.
A set can have a maximum of 64 members. In MySQL, a set values are stored as a 64-bit integer, with the least-significant bit corresponding to the first member. For example, for a
SET
('a'
, 'b'
, 'c'
), selected numeric value are:SET member | Binary Value | Decimal Value |
---|---|---|
'a' | 001 | 1 |
'b' | 010 | 2 |
'c' | 100 | 4 |
'' | 000 | 0 |
'a,b,c' | 111 | 7 |
Similar to
ENUM
, you can use the numeric value.
Example (Testing SET Data Type):
[TODO] Explanation.
2.9 Date/Time
Date and time (as well as currency) are of particular interest for database applications. It is because business records often carry date/time information (e.g.,
orderDate
, deliveryDate
, paymentDate
), as well as the need to time-stamp the last update of the records for auditing and security.
MySQL provides these built-in functions for getting the current date and/or time:
- NOW(): returns the current date and time in the format of 'YYYY-MM-DD HH:MM:SS'.
- CURDATE() (or CURRENT_DATE(), or CURRENT_DATE): returns the current date in the format of 'YYYY-MM-DD'.
- CURTIME() (or CURRENT_TIME(), or CURRENT_TIME): returns the current time in the format of 'HH:MM:SS'.
For examples,
mysql> select now(), curdate(), curtime(); +---------------------+------------+-----------+ | now() | curdate() | curtime() | +---------------------+------------+-----------+ | 2010-04-08 16:28:06 | 2010-04-08 | 16:28:06 | +---------------------+------------+-----------+
MySQL provides these date/time data type:
- DATETIME: stored both date and time in the format of 'YYYY-MM-DD HH:MM:SS' where the date and time are separated by a space. The valid range is
'1000-01-01 00:00:00'
to'9999-12-31 23:59:59'
. You could apply functionsNOW()
orCURDATE()
(time will be set to'00:00:00'
) on this field, but notCURTIME()
. - DATE: stored date only in the format of 'YYYY-MM-DD'. The range is
'1000-01-01'
to'9999-12-31'
. You could useCURDATE()
orNOW()
(the returned time portion discarded) for this field. - TIME: stored time only in the format of 'HH:MM:SS'. You could use
CURTIME()
orNOW()
(the returned date portion discarded) for this field. - YEAR(4|2): in
'YYYY'
or'YY'
. The range of years is 1901 to 2155. UseDATE
type for year outside this range. You could also useCURDATE()
to retrieve the current year value for this field (month and day discarded).
- TIMESTAMP: similar to
DATETIME
but stored the number of seconds since January 1, 1970 UTC (Unix-style). The range is'1970-01-01 00:00:00'
to'2037-12-31 23:59:59'
.
The differences betweenDATETIME
andTIMESTAMP
are:- the range,
- support for time zone,
TIMESTAMP
column could be declared withDEFAULT CURRENT_TIMESTAMP
to set the default value to the current date/time. (All other data types' default, includingDATETIME
, must be a constant and not a function return value). You can also declared withON UPDATE CURRENT_TIMESTAMP
to capture the timestamp of the last update.
The date/time value can be entered manually as a string literal (e.g.,
'2010-12-31 23:59:59'
for DATAETIME
). MySQL will issue a warning and insert all zeros (e.g., '0000-00-00 00:00:00' for DATAETIME
), if the value of date/time to be inserted is invalid or out-of-range. '0000-00-00'
is called a "dummy" date.
Example (Testing DATE/TIME Data Types):
- Create a table with various date/time columns. Only the
TIMESTAMP
column can have theDEFAULT
andON UPDATE
asCURRENT_TIMESTAMP
.CREATE TABLE IF NOT EXISTS `datetime_arena` ( `desc` VARCHAR(50) DEFAULT NULL, `cDateTime` DATETIME DEFAULT '0000-00-00 00:00:00', `cDate` DATE DEFAULT '0000-00-00', `cTime` TIME DEFAULT '00:00:00', `cYear` YEAR DEFAULT '0000', `cYear2` YEAR(2) DEFAULT '00', `cTimeStamp` TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP ); DESCRIBE `datetime_arena`; +-------------+-------------+------+-----+---------------------+-----------------------------+ | Field | Type | Null | Key | Default | Extra | +-------------+-------------+------+-----+---------------------+-----------------------------+ | description | varchar(50) | YES | | NULL | | | cDateTime | datetime | YES | | 0000-00-00 00:00:00 | | | cDate | date | YES | | 0000-00-00 | | | cTime | time | YES | | 00:00:00 | | | cYear | year(4) | YES | | 0000 | | | cYear2 | year(2) | YES | | 00 | | | cTimeStamp | timestamp | NO | | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP | +-------------+-------------+------+-----+---------------------+-----------------------------+
- Insert values manually using string literals.
INSERT INTO `datetime_arena` (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`) VALUES ('Manual Entry', '2001-01-01 23:59:59', '2002-02-02', '12:30:30', '2004', '05'); SELECT * FROM `datetime_arena` WHERE description='Manual Entry'; +--------------+---------------------+------------+----------+-------+--------+---------------------+ | description | cDateTime | cDate | cTime | cYear | cYear2 | cTimeStamp | +--------------+---------------------+------------+----------+-------+--------+---------------------+ | Manual Entry | 2001-01-01 23:59:59 | 2002-02-02 | 12:30:30 | 2004 | 05 | 2010-04-08 14:44:37 | +--------------+---------------------+------------+----------+-------+--------+---------------------+
- Checking the on-update for
TIMSTAMP
.UPDATE `datetime_arena` SET `cYear2`='99' WHERE description='Manual Entry'; SELECT * FROM `datetime_arena` WHERE description='Manual Entry'; +--------------+---------------------+------------+----------+-------+--------+---------------------+ | description | cDateTime | cDate | cTime | cYear | cYear2 | cTimeStamp | +--------------+---------------------+------------+----------+-------+--------+---------------------+ | Manual Entry | 2001-01-01 23:59:59 | 2002-02-02 | 12:30:30 | 2004 | 99 | 2010-04-08 14:44:48 | +--------------+---------------------+------------+----------+-------+--------+---------------------+
- Insert values using MySQL built-in functions
now()
,curdate()
,curtime()
.INSERT INTO `datetime_arena` (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`) VALUES ('Built-in Functions', now(), curdate(), curtime(), now(), now()); SELECT * FROM `datetime_arena` WHERE description='Built-in Functions'; +--------------------+---------------------+------------+----------+-------+--------+---------------------+ | description | cDateTime | cDate | cTime | cYear | cYear2 | cTimeStamp | +--------------------+---------------------+------------+----------+-------+--------+---------------------+ | Built-in Functions | 2010-04-08 14:45:48 | 2010-04-08 | 14:45:48 | 2010 | 10 | 2010-04-08 14:45:48 | +--------------------+---------------------+------------+----------+-------+--------+---------------------+
- Insert invalid or out-of-range values. MySQL replaces with all zeros.
INSERT INTO `datetime_arena` (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`) VALUES ('Error Input', '2001-13-31 23:59:59', '2002-13-31', '12:61:61', '99999', '999'); SELECT * FROM `datetime_arena` WHERE description='Error Input'; +-------------+---------------------+------------+----------+-------+--------+---------------------+ | description | cDateTime | cDate | cTime | cYear | cYear2 | cTimeStamp | +-------------+---------------------+------------+----------+-------+--------+---------------------+ | Error Input | 0000-00-00 00:00:00 | 0000-00-00 | 00:00:00 | 0000 | 00 | 2010-04-08 14:46:10 | +-------------+---------------------+------------+----------+-------+--------+---------------------+
- An useful built-in function
INTERVAL
can be used to compute a future date, e.g.,SELECT `cDate`, `cDate` + INTERVAL 30 DAY, `cDate` + INTERVAL 1 MONTH FROM `datetime_arena`; +------------+---------------------------+----------------------------+ | cDate | `cDate` + INTERVAL 30 DAY | `cDate` + INTERVAL 1 MONTH | +------------+---------------------------+----------------------------+ | 2002-02-02 | 2002-03-04 | 2002-03-02 | | 2010-04-08 | 2010-05-08 | 2010-05-08 | | 0000-00-00 | NULL | NULL | +------------+---------------------------+----------------------------+
Example (Creation and Last Update TIMESTAMP): In MySQL, you can have only one
TIMESTAMP
column with DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
. To maintain both the creation and last updated timestamp, set the creation column to default 0 and insert a NULL
to get the CURRENT_TIMESTAMP
.CREATE TABLE timestamp_arena ( `desc` VARCHAR(30), created TIMESTAMP DEFAULT 0, last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP); -- INSERT NULL to created which results in CURRENT_TIMESTAMP INSERT INTO timestamp_arena VALUE ('Created', NULL, NULL); SELECT * FROM timestamp_arena; +---------+---------------------+---------------------+ | desc | created | last_updated | +---------+---------------------+---------------------+ | Created | 2012-10-29 20:22:06 | 2012-10-29 20:22:06 | +---------+---------------------+---------------------+ UPDATE timestamp_arena SET `desc`='Updated'; SELECT * FROM timestamp_arena; +---------+---------------------+---------------------+ | desc | created | last_updated | +---------+---------------------+---------------------+ | Updated | 2012-10-29 20:22:06 | 2012-10-29 20:23:04 | +---------+---------------------+---------------------+
2.10 Type Casting
To parse a string (of digits) into numbers, use function
CAST(... AS type)
or +0
. For example,mysql> SET @aStr = '1234'; mysql> SELECT @aStr, @aStr+0, CAST(@aStr AS UNSIGNED); +-------+---------+-------------------------+ | @aStr | @aStr+0 | CAST(@aStr AS UNSIGNED) | +-------+---------+-------------------------+ | 1234 | 1234 | 1234 | +-------+---------+-------------------------+
2.11 Spatial Data Types
- POINT: A point in (x, y) pair.
- LINESTRING:
- POLYGON:
- GEOMETRY:
- MULTIPOINT:
- MULTILINESTRING:
- MULTIPOLYGON:
- GEOMETRYCOLLECTION:
0 comments:
Post a Comment