Monday, 24 December 2018

MySQL Data Types

2.  MySQL Data Types

As a programmer, understanding the data types is curial in understanding the working of the underlying database system.
MySQL supports many data types, grouped in 3 categories:
  1. Numeric: including integers, floating-point numbers, bits and boolean.
  2. String: including fixed-length and variable-length strings, binary data, and collections (enumeration and set).
  3. Date/Time: Date and time are extremely important in database applications. This is because business records often carry date/time information (e.g., orderDatepaymentDate). There is also a need to time-stamp the creation and last update of records for auditing and security considerations. MySQL provides many date/time data types (e.g., DATETIMEDATETIMEYEARTIMESTAMP) and built-in functions for date/time manipulation.

2.1  Numeric - Integers

Integers, by default, are signed integers; unless UNSIGNED is declared.
You could set the display width, using the syntax INTEGER_TYPE(n), where n is the display field-width of up to 255, e.g., INT(10). You could also specify ZEROFILL to pad the displayed numbers with leading zeros (for UNSIGNED only) instead of blanks. The field-width affects only the display, and not the number stored.
MySQL supports many integer types with various precisions and ranges.
  • TINYINT: 8-bit precision. The range is [-128, 127] for signed integer, and [0, 255] for unsigned.
  • SMALLINT: 16-bit precision. The range is [-32768, 32767] for signed integer, and [0, 65535] for unsigned.
  • MEDIUMINT: 24-bit precision. The range is [-8388608, 8388607] for signed integer, and [0, 16777215] for unsigned.
  • INT (or INTEGER): 32-bit precision. The range is [-2147483648, 2147483647] for signed integer, and [0, 4294967295] for unsigned.
  • BIGINT: 64-bit precision. The range is [-9223372036854775808, 9223372036854775807] for signed integer, and [0, 18446744073709551615] for unsigned.
  • BIT(n): A n-bit column. To input a bit-value, use the syntax b'value' or 0bvalue, e.g., b'1001'0b10010.
  • BOOLEAN (or BOOL): same as TINYINT(1) (with range of -128 to 127, display field-width of 1). Value of zero is considered false; non-zero is considered true. You could also use BIT(1) to store boolean value.
Choosing the right integer type is important for optimizing storage usage and computational efficiency. For example, for a integer column with a range of 1 to 9999, use SMALLINT(4) UNSIGNED: for 1 to 9999999, use MEDIUMINT(7) UNSIGNED (which shall be sufficient for a small business, but always allow more room for future expansion).
Integers (and floating-point numbers to be discussed later) could be declared as AUTO_INCREMENT, with default starting value of 1. When you insert a NULL (recommended) (or 0, or a missing value) into an AUTO_INCREMENT column, the maximum value of that column plus 1 would be inserted. You can also insert any valid value to an AUTO_INCREMENT column, bypassing the auto-increment. Take note that further INSERT would fail if the last INSERT has reached the maximum value. Hence, it is recommended to use INT for AUTO_INCREMENT column to avoid handling over-run. There can only be one AUTO_INCREMENT column in a table and the column must be defined as a key.
Example (Testing the Integer Data Types): Read "integer_arena.sql".
Example (Testing AUTO_INCREMENT): autoincrement_arena.sql.

2.2  Numeric - Fixed-Point & Floating-Point Numbers

MySQL supports both approximated floating points (FLOAT and DOUBLE) and exact fixed-point point (DECIMAL).
  • FLOAT: 32-bit single precision floating-point numbers. You can specify UNSIGNED to disallow negative number, and ZEROFILL to pad the displayed number with zeros.
  • DOUBLE (or DOUBLE PRECISION or REAL): 64-bit double precision floating-point numbers.
  • DECIMAL(nd) (or DEC or NUMERIC or FIXED): fixed-point decimal numbers, where n is the number of digits with d decimal places. For example, DECIMAL(6, 2) specifies 6 total digits (not including the decimal point) with 2 digit after the decimal point, which has the range of -9999.99 to 9999.99.
    Unlike INT(n), where n specifies the display field-width and does not affect the number stored; n in DECIMAL(nd) specifies the range and affects the number stored.
    The fixed-point DECIMAL is not available in popular languages such as C/C++/Java/C#. DECIMAL is an exact representation, as it is represented as integer with a fix decimal point. On the other hand, FLOAT and DOUBLE values are approximated and inexact, which create problems in comparisonand cause rounding errors in arithmetic operations. Attempt to compare inexact values using '=' and '<>' leads to unexpected result.
    DECIMAL type was created for representing currency, which is exact and does not suffer from rounding errors in addition and subtraction. For example, DECIMAL(n, 2)DECIMAL values can be compared with '=' or '<>'.
  • FLOAT(nd) or DOUBLE(nd): Stored with n total digits with d decimal places. Exact or Approximated?? Advisable to use DECIMAL(nd).
The fixed-point and floating-point numbers can also be declared as UNSIGNED to exclude negative numbers.
Example (Testing the Floating-point Data Type): Read "floatingpoint_arena.sql".

2.3  Character Sets and Collations

Strings are made up of a sequence of characters. Before presenting the various string data types, it is important to understand the so-called character set and collating sequence (or collation).
Character Sets
character set (also called charsetcharacter encodingcharacter mapcode page) defines a set of character and maps each character to a unique numeric code. For example, in ASCII, character 'A' is assigned 65, 'a' is 97, '0' is 48 and space is 32. The most commonly-used character sets are the 8-bit Latin-1 (ISO/IEC 8859-1) for English characters (which is backward compatible with 7-bit ASCII), and multi-byte Unicode (ISO/IEC 10646) for internationalization. Unicode can be encoded in variable-length UTF8 (1-3 bytes) for fixed-length UCS2 (2 bytes).
MySQL's UCS-2 is in big-endian without BOM (Byte Order Mark), i.e., UCS-2BE or UTF2BE. MySQL uses no BOM for UTF-8.
Collating Sequences
collating sequence (or collation) refers to the orders in which individual characters should be ranked in comparing or sorting the strings. In other words, a collation is a set of rules for ranking characters in a character set.
A string may contain uppercase and lowercase characters, digits and symbols. As an example, suppose we wish to sort three strings: "apple""BOY", and "Cat" encoded in ASCII. If we rank the characters according to the underlying ASCII code numbers, the order would be "BOY""apple" and"Cat". It is because uppercase letters have smaller code numbers than lowercase letters in ASCII code. Ranking strings according to their underlying code numbers is known as binary collation. Binary collation does not agree with the so-called dictionary order, where the same uppercase and lowercase letters have the same rank.
There are many other collating sequences available. Some case-insensitive (ci) dictionary-order collating sequences have the same rank for the same uppercase and lowercase letter, i.e., 'A''a' ⇒ 'B''b' ⇒ ... ⇒ 'Z''z'. Some case-sensitive (cs) dictionary-order collating sequences put the uppercase letter before its lowercase counterpart, i.e., 'A' ⇒'a' ⇒ 'B' ⇒ 'b' ⇒ ... ⇒ 'Z' ⇒ 'z'. Typically, space is ranked before digits '0' to '9', followed by the alphabets.
Collating sequence is also language dependent, as different languages use different sets of characters, e.g., a, A, α, Ä, á, é. It becomes more complicated if more than one languages are used, e.g., you need to decide how to rank 'a' and 'α', by choosing an appropriate collating sequence.
MySQL Character Sets and Collations
MySQL 5.5 supports 39 character sets with 197 collations.
You could use command SHOW CHARACTER SET to display all the supported character sets and their default collation; and SHOW COLLATION to display all the collations and the charsets their support.
mysql> SHOW CHARACTER SET;
+----------+-----------------------------+---------------------+--------+
| Charset  | Description                 | Default collation   | Maxlen |
+----------+-----------------------------+---------------------+--------+
| latin1   | cp1252 West European        | latin1_swedish_ci   |      1 |
| utf8     | UTF-8 Unicode               | utf8_general_ci     |      3 |
| ucs2     | UCS-2 Unicode               | ucs2_general_ci     |      2 |
| ascii    | US ASCII                    | ascii_general_ci    |      1 |
  ......
39 rows in set (0.06 sec)
 
mysql> SHOW COLLATION;
+--------------------------+----------+-----+---------+----------+---------+
| Collation                | Charset  | Id  | Default | Compiled | Sortlen |
+--------------------------+----------+-----+---------+----------+---------+
| big5_chinese_ci          | big5     |   1 | Yes     | Yes      |       1 |
| big5_bin                 | big5     |  84 |         | Yes      |       1 |
 ......
197 rows in set (0.01 sec)
The default character set in MySQL is latin1 (aka ISO-8859-1). The default collation for latin1 is latin1_swdish_ci, which is a case-insensitive collation suitable for Swedish/finnish (MySQL was created by some Swedishs). You can list the available collations for latin1, as follows:
mysql> SHOW COLLATION LIKE 'latin1%';
+-------------------+---------+----+---------+----------+---------+
| Collation         | Charset | Id | Default | Compiled | Sortlen |
+-------------------+---------+----+---------+----------+---------+
| latin1_german1_ci | latin1  |  5 |         | Yes      |       1 |
| latin1_swedish_ci | latin1  |  8 | Yes     | Yes      |       1 |
| latin1_danish_ci  | latin1  | 15 |         | Yes      |       1 |
| latin1_german2_ci | latin1  | 31 |         | Yes      |       2 |
| latin1_bin        | latin1  | 47 |         | Yes      |       1 |
| latin1_general_ci | latin1  | 48 |         | Yes      |       1 |
| latin1_general_cs | latin1  | 49 |         | Yes      |       1 |
| latin1_spanish_ci | latin1  | 94 |         | Yes      |       1 |
+-------------------+---------+----+---------+----------+---------+
latin1_general_ci is a case-insensitive collation for multilingual western european. latin1_general_cs is case-sensitivelatin1_bin is the binary collation according to latin1 encoding.
I recommend that you use utf8 charset for applications that require internationalization (i18n) support (because utf8 supports all the languages in this world). The default collation for utf8 is utf8_general_ci. Other collations for utf8 includes:
mysql> SHOW COLLATION LIKE 'utf8%';
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8_general_ci    | utf8    |  33 | Yes     | Yes      |       1 |
| utf8_bin           | utf8    |  83 |         | Yes      |       1 |
| utf8_unicode_ci    | utf8    | 192 |         | Yes      |       8 |
| utf8_icelandic_ci  | utf8    | 193 |         | Yes      |       8 |
| utf8_latvian_ci    | utf8    | 194 |         | Yes      |       8 |
| utf8_romanian_ci   | utf8    | 195 |         | Yes      |       8 |
  ......
You may want to try out some of these collations by ranking space, numbers, uppercase and lowercase letters, and some typical strings.
Specifying Character Set and Collation in MySQL
You could specify the character set and collation via keyword CHARACTER SET (or CHARSET) and COLLATE. They could be set at 4 levels: server, database, table, and column.
To set the charset and collation for the server, start mysqld with --character-set-server=charset and --collation-server=collation options. The default charset is latin1.
You can set the default charset and collation for an database in CREATE DATABASE. For example,
CREATE DATABASE databaseName
DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;
You can set the character set and collation for table and column in CREATE TABLE. For example,
CREATE TABLE IF NOT EXISTS StringArena (
   cString1 VARCHAR(50) CHARACTER SET latin1 COLLATE latin1_general_cs,  -- case-sensitive 'A' -> 'a' -> 'B' -> 'b'
   cString2 CHAR(10) CHARACTER SET latin1,                               -- default collation of latin1_swedish_ci
   cString3 TEXT
) DEFAULT CHARACTER SET utf8 COLLATE utf8_unicode_ci;                    -- case-insensitive 'A' and 'a' are the same
In the above example, the default character set for all the string columns for the table is utf8 (for unicode support) with collating sequence of utf8_unicode_ci (default). However, column cString1 uses character set latin1 with collation latin1_general_cs; column cString2 uses character set Latin1 with default collation latin1_swedish_ci (not from table default but from the charset default); column cString3 uses the default character set and collation of the table.
Example (Testing Character Sets and Collations): Read "charset_arena.sql".
Character Set and Collation Related System Variables
There are quite a number of system variables related to charset and collation, with names starting with "character_set_" and "collation_", respectively. You can list them as follows:
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
            --  Use the escape '\_' for '_'
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | latin1 |
| character_set_connection | latin1 |
| character_set_database   | latin1 |   <= default charset for database
| character_set_filesystem | binary |
| character_set_results    | latin1 |
| character_set_server     | latin1 |   <= default charset for server
| character_set_system     | utf8   |   <= meta-data
+--------------------------+--------+
7 rows in set (0.00 sec)
   
mysql> SHOW VARIABLES LIKE 'collation\_%';
+----------------------+-------------------+
| Variable_name        | Value             |
+----------------------+-------------------+
| collation_connection | latin1_swedish_ci |
| collation_database   | latin1_swedish_ci |  <= default collation for database
| collation_server     | latin1_swedish_ci |  <= default collation for server
+----------------------+-------------------+
3 rows in set (0.00 sec)
The system variables character_set_servercharacter_set_database and collation_servercollation_database maintain the default charset and collation for the server and the current database respectively.
In MySQL, columns, tables, databases may use different character sets. However, the meta-data (such as column names and attributes) must be stored in the same character set, and include all languages. Hence, meta-data, by default, stored in UTF-8, as maintained in the system variable character_set_system.
Character Set for Client-Server Communication
A client connects to the server via a so-called connection object. The client sends the SQL statements over the connection to the server. The server returns the results via the connection to the client. Server maintains a connection for each client, and maintains connection-related character set and collation system variables for each client connection.
The client sends its statements using character_set_client. They are translated to character_set_connection, and uses collation_connection for comparing string literals. The server returns the results to the client using character_set_results. You can use command "SET NAMEScharset" (e.g., "SET NAMES 'utf8'") to change the character set used for client-server communication. "SET NAMES", in effect, changes system variables character_set_clientcharacter_set_connection, and character_set_results. For example,
-- Default charset for client, connection, and results is latin1
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | latin1 |
| character_set_connection | latin1 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | latin1 |
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)
   
mysql> SET NAMES 'utf8';
Query OK, 0 rows affected (0.00 sec)
   
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | utf8   |  <==
| character_set_connection | utf8   |  <==
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | utf8   |  <==
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)
When a client (e.g., mysqlmysqlshowmysqladmin) connects to the server, it sends the name of the character set that it wants to use to the server (default is latin1; or specify in --default-character-set startup option of the client). The server uses this name to set the character_set_clientcharacter_set_results, and character_set_connection system variables for this client connection. In effect, the server performs a "SET NAMES".

2.4  String Data Types

MySQL supports fixed-length and variable-length string types:
  • CHAR(n): fix-length string of n characters, right-pad with spaces. The maximum n is 255 characters.
  • VARCHAR(n): variable-length string of up to n characters. The maximum n is 65535 bytes of storage (utf8 characters range from 1-4 bytes).
  • TINYTEXT: up to 255 Bytes
  • TEXT: up to 64 KBytes
  • MEDIUMTEXT: up to 16 MBytes
  • LONGTEXT: up to 4 GBytes
Example: [TODO]
String Literals and String Introducer
A string literal has on optional charset introducer and collate clause. The introducer tells the parser that the string that is followed uses a certain character set.
// Syntax
[_charsetName]'string' [COLLATE collationName]
 
// Example
SELECT 'Test String';
SELECT _latin1'Test String';
SELECT _latin1'Test String' COLLATE ...
 
SELECT HEX('abc');         -- 616263
SELECT _latin1 x'616263';  -- 'abc': can express string in hex code
SELECT HEX(_utf8'abc');    -- 616263
CONVERT(... USING ...) Function
CONVERT(expr USING charset) can be used to convert string between different character sets, e.g.,
SELECT CONVERT(_latin1'....' USING utf8);
INSERT INTO utf8Table (utf8Column) VALUES
   ((SELECT CONVERT(latin1Column USING utf8) FROM latin1Table)));
String introducer merely interprets the string literal and does not change its value; whereas CONVERT() returns a new value of the specified type. For example,
-- default charset is latin1
 
mysql> SELECT HEX(CONVERT('abc' USING ucs2));
+--------------------------------+
| HEX(CONVERT('abc' USING ucs2)) |
+--------------------------------+
| 006100620063                   |
+--------------------------------+
    -- Returns a new UCS2 representation.
    -- Each character in UCS2 takes two bytes
 
mysql> SELECT HEX(_ucs2'abc'), _ucs2'abc';
+-----------------+-----+
| HEX(_ucs2'abc') | ?   |
+-----------------+-----+
| 00616263        | ??  |
+-----------------+-----+
    -- Interpret 'abc' (or x'616263') as two UCS2 characters

2.5  UTF8 Charset

For internationalization, UTF8 charset shall be used.
UTF8-Encoded Chinese Characters in Windows' CMD Shell
In brief, do not use CMD shell to work on UTF8 charset. Use a tool such as NetBeans, Eclipse, MySQL workbench or PhpMyAdmin which fully supports UTF8 charset.
If you want to display a text file encoded in a particular character set (e.g., ASCII, Latin1, UTF8) in CMD correctly, you need to choose a matching "codepage". For example, ANSI/Latin1 requires codepage 1252, Simplified Chinese GBK (GB2312) requires 936, UTF8 requires 65001, UCS2LE requires 1200, UCS2BE requires 1201, and the original DOS extended-ASCII uses codepage 437.
CMD is NOT able to detect the character set automatically and switch to the matching codepage. To set the codepage in CMD, issue command "chcp color-page-number", e.g. "chcp 65001" to choose UTF8 codepage. This codepage could display UTF8 text correctly, and display garbage if the text is encoded using other character set, including UCS2, GB2312.
You also have to choose a font (such as Consolas, Lucida Console, but NOT raster font) that supports the characters. To choose the font for CMD, click on the CMD icon ⇒ "Properties" ⇒ "Font".
There are a number of character encoding schemes for Chinese characters, e.g., Unicode (UTF8, UCS2GE, UCS2LE), GB2312, GBK and BIG5, which are not compatible.
Unicode is an international standard, which supports all languages, including Chinese, Japanese and Korean (CJK). Unicode has two frequently-used encoding schemes: UTF8 (1-3 bytes variable-length, the 1-byte codes are backward-compatible with ASCII) and UCS2 (2 byte fixed-length). UTF8 is ideal if your text consists of mainly ASCII (English text), with occasional non-ASCII text. I strongly recommend using UTF8 character set for MySQL columns that require internationalization support. GB2312 is a Chinese national standard for simplified chinese, it is not compatible with UCS2 or UTF8. For example, for the chinese characters "您好", the UCS2 is "60A8 597D", the UTF8 is "E682A8 E5A5BD", the GB1232 is "C4FA BAC3".
Unfortunately, in CMD, the Microsoft Pinyin IME (MSPY), the Chinese character input tool, is disabled for codepage 65001 (UTF8) (?!?!). MSPY works on GB2312 (codepage 936). If you run your MySQL client in CMD, using codepage 936 (GB2312) so that you can input chinese characters (e.g., for INSERT and UPDATE), but your database character columns are using UTF8, you need to handle this situation with great care.
In MySQL client, you can issue the command "SET NAMES 'charset'" to specify the character set used for the client/server communication. (Alternatively, you could start 'mysql' client with the --default-character-set=charset option.)
In the above situation, you need to issue "SET NAMES 'gb2312'", as your input is encoded in GB2312, which is sent to the server. The server receives the characters in GB2312, converts them into the character set defined for the column, which is UTF8, before storing them. During query, the server retrieves the characters from the columns in UTF8, convert to GB2312 (according to the SET NAMES), and send to the client in CMD, which uses codepage 936 to correctly display the text. Try the following commands:
mysql> SET NAMES 'gb2312';
Query OK, 0 rows affected (0.00 sec)
   
mysql> SHOW VARIABLES LIKE 'character\_set\_%';
+--------------------------+--------+
| Variable_name            | Value  |
+--------------------------+--------+
| character_set_client     | gb2312 |
| character_set_connection | gb2312 |
| character_set_database   | latin1 |
| character_set_filesystem | binary |
| character_set_results    | gb2312 |
| character_set_server     | latin1 |
| character_set_system     | utf8   |
+--------------------------+--------+
7 rows in set (0.00 sec)
Observe that SET NAMES affects the character_set_clientcharacter_set_connection, and character_set_results.
Suppose that you interacts with the server via a Java program, written in Eclipse or others. You should set the encoding to UTF8, and include "SET NAMES 'utf8'" in your MySQL script.
Similarly, suppose that you are running a SQL script in batch mode or using source command in CMD, and your script in encoded in UTF8, then you should include "SET NAMES 'utf8'" in the script ("SET NAMES 'gb2312'" results in garbage inserted into the database). But to query interactively in CMD using code page 936, you need to "SET NAMES 'gb2312'" to convert UTF8 to GB2312 (cp936).
I would expect that, in CMD using codepage 65001, I could display the UTF8 column from MySQL correctly with "SET NAMES 'utf8'". But it does not work in my computer (?!?!).
Take note that you could use HEX() function to request the hex code of the characters, stored in the database.
SELECT `id`, `description`, HEX(`description`) FROM `ChineseUtf8Arena` ORDER BY `description`, `id`;
Example (Testing UTF8-encoded Chinese Characters): 
Notes on Using UTF8-Encoded Characters in Eclipse/Java
  1. Open Run Dialog ⇒ "your application" ⇒ Common Tab ⇒ Console Encoding ⇒ Other ⇒ "set it to UTF-8".
  2. Open Run Dialog ⇒ "your application" ⇒ Arguments Tab ⇒ VM Arguments ⇒ Add "-Dfile.encoding=UTF-8"; or add "-Dfile.encoding=UTF-8" to "eclipse.ini".
  3. Window Menu ⇒ General ⇒ Workspace ⇒ Text file encoding ⇒ set to "UTF-8".

2.6  Binary Data and BLOB

MySQL support both character string (a sequence of characters) and binary string (a sequence of bytes). Binary data (e.g., images and codes) are special string with character set of binary, that is, all characters are treated as binary raw bytes.
  • BINARY(n): Similar to CHAR(n), but store binary data with fixed-length of n bytes.
  • VARBINARY(n): Similar to VARCHAR(n), but store binary data of variable-length of up to n bytes.
  • TINYBLOB: up to 255 bytes
  • BLOB: up to 64KBytes
  • MEDIUMBLOB: up to 16MBytes
  • LONGBOLB: up to 4GBytes
Example: The LOAD_FILE() function, which reads from a file and returns a string, can be used to load a BLOB/TEXT field with a binary/text file.
CREATE TABLE IF NOT EXISTS `blob_arena` (
   `id`     INT UNSIGNED ZEROFILL NOT NULL AUTO_INCREMENT,
   `photo`  BLOB,
    PRIMARY KEY(`ID`)
);
   
INSERT INTO `blob_arena` (`photo`) VALUES (LOAD_FILE('d:/path/test.jpg'));
Example: Storing a thumbnail image [Refer to Rental Database Example].

2.7  ENUM

ENUM is a special string type. An ENUM (or enumeration) is a string with a value chosen from a list of allowed values (or members). The list of allowed values are defined explicitly in the column definition of the CREATE TABLE command. For example,
CREATE TABLE Shoes (
   brand           ENUM('abc', 'xyz', 'qqq');
   size            ENUM('S', 'M', 'L', 'XL', 'XXL');
   customerRating  ENUM('1', '2', '3', '4', '5');
);
The value must be a quoted string literal. Duplicate values are not allowed. NULL is allowed only if the column is declared to allow NULL. If you insert a value not in the ENUM list, an empty string ('') would be inserted, which signals an error value.
Each ENUM type is associated with an index, beginning with 1 for the first allowable value. The empty string has index of 0 (to denotes an error value). The index of NULL is NULL.
ENUM is typically represented in 16-bit, which allows 65,535 members.
Example (Testing ENUM Type):
  1. Create a table with ENUM columns. Observe that ENUM's values are quoted string literals. Numbers must be stored as quoted strings.
    CREATE TABLE IF NOT EXISTS `enum_arena` (
       `desc`            VARCHAR(50) NULL,
       `brand`           ENUM('abc', 'xyz', 'qqq') NOT NULL DEFAULT 'qqq',
       `size`            ENUM('S', 'M', 'L', 'XL', 'XXL') NOT NULL,
       `customerRating`  ENUM('5', '4', '3', '2', '1') DEFAULT NULL
    );
      
    DESCRIBE `enum_arena`;
    +----------------+------------------------------+------+-----+---------+-------+
    | Field          | Type                         | Null | Key | Default | Extra |
    +----------------+------------------------------+------+-----+---------+-------+
    | desc           | varchar(50)                  | YES  |     | NULL    |       |
    | brand          | enum('abc','xyz','qqq')      | NO   |     | qqq     |       |
    | size           | enum('S','M','L','XL','XXL') | NO   |     | NULL    |       |
    | customerRating | enum('5','4','3','2','1')    | YES  |     | NULL    |       |
    +----------------+------------------------------+------+-----+---------+-------+
  2. Insert values into ENUM fields. Check case-sensitivity and values not in the enum list. Observe that the ENUM string is NOT case-sensitive (in MySQL). Empty string (''), denoting erroneous value, would be inserted for value not in the ENUM list.
    INSERT INTO `enum_arena` 
       (`description`, `brand`, `size`)
    VALUES
       ('Valid value', 'abc', 'm'),
       ('Uppercase value', 'XYZ', 'M'),
       ('Invalid value', 'qqq', 'unknown');
       
    UPDATE `enum_arena` SET customerRating='3';
       
    SELECT * FROM `enum_arena`;
    +-----------------+-------+------+----------------+
    | description     | brand | size | customerRating |
    +-----------------+-------+------+----------------+
    | Valid value     | abc   | M    | 3              |
    | Uppercase value | xyz   | M    | 3              |
    | Invalid value   | qqq   |      | 3              |
    +-----------------+-------+------+----------------+
  3. Using numeric index as ENUM's value: Instead of the literal string value, you could also use the numeric index associated with each ENUM type. The numeric index begins at 1, for the first value is the list.
    INSERT INTO `enum_arena` VALUES
       ('Use numeric index', 'qqq', 2, 1);
       
    SELECT * FROM `enum_arena` WHERE `description`='Use numeric index';
    +-------------------+-------+------+----------------+
    | description       | brand | size | customerRating |
    +-------------------+-------+------+----------------+
    | Use numeric index | qqq   | M    | 5              |
    +-------------------+-------+------+----------------+
  4. Selecting the error records: The empty string (erroneous value) has index of 0 (error code). You could use the error index of 0 to retrieve the erroneous records as follows.
    SELECT * FROM Shirts WHERE brand=0 OR size=0 OR customerRating=0;
    +-------+------+----------------+
    | brand | size | customerRating |
    +-------+------+----------------+
    | qqq   |      | 3              |
    +-------+------+----------------+
  5. To retrieve all the allowable values of an ENUM, you could query the MySQL system database information_schema, table columns, as follows:
    SELECT column_type FROM information_schema.columns
    WHERE table_schema='test' AND table_name='enum_arena' AND column_name='brand';
    +-------------------------+
    | column_type             |
    +-------------------------+
    | enum('abc','xyz','qqq') |
    +-------------------------+

2.8  SET

SET is also a special string type. SET is similar to ENUM, but you could choose zero (empty string) or more values from the allowable values; or NULL if configured. For a SET ('a''b''c'), the valid values are '' (empty string), 'a''b''c''a,b''a,c''b,c' and 'a,b,c'. Commas are used to delimit multiple values, with no spaces around the commas.
A set can have a maximum of 64 members. In MySQL, a set values are stored as a 64-bit integer, with the least-significant bit corresponding to the first member. For example, for a SET ('a''b''c'), selected numeric value are:
SET memberBinary ValueDecimal Value
'a'0011
'b'0102
'c'1004
''0000
'a,b,c'1117
Similar to ENUM, you can use the numeric value.
Example (Testing SET Data Type):
[TODO] Explanation.

2.9  Date/Time

Date and time (as well as currency) are of particular interest for database applications. It is because business records often carry date/time information (e.g., orderDatedeliveryDatepaymentDate), as well as the need to time-stamp the last update of the records for auditing and security.
MySQL provides these built-in functions for getting the current date and/or time:
  • NOW(): returns the current date and time in the format of 'YYYY-MM-DD HH:MM:SS'.
  • CURDATE() (or CURRENT_DATE(), or CURRENT_DATE): returns the current date in the format of 'YYYY-MM-DD'.
  • CURTIME() (or CURRENT_TIME(), or CURRENT_TIME): returns the current time in the format of 'HH:MM:SS'.
For examples,
mysql> select now(), curdate(), curtime();
+---------------------+------------+-----------+
| now()               | curdate()  | curtime() |
+---------------------+------------+-----------+
| 2010-04-08 16:28:06 | 2010-04-08 | 16:28:06  |
+---------------------+------------+-----------+
MySQL provides these date/time data type:
  • DATETIME: stored both date and time in the format of 'YYYY-MM-DD HH:MM:SS' where the date and time are separated by a space. The valid range is '1000-01-01 00:00:00' to '9999-12-31 23:59:59'. You could apply functions NOW() or CURDATE() (time will be set to '00:00:00') on this field, but not CURTIME().
  • DATE: stored date only in the format of 'YYYY-MM-DD'. The range is '1000-01-01' to '9999-12-31'. You could use CURDATE() or NOW() (the returned time portion discarded) for this field.
  • TIME: stored time only in the format of 'HH:MM:SS'. You could use CURTIME() or NOW() (the returned date portion discarded) for this field.
  • YEAR(4|2): in 'YYYY' or 'YY'. The range of years is 1901 to 2155. Use DATE type for year outside this range. You could also use CURDATE() to retrieve the current year value for this field (month and day discarded).
  • TIMESTAMP: similar to DATETIME but stored the number of seconds since January 1, 1970 UTC (Unix-style). The range is '1970-01-01 00:00:00' to '2037-12-31 23:59:59'.
    The differences between DATETIME and TIMESTAMP are:
    1. the range,
    2. support for time zone,
    3. TIMESTAMP column could be declared with DEFAULT CURRENT_TIMESTAMP to set the default value to the current date/time. (All other data types' default, including DATETIME, must be a constant and not a function return value). You can also declared with ON UPDATE CURRENT_TIMESTAMP to capture the timestamp of the last update.
The date/time value can be entered manually as a string literal (e.g., '2010-12-31 23:59:59' for DATAETIME). MySQL will issue a warning and insert all zeros (e.g., '0000-00-00 00:00:00' for DATAETIME), if the value of date/time to be inserted is invalid or out-of-range. '0000-00-00' is called a "dummy" date.
Example (Testing DATE/TIME Data Types):
  1. Create a table with various date/time columns. Only the TIMESTAMP column can have the DEFAULT and ON UPDATE as CURRENT_TIMESTAMP.
    CREATE TABLE IF NOT EXISTS `datetime_arena` (
       `desc`        VARCHAR(50)  DEFAULT NULL,
       `cDateTime`   DATETIME     DEFAULT '0000-00-00 00:00:00',
       `cDate`       DATE         DEFAULT '0000-00-00',
       `cTime`       TIME         DEFAULT '00:00:00',
       `cYear`       YEAR         DEFAULT '0000',
       `cYear2`      YEAR(2)      DEFAULT '00',
       `cTimeStamp`  TIMESTAMP    DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
    );
       
    DESCRIBE `datetime_arena`;
    +-------------+-------------+------+-----+---------------------+-----------------------------+
    | Field       | Type        | Null | Key | Default             | Extra                       |
    +-------------+-------------+------+-----+---------------------+-----------------------------+
    | description | varchar(50) | YES  |     | NULL                |                             |
    | cDateTime   | datetime    | YES  |     | 0000-00-00 00:00:00 |                             |
    | cDate       | date        | YES  |     | 0000-00-00          |                             |
    | cTime       | time        | YES  |     | 00:00:00            |                             |
    | cYear       | year(4)     | YES  |     | 0000                |                             |
    | cYear2      | year(2)     | YES  |     | 00                  |                             |
    | cTimeStamp  | timestamp   | NO   |     | CURRENT_TIMESTAMP   | on update CURRENT_TIMESTAMP |
    +-------------+-------------+------+-----+---------------------+-----------------------------+
  2. Insert values manually using string literals.
    INSERT INTO `datetime_arena` 
       (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`)
    VALUES 
       ('Manual Entry', '2001-01-01 23:59:59', '2002-02-02', '12:30:30', '2004', '05');
       
    SELECT * FROM `datetime_arena` WHERE description='Manual Entry';
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | description  | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | Manual Entry | 2001-01-01 23:59:59 | 2002-02-02 | 12:30:30 |  2004 |     05 | 2010-04-08 14:44:37 |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
  3. Checking the on-update for TIMSTAMP.
    UPDATE `datetime_arena` SET `cYear2`='99' WHERE description='Manual Entry';
       
    SELECT * FROM `datetime_arena` WHERE description='Manual Entry';
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | description  | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
    | Manual Entry | 2001-01-01 23:59:59 | 2002-02-02 | 12:30:30 |  2004 |     99 | 2010-04-08 14:44:48 |
    +--------------+---------------------+------------+----------+-------+--------+---------------------+
  4. Insert values using MySQL built-in functions now()curdate()curtime().
    INSERT INTO `datetime_arena` 
       (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`)
    VALUES
       ('Built-in Functions', now(), curdate(), curtime(), now(), now());
       
    SELECT * FROM `datetime_arena` WHERE description='Built-in Functions';
    +--------------------+---------------------+------------+----------+-------+--------+---------------------+
    | description        | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +--------------------+---------------------+------------+----------+-------+--------+---------------------+
    | Built-in Functions | 2010-04-08 14:45:48 | 2010-04-08 | 14:45:48 |  2010 |     10 | 2010-04-08 14:45:48 |
    +--------------------+---------------------+------------+----------+-------+--------+---------------------+
  5. Insert invalid or out-of-range values. MySQL replaces with all zeros.
    INSERT INTO `datetime_arena` 
       (`description`, `cDateTime`, `cDate`, `cTime`, `cYear`, `cYear2`)
    VALUES
       ('Error Input', '2001-13-31 23:59:59', '2002-13-31', '12:61:61', '99999', '999');
       
    SELECT * FROM `datetime_arena` WHERE description='Error Input';
    +-------------+---------------------+------------+----------+-------+--------+---------------------+
    | description | cDateTime           | cDate      | cTime    | cYear | cYear2 | cTimeStamp          |
    +-------------+---------------------+------------+----------+-------+--------+---------------------+
    | Error Input | 0000-00-00 00:00:00 | 0000-00-00 | 00:00:00 |  0000 |     00 | 2010-04-08 14:46:10 |
    +-------------+---------------------+------------+----------+-------+--------+---------------------+
  6. An useful built-in function INTERVAL can be used to compute a future date, e.g.,
    SELECT `cDate`, `cDate` + INTERVAL 30 DAY, `cDate` + INTERVAL 1 MONTH FROM `datetime_arena`;
    +------------+---------------------------+----------------------------+
    | cDate      | `cDate` + INTERVAL 30 DAY | `cDate` + INTERVAL 1 MONTH |
    +------------+---------------------------+----------------------------+
    | 2002-02-02 | 2002-03-04                | 2002-03-02                 |
    | 2010-04-08 | 2010-05-08                | 2010-05-08                 |
    | 0000-00-00 | NULL                      | NULL                       |
    +------------+---------------------------+----------------------------+
Example (Creation and Last Update TIMESTAMP): In MySQL, you can have only one TIMESTAMP column with DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP. To maintain both the creation and last updated timestamp, set the creation column to default 0 and insert a NULL to get the CURRENT_TIMESTAMP.
CREATE TABLE timestamp_arena (
   `desc` VARCHAR(30),
   created TIMESTAMP DEFAULT 0,
   last_updated TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP);

-- INSERT NULL to created which results in CURRENT_TIMESTAMP
INSERT INTO timestamp_arena VALUE ('Created', NULL, NULL);

SELECT * FROM timestamp_arena;
+---------+---------------------+---------------------+
| desc    | created             | last_updated        |
+---------+---------------------+---------------------+
| Created | 2012-10-29 20:22:06 | 2012-10-29 20:22:06 |
+---------+---------------------+---------------------+

UPDATE timestamp_arena SET `desc`='Updated';

SELECT * FROM timestamp_arena;
+---------+---------------------+---------------------+
| desc    | created             | last_updated        |
+---------+---------------------+---------------------+
| Updated | 2012-10-29 20:22:06 | 2012-10-29 20:23:04 |
+---------+---------------------+---------------------+

2.10  Type Casting

To parse a string (of digits) into numbers, use function CAST(... AS type) or +0. For example,
mysql> SET @aStr = '1234';
   
mysql> SELECT @aStr, @aStr+0, CAST(@aStr AS UNSIGNED);
+-------+---------+-------------------------+
| @aStr | @aStr+0 | CAST(@aStr AS UNSIGNED) |
+-------+---------+-------------------------+
| 1234  |    1234 |                    1234 |
+-------+---------+-------------------------+

2.11  Spatial Data Types

  • POINT: A point in (x, y) pair.
  • LINESTRING:
  • POLYGON:
  • GEOMETRY:
  • MULTIPOINT:
  • MULTILINESTRING:
  • MULTIPOLYGON:
  • GEOMETRYCOLLECTION:

0 comments:

Post a Comment