SQL user guide

                                   

 

 

 

 

 

 

             SQL USER GUIDE

 

                                 

 

 

 

            

                                                                                                     

 

                                                                     

 

 

 

 

SQL User guide Table of Contents

  • Introduction
  • SQL DML and DDL
  • SQL Table
  • SQL SELECT Statement
  • SQL DISTINCT
  • SQL WHERE
  • SQL AND OR
  • SQL IN
  • SQL BETWEEN
  • SQL Wildcard
  • SQL LIKE
  • SQL ORDER BY
  • SQL Aggregate Functions
  • SQL Average
  • SQL COUNT
  • SQL MAX
  • SQL MIN
  • SQL SUM
  • SQL GROUP BY
  • SQL HAVING
  • SQL ALIAS
  • SQL JOIN
  • SQL OUTER JOIN
  • SQL CONCATENATE Function
  • SQL SUBSTRING Function
  • SQL TRIM Function
SQL Table Manipulation Functions
  • SQL CREATE TABLE
  • SQL CONSTRAINT
  • SQL NOT NULL
  • SQL DEFAULT
  • SQL UNIQUE
  • SQL CHECK
  • SQL PRIMARY KEY
  • SQL FOREIGN KEY
  • SQL View
  • SQL CREATE VIEW
  • SQL Index
  • SQL CREATE INDEX
  • SQL ALTER TABLE
  • SQL DROP TABLE
  • SQL TRUNCATE TABLE
  • SQL USE
  • SQL INSERT INTO
  • SQL UPDATE
  • SQL DELETE FROM


Introduction

SQL is short for Structured Query Language and is a widely used database language, providing means of data manipulation (store, retrieve, update, delete) and database creation.
Almost all modern Relational Database Management Systems like MS SQL Server, Microsoft Access, MSDE, Oracle, DB2, Sybase, MySQL, Postgres and Informix use SQL as standard database language. Now a word of warning here, although all those RDBMS use SQL, they use different SQL dialects. For example MS SQL Server specific version of the SQL is called T-SQL, Oracle version of SQL is called PL/SQL, MS Access version of SQL is called JET SQL, etc.
Our SQL User guide will teach you how to use commonly used SQL commands and you will be able to apply most of the knowledge gathered from this SQL User guide to any of the databases above.

SQL DML and DDL

SQL can be divided into two parts: The Data Manipulation Language (DML) and the Data Definition Language (DDL).
The query and update commands form the DML part of SQL:
  • SELECT - extracts data from a database
  • UPDATE - updates data in a database
  • DELETE - deletes data from a database
  • INSERT INTO - inserts new data into a database
The DDL part of SQL permits database tables to be created or deleted. It also define indexes (keys), specify links between tables, and impose constraints between tables. The most important DDL statements in SQL are:
  • CREATE DATABASE - creates a new database
  • ALTER DATABASE - modifies a database
  • CREATE TABLE - creates a new table
  • ALTER TABLE - modifies a table
  • DROP TABLE - deletes a table
  • CREATE INDEX - creates an index (search key)
  • DROP INDEX - deletes an index



SQL Table
SQL Database Tables are the foundation of every RDBMS (Relational Database Management System).
The foundation of every Relational Database Management System is a database object called table. Every database consists of one or more tables, which store the database’s data/information. Each table has its own unique name and consists of columns and rows.
The database table columns (called also table fields) have their own unique names and have a pre-defined data types. Table columns can have various attributes defining the column functionality (the column is a primary key, there is an index defined on the column, the column has certain default value, etc.).
While table columns describe the data types, the table rows contain the actual data for the columns.
Here is an example of a simple database table, containing customer’s data. The first row, listed in bold, contains the names of the table columns:
Table: Customers
First Name
Last Name
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
Now that we’ve learned what a database table is, we can continue with our sql User guide and learn how to manipulate the data within the database tables.







SQL SELECT
Learn how to use the SELECT SQL statement to retrieve data from a SQL database table.
The SQL SELECT statement is used to select data from a SQL database table. This is usually the very first SQL command every SQL newbie learns and this is because the SELECT SQL statement is one of the most used SQL commands.
Please have a look at the general SQL SELECT syntax:

SELECT Column1, Column2, Column3,
FROM Table1
The list of column names after the SQL SELECT command determines which columns you want to be returned in your result set. If you want to select all columns from a database table, you can use the following SQL statement:

SELECT *
FROM Table1
When the list of columns following the SELECT SQL command is replaced with asterix (*) all table columns are returned. Word of caution here, it’s always better to explicitly specify the columns in the SELECT list, as this will improve your query performance significantly.
The table name following the SQL FROM keyword (in our case Table1) tells the SQL interpreter which table to use to retrieve the data.







SQL SELECT INTO
Learn how to use the SQL SELECT INTO statement to copy data between database tables.
The SQL SELECT INTO statement is used to select data from a SQL database table and to insert it to a different table at the same time.
The general SQL SELECT INTO syntax looks like this:

SELECT Column1, Column2, Column3,
INTO Table2
FROM Table1
The list of column names after the SQL SELECT command determines which columns will be copied, and the table name after the SQL INTO keyword specifies to which table to copy those rows.
If we want to make an exact copy of the data in our Customers table, we need the following SQL SELECT INTO statement:

SELECT *
INTO Customers_copy
FROM Customers









SQL DISTINCT
Learn how to use the SQL DISTINCT clause together with the SQL SELECT keyword, to return a dataset with unique entries for certain database table column.
The SQL DISTINCT clause is used together with the SQL SELECT keyword, to return a dataset with unique entries for certain database table column.
We will use our Customers database table to illustrate the usage of SQL DISTINCT.
First Name
Last Name
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
For example if we want to select all distinct surnames from our Customers table, we will use the following SQL DISTINCT statement:

SELECT DISTINCT LastName
FROM Customers
The result of the SQL DISTINCT expression above will look like this:
LastName
Smith
Goldfish
Brown





SQL WHERE
The SQL WHERE command is used to specify selection criteria, thus restricting the result of a SQL query.
The SQL WHERE clause is used to select data conditionally, by adding it to already existing SQL SELECT query. We are going to use the Customers table from the previous chapter, to illustrate the use of the SQL WHERE command.
Table: Customers
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
If we want to select all customers from our database table, having last name 'Smith' we need to use the following SQL syntax:

SELECT *
FROM Customers
WHERE LastName = 'Smith'
The result of the SQL expression above will be the following:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
In this simple SQL query we used the "=" (Equal) operator in our WHERE criteria:
LastName = 'Smith'
But we can use any of the following comparison operators in conjunction with the SQL WHERE clause:


<> (Not Equal)

SELECT *
FROM Customers
WHERE LastName <> 'Smith'
> (Greater than)

SELECT *
FROM Customers
WHERE DOB > '1/1/1970'
>= (Greater or Equal)

SELECT *
FROM Customers
WHERE DOB >= '1/1/1970'
< (Less than)

SELECT *
FROM Customers
WHERE DOB < '1/1/1970'
<= (Less or Equal)

SELECT *
FROM Customers
WHERE DOB =< '1/1/1970'

LIKE (similar to)

SELECT *
FROM Customers
WHERE Phone LIKE '626%'
Note the LIKE syntax is different with the different RDBMS (SQL Server syntax used above). Check the SQL LIKE article for more details.
Between (Defines a range)

SELECT *
FROM Customers
WHERE DOB BETWEEN '1/1/1970' AND '1/1/1975'
SQL LIKE
The SQL LIKE clause is used along with the SQL WHERE clause and specifies criteria based on a string pattern.
We will use the Customers table to illustrate the SQL LIKE clause usage:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
The SQL LIKE clause is very useful when you want to specify a search condition within your SQL WHERE clause, based on a part of a column contents. For example if you want to select all customers having FirstName starting with 'J' you need to use the following SQL statement:

SELECT *
FROM Customers
WHERE FirstName LIKE 'J%'
Here is the result of the SQL statement above:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
If you want to select all Customers with phone numbers starting with '416' you will use this SQL expression:

SELECT *
FROM Customers
WHERE Phone LIKE '416%'
The '%' is a so called wildcard character and represents any string in our pattern.
You can put the wildcard anywhere in the string following the SQL LIKE clause and you can put as many wildcards as you like too.
Note that different databases use different characters as wildcard characters, for example '%' is a wildcard character for MS SQL Server representing any string, and '*' is the corresponding wildcard character used in MS Access.
Another wildcard character is '_' representing any single character.
The '[]' specifies a range of characters. Have a look at the following SQL statement:

SELECT *
FROM Customers
WHERE Phone LIKE '[4-6]_6%'
This SQL expression will return all customers satisfying the following conditions:
·       The Phone column starts with a digit between 4 and 6 ([4-6])
·       Second character in the Phone column can be anything (_)
·       The third character in the Phone column is 6 (6)
·       The remainder of the Phone column can be any character string (%)
Here is the result of this SQL expression:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
SQL INSERT INTO
Learn how to use the SQL INSERT INTO clause to insert data into a SQL database table.
The SQL INSERT INTO syntax has 2 main forms and the result of either of them is adding a new row into the database table.
The first syntax form of the INSERT INTO SQL clause doesn't specify the column names where the data will be inserted, but just their values:

INSERT INTO Table1
VALUES (value1, value2, value3…)
The second form of the SQL INSERT INTO command, specifies both the columns and the values to be inserted in them:

INSERT INTO Table1 (Column1, Column2, Column3…)
VALUES (Value1, Value2, Value3…)
As you might already have guessed, the number of the columns in the second INSERT INTO syntax form must match the number of values into the SQL statement, otherwise you will get an error.
If we want to insert a new row into our Customers table, we are going to use one of the following 2 SQL statements:

INSERT INTO Customers
VALUES ('Peter', 'Hunt', 'peter.hunt@tgmail.net', '1/1/1974', '626 888-8888')


INSERT INTO Customers (FirstName, LastName, Email, DOB, Phone)
VALUES ('Peter', 'Hunt', 'peter.hunt@tgmail.net', '1/1/1974', '626 888-8888')
The result of the execution of either of the 2 INSERT INTO SQL statements will be a new row added to our Customers database table:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
Peter
Hunt
peter.hunt@tgmail.net
1/1/1974
626 888-8888
If you want to enter data for just a few of the table columns, you’ll have to use the second syntax form of the SQL INSERT INTO clause, because the first form will produce an error if you haven’t supplied values for all columns.
To insert only the FirstName and LastName columns, execute the following SQL statement:

INSERT INTO Customers (FirstName, LastName)
VALUES ('Peter', 'Hunt')









SQL UPDATE
Learn how to use the SQL UPDATE statement to update data in a SQL database table.
The SQL UPDATE general syntax looks like this:

UPDATE Table1
SET Column1 = Value1, Column2 = Value2
WHERE Some_Column = Some_Value
The SQL UPDATE clause changes the data in already existing database row(s) and usually we need to add a conditional SQL WHERE clause to our SQL UPDATE statement in order to specify which row(s) we intend to update.
If we want to update the Mr. Steven Goldfish's date of birth to '5/10/1974' in our Customers database table
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
we need the following SQL UPDATE statement:

UPDATE Customers
SET DOB = '5/10/1974'
WHERE LastName = 'Goldfish' AND FirstName = 'Steven'
If we don’t specify a WHERE clause in the SQL expression above, all customers' DOB will be updated to '5/10/1974', so be careful with the SQL UPDATE command usage.
We can update several database table rows at once, by using the SQL WHERE clause in our UPDATE statement. For example if we want to change the phone number for all customers with last name Smith (we have 2 in our example Customers table), we need to use the following SQL UPDATE statement:

UPDATE Customers
SET Phone = '626 555-5555'
WHERE LastName = 'Smith'
After the execution of the UPDATE SQL expression above, the Customers table will look as follows:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 555-5555
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
626 555-5555
SQL DELETE
Learn how to use the SQL DELETE statement to delete data from a SQL database table.
So far we’ve learnt how to select data from a database table and how to insert and update data into a database table. Now it’s time to learn how to remove data from a database. Here comes the SQL DELETE statement!
The SQL DELETE command has the following generic SQL syntax:

DELETE FROM Table1
WHERE Some_Column = Some_Value
If you skip the SQL WHERE clause when executing SQL DELETE expression, then all the data in the specified table will be deleted. The following SQL statement will delete all the data from our Customers table and we’ll end up with completely empty table:

DELETE FROM Table1
If you specify a WHERE clause in your SQL DELETE statement, only the table rows satisfying the WHERE criteria will be deleted:

DELETE FROM Customers
WHERE LastName = 'Smith'
The SQL query above will delete all database rows having LastName 'Smith' and will leave the Customers table in the following state:
FirstName
LastName
Email
DOB
Phone
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
SQL ORDER BY
Learn how to use the SQL ORDER BY statement to sort the data retrieved in your SQL query.
The SQL ORDER BY clause comes in handy when you want to sort your SQL result sets by some column(s). For example if you want to select all the persons from the already familiar Customers table and order the result by date of birth, you will use the following statement:

SELECT * FROM Customers
ORDER BY DOB
The result of the above SQL expression will be the following:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
As you can see the rows are sorted in ascending order by the DOB column, but what if you want to sort them in descending order? To do that you will have to add the DESC SQL keyword after your SQL ORDER BY clause:

SELECT * FROM Customers
ORDER BY DOB DESC
The result of the SQL query above will look like this:
FirstName
LastName
Email
DOB
Phone
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
If you don't specify how to order your rows, alphabetically or reverse, than the result set is ordered alphabetically, hence the following to SQL expressions produce the same result:

SELECT * FROM Customers
ORDER BY DOB

SELECT * FROM Customers
ORDER BY DOB ASC
You can sort your result set by more than one column by specifying those columns in the SQL ORDER BY list. The following SQL expression will order by DOB and LastName:

SELECT * FROM Customers
ORDER BY DOB, LastName
SQL OR & AND
Learn how to use the SQL OR & AND keywords together with the SQL WHERE clause to add several conditions to your SQL statement.
The SQL AND clause is used when you want to specify more than one condition in your SQL WHERE clause, and at the same time you want all conditions to be true.
For example if you want to select all customers with FirstName "John" and LastName "Smith", you will use the following SQL expression:

SELECT * FROM Customers
WHERE FirstName = 'John' AND LastName = 'Smith'
The result of the SQL query above is:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
The following row in our Customer table, satisfies the second of the conditions (LastName = 'Smith'), but not the first one (FirstName = 'John'), and that's why it's not returned by our SQL query:
FirstName
LastName
Email
DOB
Phone
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
The SQL OR statement is used in similar fashion and the major difference compared to the SQL AND is that OR clause will return all rows satisfying any of the conditions listed in the WHERE clause.
If we want to select all customers having FirstName 'James' or FirstName 'Paula' we need to use the following SQL statement:

SELECT * FROM Customers
WHERE FirstName = 'James' OR FirstName = 'Paula'
The result of this query will be the following:
FirstName
LastName
Email
DOB
Phone
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
You can combine AND and OR clauses anyway you want and you can use parentheses to define your logical expressions.
Here is an example of such a SQL query, selecting all customers with LastName 'Brown' and FirstName either 'James' or 'Paula':

SELECT * FROM Customers
WHERE (FirstName = 'James' OR FirstName = 'Paula') AND LastName = 'Brown'


The result of the SQL expression above will be:
FirstName
LastName
Email
DOB
Phone
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
SQL IN
The SQL IN clause allows you to specify discrete values in your SQL WHERE search criteria.
The SQL IN clause allows you to specify discrete values in your SQL WHERE search criteria.
THE SQL IN syntax looks like this:

SELECT Column1, Column2, Column3, …
FROM Table1
WHERE Column1 IN (Valu1, Value2, …)
Let’s use the EmployeeHours table to illustrate how SQL IN works:
Employee
Date
Hours
John Smith
5/6/2004
8
Allan Babel
5/6/2004
8
Tina Crown
5/6/2004
8
John Smith
5/7/2004
9
Allan Babel
5/7/2004
8
Tina Crown
5/7/2004
10
John Smith
5/8/2004
8
Allan Babel
5/8/2004
8
Tina Crown
5/8/2004
9
Consider the following SQL query using the SQL IN clause:

SELECT *
FROM EmployeeHours
WHERE Date IN ('5/6/2004', '5/7/2004')


This SQL expression will select only the entries where the column Date has value of '5/6/2004' or '5/7/2004', and you can see the result below:
Employee
Date
Hours
John Smith
5/6/2004
8
Allan Babel
5/6/2004
8
Tina Crown
5/6/2004
8
John Smith
5/7/2004
9
Allan Babel
5/7/2004
8
Tina Crown
5/7/2004
10
We can use the SQL IN statement with another column in our EmployeeHours table:

SELECT *
FROM EmployeeHours
WHERE Hours IN (9, 10)
The result of the SQL query above will be:
Employee
Date
Hours
John Smith
5/7/2004
9
Tina Crown
5/7/2004
10
Tina Crown
5/8/2004
9
SQL BETWEEN
The SQL BETWEEN & AND keywords define a range of data between 2 values.
The SQL BETWEEN & AND keywords define a range of data between 2 values.
The SQL BETWEEN syntax looks like this:

SELECT Column1, Column2, Column3, …
FROM Table1
WHERE Column1 BETWEEN Value1 AND Value2
The 2 values defining the range for SQL BETWEEN clause can be dates, numbers or just text.
In contrast with the SQL IN keyword, which allows you to specify discrete values in your SQL WHERE criteria, the SQL BETWEEN gives you the ability to specify a range in your search criteria.
We are going to use the familiar Customers table to show how SQL BETWEEN works:
FirstName
LastName
Email
DOB
Phone
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
Consider the following SQL BETWEEN statement:

SELECT *
FROM Customers
WHERE DOB BETWEEN '1/1/1975' AND '1/1/2004'
The SQL BETWEEN statement above will select all Customers having DOB column between '1/1/1975' and '1/1/2004' dates. Here is the result of this SQL expression:
FirstName
LastName
Email
DOB
Phone
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888






SQL Aliases
SQL aliases can be used with database tables and/or with database table columns, depending on task you are performing.
SQL aliases can be used with database tables and with database table columns, depending on task you are performing.
SQL column aliases are used to make the output of your SQL queries easy to read and more meaningful:

SELECT Employee, SUM(Hours) As SumHoursPerEmployee
FROM EmployeeHours
GROUP BY Employee
In the example above we created SQL alias SumHoursPerEmployee and the result of this SQL query will be the following:
Employee
SumHoursPerEmployee
John Smith
25
Allan Babel
24
Tina Crown
27
Consider the following SQL statement, showing how to use SQL table aliases:

SELECT Emp.Employee
FROM EmployeeHours AS Emp
Here is the result of the SQL expression above:
Employee
John Smith
Allan Babel
Tina Crown
The SQL table aliases are very useful when you select data from multiple tables.

SQL COUNT
The SQL COUNT aggregate function is used to count the number of rows in a database table.
The SQL COUNT aggregate function is used to count the number of rows in a database table.
The SQL COUNT syntax is simple and looks like this:

SELECT COUNT(Column1)
FROM Table1
If we want to count the number of customers in our Customers table, we will use the following SQL COUNT statement:

SELECT COUNT(LastName) AS NumberOfCustomers
FROM Customers
The result of this SQL COUNT query will be:
NumberOfCustomers
4










SQL MAX
The SQL MAX aggregate function allows us to select the highest (maximum) value for a certain column.
The SQL MAX aggregate function allows us to select the highest (maximum) value for a certain column.
The SQL MAX function syntax is very simple and it looks like this:

SELECT MAX(Column1)
FROM Table1
If we use the Customers table from our previous chapters, we can select the highest date of birth with the following SQL MAX expression:

SELECT MAX(DOB) AS MaxDOB
FROM Customers











SQL MIN
The SQL MIN aggregate function allows us to select the lowest (minimum) value for a certain column.
The SQL MIN aggregate function allows us to select the lowest (minimum) value for a certain column.
The SQL MIN function syntax is very simple and it looks like this:

SELECT MIN(Column1)
FROM Table1
If we use the Customers table from our previous chapters, we can select the lowest date of birth with the following SQL MIN expression:

SELECT MIN(DOB) AS MinDOB
FROM Customers











SQL AVG
The SQL AVG aggregate function selects the average value for a certain table column.
The SQL AVG aggregate function selects the average value for certain table column.
Have a look at the SQL AVG syntax:

SELECT AVG(Column1)
FROM Table1
If we want to find out what is the average SaleAmount in the Sales table, we will use the following SQL AVG statement:

SELECT AVG(SaleAmount) AS AvgSaleAmount
FROM Sales
which will result in the following dataset:
AvgSaleAmount
$195.73









SQL SUM
The SQL SUM aggregate function allows selecting the total for a numeric column.
The SQL SUM aggregate function allows selecting the total for a numeric column.
The SQL SUM syntax is displayed below:

SELECT SUM(Column1)
FROM Table1
We are going to use the Sales table to illustrate the use of SQL SUM clause:
Sales:
CustomerID
Date
SaleAmount
2
5/6/2004
$100.22
1
5/7/2004
$99.95
3
5/7/2004
$122.95
3
5/13/2004
$100.00
4
5/22/2004
$555.55
Consider the following SQL SUM statement:

SELECT SUM(SaleAmount)
FROM Sales
This SQL statement will return the sum of all SaleAmount fields and the result of it will be:
SaleAmount
$978.67
Of course you can specify search criteria using the SQL WHERE clause in your SQL SUM statement. If you want to select the total sales for customer with CustomerID = 3, you will use the following SQL SUM statement:

SELECT SUM(SaleAmount)
FROM Sales
WHERE CustomerID = 3
The result will be:
SaleAmount
$222.95

















SQL GROUP BY
The SQL GROUP BY statement is used along with the SQL aggregate functions like SUM to provide means of grouping the result dataset by certain database table column(s).
The SQL GROUP BY statement is used along with the SQL aggregate functions like SUM to provide means of grouping the result dataset by certain database table column(s).
The best way to explain how and when to use the SQL GROUP BY statement is by example, and that’s what we are going to do.
Consider the following database table called EmployeeHours storing the daily hours for each employee of a factious company:
Employee
Date
Hours
John Smith
5/6/2004
8
Allan Babel
5/6/2004
8
Tina Crown
5/6/2004
8
John Smith
5/7/2004
9
Allan Babel
5/7/2004
8
Tina Crown
5/7/2004
10
John Smith
5/8/2004
8
Allan Babel
5/8/2004
8
Tina Crown
5/8/2004
9
If the manager of the company wants to get the simple sum of all hours worked by all employees, he needs to execute the following SQL statement:

SELECT SUM (Hours)
FROM EmployeeHours
But what if the manager wants to get the sum of all hours for each of his employees?
To do that he need to modify his SQL query and use the SQL GROUP BY statement:

SELECT Employee, SUM (Hours)
FROM EmployeeHours
GROUP BY Employee
The result of the SQL expression above will be the following:
Employee
Hours
John Smith
25
Allan Babel
24
Tina Crown
27
As you can see we have only one entry for each employee, because we are grouping by the Employee column.
The SQL GROUP BY clause can be used with other SQL aggregate functions, for example SQL AVG:

SELECT Employee, AVG(Hours)
FROM EmployeeHours
GROUP BY Employee
The result of the SQL statement above will be:
Employee
Hours
John Smith
8.33
Allan Babel
8
Tina Crown
9
In our Employee table we can group by the date column too, to find out what is the total number of hours worked on each of the dates into the table:

SELECT Date, SUM(Hours)
FROM EmployeeHours
GROUP BY Date
Here is the result of the above SQL expression:
Date
Hours
5/6/2004
24
5/7/2004
27
5/8/2004
25
SQL HAVING
The SQL HAVING clause is used to restrict conditionally the output of a SQL statement, by a SQL aggregate function used in your SELECT list of columns.
The SQL HAVING clause is used to restrict conditionally the output of a SQL statement, by a SQL aggregate function used in your SELECT list of columns.
You can't specify criteria in a SQL WHERE clause against a column in the SELECT list for which SQL aggregate function is used. For example the following SQL statement will generate an error:

SELECT Employee, SUM (Hours)
FROM EmployeeHours
WHERE SUM (Hours) > 24
GROUP BY Employee
The SQL HAVING clause is used to do exactly this, to specify a condition for an aggregate function which is used in your query:

SELECT Employee, SUM (Hours)
FROM EmployeeHours
GROUP BY Employee
HAVING SUM (Hours) > 24
The above SQL statement will select all employees and the sum of their respective hours, as long as this sum is greater than 24. The result of the SQL HAVING clause can be seen below:
Employee
Hours
John Smith
25
Tina Crown
27



The SQL JOIN clause is used whenever we have to select data from 2 or more tables.
The SQL JOIN clause is used whenever we have to select data from 2 or more tables.
To be able to use SQL JOIN clause to extract data from 2 (or more) tables, we need a relationship between certain columns in these tables.


The SQL JOIN clause is used whenever we have to select data from 2 or more tables.
The SQL JOIN clause is used whenever we have to select data from 2 or more tables.
To be able to use SQL JOIN clause to extract data from 2 (or more) tables, we need a relationship between certain columns in these tables.
We are going to illustrate our SQL JOIN example with the following 2 tables:

Customers:
CustomerID
FirstName
LastName
Email
DOB
Phone
1
John
Smith
John.Smith@yahoo.com
2/4/1968
626 222-2222
2
Steven
Goldfish
goldfish@fishhere.net
4/4/1974
323 455-4545
3
Paula
Brown
pb@herowndomain.org
5/24/1978
416 323-3232
4
James
Smith
jim@supergig.co.uk
20/10/1980
416 323-8888
Sales:
CustomerID
Date
SaleAmount
2
5/6/2004
$100.22
1
5/7/2004
$99.95
3
5/7/2004
$122.95
3
5/13/2004
$100.00
4
5/22/2004
$555.55
As you can see those 2 tables have common field called CustomerID and thanks to that we can extract information from both tables by matching their CustomerID columns.
Consider the following SQL statement:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers, Sales
WHERE Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
The SQL expression above will select all distinct customers (their first and last names) and the total respective amount of dollars they have spent.
The SQL JOIN condition has been specified after the SQL WHERE clause and says that the 2 tables have to be matched by their respective CustomerID columns.

Here is the result of this SQL statement:
FirstName
LastName
SalesPerCustomers
John
Smith
$99.95
Steven
Goldfish
$100.22
Paula
Brown
$222.95
James
Smith
$555.55
The SQL statement above can be re-written using the SQL JOIN clause like this:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
There are 2 types of SQL JOINS – INNER JOINS and OUTER JOINS. If you don't put INNER or OUTER keywords in front of the SQL JOIN keyword, then INNER JOIN is used. In short "INNER JOIN" = "JOIN" (note that different databases have different syntax for their JOIN clauses).
The INNER JOIN will select all rows from both tables as long as there is a match between the columns we are matching on. In case we have a customer in the Customers table, which still hasn't made any orders (there are no entries for this customer in the Sales table), this customer will not be listed in the result of our SQL query above.
If the Sales table has the following rows:
CustomerID
Date
SaleAmount
2
5/6/2004
$100.22
1
5/6/2004
$99.95
And we use the same SQL JOIN statement from above:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
We'll get the following result:
FirstName
LastName
SalesPerCustomers
John
Smith
$99.95
Steven
Goldfish
$100.22
Even though Paula and James are listed as customers in the Customers table they won't be displayed because they haven't purchased anything yet.
But what if you want to display all the customers and their sales, no matter if they have ordered something or not? We’ll do that with the help of SQL OUTER JOIN clause.
The second type of SQL JOIN is called SQL OUTER JOIN and it has 2 sub-types called LEFT OUTER JOIN and RIGHT OUTER JOIN.
The LEFT OUTER JOIN or simply LEFT JOIN (you can omit the OUTER keyword in most databases), selects all the rows from the first table listed after the FROM clause, no matter if they have matches in the second table.
If we slightly modify our last SQL statement to:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers LEFT JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName
and the Sales table still has the following rows:
CustomerID
Date
SaleAmount
2
5/6/2004
$100.22
1
5/6/2004
$99.95
The result will be the following:
FirstName
LastName
SalesPerCustomers
John
Smith
$99.95
Steven
Goldfish
$100.22
Paula
Brown
NULL
James
Smith
NULL
As you can see we have selected everything from the Customers (first table). For all rows from Customers, which don’t have a match in the Sales (second table), the SalesPerCustomer column has amount NULL (NULL means a column contains nothing).
The RIGHT OUTER JOIN or just RIGHT JOIN behaves exactly as SQL LEFT JOIN, except that it returns all rows from the second table (the right table in our SQL JOIN statement).

SQL LEFT JOIN

The LEFT JOIN keyword returns all rows from the left table (table_name1), even if there are no matches in the right table (table_name2).

SQL LEFT JOIN Syntax

SELECT column_name(s)
FROM table_name1
LEFT JOIN table_name2
ON table_name1.column_name=table_name2.column_name
PS: In some databases LEFT JOIN is called LEFT OUTER JOIN.

SQL LEFT JOIN Example

The "Persons" table:
P_Id
LastName
FirstName
Address
City
1
Hansen
Ola
Timoteivn 10
Sandnes
2
Svendson
Tove
Borgvn 23
Sandnes
3
Pettersen
Kari
Storgt 20
Stavanger
The "Orders" table:
O_Id
OrderNo
P_Id
1
77895
3
2
44678
3
3
22456
1
4
24562
1
5
34764
15
Now we want to list all the persons and their orders - if any, from the tables above.
We use the following SELECT statement:
SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo
FROM Persons
LEFT JOIN Orders
ON Persons.P_Id=Orders.P_Id
ORDER BY Persons.LastName
The result-set will look like this:
LastName
FirstName
OrderNo
Hansen
Ola
22456
Hansen
Ola
24562
Pettersen
Kari
77895
Pettersen
Kari
44678
Svendson
Tove

The LEFT JOIN keyword returns all the rows from the left table (Persons), even if there are no matches in the right table (Orders).

SQL RIGHT JOIN

The RIGHT JOIN keyword Return all rows from the right table (table_name2), even if there are no matches in the left table (table_name1).

SQL RIGHT JOIN Syntax

SELECT column_name(s)
FROM table_name1
RIGHT JOIN table_name2
ON table_name1.column_name=table_name2.column_name
PS: In some databases RIGHT JOIN is called RIGHT OUTER JOIN.

SQL RIGHT JOIN Example

The "Persons" table:
P_Id
LastName
FirstName
Address
City
1
Hansen
Ola
Timoteivn 10
Sandnes
2
Svendson
Tove
Borgvn 23
Sandnes
3
Pettersen
Kari
Storgt 20
Stavanger
The "Orders" table:
O_Id
OrderNo
P_Id
1
77895
3
2
44678
3
3
22456
1
4
24562
1
5
34764
15
Now we want to list all the orders with containing persons - if any, from the tables above.
We use the following SELECT statement:
SELECT Persons.LastName, Persons.FirstName, Orders.OrderNo
FROM Persons
RIGHT JOIN Orders
ON Persons.P_Id=Orders.P_Id
ORDER BY Persons.LastName
The result-set will look like this:
LastName
FirstName
OrderNo
Hansen
Ola
22456
Hansen
Ola
24562
Pettersen
Kari
77895
Pettersen
Kari
44678


34764
The RIGHT JOIN keyword returns all the rows from the right table (Orders), even if there are no matches in the left table (Persons).
Trim
The TRIM function in SQL is used to remove specified prefix or suffix from a string. The most common pattern being removed is white spaces. This function is called differently in different databases:
  • MySQL: TRIM(), RTRIM(), LTRIM()
  • Oracle: RTRIM(), LTRIM()
  • SQL Server: RTRIM(), LTRIM()
The syntax for these trim functions are:
TRIM([[LOCATION] [remstr] FROM ] str): [LOCATION] can be either LEADING, TRAILING, or BOTH. This function gets rid of the [remstr] pattern from either the beginning of the string or the end of the string, or both. If no [remstr] is specified, white spaces are removed.
LTRIM(str): Removes all white spaces from the beginning of the string.
RTRIM(str): Removes all white spaces at the end of the string.
Example 1:
SELECT TRIM('   Sample   ');
Result:
'Sample'
Example 2:
SELECT LTRIM('   Sample   ');
Result:
'Sample   '
Example 3:
SELECT RTRIM('   Sample   ');
Result:
'   Sample'

 Wildcard
There are times when we want to match on a string pattern. To do that, we will need to employ the concept of wildcard. In SQL, there are two wildcards:
% (percent sign) represents zero, one, or more characters.
_ (underscore) represents exactly one character.
Wildcards are used with the LIKE keyword in SQL.
Below are some wildcard examples:
• 'A_Z': All string that starts with 'A', another character, and end with 'Z'. For example, 'ABZ' and 'A2Z' would both satisfy the condition, while 'AKKZ' would not (because there are two characters between A and Z instead of one).
• 'ABC%': All strings that start with 'ABC'. For example, 'ABCD' and 'ABCABC' would both satisfy the condition.
• '%XYZ': All strings that end with 'XYZ'. For example, 'WXYZ' and 'ZZXYZ' would both satisfy the condition.
• '%AN%': All strings that contain the pattern 'AN' anywhere. For example, 'LOS ANGELES' and 'SAN FRANCISCO' would both satisfy the condition.
• '_AN%': All strings that contain a character, then 'AN', followed by anything else. For example, 'SAN FRANCISCO' would satisfy the condition, while 'LOS ANGELES' would not satisfy the condition.
 Replace
The Replace function in SQL is used to update the content of a string. The function call is REPLACE() for MySQL, Oracle, and SQL Server. The syntax of the Replace function is:
Replace(str1, str2, str3): In str1, find where str2 occurs, and replace it with str3.
Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
If we apply the following Replace function:
SELECT REPLACE(region_name, 'ast', 'astern')
FROM Geography;
Result:
region_name
Eastern
Eastern
West
West

AS
In the SQL Alias section, we saw that the syntax for using table and column aliases is as follows:
SELECT "table_alias"."column_name1" "column_alias"
FROM "table_name" "table_alias"
The keyword AS is used to assign an alias to the column or a table. It is insert between the column name and the column alias or between the table name and the table alias. The syntax for using AS is as follows:
SELECT "table_alias"."column_name1" AS "column_alias"
FROM "table_name" AS "table_alias"
Let's take a look at the same example as we used in SQL Alias. Assume we have the following table, Store_Information,
Table Store_Information
store_name
Sales
Date
Los Angeles
$1500
Jan-05-1999
San Diego
$250
Jan-07-1999
Los Angeles
$300
Jan-08-1999
Boston
$700
Jan-08-1999
To find total sales by store using AS as part of the table and column alias, we type in:
SELECT A1.store_name Store, SUM(A1.Sales) AS "Total Sales"
FROM Store_Information AS A1
GROUP BY A1.store_name
Result:
Store
 
Total Sales
Los Angeles
 
$1800
San Diego
 
$250
Boston
 
$700

Join
Now we want to look at joins. To do joins correctly in SQL requires many of the elements we have introduced so far. Let's assume that we have the following two tables,
Table Store_Information
store_name
Sales
Date
Los Angeles
$1500
Jan-05-1999
San Diego
$250
Jan-07-1999
Los Angeles
$300
Jan-08-1999
Boston
$700
Jan-08-1999
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
and we want to find out sales by region. We see that table Geography includes information on regions and stores, and table Store_Information contains sales information for each store. To get the sales information by region, we have to combine the information from the two tables. Examining the two tables, we find that they are linked via the common field, "store_name". We will first present the SQL statement and explain the use of each segment later:
SELECT A1.region_name REGION, SUM(A2.Sales) SALES
FROM Geography A1, Store_Information A2
WHERE A1.store_name = A2.store_name
GROUP BY A1.region_name
Result:
REGION
 
SALES
East
 
$700
West
 
$2050
The first two lines tell SQL to select two fields, the first one is the field "region_name" from table Geography (aliased as REGION), and the second one is the sum of the field "Sales" from table Store_Information (aliased as SALES). Notice how the table aliases are used here: Geography is aliased as A1, and Store_Information is aliased as A2. Without the aliasing, the first line would become
SELECT Geography.region_name REGION, SUM(Store_Information.Sales) SALES
which is much more cumbersome. In essence, table aliases make the entire SQL statement easier to understand, especially when multiple tables are included.
Next, we turn our attention to line 3, the WHERE statement. This is where the condition of the join is specified. In this case, we want to make sure that the content in "store_name" in table Geography matches that in table Store_Information, and the way to do it is to set them equal. This WHERE statement is essential in making sure you get the correct output. Without the correct WHERE statement, a Cartesian Join will result. Cartesian joins will result in the query returning every possible combination of the two (or whatever the number of tables in the FROM statement) tables. In this case, a Cartesian join would result in a total of 4 x 4 = 16 rows being returned.
Outer Join
Previously, we had looked at left join, or inner join, where we select rows common to the participating tables to a join. What about the cases where we are interested in selecting elements in a table regardless of whether they are present in the second table? We will now need to use the SQL OUTER JOIN command.
The syntax for performing an outer join in SQL is database-dependent. For example, in Oracle, we will place an "(+)" in the WHERE clause on the other side of the table for which we want to include all the rows.
Let's assume that we have the following two tables,
Concatenate
Sometimes it is necessary to combine together (concatenate) the results from several different fields. Each database provides a way to do this:
  • MySQL: CONCAT()
  • Oracle: CONCAT(), ||
  • SQL Server: +

The syntax for CONCAT() is as follows:
CONCAT(str1, str2, str3, ...): Concatenate str1, str2, str3, and any other strings together. Please note the Oracle CONCAT() function only allows two arguments -- only two strings can be put together at a time using this function. However, it is possible to concatenate more than two strings at a time in Oracle using '||'.
Let's look at some examples. Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
Example 1:
MySQL/Oracle:
SELECT CONCAT(region_name,store_name) FROM Geography
WHERE store_name = 'Boston';
Result:
'EastBoston'
Example 2:
Oracle:
SELECT region_name || ' ' || store_name FROM Geography
WHERE store_name = 'Boston';
Result:
'East Boston'
Example 3:
SQL Server:
SELECT region_name + ' ' + store_name FROM Geography
WHERE store_name = 'Boston';
Result:
'East Boston'
Substring
The Substring function in SQL is used to grab a portion of the stored data. This function is called differently for the different databases:
  • MySQL: SUBSTR(), SUBSTRING()
  • Oracle: SUBSTR()
  • SQL Server: SUBSTRING()
The most frequent uses are as follows (we will use SUBSTR() here):
SUBSTR(str,pos): Select all characters from <str> starting with position <pos>. Note that this syntax is not supported in SQL Server.
SUBSTR(str,pos,len): Starting with the <pos>th character in string <str> and select the next <len> characters.
Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
Example 1:
SELECT SUBSTR(store_name, 3)
FROM Geography
WHERE store_name = 'Los Angeles';
Result:
's Angeles'
Example 2:
SELECT SUBSTR(store_name,2,4)
FROM Geography
WHERE store_name = 'San Diego';
Result:
'an D'
Length
The Length function in SQL is used to get the length of a string. This function is called differently for the different databases:
  • MySQL: LENGTH()
  • Oracle: LENGTH()
  • SQL Server: LEN()
The syntax for the Length function is as follows:
Length(str): Find the length of the string str.
Let's take a look at some examples. Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
Example 1:
SELECT Length(store_name)
FROM Geography
WHERE store_name = 'Los Angeles';
Result:
11
Example 2:
SELECT region_name, Length(region_name)
FROM Geography;
Result:
region_name
Length(region_name)
East
4
East
4
West
4
West
4

Trim
The TRIM function in SQL is used to remove specified prefix or suffix from a string. The most common pattern being removed is white spaces. This function is called differently in different databases:
  • MySQL: TRIM(), RTRIM(), LTRIM()
  • Oracle: RTRIM(), LTRIM()
  • SQL Server: RTRIM(), LTRIM()
The syntax for these trim functions are:
TRIM([[LOCATION] [remstr] FROM ] str): [LOCATION] can be either LEADING, TRAILING, or BOTH. This function gets rid of the [remstr] pattern from either the beginning of the string or the end of the string, or both. If no [remstr] is specified, white spaces are removed.
LTRIM(str): Removes all white spaces from the beginning of the string.
RTRIM(str): Removes all white spaces at the end of the string.
Example 1:
SELECT TRIM('   Sample   ');
Result:
'Sample'
Example 2:
SELECT LTRIM('   Sample   ');
Result:
'Sample   '
Example 3:
SELECT RTRIM('   Sample   ');
Result:
'   Sample'

 Wildcard
There are times when we want to match on a string pattern. To do that, we will need to employ the concept of wildcard. In SQL, there are two wildcards:
% (percent sign) represents zero, one, or more characters.
_ (underscore) represents exactly one character.
Wildcards are used with the LIKE keyword in SQL.
Below are some wildcard examples:
• 'A_Z': All string that starts with 'A', another character, and end with 'Z'. For example, 'ABZ' and 'A2Z' would both satisfy the condition, while 'AKKZ' would not (because there are two characters between A and Z instead of one).
• 'ABC%': All strings that start with 'ABC'. For example, 'ABCD' and 'ABCABC' would both satisfy the condition.
• '%XYZ': All strings that end with 'XYZ'. For example, 'WXYZ' and 'ZZXYZ' would both satisfy the condition.
• '%AN%': All strings that contain the pattern 'AN' anywhere. For example, 'LOS ANGELES' and 'SAN FRANCISCO' would both satisfy the condition.
• '_AN%': All strings that contain a character, then 'AN', followed by anything else. For example, 'SAN FRANCISCO' would satisfy the condition, while 'LOS ANGELES' would not satisfy the condition.
 Replace
The Replace function in SQL is used to update the content of a string. The function call is REPLACE() for MySQL, Oracle, and SQL Server. The syntax of the Replace function is:
Replace(str1, str2, str3): In str1, find where str2 occurs, and replace it with str3.
Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
If we apply the following Replace function:
SELECT REPLACE(region_name, 'ast', 'astern')
FROM Geography;
Result:
region_name
Eastern
Eastern
West
West

AS
In the SQL Alias section, we saw that the syntax for using table and column aliases is as follows:
SELECT "table_alias"."column_name1" "column_alias"
FROM "table_name" "table_alias"
The keyword AS is used to assign an alias to the column or a table. It is insert between the column name and the column alias or between the table name and the table alias. The syntax for using AS is as follows:
SELECT "table_alias"."column_name1" AS "column_alias"
FROM "table_name" AS "table_alias"
Let's take a look at the same example as we used in SQL Alias. Assume we have the following table, Store_Information,
Table Store_Information
store_name
Sales
Date
Los Angeles
$1500
Jan-05-1999
San Diego
$250
Jan-07-1999
Los Angeles
$300
Jan-08-1999
Boston
$700
Jan-08-1999
To find total sales by store using AS as part of the table and column alias, we type in:
SELECT A1.store_name Store, SUM(A1.Sales) AS "Total Sales"
FROM Store_Information AS A1
GROUP BY A1.store_name
Result:
Store
 
Total Sales
Los Angeles
 
$1800
San Diego
 
$250
Boston
 
$700

Join
Now we want to look at joins. To do joins correctly in SQL requires many of the elements we have introduced so far. Let's assume that we have the following two tables,
Table Store_Information
store_name
Sales
Date
Los Angeles
$1500
Jan-05-1999
San Diego
$250
Jan-07-1999
Los Angeles
$300
Jan-08-1999
Boston
$700
Jan-08-1999
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
and we want to find out sales by region. We see that table Geography includes information on regions and stores, and table Store_Information contains sales information for each store. To get the sales information by region, we have to combine the information from the two tables. Examining the two tables, we find that they are linked via the common field, "store_name". We will first present the SQL statement and explain the use of each segment later:
SELECT A1.region_name REGION, SUM(A2.Sales) SALES
FROM Geography A1, Store_Information A2
WHERE A1.store_name = A2.store_name
GROUP BY A1.region_name
Result:
REGION
 
SALES
East
 
$700
West
 
$2050
The first two lines tell SQL to select two fields, the first one is the field "region_name" from table Geography (aliased as REGION), and the second one is the sum of the field "Sales" from table Store_Information (aliased as SALES). Notice how the table aliases are used here: Geography is aliased as A1, and Store_Information is aliased as A2. Without the aliasing, the first line would become
SELECT Geography.region_name REGION, SUM(Store_Information.Sales) SALES
which is much more cumbersome. In essence, table aliases make the entire SQL statement easier to understand, especially when multiple tables are included.
Next, we turn our attention to line 3, the WHERE statement. This is where the condition of the join is specified. In this case, we want to make sure that the content in "store_name" in table Geography matches that in table Store_Information, and the way to do it is to set them equal. This WHERE statement is essential in making sure you get the correct output. Without the correct WHERE statement, a Cartesian Join will result. Cartesian joins will result in the query returning every possible combination of the two (or whatever the number of tables in the FROM statement) tables. In this case, a Cartesian join would result in a total of 4 x 4 = 16 rows being returned.
Outer Join
Previously, we had looked at left join, or inner join, where we select rows common to the participating tables to a join. What about the cases where we are interested in selecting elements in a table regardless of whether they are present in the second table? We will now need to use the SQL OUTER JOIN command.
The syntax for performing an outer join in SQL is database-dependent. For example, in Oracle, we will place an "(+)" in the WHERE clause on the other side of the table for which we want to include all the rows.
Let's assume that we have the following two tables,
Concatenate
Sometimes it is necessary to combine together (concatenate) the results from several different fields. Each database provides a way to do this:
  • MySQL: CONCAT()
  • Oracle: CONCAT(), ||
  • SQL Server: +

The syntax for CONCAT() is as follows:
CONCAT(str1, str2, str3, ...): Concatenate str1, str2, str3, and any other strings together. Please note the Oracle CONCAT() function only allows two arguments -- only two strings can be put together at a time using this function. However, it is possible to concatenate more than two strings at a time in Oracle using '||'.
Let's look at some examples. Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
Example 1:
MySQL/Oracle:
SELECT CONCAT(region_name,store_name) FROM Geography
WHERE store_name = 'Boston';
Result:
'EastBoston'
Example 2:
Oracle:
SELECT region_name || ' ' || store_name FROM Geography
WHERE store_name = 'Boston';
Result:
'East Boston'
Example 3:
SQL Server:
SELECT region_name + ' ' + store_name FROM Geography
WHERE store_name = 'Boston';
Result:
'East Boston'
Substring
The Substring function in SQL is used to grab a portion of the stored data. This function is called differently for the different databases:
  • MySQL: SUBSTR(), SUBSTRING()
  • Oracle: SUBSTR()
  • SQL Server: SUBSTRING()
The most frequent uses are as follows (we will use SUBSTR() here):
SUBSTR(str,pos): Select all characters from <str> starting with position <pos>. Note that this syntax is not supported in SQL Server.
SUBSTR(str,pos,len): Starting with the <pos>th character in string <str> and select the next <len> characters.
Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
Example 1:
SELECT SUBSTR(store_name, 3)
FROM Geography
WHERE store_name = 'Los Angeles';
Result:
's Angeles'
Example 2:
SELECT SUBSTR(store_name,2,4)
FROM Geography
WHERE store_name = 'San Diego';
Result:
'an D'
Length
The Length function in SQL is used to get the length of a string. This function is called differently for the different databases:
  • MySQL: LENGTH()
  • Oracle: LENGTH()
  • SQL Server: LEN()
The syntax for the Length function is as follows:
Length(str): Find the length of the string str.
Let's take a look at some examples. Assume we have the following table:
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
Example 1:
SELECT Length(store_name)
FROM Geography
WHERE store_name = 'Los Angeles';
Result:
11
Example 2:
SELECT region_name, Length(region_name)
FROM Geography;
Result:
region_name
Length(region_name)
East
4
East
4
West
4
West
4

Create Table Statement
Tables are the basic structure where data is stored in the database. Given that in most cases, there is no way for the database vendor to know ahead of time what your data storage needs are, chances are that you will need to create tables in the database yourself. Many database tools allow you to create tables without writing SQL, but given that tables are the container of all the data, it is important to include the CREATE TABLE syntax in this User guide.
Before we dive into the SQL syntax for CREATE TABLE, it is a good idea to understand what goes into a table. Tables are divided into rows and columns. Each row represents one piece of data, and each column can be thought of as representing a component of that piece of data. So, for example, if we have a table for recording customer information, then the columns may include information such as First Name, Last Name, Address, City, Country, Birth Date, and so on. As a result, when we specify a table, we include the column headers and the data types for that particular column.
So what are data types? Typically, data comes in a variety of forms. It could be an integer (such as 1), a real number (such as 0.55), a string (such as 'sql'), a date/time expression (such as '2000-JAN-25 03:22:22'), or even in binary format. When we specify a table, we need to specify the data type associated with each column (i.e., we will specify that 'First Name' is of type char(50) - meaning it is a string with 50 characters). One thing to note is that different relational databases allow for different data types, so it is wise to consult with a database-specific reference first.




The SQL syntax for CREATE TABLE is
CREATE TABLE "table_name"
("column 1" "data_type_for_column_1",
"column 2" "data_type_for_column_2",
... )
So, if we are to create the customer table specified as above, we would type in
CREATE TABLE customer
(First_Name char(50),
Last_Name char(50),
Address char(50),
City char(50),
Country char(25),
Birth_Date date)
Sometimes, we want to provide a default value for each column. A default value is used when you do not specify a column's value when inserting data into the table. To specify a default value, add "Default [value]" after the data type declaration. In the above example, if we want to default column "Address" to "Unknown" and City to "Mumbai", we would type in
CREATE TABLE customer
(First_Name char(50),
Last_Name char(50),
Address char(50) default 'Unknown',
City char(50) default 'Mumbai',
Country char(25),
Birth_Date date)
You can also limit the type of information a table / a column can hold. This is done through the CONSTRAINT keyword, which is discussed next.

Constraint
You can place constraints to limit the type of data that can go into a table. Such constraints can be specified when the table when the table is first created via the CREATE TABLE statement, or after the table is already created via the ALTER TABLE statement.
Common types of constraints include the following:
Each constraint is discussed in the following sections.

NOT NULL Constraint
By default, a column can hold NULL. If you not want to allow NULL value in a column, you will want to place a constraint on this column specifying that NULL is now not an allowable value.
For example, in the following statement,
CREATE TABLE Customer
(SID integer NOT NULL,
Last_Name varchar (30) NOT NULL,
First_Name varchar(30));
Columns "SID" and "Last_Name" cannot include NULL, while "First_Name" can include NULL.
An attempt to execute the following SQL statement,
INSERT INTO Customer (Last_Name, First_Name) values ('Wong','Ken');
will result in an error because this will lead to column "SID" being NULL, which violates the NOT NULL constraint on that column.
DEFAULT Constraint
The DEFAULT constraint provides a default value to a column when the INSERT INTO statement does not provide a specific value. For example, if we create a table as below:
CREATE TABLE Student
(Student_ID integer Unique,
Last_Name varchar (30),
First_Name varchar (30),
Score DEFAULT 80);
and execute the following SQL statement,
INSERT INTO Student (Student_ID, Last_Name, First_Name) values ('10','Johnson','Rick');
The table will look like the following:
Student_ID
Last_Name
First_Name
Score
10
Johnson
Rick
80
Even though we didn't specify a value for the "Score" column in the INSERT INTO statement, it does get assigned the default value of 80 since we had already set 80 as the default value for this column.
UNIQUE Constraint
The UNIQUE constraint ensures that all values in a column are distinct.
For example, in the following CREATE TABLE statement,
CREATE TABLE Customer
(SID integer Unique,
Last_Name varchar (30),
First_Name varchar(30));
column "SID" has a unique constraint, and hence cannot include duplicate values. Such constraint does not hold for columns "Last_Name" and "First_Name". So, if the table already contains the following rows:
SID
Last_Name
First_Name
1
Johnson
Stella
2
James
Gina
3
Aaron
Ralph
Executing the following SQL statement,
INSERT INTO Customer values ('3','Lee','Grace');
will result in an error because '3' already exists in the SID column, thus trying to insert another row with that value violates the UNIQUE constraint.
Please note that a column that is specified as a primary key must also be unique. At the same time, a column that's unique may or may not be a primary key. In addition, multiple UNIQUE constraints can be defined on a table.
CHECK Constraint
The CHECK constraint ensures that all values in a column satisfy certain conditions. Once defined, the database will only insert a new row or update an existing row if the new value satisfies the CHECK constraint. The CHECK constraint is used to ensure data quality
For example, in the following CREATE TABLE statement,
CREATE TABLE Customer
(SID integer CHECK (SID > 0),
Last_Name varchar (30),

First_Name varchar(30));
Column "SID" has a constraint -- its value must only include integers greater than 0. So, attempting to execute the following statement,
INSERT INTO Customer values ('-3','Gonzales','Lynn');
will result in an error because the values for SID must be greater than 0.
Please note that the CHECK constraint does not get enforced by MySQL at this time.
Primary Key
A primary key is used to uniquely identify each row in a table. It can either be part of the actual record itself , or it can be an artificial field (one that has nothing to do with the actual record). A primary key can consist of one or more fields on a table. When multiple fields are used as a primary key, they are called a composite key.
Primary keys can be specified either when the table is created (using CREATE TABLE) or by changing the existing table structure (using ALTER TABLE).
Below are examples for specifying a primary key when creating a table:
Oracle:
CREATE TABLE Customer
(SID integer PRIMARY KEY,
Last_Name varchar(30),
First_Name varchar(30));
Below are examples for specifying a primary key by altering a table:
Oracle:
ALTER TABLE Customer ADD PRIMARY KEY (SID);
Note: Before using the ALTER TABLE command to add a primary key, you'll need to make sure that the field is defined as 'NOT NULL' -- in other words, NULL cannot be an accepted value for that field.
Foreign Key
A foreign key is a field (or fields) that points to the primary key of another table. The purpose of the foreign key is to ensure referential integrity of the data. In other words, only values that are supposed to appear in the database are permitted.
For example, say we have two tables, a CUSTOMER table that includes all customer data, and an ORDERS table that includes all customer orders. The constraint here is that all orders must be associated with a customer that is already in the CUSTOMER table. In this case, we will place a foreign key on the ORDERS table and have it relate to the primary key of the CUSTOMER table. This way, we can ensure that all orders in the ORDERS table are related to a customer in the CUSTOMER table. In other words, the ORDERS table cannot contain information on a customer that is not in the CUSTOMER table.
The structure of these two tables will be as follows:
Table CUSTOMER
column name
characteristic
SID
Primary Key
Last_Name
 
First_Name
 
Table ORDERS
column name
characteristic
Order_ID
Primary Key
Order_Date
 
Customer_SID
Foreign Key
Amount
 
In the above example, the Customer_SID column in the ORDERS table is a foreign key pointing to the SID column in the CUSTOMER table.
Below we show examples of how to specify the foreign key when creating the ORDERS table:
Oracle:
CREATE TABLE ORDERS
(Order_ID integer primary key,
Order_Date date,
Customer_SID integer references CUSTOMER(SID),
Amount double);
Below are examples for specifying a foreign key by altering a table. This assumes that the ORDERS table has been created, and the foreign key has not yet been put in:
Oracle:
ALTER TABLE ORDERS
ADD (CONSTRAINT fk_orders1) FOREIGN KEY (customer_sid) REFERENCES CUSTOMER(SID);
View
A view is a virtual table. A view consists of rows and columns just like a table. The difference between a view and a table is that views are definitions built on top of other tables (or views), and do not hold data themselves. If data is changing in the underlying table, the same change is reflected in the view. A view can be built on top of a single table or multiple tables. It can also be built on top of another view. In the SQL Create View page, we will see how a view can be built.
Views offer the following advantages:
1. Ease of use: A view hides the complexity of the database tables from end users. Essentially we can think of views as a layer of abstraction on top of the database tables.
2. Space savings: Views takes very little space to store, since they do not store actual data.
3. Additional data security: Views can include only certain columns in the table so that only the non-sensitive columns are included and exposed to the end user. In addition, some databases allow views to have different security settings, thus hiding sensitive data from prying eyes.
Create View Statement
Views can be considered as virtual tables. Generally speaking, a table has a set of definition, and it physically stores the data. A view also has a set of definitions, which is build on top of table(s) or other view(s), and it does not physically store the data.
The syntax for creating a view is as follows:
CREATE VIEW "VIEW_NAME" AS "SQL Statement"
"SQL Statement" can be any of the SQL statements we have discussed in this User guide.
Let's use a simple example to illustrate. Say we have the following table:
TABLE Customer
(First_Name char(50),
Last_Name char(50),
Address char(50),
City char(50),
Country char(25),
Birth_Date date)
and we want to create a view called V_Customer that contains only the First_Name, Last_Name, and Country columns from this table, we would type in,
CREATE VIEW V_Customer
AS SELECT First_Name, Last_Name, Country
FROM Customer
Now we have a view called V_Customer with the following structure:
View V_Customer
(First_Name char(50),
Last_Name char(50),
Country char(25))
We can also use a view to apply joins to two tables. In this case, users only see one view rather than two tables, and the SQL statement users need to issue becomes much simpler. Let's say we have the following two tables:
Table Store_Information
store_name
Sales
Date
Los Angeles
$1500
Jan-05-1999
San Diego
$250
Jan-07-1999
Los Angeles
$300
Jan-08-1999
Boston
$700
Jan-08-1999
Table Geography
region_name
store_name
East
Boston
East
New York
West
Los Angeles
West
San Diego
and we want to build a view that has sales by region information. We would issue the following SQL statement:
CREATE VIEW V_REGION_SALES
AS SELECT A1.region_name REGION, SUM(A2.Sales) SALES
FROM Geography A1, Store_Information A2
WHERE A1.store_name = A2.store_name
GROUP BY A1.region_name
This gives us a view, V_REGION_SALES, that has been defined to store sales by region records. If we want to find out the content of this view, we type in,


SELECT * FROM V_REGION_SALES
Result:
REGION
SALES
East
$700
West
$2050

Index
Indexes help us retrieve data from tables quicker. Let's use an example to illustrate this point: Say we are interested in reading about how to grow peppers in a gardening book. Instead of reading the book from the beginning until we find a section on peppers, it is much quicker for us to go to the index section at the end of the book, locate which pages contain information on peppers, and then go to these pages directly. Going to the index first saves us time and is by far a more efficient method for locating the information we need.
The same principle applies for retrieving data from a database table. Without an index, the database system reads through the entire table (this process is called a 'table scan') to locate the desired information. With the proper index in place, the database system can then first go through the index to find out where to retrieve the data, and then go to these locations directly to get the needed data. This is much faster.
Therefore, it is often desirable to create indexes on tables. An index can cover one or more columns. The syntax for creating a table index is shown in the CREATE INDEX section. Below we discuss some general strategies when building and using an index:
1. Build index on columns of integer type
Integers take less space to store, which means the query will be faster. If the column you want to build an index for is not of type integer, consider creating a surrogate integer key (or simply a surrogate column of type integer) which maps one-to-one to the column you want to build the index for.
2. Keep index as narrow as possible
Narrower indexes take less space, require less time to process, which in turn means the query will run faster.
3. Column order is important
For indexes covering multiple columns, the order of the columns in the index is important. The best practice is to use the column with the lowest cardinality first, and the column with the highest cardinality last. Recall cardinality means the number of distinct values for that column. So, if "SELECT DISTINCT (COLUMN1) FROM TABLE_NAME;" returns 5, that means the cardinality for COLUMN1 is 5.
4. Make sure the column you are building an index for is declared NOT NULL
This can decrease the size of the index, which in turn will speed up the query.  
Create Index Statement
As mentioned in the Index overview page, a table index helps SQL statements run faster. The syntax for creating an index is:
CREATE INDEX "INDEX_NAME" ON "TABLE_NAME" (COLUMN_NAME)
Let's assume that we have the following table,
TABLE Customer
(First_Name char(50),
Last_Name char(50),
Address char(50),
City char(50),
Country char(25),
Birth_Date date)
and we want to create an index on the column Last_Name, we would type in,
CREATE INDEX IDX_CUSTOMER_LAST_NAME
on CUSTOMER (Last_Name)
If we want to create an index on both City and Country, we would type in,
CREATE INDEX IDX_CUSTOMER_LOCATION
on CUSTOMER (City, Country)
There is no strict rule on how to name an index. The generally accepted method is to place a prefix, such as "IDX_", before an index name to avoid confusion with other database objects. It is also a good idea to provide information on which table and column(s) the index is used on.
Please note that the exact syntax for CREATE INDEX may be different for different databases. You should consult with your database reference manual for the precise syntax.
Alter Table Statement
Once a table is created in the database, there are many occasions where one may wish to change the structure of the table. Typical cases include the following:
- Add a column
- Drop a column
- Change a column name
- Change the data type for a column
Please note that the above is not an exhaustive list. There are other instances where ALTER TABLE is used to change the table structure, such as changing the primary key specification or adding a unique constraint to a column.
The SQL syntax for ALTER TABLE is
ALTER TABLE "table_name"
[alter specification]
[alter specification] is dependent on the type of alteration we wish to perform. For the uses cited above, the [alter specification] statements are:
  • Add a column: ADD "column 1" "data type for column 1"
  • Drop a column: DROP "column 1"
  • Change a column name: CHANGE "old column name" "new column name" "data type for new column name"
  • Change the data type for a column: MODIFY "column 1" "new data type"
Let's run through examples for each one of the above, using the "customer" table created in the CREATE TABLE section:
Table customer
Column Name
Data Type
First_Name
char(50)
Last_Name
char(50)
Address
char(50)
City
char(50)
Country
char(25)
Birth_Date
date
First, we want to add a column called "Gender" to this table. To do this, we key in:
ALTER table customer add Gender char(1)
Resulting table structure:
Table customer
Column Name
Data Type
First_Name
char(50)
Last_Name
char(50)
Address
char(50)
City
char(50)
Country
char(25)
Birth_Date
date
Gender
char(1)
Next, we want to rename "Address" to "Addr". To do this, we key in,
ALTER table customer change Address Addr char(50)
Resulting table structure:
Table customer
Column Name
Data Type
First_Name
char(50)
Last_Name
char(50)
Addr
char(50)
City
char(50)
Country
char(25)
Birth_Date
date
Gender
char(1)
Then, we want to change the data type for "Addr" to char(30). To do this, we key in,
ALTER table customer modify Addr char(30)
Resulting table structure:
Table customer
Column Name
Data Type
First_Name
char(50)
Last_Name
char(50)
Addr
char(30)
City
char(50)
Country
char(25)
Birth_Date
date
Gender
char(1)
Finally, we want to drop the column "Gender". To do this, we key in,
ALTER table customer drop Gender
Resulting table structure:
Table customer
Column Name
Data Type
First_Name
char(50)
Last_Name
char(50)
Addr
char(30)
City
char(50)
Country
char(25)
Birth_Date
date

Drop Table Statement
USE
The USE keyword is used to select a database in MySQL. The syntax is as follows:
USE "Database Name"
For example, if you want to connect to a database called "Scores", you can type in the following:
USE Scores;
In MySQL, you can access tables in multiple databases by specifying [Database Name].[Table Name]. If the table you want to access is currently in the database you use, there is no need to specify the database name.
For example, if you want to access table "Course_110" from database "Scores" and table "Students" from database "Personnel", you can type in the following:
USE Scores;
SELECT ...
FROM Course_110, Personnel.Students
WHERE ... ;
Delete From Statement
Sometimes we may wish to get rid of records from a table. To do so, we can use the DELETE FROM command. The syntax for this is
DELETE FROM "table_name"
WHERE {condition}
It is easiest to use an example. Say we currently have a table as below:
Table Store_Information
store_name
Sales
Date
Los Angeles
$1500
Jan-05-1999
San Diego
$250
Jan-07-1999
Los Angeles
$300
Jan-08-1999
Boston
$700
Jan-08-1999
and we decide not to keep any information on Los Angeles in this table. To accomplish this, we type the following SQL:
DELETE FROM Store_Information
WHERE store_name = "Los Angeles"
Now the content of table would look like,
Table Store_Information
store_name
Sales
Date
San Diego
$250
Jan-07-1999
Boston
$700
Jan-08-1999




Comments

Popular posts from this blog

Date/time constants in JCL

JCL ABeND --

IEBGENER usage examples