Relational Database 4 | JOIN Operation, Conditional Expressions, Nested Queries, and Aggregate…

Series: Relational Database

Relational Database 4 | JOIN Operation, Conditional Expressions, Nested Queries, and Aggregate Operators

  1. JOIN Operation

(0) Basic Setup

Suppose we have two following tables,

(1) CROSS JOIN

Returns a cartesian product between two sets of data.

SELECT *
FROM <table1>
CROSS JOIN <table2>;

(2) INNER JOIN

Returns rows appearing in both tables.

SELECT *
FROM <table1>
INNER JOIN <table2>
ON <table1>.<colname1>=<table2>.<colname2>;

With table1.sid = table2.sid,

(3) LEFT JOIN

Return all rows from the left table, and the matched rows from the right table.

SELECT *
FROM <table1>
LEFT JOIN <table2>
ON <table1>.<colname1>=<table2>.<colname2>;

(4) RIGHT JOIN

Return all rows from the right table, and the matched rows from the left table.

SELECT *
FROM <table1>
RIGHT JOIN <table2>
ON <table1>.<colname1>=<table2>.<colname2>;

(5) FULL JOIN

Return all rows appearing in any of the tables.

SELECT *
FROM <table1>
FULL JOIN <table2>
ON <table1>.<colname1>=<table2>.<colname2>;

2. Conditional Expressions

(1) Basic Syntax

CASE WHEN <condition1> THEN <result1>
WHEN <condition2> THEN <result2>
...
ELSE <result>
END

(2) Useful Functions

  • Return the first not NULL value, else return NULL
COALESCE(<colname1>, <colname2>, ...)
  • Return the NULL if value1 = value2, else return 1
NULLIF(<colname1>, <colname2>)
  • Return the largest value (Ignore NULL value)
GREATEST(<colname1>, <colname2>, ...)
  • Return the smallest value (Ignore NULL value)
LEAST(<colname1>, <colname2>, ...)

3. Nested Queries

(1) Create a Subquery

The syntax of a subquery is,

SELECT *
FROM (SELECT *
FROM <table>
WHERE <balabala>
) AS <new_tablename>
WHERE <balabala>;

(2) Subquery Expressions

  • EXISTS: check rows, if returns at least one row, then true
SELECT *
FROM <table>
WHERE EXISTS (SELECT *
FROM <table1>
WHERE <balabala>
);
  • IN: check columns, if returns at least one row, then true
SELECT *
FROM <table>
WHERE <colname>
IN (SELECT <colname>
FROM <table1>
WHERE <balabala>
);
  • ANY: check any conditions, if returns at least one row, then true
SELECT *
FROM <table>
WHERE <colname> <=
ANY (SELECT <colname>
FROM <table1>
WHERE <balabala>
);
  • ALL: check all conditions, if the condition holds for all rows, then true
SELECT *
FROM <table>
WHERE <colname> <=
ALL (SELECT <colname>
FROM <table1>
WHERE <balabala>
);
  • EXCEPT: reduce some items in another table
SELECT *
FROM <table>
EXCEPT
SELECT *
FROM <table1>
WHERE <balabala>;

4. Aggregate Operators

  • GROUP BY: group by a certain column with a math operation (AVG / MIN / MAX / ABS / etc.)
GROUP BY <colname>
  • HAVING: gives the condition of a group

For example, to make a group larger than 2 items,

HAVING COUNT(*) >= 2

Here’s an example of why HAVING is good syntax. For Leetcode #569, the question is:

There is a table courses with columns: student and class. Please list out all classes which have more than or equal to 5 students. For example, the table:

+---------+------------+
| student | class |
+---------+------------+
| A | Math |
| B | English |
| C | Math |
| D | Biology |
| E | Math |
| F | Computer |
| G | Math |
| H | Math |
| I | Math |
+---------+------------+

Should output:

+---------+
| class |
+---------+
| Math |
+---------+

Note:
The students should not be counted duplicate in each course.

If we are not using HAVING, we are going to write the following code,

SELECT class
FROM
(
SELECT class, COUNT(DISTINCT student) AS num
FROM courses
GROUP BY class
) AS studentsnumber
WHERE num >= 5;

If we use HAVING for our code, then,

SELECT class
FROM courses
GROUP BY class
HAVING COUNT(DISTINCT student) >= 5;