Summary  
This chapter covers using the SQL GROUP BY clause to group rows by specified columns, apply aggregate functions like COUNT and AVG, and use aliases to rename result columns for clarity.  

General domain of usage  
Transportation system performance analysis

Welcome to the **Intermediate SQL** course! 


In the first section, we're diving into how we can **group and aggregate data** within our tables. 


Let's understand what "grouping data" means using a simple example of an employees table:

## Grouping Data

We have a **task** to **find out the number of employees in each department.** To do this, we will group the data by the `department` column and use aggregation with the `COUNT(*)` function. 
 
Here's what the implementation will look like:

SELECT department, COUNT(*) AS number_of_employees
FROM employees
GROUP BY department

So, as you can see, **the syntax** for grouping data looks like this:

```sql
SELECT column1, AGG_FUNC(column2)
FROM table
GROUP BY column1
```

`AGG_FUNC` means aggregate functions like `MAX`, `MIN`, `COUNT`, etc.

Note

This syntax exists to **find certain values using aggregate functions in specific columns**. 

Let's consider another example: we've been tasked with **finding the department with the highest average salary.**

To retrieve such data, we need to **group** the data by the `department` column and then use the `AVG()` function to calculate the average salary:

SELECT department, AVG(salary) as average_salary
FROM employees
GROUP BY department

In this part of the section, we will work with the **Montreal Metro system database**, which contains the `metro_travel_time` table. 



This table will contain information about the **station line**(`line_name`), its **name**(`station_name`), and **the amount of time** it takes for a train **to travel from one station to the next one**(`time_to_next_station`).

Here is what this **table** looks like and the **data preview** in it:


As you can see, this is **not a complex table**. Let's think about where we can **use grouping** here.

The most obvious option is **grouping by the colors of metro lines**. That means we can aggregate the data, grouping it by the color of the metro line. 




## Alias

In the assignments, you’ll often use a concept called an **alias**. An alias is essentially a "nickname" for a column you retrieve with a `SELECT` statement. It’s specified using the following syntax:

```sql
SELECT column AS alias
```

An alias only affects how the column appears in the response. 

For example, instead of `MAX(time)`, the column could be called `max_time` if you assign that alias. This makes the output more readable and clear.

Dette kurset er perfekt for de som allerede har en grunnleggende forståelse av SQL og ønsker å fordype seg i mer avanserte konsepter for å lage kraftigere spørringer. Gjennom kurset vil du bli kjent med datagruppering og filtrering av grupperte data. Du vil også lære hvordan du arbeider med flere tabeller samtidig, inkludert hvordan du kombinerer dem. I tillegg vil du utforske forskjellige typer tabellkoblinger og hvordan du anvender dem i praksis.

I denne delen vil du lære hvordan du grupperer og håndterer data effektivt. Vi vil introdusere GROUP BY-operatoren for gruppering av data. I tillegg vil du lære hvordan du filtrerer grupperte data ved hjelp av HAVING-operatoren.

Du vil lære om nøstede underforespørsler, hvordan de opprettes, og hvordan de brukes effektivt.

Her vil du lære hvordan du kan kombinere flere tabeller, noe som gjør det mulig å arbeide med en samlet tabell og forenkle opprettelsen av spørringer.

Lær det grunnleggende om Data Definition Language (DDL) og Data Manipulation Language (DML) i SQL, inkludert hvordan du oppretter, endrer og sletter databaseobjekter, samt hvordan du setter inn, oppdaterer og sletter data i tabeller.

GROUP BY Clause

Grouping Data

Alias

Brief Instructions

Løsning