View on GitHub

Northwind Company Analysis

By Abdullah Subuh

Project Overview

The Northwind database contains the sales data for a fictitious company called Northwind Traders, which imports and exports specialty foods from around the world.

My task in this case study is to perform an analysis of the performance of this company using SQL.


Table of Contents


Dataset

The dataset consists of 14 tables, including Customers, Orders, Products, and Employees. Below is the Entity Relationship Diagram (ERD) for the Northwind Traders database:

Image


Data Dictionary

The Northwind Traders database consists of 14 tables, each containing specific information about the company’s operations. Below is a detailed breakdown of the tables and their columns:


Categories Stores information about product categories. - **category_id** (Primary Key): Unique identifier for each category. - **category_name**: Name of the category. - **description**: Description of the category. - **picture**: Image representing the category.
Products Contains details about the products sold by Northwind Traders. - **product_id** (Primary Key): Unique identifier for each product. - **product_name**: Name of the product. - **supplier_id** (Foreign Key): Links to the `suppliers` table. - **category_id** (Foreign Key): Links to the `categories` table. - **quantity_per_unit**: Quantity of the product per unit (e.g., "24 boxes"). - **unit_price**: Price per unit of the product. - **units_in_stock**: Number of units currently in stock. - **units_on_order**: Number of units on order. - **reorder_level**: Minimum stock level before reordering. - **discontinued**: Indicates whether the product is discontinued (1 = Yes, 0 = No).
Suppliers Stores information about suppliers. - **supplier_id** (Primary Key): Unique identifier for each supplier. - **company_name**: Name of the supplier company. - **contact_name**: Name of the contact person. - **contact_title**: Title of the contact person. - **address**: Address of the supplier. - **city**: City where the supplier is located. - **region**: Region or state where the supplier is located. - **postal_code**: Postal code of the supplier. - **country**: Country where the supplier is located. - **phone**: Phone number of the supplier. - **fax**: Fax number of the supplier. - **homepage**: Website of the supplier.
Employees Contains information about employees. - **employee_id** (Primary Key): Unique identifier for each employee. - **last_name**: Last name of the employee. - **first_name**: First name of the employee. - **title**: Job title of the employee. - **title_of_courtesy**: Courtesy title (e.g., Mr., Ms.). - **birth_date**: Birth date of the employee. - **hire_date**: Date the employee was hired. - **address**: Address of the employee. - **city**: City where the employee resides. - **region**: Region or state where the employee resides. - **postal_code**: Postal code of the employee. - **country**: Country where the employee resides. - **home_phone**: Home phone number of the employee. - **extension**: Office extension of the employee. - **photo**: Photo of the employee. - **notes**: Additional notes about the employee. - **reports_to** (Foreign Key): Links to the `employees` table (manager of the employee). - **photo_path**: Path to the employee's photo.
Order Details Stores details about each order. - **order_id** (Foreign Key): Links to the `orders` table. - **product_id** (Foreign Key): Links to the `products` table. - **unit_price**: Price per unit of the product at the time of the order. - **quantity**: Quantity of the product ordered. - **discount**: Discount applied to the product.
Customers Contains information about customers. - **customer_id** (Primary Key): Unique identifier for each customer. - **company_name**: Name of the customer company. - **contact_name**: Name of the contact person. - **contact_title**: Title of the contact person. - **address**: Address of the customer. - **city**: City where the customer is located. - **region**: Region or state where the customer is located. - **postal_code**: Postal code of the customer. - **country**: Country where the customer is located. - **phone**: Phone number of the customer. - **fax**: Fax number of the customer.
Orders Stores information about orders placed by customers. - **order_id** (Primary Key): Unique identifier for each order. - **customer_id** (Foreign Key): Links to the `customers` table. - **employee_id** (Foreign Key): Links to the `employees` table. - **order_date**: Date the order was placed. - **required_date**: Date the order is required. - **shipped_date**: Date the order was shipped. - **ship_via** (Foreign Key): Links to the `shippers` table. - **freight_name**: Name of the freight company. - **ship_address**: Shipping address. - **ship_city**: City where the order is shipped. - **ship_region**: Region or state where the order is shipped. - **ship_postal_code**: Postal code of the shipping address. - **ship_country**: Country where the order is shipped.
Shippers Stores information about shipping companies. - **shipper_id** (Primary Key): Unique identifier for each shipper. - **company_name**: Name of the shipping company. - **phone**: Phone number of the shipping company.
Customer Demographics Stores demographic information about customers. - **customer_type_id** (Primary Key): Unique identifier for each customer type. - **customer_desc**: Description of the customer demographic.
Customer Customer Demo Links customers to their demographic information. - **customer_id** (Foreign Key): Links to the `customers` table. - **customer_type_id** (Foreign Key): Links to the `customer_demographics` table.
Employee Territories Links employees to their assigned territories. - **employee_id** (Foreign Key): Links to the `employees` table. - **territory_id** (Foreign Key): Links to the `territories` table.
Territories Stores information about sales territories. - **territory_id** (Primary Key): Unique identifier for each territory. - **territory_description**: Description of the territory. - **region_id** (Foreign Key): Links to the `region` table.
Region Stores information about regions. - **region_id** (Primary Key): Unique identifier for each region. - **region_description**: Description of the region.
US States Contains information about U.S. states. - **us_state_id** (Primary Key): Unique identifier for each state. - **us_state_name**: Name of the state. - **state_abbr**: Abbreviation of the state. - **state_region**: Region of the state.

Business Questions and Solutions

Question 1

Business Problem: For their annual review of the company pricing strategy, the Product Team wants to look at the products that are currently being offered for a specific price range ($20 to $50). In order to help them, they asked you to provide them with a list of products with the following information:

Filtered on the following conditions:

Finally, order the results by unit price in a descending order (highest first).

Solution

-- Solution Query
SELECT
	product_name,
	unit_price
FROM products
WHERE unit_price BETWEEN 20 AND 50
AND discontinued = 0
ORDER BY unit_price DESC;
Output | product_name | unit_price | |---------------------------------------|------------| | Tarte au sucre | 49.3 | | Ipoh Coffee | 46 | | Vegie-spread | 43.9 | | Schoggi Schokolade | 43.9 | | Northwoods Cranberry Sauce | 40 | | Gnocchi di nonna Alice | 38 | | Queso Manchego La Pastora | 38 | | Gudbrandsdalsost | 36 | | Mozzarella di Giovanni | 34.8 | | Camembert Pierrot | 34 | | Wimmers gute Semmelknödel | 33.25 | | Mascarpone Fabioli | 32 | | Gumbär Gummibärchen | 31.23 | | Ikura | 31 | | Uncle Bob's Organic Dried Pears | 30 | | Sirop d'érable | 28.5 | | Gravad lax | 26 | | Nord-Ost Matjeshering | 25.89 | | Grandma's Boysenberry Spread | 25 | | Pâté chinois | 24 | | Tofu | 23.25 | | Chef Anton's Cajun Seasoning | 22 | | Flotemysost | 21.5 | | Louisiana Fiery Hot Pepper Sauce | 21.05 | | Queso Cabrales | 21 | | Gustaf's Knäckebröd | 21 | | Maxilaku | 20 |

Question 2: Logistics Performance in 1998

Business Problem: The Logistics Team wants to do a retrospection of their performances for the year 1998, in order to identify for which countries they didn’t perform well. They asked you to provide them a list of countries with the following information:

Filtered on the following conditions:

Finally, order the results by country name in ascending order (lowest first).

Solution

-- Solution Query
WITH cte_avg_days AS (
	SELECT
		ship_country,
		ROUND(AVG(
			EXTRACT(DAY FROM (shipped_date - order_date) * INTERVAL '1 DAY')
			)::NUMERIC,
		2) AS average_days_between_order_shipping,
		COUNT(*) AS total_number_orders
	FROM orders
	WHERE EXTRACT(YEAR FROM order_date) = 1998
	GROUP BY 
		ship_country
	ORDER BY ship_country
	)
SELECT * FROM cte_avg_days
WHERE average_days_between_order_shipping >= 5
AND total_number_orders > 10;
Output | ShipCountry | AvgShippingDelay | TotalOrders | |--------------|------------------|-------------| | Austria | 5.89 | 11 | | Brazil | 8.12 | 28 | | France | 9.43 | 23 | | Germany | 5.38 | 34 | | Spain | 7.83 | 12 | | Sweden | 13.29 | 14 | | UK | 6.25 | 16 | | USA | 7.89 | 39 | | Venezuela | 8.73 | 18 |

Question 3: Employee Age and Manager Information

Business Problem: The HR Team wants to know for each employee what was their age on the date they joined the company and who they currently report to. Provide them with a list of every employee with the following information:

Finally, order the results by employee age and employee full name in ascending order (lowest first).

Solution

-- Solution Query
SELECT
    CONCAT(e.first_name, ' ', e.last_name) AS employee_full_name,
	e.title AS employee_title,
	EXTRACT(YEAR FROM AGE(e.hire_date, e.birth_date))::INT AS employee_age,
	CONCAT(m.first_name, ' ', m.last_name) AS manager_full_name,
	m.title AS manager_title
FROM
    employees AS e
INNER JOIN employees AS m 
ON m.employee_id = e.reports_to
ORDER BY
    employee_age,
	employee_full_name;
Output | employee_full_name | employee_title | employee_age | manager_full_name | manager_title | |----------------------|--------------------------|--------------|---------------------|-------------------------| | Anne Dodsworth | Sales Representative | 28 | Steven Buchanan | Sales Manager | | Janet Leverling | Sales Representative | 28 | Andrew Fuller | Vice President, Sales | | Michael Suyama | Sales Representative | 30 | Steven Buchanan | Sales Manager | | Robert King | Sales Representative | 33 | Steven Buchanan | Sales Manager | | Laura Callahan | Inside Sales Coordinator | 36 | Andrew Fuller | Vice President, Sales | | Steven Buchanan | Sales Manager | 38 | Andrew Fuller | Vice President, Sales | | Nancy Davolio | Sales Representative | 43 | Andrew Fuller | Vice President, Sales | | Margaret Peacock | Sales Representative | 55 | Andrew Fuller | Vice President, Sales |

Question 4: Global Logistics Performance (1997-1998)

Business Problem: The Logistics Team wants to do a retrospection of their global performances over 1997-1998, in order to identify for which month they perform well. They asked you to provide them a list with:

Filtered on the following conditions:

Finally, order the results by total freight in descending order.

Solution

-- Solution Query
WITH cte_freight AS (
	SELECT
		CONCAT(EXTRACT(YEAR FROM order_date), 
			   '-', 
			   EXTRACT(MONTH FROM order_date), 
			   '-01'
			  ) AS year_month,
		COUNT(*) AS total_number_orders,
		ROUND(
			SUM(freight)
			)::INT AS total_freight
	FROM orders
	WHERE order_date >= '1997-01-01' AND order_date < '1998-01-31'
	GROUP BY 
		CONCAT(EXTRACT(YEAR FROM order_date), 
			   '-', 
			   EXTRACT(MONTH FROM order_date), 
			   '-01'
			  )
)
SELECT * FROM cte_freight
WHERE total_number_orders > 35
ORDER BY total_freight DESC;
Output | year_month | total_number_orders | total_freight | |--------------|---------------------|---------------| | 1998-4-01 | 74 | 6394 | | 1998-1-01 | 55 | 5463 | | 1998-3-01 | 73 | 5379 | | 1998-2-01 | 54 | 4273 | | 1997-10-01 | 38 | 3946 | | 1997-12-01 | 48 | 3758 | | 1997-9-01 | 37 | 3237 |

Question 5: Category Performance by Price Range

Business Problem: The Pricing Team wants to know how each category performs according to their price range. In order to help them, they asked you to provide them a list of categories with:

Finally, order the results by category name then price range (both ascending order).

Solution

--Solution query
SELECT
	c.category_name,
	CASE 
		WHEN p.unit_price < 20 THEN '1. Below $20'
		WHEN p.unit_price >= 20 AND p.unit_price <= 50 THEN '2. $20 - $50'
		WHEN p.unit_price > 50 THEN '3. Over $50'
		END AS price_range,
	ROUND(SUM(d.unit_price * d.quantity)) AS total_amount,
	COUNT(DISTINCT d.order_id) AS total_number_orders
FROM categories AS c
INNER JOIN products AS p
ON c.category_id =  p.category_id
INNER JOIN order_details AS d
ON d.product_id =  p.product_id
GROUP BY 
	c.category_name,
	price_range
ORDER BY 
	c.category_name,
	price_range;
Output | category_name | price_range | total_amount | total_number_orders | |-----------------|----------------|--------------|---------------------| | Beverages | 1. Below $20 | 111464 | 317 | | Beverages | 2. $20 - $50 | 25079 | 28 | | Beverages | 3. Over $50 | 149984 | 24 | | Condiments | 1. Below $20 | 28622 | 85 | | Condiments | 2. $20 - $50 | 85073 | 121 | | Confections | 1. Below $20 | 57369 | 197 | | Confections | 2. $20 - $50 | 96094 | 106 | | Confections | 3. Over $50 | 23636 | 16 | | Dairy Products | 1. Below $20 | 17886 | 81 | | Dairy Products | 2. $20 - $50 | 157148 | 204 | | Dairy Products | 3. Over $50 | 76296 | 54 | | Grains/Cereals | 1. Below $20 | 25364 | 99 | | Grains/Cereals | 2. $20 - $50 | 75363 | 91 | | Meat/Poultry | 1. Below $20 | 5121 | 36 | | Meat/Poultry | 2. $20 - $50 | 76504 | 96 | | Meat/Poultry | 3. Over $50 | 96563 | 36 | | Produce | 1. Below $20 | 2566 | 13 | | Produce | 2. $20 - $50 | 57960 | 81 | | Produce | 3. Over $50 | 44743 | 39 | | Seafood | 1. Below $20 | 69673 | 217 | | Seafood | 2. $20 - $50 | 39963 | 70 | | Seafood | 3. Over $50 | 31988 | 27 |

Question 6: Regional Supplier Stock Analysis

Business Problem: The Logistics Team wants to know what is the current state of our regional suppliers’ stocks for each category of product. In order to help them, they asked you to provide them a list of categories with:

Finally, order the results by supplier region, then category name, then price range (each in ascending order).

Solution

--Solution query
SELECT
	c.category_name,
	CASE
		WHEN s.country IN ('Australia', 'Singapore', 'Japan' ) THEN 'Asia-Pacific'
		WHEN s.country IN ('US', 'Brazil', 'Canada') THEN 'America'
		ELSE 'Europe'
	END AS supplier_region,
	p.unit_in_stock AS units_in_stock,
	p.unit_on_order AS units_on_order,
	p.reorder_level 
FROM suppliers AS s
INNER JOIN products AS p
ON s.supplier_id = p.supplier_id
INNER JOIN categories AS c
ON p.category_id = c.category_id
WHERE s.region IS NOT NULL
ORDER BY 
	supplier_region,
	c.category_name,
	p.unit_price;
Output | category_name | supplier_region | units_in_stock | units_on_order | reorder_level | |-----------------|-----------------|----------------|----------------|---------------| | Condiments | America | 113 | 0 | 25 | | Confections | America | 17 | 0 | 0 | | Meat/Poultry | America | 21 | 0 | 10 | | Meat/Poultry | America | 115 | 0 | 20 | | Beverages | Asia-Pacific | 15 | 10 | 30 | | Condiments | Asia-Pacific | 24 | 0 | 5 | | Confections | Asia-Pacific | 29 | 0 | 10 | | Grains/Cereals | Asia-Pacific | 38 | 0 | 25 | | Meat/Poultry | Asia-Pacific | 0 | 0 | 0 | | Meat/Poultry | Asia-Pacific | 0 | 0 | 0 | | Produce | Asia-Pacific | 20 | 0 | 10 | | Seafood | Asia-Pacific | 42 | 0 | 0 | | Beverages | Europe | 52 | 0 | 10 | | Beverages | Europe | 111 | 0 | 15 | | Beverages | Europe | 20 | 0 | 15 | | Condiments | Europe | 4 | 100 | 20 | | Condiments | Europe | 76 | 0 | 0 | | Condiments | Europe | 0 | 0 | 0 | | Condiments | Europe | 53 | 0 | 0 | | Condiments | Europe | 120 | 0 | 25 | | Condiments | Europe | 6 | 0 | 0 | | Dairy Products | Europe | 22 | 30 | 30 | | Dairy Products | Europe | 86 | 0 | 0 | | Produce | Europe | 15 | 0 | 10 | | Seafood | Europe | 85 | 0 | 10 | | Seafood | Europe | 123 | 0 | 30 |

Key Insights and Recommendations

Key Insights

Business Recommendations