Summary  
This chapter explains how to organize code assets separately from data by using a workspace for notebooks and code files, and a Unity Catalog that groups data into catalogs, schemas, tables, and volumes for secure, structured storage of both structured and unstructured data.  

General domain of usage  
Data analytics

 In Databricks, there is a clear distinction between Workspace Files (your notebooks and code) and Data Objects (your tables and raw files). The Catalog is the modern gateway used to manage and discover these data objects.

Definition

One of the first things you need to learn is that Databricks has "two sides to the house." One side is for your work - your scripts and notebooks. The other side is for the actual data you are analyzing. Understanding where each lives will save you a lot of frustration when you start writing code.


## Workspace Files: Where your code lives

When you click on the **Workspace** tab in the sidebar, you are looking at a file system for your **logic**.
- This is where you create folders, sub-folders, and notebooks.
- You can also store non-notebook files here, like small Python scripts or requirement files.
- **Important:** these are not "data tables." You don't store a 100GB CSV file here. This area is for your intellectual property - the code that tells Databricks what to do.


## The Catalog: Where your data lives

When you want to see your data, you go to the **Catalog** tab. In the past, Databricks relied heavily on something called **DBFS (Databricks File System)**. While you might still see references to DBFS in older documentation, it is now considered a legacy approach.

Today, we use the **Catalog** (powered by Unity Catalog). This provides a structured, "SQL-like" way to view your data:

- **Unity Catalogs:** a logical grouping (e.g., production_data or marketing_data) of schemas;
- **Schemas (or Databases):** a way to organize tables within a catalog, as well as Volumes (see below), ML models and functions;
- **Tables:** the actual rows and columns you will query.


## Volumes: Handling Raw Files
Sometimes you have data that isn't a table yet - like a raw CSV or an image file. In the modern Databricks UI, these are stored in **Volumes**. Think of a Volume as a bridge between the old "folder" way of thinking and the new, secure "Catalog" way of thinking. You can browse these volumes directly inside the Catalog UI to see your raw files before they are loaded into tables.


## Why does the distinction matter?
It all comes down to **Security and Performance**. By keeping code in the **Workspace** and data in the **Catalog**, Databricks allows administrators to give a user permission to edit a notebook without necessarily giving them permission to see the sensitive data inside a table. This "separation of concerns" is what makes Databricks an enterprise-grade platform.


If you want to create a new folder to organize your Python Notebooks, which sidebar tab should you use?

What is the modern, recommended way to manage and discover data tables in Databricks?

Which legacy term might you see in older Databricks documentation that is now being replaced by the Catalog and Volumes?

A practical introduction to Databricks, its core concepts, and hands-on data manipulation using Python and SQL. This course is designed for absolute beginners, focusing on clarity, simplicity, and real-world application.

Define Databricks simply and introduce key terms without jargon.

Get the user logged in and a compute environment running.

Master the primary development environment using familiar Python and SQL.

Practical, hands-on data manipulation using DataFrames (the core data structure).

Introduce the key differentiator, Delta Lake, simply.

Managing Files in the Workspace

Workspace Files: Where your code lives

The Catalog: Where your data lives

Volumes: Handling Raw Files

Why does the distinction matter?

1. If you want to create a new folder to organize your Python Notebooks, which sidebar tab should you use?

2. What is the modern, recommended way to manage and discover data tables in Databricks?

3. Which legacy term might you see in older Databricks documentation that is now being replaced by the Catalog and Volumes?