Writing and Running SQL Code
Stryg for at vise menuen
SQL (Structured Query Language) is the industry-standard language for interacting with relational databases. In Databricks, SQL is used to query tables in the Lakehouse, build visualizations, and perform data analysis within notebooks.
While Python is powerful for data engineering, SQL remains the most popular language for data analysis. Databricks treats SQL as a first-class citizen. Even if your notebook's default language is set to Python, you can easily switch to SQL to perform quick data lookups or complex joins.
Using SQL in a SQL Notebook
If you created your notebook with SQL as the default language, every cell you create will automatically be a SQL cell. You simply type your query - such as SELECT * FROM table_name - and run it. The output is displayed in a clean, interactive table format where you can sort columns or search for specific values without writing extra code.
The "Magic" Command: %sql
If you are working in a Python notebook, you don't need to create a whole new file just to run a SQL query. Databricks uses "Magic Commands" to switch languages on the fly. By typing %sql at the very beginning of a cell, you tell the cluster to ignore Python and execute that specific cell as SQL code.
%sql
SELECT "Hello from the SQL engine" AS message
Querying the Sample Data
In Section 2, we uploaded a file called diamonds. Now, we can query it directly. In a SQL cell, try the following code:
SELECT
Region,
Item_Type,
Total_Revenue
FROM
main.default.sample_sales_records
LIMIT 10
Notice the naming convention: catalog.schema.table. This three-part name ensures you are pulling the correct data from the Catalog we explored earlier.
Interactive Results and Visualizations
When a SQL cell finishes running, Databricks provides an interactive result set. Below the results table, you will see a "+" icon followed by "Visualization". This allows you to turn your SQL results into bar charts, line graphs, or pie charts instantly. Unlike traditional SQL environments where you might need to export data to Excel to see a chart, Databricks allows you to build the analysis and the visual in the same space.
Formatting SQL for Readability SQL can become difficult to read as queries grow in complexity. Databricks includes a built-in SQL Formatter. You can find this in the cell toolbar (often represented by a small "magic wand" icon or under the "Edit" menu). Clicking this will automatically indent your keywords and align your columns, making your code easier for your teammates to review in the Workspace.
1. If you are in a Python notebook and want to write a SQL query in a single cell, which "Magic Command" must you type at the top?
2. What is the correct way to reference a table in Databricks using the three-part naming convention?
Tak for dine kommentarer!
Spørg AI
Spørg AI
Spørg om hvad som helst eller prøv et af de foreslåede spørgsmål for at starte vores chat