We will be setting-up DuckDB to work in Jupyter-powered Quarto Documents.
Set-up jupysql to be able to write sql queries directly
To do it with DuckDB, basically following this guide, just need to make sure jupysql, SQLAlchemy and duckdb-engine are installed, besides the core libraries (notebook, pandas, duckdb). If any of them mssing, simply pip install them.
Step 1 is then to import extension. It enables SQL cells in Jupyter. It supports inline SQL using %sql and a whole SQL cell starting it with %%sql.
import duckdbimport pandas as pdimport sqlalchemy # No need to import duckdb_engine, # SQLAlchemy will auto-detect %load_ext sql%config SqlMagic.autopandas =True%config SqlMagic.feedback =False%config SqlMagic.displaycon =False
Loading configurations from /home/runner/work/strom/strom/pyproject.toml.
Settings changed:
Config
value
feedback
True
autopandas
True
displaycon
False
dsn_filename
./connections.ini
I prefer Quarto to edit my notebooks, and the above still works. However, it seems Quarto’s SQL engine is still only for R since it requires knitr and does not seem to support the combo ipython-sql-SQLAlchemy. So you cannot simply use an SQl chunk like this
```{sql}SELECT*FROM test;```
But you have to use a standard python chunk and use the %sql or %%sql to be able to write SQL direcly.
Step 2 is to fire-up DuckDB, either in memory or pointing to a file.
We will need to make sure we load the extension for all Quarto documents, as they will all be different jupyter notebooks (TODO: double-check this). So we will use Quarto includes to keep the source files not-so-verbose.
Note that we use an underscore (_) prefix for the included file. You should always use an underscore prefix with included files so that they are automatically ignored (i.e. not treated as standalone files) by a quarto render of a project).
In Quarto projects, an include path that begins with a leading slash will be interpreted as project relative, meaning that you should be able to use an include such as:
"{{< include /_codebit.qmd >}}"
to include from the project root no matter the subdirectory of the file itself.