Skip to main content

Notebooks

EasyFabric relies (almost) solely on notebooks for loading and processing data into the different stages of the fabric workspace.

Notebook

Default notebooks

EasyFabric comes with a couple of notebooks:

  • DAG_Multiloader
  • DAG_Gold
  • Load_Bronze
  • Load_Silver
  • Load_Gold

These notebooks depend on the wheel package of EasyFabric.

You can open a notebook and run it in the webbrowser as usual. Imagine you want to load a table from Bronze to Silver, based on the settings of the object named MyTable from source MySource.

Run a notebook: Load_Silver

Steps to follow when using the direct notebook approach:

  • Open Load_Silver
  • Go to the parameter cell (right corner has the word 'Paramters')
  • Replace 'tablefile' with the desired path to the object from the Files section of the Meta lakehouse
  • Run all

What's happening during the run?

  1. A session is started
  2. The EasyFabric wheel package is installed
  3. The parameters are set
  4. The configmanager is set, based on the default yaml config file from the Meta lakehouse
  5. The configmanager and the tablefile are used to run the load_data_silver.run method
  6. Info logging is displayed in the notebook, while running.
  7. Logfile is saved to the Meta lakehouse and to a table in the Meta lakehouse

Run a notebook from a new custom notebook (preferred)

You can also open a new notebook and call an existing notebook with the following Python script:


# Load bronze table via DAG runMultiple
tablefile = "Files/Objects/MySource/MyTable.yaml"

DAG = {
"activities": [
{
"name": "Load_bronze_1",
"path": "Load_bronze",
"timeoutPerCellInSeconds": 900, # max timeout for each cell, default to 90 seconds
"args": {"tablefile": tablefile},
}]
}

notebookutils.notebook.runMultiple(DAG)

Run a DAG notebook: DAG_Multiloader

Loading a single item is straightforward, but in real-world scenarios, you often need to load multiple items into your lakehouses. This is where DAG (Directed Acyclic Graph) becomes valuable. For example, using DAG_Multiloader, you can orchestrate multiple loading operations simultaneously. This notebook initiates both Load_Bronze and Load_Silver operations for each object present in the specified parameter folder.

DAG in Microsoft Fabric

  • DAG (Directed Acyclic Graph) represents a workflow structure in Microsoft Fabric's data pipelines
  • It's a collection of tasks/activities connected in a way that forms a directed flow without cycles
  • In Fabric, DAGs enable orchestration of data workflows, notebooks, and pipeline activities
  • Each node in a DAG represents a task, while edges show dependencies between tasks
  • DAGs ensure tasks execute in the correct order while preventing circular dependencies