Skip to main content

Connecting a source to bronze

EasyFabric is uniform from the bronze table onward: once data lands in bronze, every object flows through the same bronze → silver → gold path. The default flow covers the vast majority of cases — you can add per-object exceptions, but you rarely need to.

The one genuinely variable part is the decoupling point between the source and bronze — getting the raw data into the bronze landing in the first place. A source can be almost anything; databases, files and APIs (Microsoft Graph / Entra, a REST endpoint, a SaaS product) are the most common.

First, check whether the wheel already handles your source — then you write no code at all. It ships built-in loaders for the common cases:

  • Azure Blob Storage → the built-in azblob connection pulls the files for you.
  • Files already dropped into Fabric (by a third party or another process) → the built-in fabricfiles connection.

For these you only need a connection YAML and an object YAML — no notebook. See Connection config.

You write a connection notebook only when there is no built-in for the source — a database, a REST API, a SaaS product. Good practice is one reusable notebook per source type (an Azure SQL notebook, your own API notebook), not a one-off per table. EasyFabric drives it: when it loads an object, the wheel runs your notebook and passes it three free parameters (Param001Param003) plus the object's YAML and standard context (sourcetable, sourcefilter, bronzefolder, keyvault). The notebook uses those to know what to fetch:

  • Azure SQLParam001 names the Key Vault secret holding the connection string, Param002 carries the query.
  • A REST API such as Entra — pass whatever identifies the slice (a group id, an endpoint) in Param001.

The notebook's job ends once it has written the raw data into the Bronze Files section. From there the standard flow takes over identically for every source.

This page walks through a connection notebook end to end, using a reusable Pull_Entra notebook as the example.

The pattern

The notebook is the source-fetch step, not the loader. It drops one or more files into the Bronze Files section; the standard bronze flow then loads them into Bronze (landing + history) and on into Silver — exactly as it would for any third-party file.

That means three artifacts, not one:

  1. A connection YAML of type fabricfiles — tells EasyFabric to look for a file in the Bronze Files section.
  2. An object (table) YAML — defines the table, its columns and primary key, and references your notebook via PreBronzeNotebook.
  3. Your notebook — fetches the data and writes the file the object expects.
Let EasyFabric build the tables — not your notebook

Your notebook's only job is the fetch: write the raw data into Bronze Files and stop there. Don't let it create the destination table — EasyFabric builds and manages every medallion table from the connection and object YAML. If a notebook writes its own table instead, the framework never sees that table, so the data gets no Silver table, no history and no lineage.

1. Connection

A fabricfiles connection points at a folder in the Bronze lakehouse Files section. See Connection config.

ConnectionName: med-entra
ConnectionPrefix: med
ConnectionType: fabricfiles
BronzeFolder: Entra
FileType: json
FileExtension: json

2. Object (table)

The object defines the table that the data lands in, and wires in the notebook with PreBronzeNotebook. See Object config.

Table:
Connection: med-entra
SourceTable: groupmembers.json # the file Pull_Entra writes; the standard load reads it
DataPlatformObjectname: groupmembers
PreBronzeNotebook:
Notebook: Pull_Entra # one reusable Entra notebook, not one per table
Param001: groupMembers # what to request from Graph — passed to the notebook
KeepHistory: true
Columns:
- SourceColumn: GroupId
SourceDataType: varchar(100)
IsPrimaryKey: true
- SourceColumn: MemberId
SourceDataType: varchar(100)
IsPrimaryKey: true
- SourceColumn: DisplayName
SourceDataType: varchar(255)

SourceTable is the file Pull_Entra writes into BronzeFolder (here groupmembers.json); the standard fabricfiles load then reads it. What to request comes from a parameter, not from SourceTableParam001 tells Pull_Entra which Entra data to fetch. A second object — say users — reuses the same Pull_Entra notebook with SourceTable: users.json and its own Param001; nothing in the notebook changes. With KeepHistory: true and a primary key, Bronze also writes a Bronze.his history table and the data propagates to Silver automatically.

3. The notebook contract

PreBronzeNotebook runs before the standard bronze load. The notebook reads its inputs to learn what to fetch, pulls the data, and writes it where the loader expects. Three rules make it work:

  • Read what to fetch from the parameters — don't hardcode it. param001param003 (plus the optional sourcefilter) tell the notebook which slice to pull — e.g. param001=groupMembers → fetch group members from Graph, param001=users → fetch users. This is what lets one notebook serve many objects.
  • Write the result into bronzefolder under the name SourceTable, in the connection's FileType. sourcetable, bronzefolder and the full object_yaml_file tell the notebook where to write. The output is a single file (groupmembers.json) or a folder of part files — e.g. a SQL extract written as part_00000.parquet, part_00001.parquet…; the fabricfiles loader reads whatever it finds there.
  • Declare the standard parameter cell, so EasyFabric can pass those values in:
sourcetable=""
sourcefilter=""
bronzefolder=""
keyvault=""
object_yaml_file=""
param001=""
param002=""
param003=""

EasyFabric supplies sourcetable, sourcefilter, bronzefolder, keyvault and object_yaml_file automatically; param001param003 carry the optional Param001Param003 from the object YAML. Because everything the notebook needs arrives as a parameter — and object_yaml_file lets it read the full object definition — one notebook can serve every object that uses this source type. That is the connection-notebook-per-type practice: write it once for Azure SQL or your API, and reuse it across tables.

Keep source specifics in metadata

If your notebook currently hardcodes its inputs — for example a list of Entra groups in a param.py — move them into Param001 (or SourceFilter) on the object YAML. That keeps what is fetched in metadata and how it is fetched in the notebook, which is the whole point of the metadata-driven model.

Authentication

Most sources need a token, and you should never embed secrets in the notebook — read them from Key Vault at runtime. The recommended pattern (used by the Azure SQL example below):

  1. The service principal's client id comes from config_manager.applicationclientid; its client secret is a Key Vault secret.
  2. Read the secret with notebookutils.credentials.getSecret(...), then request an AAD token via the OAuth2 client_credentials grant for the target scope — e.g. https://database.windows.net/.default for Azure SQL, https://graph.microsoft.com/.default for Entra.
  3. Use that token to authenticate: for Azure SQL, inject it into the pyodbc connection via the SQL_COPT_SS_ACCESS_TOKEN attribute; for a REST API, send it as a Bearer header.

Cache the token and refresh it before expiry if the pull is long-running.

Example: an Azure SQL pull notebook

templates/demo-platform/Generator/Dataplatform/DP/Notebooks/Bronze/Pull_azuresql is a complete, production-grade connection notebook for Azure SQL. Use it as a starting point — the essence is below.

It is wired as a PreBronzeNotebook and parameterised per object:

  • Param001 — Key Vault secret name holding the SQL connection string
  • Param002 — the query to run

The notebook reads the object to learn where to land the data, runs the query, and writes the result as parquet into the Bronze Files section; the standard fabricfiles flow then loads it:

from easyfabric import load_meta_data, initialize_config, log_segment

with log_segment(type="Technical", name="Bronze"):
config_manager = initialize_config()

# 1. Credentials: SPN client id from config, secret from Key Vault → AAD token
token = get_aad_token(config_manager.applicationclientid, config_manager.keyvault,
scope="https://database.windows.net/.default")
connection_string = get_secret(config_manager.keyvault, param001)
engine = make_sql_engine(connection_string, token) # injects token into pyodbc

# 2. Where to land: read the object for bronzefolder + sourcetable
obj = load_meta_data.get_object_by_file(object_yaml_file)
target = f"Files/{obj.bronzefolder}/{obj.sourcetable}"

# 3. Extract and write to Bronze Files (the standard loader picks it up)
extract_to_parquet(engine, query=param002, target_path=target)

The full notebook adds the production concerns a real source needs: chunked extraction for large tables, retries with backoff, connection pooling, a serverless-DB wakeup ping, and cleaning the target path before writing so retries stay idempotent.

When to use a full override instead

The pattern above keeps EasyFabric in charge of the load and uses the notebook only to fetch a file. If you instead need the notebook to produce the Bronze table itself — no intermediate file — use BronzeNotebook on the object (see Object config) to override the default bronze flow. You still need the object YAML to define the table, columns and keys; only the loading step changes.

Prefer PreBronzeNotebook + a file unless you have a specific reason not to: it keeps freshness checks, history and the standard load behaviour intact.

Verify it worked

The failure mode here is silent: a notebook can run happily and write a table that EasyFabric never registered. Confirm the wiring before you rely on it.

1. EasyFabric recognizes the object. Run this in a Fabric notebook — if it prints your columns and keys, the object YAML is being picked up:

from easyfabric import load_meta_data
from pprint import pprint

pprint(load_meta_data.get_object_by_file("Files/Objects/med-entra/groupmembers.yaml"))

2. The managed tables exist after deployment. Because EasyFabric is schema-on-write, deploying a wired-in object creates its tables — empty — before any data is loaded. A correctly wired source produces three: the Bronze landing table, the Bronze.his history table (because KeepHistory: true), and the Silver table. A bronze run fills these; it does not create them. If they never appear after a deploy, the object isn't wired in — re-check the connection and object YAML.

Summary

You createdEasyFabric sees
Only a notebook that writes a tableNothing — a loose Delta table; no managed Bronze, history, Silver or lineage
Notebook + connection + object YAMLA fully managed source: Bronze landing, Bronze history, Silver, lineage and metadata