Bronze
Overview
Pull raw data from configured sources (CSV/JSON/XML/Parquet/Excel) into bronze Delta tables. When keephistory: true on the per-table YAML, bronze also writes versioned change records to a Bronze.his history table — this is where row history lives. Both the Fabric item generator and the runtime loader consume the same per-table YAML.
What gets generated
| Stage | Component | Output |
|---|---|---|
| Build | EasyFabric Generator (EFG) | GenerateFabricObjects → Fabric lakehouse tables (bronze) |
| Runtime | EasyFabric Runtime (EFR) | easyfabric.load_data_bronze.run, easyfabric.load_data_bronze.dataframeloader |
Example YAML
Dataplatform/DP/Objects/AdvWorks/products.yaml
Table:
Connection: adv-advworks
SourceTable: products.csv
DataPlatformObjectname: products
PreBronzeNotebook:
Notebook: Pull_Github
Param001: products.csv
Param002: products.csv
KeepHistory: true
Columns:
- SourceColumn: ProductID
SourceDataType: int
IsPrimaryKey: true
- SourceColumn: ProductName
SourceDataType: varchar(255)
- SourceColumn: Category
SourceDataType: varchar(255)
- SourceColumn: ListPrice
SourceDataType: decimal(28,5)
Schema reference
Required fields marked *. Linked types are collapsible — click to expand inline.
Fabric Object
| Name | Type | Description |
|---|---|---|
| Connection * | String | Connection to use for this object (as defined in connection) |
| Fields * | List<FabricAttribute> | Fields from the source object |
| SourceTable * | String | Name of the object in the source |
Show optional fields (6)
| Name | Type | Description |
|---|---|---|
| DataPlatformObjectName | String | To override the source name that is used in the dataplatform |
| Description | String | Description of the source |
| IsActive | true/false | Set to active for generating this object (default=true) |
| KeepHistory | true/false | Set to true if history is required in the silver layer (default=true). Primary key required. |
| Prefix | String | Set a prefix (default = '') |
| SourceSchema | String | Source schema in the source (if applicable) |
Fabric Attribute
| Name | Type | Description |
|---|---|---|
| SourceColumn * | String | Name of the attribute in the source system |
| SourceDataType * | String | Datatype of the source (use source datatype, will be converted to dataplatform types automatically) |
Show optional fields (5)
| Name | Type | Description |
|---|---|---|
| Classification | ClassificationType | Set the classification of a field (sensitive, restricted, internal, public (default = public) |
| IsActive | true/false | Attribute is active (default=true) |
| IsNullable | true/false | Attribute can be null (default = true) |
| IsPrimaryKey | true/false | Attribute is part of the primary key (default = false) |
| Name | String | To override the name used in the dataplatform (default is the sourcecolumn name used) |
Runtime-only fields
These fields are read by the EFR at runtime and have no EFG counterpart.
filetypeskipifsourceunchangedbronzeloadskiplayers
EasyFabric Runtime
load_data_bronze.run
def run(tablefile: str, config_manager: ConfigManager = None) -> str
Runs the bronze loader process for a specified table configuration and pulls files from the source, processes them, and loads them into the bronze layer.
Workflow:
- Validates table configuration and layer settings
- Computes stable Bronze folder path for file tracker
- Loads previous file snapshot from tracker (if exists)
- Optionally performs Azure Blob pre-check for unchanged sources
- Executes pre-bronze notebook (if configured)
- Pulls files from source system
- Validates file freshness (Validation 1)
- Checks for file changes using tracker snapshot (if
skipifsourceunchangedenabled) - Truncates Bronze table and loads file data by type
- Executes mid-bronze notebook and reloads data (if configured)
- Loads history records and validates correlation (Validation 2)
- Saves file snapshot for next run
- Executes post-bronze notebook (if configured)
Arguments:
tablefilestr - Path to the YAML file representing a table's configuration.config_managerConfigManager - An instance of ConfigManager used for accessing the application's configuration settings. Defaults to global config if not provided.
Returns:
str- A message indicating the outcome, such as file count, skip reason, or error details. ReturnsNoneif table is inactive or skipped.
Raises:
Exception- If ConfigManager is not initialized, table config is invalid, or filetype is unsupported.
Configuration Options:
skipifsourceunchanged(bool) - Enable skip-if-unchanged detectionbronzeloadskip(bool) - Skip entire bronze load for this tablekeephistory(bool) - Maintain history records and validationprebronzenotebook(str) - Path to notebook to run before loadingmidbronzenotebook(str) - Path to notebook to run between load and historypostbronzenotebook(str) - Path to notebook to run after loadingsourceorder(bool) - Sort files by sourceorder instead of namebronzefolder(str) - Override Bronze folder from connectionmax_file_age_hours(int) - Maximum file age for freshness validation (per connection)
load_data_bronze.dataframeloader
def dataframeloader(data_frame: DataFrame, load_config: LoadConfig,
table_config: TableConfig, config_manager: ConfigManager=None)
Loads a DataFrame into a specified data platform table using the provided configuration and manager.
This function handles the loading operation by using detailed configurations for the DataFrame, table, and the application configuration manager. It sets up logging, ensures required parameters are initialized, and supports specific settings for different layers (e.g., bronze layer). The function handles exception logging and provides mechanisms to stop processing upon encountering errors based on configuration settings.
Arguments:
data_frameDataFrame - The data to be loaded into the specified table.load_configLoadConfig - Contains configuration for the loading process, including destination table.table_configTableConfig - Holds table-specific settings, e.g., table name identifiers and layers.config_managerConfigManager - Manages and validates application-level configurations.
LoadConfig fields
Runtime parameter bag — construct in code and pass to the loader. All fields are optional unless flagged below.
| Field | Type | Description |
|---|---|---|
_layer | str | Operational layer associated with the configuration. Defaults to "not set". |
dry_run | bool | Indicates if the process should be executed in dry-run mode. Defaults to True. |
auto_null_column | bool | Determines if null values should be automatically managed for columns. Defaults to True. |
load_type | LoadType | Specifies the type of load operation. Defaults to LoadType.FULL. |
stop_at_error | bool | Specifies whether the process should stop when an error occurs. Defaults to True. |
business_key_check | bool | Indicates if business keys should be validated during the load. Defaults to True. |
log_row_count | bool | Determines if row counts should be logged during the process. Defaults to False. |
key_violation_action | str | Action to be taken when key violations occur. Defaults to "raise". |
destination_schema | str | Schema of the destination table. Defaults to "dbo". |
destination_table | Optional[str] | Name of the destination table. Defaults to None. |
Returns:
str- Message indicating the result of the DataFrame loading process, including the target table name and error details if applicable.
Raises:
Exception- If the destination table name is missing from LoadConfig.Exception- If the ConfigManager is not properly initialized.
Related data classes
easyfabric.data.TableConfigeasyfabric.data.Columneasyfabric.data.Connection