Wheel Package
What is a Python Wheel Package?
A Python wheel (.whl) is a built-package format for Python that can be directly installed using pip. Wheels offer several advantages over traditional source distributions:
- Pre-built Distribution: Wheels are "built distributions" that don't require a build step during installation
- Faster Installation: Since the code is pre-compiled, installation is significantly faster
- Dependency Management: Wheels handle dependencies automatically
- Platform Specificity: Can be created for specific platforms or be platform-agnostic
The EasyFabric Wheel Package
The EasyFabric package is a specialized tool designed to facilitate data loading into a medallion architecture using YAML-based configurations.
Key Components
- ConfigManager: Manages configuration settings from YAML files
- load_data_bronze: Core functionality for loading data into the bronze layer
- load_data_silver: Core functionality for loading data from bronze into the silver layer
- load_data_gold: Core functionality for loading data into the gold layer
Basic Usage
Here's a step-by-step guide to using the package:
Import Required Modules:
from easyfabric import load_data_bronze, ConfigManager
Initialize Configuration:
cfg = ConfigManager.from_yaml_file()
Configure Optional Settings:
# Skip notebook pre-bronze processing (optional)
cfg.skip_notebookprebronze()
# Set batch ID (if needed)
cfg.batch_id = batchid
# Enable stop-at-error functionality (optional)
#cfg.stop_at_error()
Execute Data Loading:
result = load_data_bronze.run(
tablefile="Files/Objects/MySource/MyTable.yaml",
config_manager=cfg
)
Key Features
- Flexible Configuration: Uses YAML files for easy configuration management
- Secure Authentication: Integrated with Key Vault for secure credential management
- Error Handling: Optional stop-at-error functionality
- Batch Processing: Support for batch ID-based processing
- Pre-processing Control: Ability to skip pre-bronze notebook processing
Best Practices
- Always verify your YAML configuration files before running the data load
- Secure your authentication credentials using Key Vault
- Use batch IDs when processing related data sets
- Monitor the results returned by the load_data_bronze.run() function
- Keep your wheel package updated to the latest version
Error Handling
The package provides several error handling mechanisms:
- Optional stop-at-error functionality
- Return values from the run function indicating success/failure
- Detailed error logging
Additional Considerations
- The package is designed to work within a medallion architecture
- Configuration files should follow the expected YAML structure
- Authentication is handled securely through Key Vault integration
- The package supports both batch and real-time processing scenarios
This documentation provides a basic overview of the wheel package and its usage. For specific configuration options and advanced features, please refer to the YAML configuration documentation and example files.