easyfabric.loaders.pull_files
json
logging
os
datetime
notebookutils
ConfigManager
get_current_date
get_current_time
get_random_string
ObjectInfo
TableConfig
download_blob
list_blobs
has_files_changed
pull_files
def pull_files(config_manager: ConfigManager,
table_config: TableConfig) -> list[ObjectInfo]
Pulls files from a remote source to a local destination based on the configuration and connection type provided. It handles various connection types, retrieves the file details, processes them, and returns a list of ObjectInfo instances representing the pulled files.
Arguments:
config_manager- Manages configurations used throughout the application.table_config- Contains specific information and settings for a particular table including connection type, source folder, and bronze folder.
Returns:
List[ObjectInfo]- A list containing ObjectInfo instances, each representing a pulled file with relevant details such as its bronze file name, path, and source log.
Raises:
Exception- If a connection type is not set on the object for "customperfile".Exception- If an unsupported connection type is detected.
check_source_unchanged
def check_source_unchanged(config_manager: ConfigManager,
table_config: TableConfig,
previous: list[dict]) -> bool
Pre-checks whether Azure Blob source files are unchanged compared to the previous tracker snapshot, WITHOUT downloading any files.
Only applicable for the azblob connection type. Returns False
(assume changed) for any other connection type or when previous is
empty (first run).
Uses :func:list_blobs to retrieve current blob metadata and compares
it against previous via :func:~.file_tracker.has_files_changed.
The basename of each blob name is used as the stable key so it aligns
with the partial_filename stored in the tracker.
Arguments:
- ``2 - Application-level ConfigManager.
- ``3 - TableConfig for the table being processed.
4 - Tracker entries from :func:5.
Returns:
6 -Trueif all blobs are unchanged;False`` if any blob changed, is new, or the snapshot is empty / unavailable.