Skip to main content

easyfabric.loaders.pull_files

json

logging

os

datetime

notebookutils

ConfigManager

get_current_date

get_current_time

get_random_string

ObjectInfo

TableConfig

download_blob

list_blobs

has_files_changed

pull_files

def pull_files(config_manager: ConfigManager,
table_config: TableConfig) -> list[ObjectInfo]

Pulls files from a remote source to a local destination based on the configuration and connection type provided. It handles various connection types, retrieves the file details, processes them, and returns a list of ObjectInfo instances representing the pulled files.

Arguments:

  • config_manager - Manages configurations used throughout the application.
  • table_config - Contains specific information and settings for a particular table including connection type, source folder, and bronze folder.

Returns:

  • List[ObjectInfo] - A list containing ObjectInfo instances, each representing a pulled file with relevant details such as its bronze file name, path, and source log.

Raises:

  • Exception - If a connection type is not set on the object for "customperfile".
  • Exception - If an unsupported connection type is detected.

check_source_unchanged

def check_source_unchanged(config_manager: ConfigManager,
table_config: TableConfig,
previous: list[dict]) -> bool

Pre-checks whether Azure Blob source files are unchanged compared to the previous tracker snapshot, WITHOUT downloading any files.

Only applicable for the azblob connection type. Returns False (assume changed) for any other connection type or when previous is empty (first run).

Uses :func:list_blobs to retrieve current blob metadata and compares it against previous via :func:~.file_tracker.has_files_changed. The basename of each blob name is used as the stable key so it aligns with the partial_filename stored in the tracker.

Arguments:

  • ``2 - Application-level ConfigManager.
  • ``3 - TableConfig for the table being processed.
  • 4 - Tracker entries from :func:5.

Returns:

  • 6 - Trueif all blobs are unchanged;False`` if any blob changed, is new, or the snapshot is empty / unavailable.