DataFrame Display Extension
The display extension provides a convenient way to visualize PySpark DataFrames within Microsoft Fabric notebooks. It automatically handles the environment context to ensure interactive visualization works where possible.
Overview
In a Microsoft Fabric environment, visualizing data typically requires calling specialized display functions. This extension monkey-patches the PySpark DataFrame class, allowing you to call .display() directly on any DataFrame object.
How it Works
The display method checks the current runtime context:
- Interactive Run: If the notebook is being run interactively (by a user), it uses
notebookutils.visualization.displayto render a rich, interactive table. - Non-Interactive Run: If the notebook is running as part of a pipeline or job, it skips the rich visualization (to save resources and avoid errors) and prints a message instead.
Usage
Once easyfabric is imported, the display method becomes available on all DataFrames.
import easyfabric as ef
# Define your DataFrame
df = spark.table("Silver.dbo.MyTable")
# Use the extension to visualize the data
df.display()
# You can also include a summary (statistics)
df.display(summary=True)
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
summary | bool | False | If set to True, the display will include descriptive statistics (count, mean, stddev, min, max) for the columns. |
Benefits
- Syntactic Sugar: Cleaner code compared to calling
vis_display(df). - Environment Aware: Prevents pipeline failures or unnecessary processing when running in non-interactive modes.
- Consistency: Provides a uniform way to look at data throughout the project.