Options to Profile your data

Start by loading your DataFrame as you normally would, e.g. by using a YData Connector. To generate the standard profiling report, merely run:

profile = ProfileReport(df, title="YData Profiling Report")

Using inside Jupyter Notebooks

There are two interfaces to consume the report inside a Jupyter notebook (see animations below): through widgets and through an embedded HTML report.

Running ydata-sdk inside a Jupyter Notebook

This is achieved by simply displaying the report as a set of widgets. In a Jupyter Notebook, run:

profile.to_widgets()

The HTML report can be directly embedded in a cell in a similar fashion:

profile.to_notebook_iframe()

ydata-sdk widgets

Exporting the report to a file

To generate a HTML report file, save the ProfileReport to an object and use the to_file() function:

profile.to_file("ydata_html_report.html")

Alternatively, the report's data can be obtained as a JSON file:

Save your profile report as a JSON file
# As a JSON string
json_data = profile.to_json()

# As a file
profile.to_file("ydata_json_report.json")

Command line usage

For standard formatted CSV files (which can be read directly by pandas without additional settings), the ydata-sdk executable can be used in the command line. The example below generates a report named Example Profiling Report, using a configuration file called default.yaml, in the file report.html by processing a data.csv dataset.

ydata --title "YData Profiling Report" --config_file default.yaml data.csv report.html

Information about all available options and arguments can be viewed through the command below.

ydata -h

Image title — Options available in the CLI

Deeper profiling

The contents, behaviour and appearance of the report are easily customizable. The example below used the explorative mode, a lightweight data profiling option.

profile = ProfileReport(df, title="YData Profiling Report", explorative=True)