Bases: BaseSynthesizer
Source code in /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/ydata/sdk/synthesizers/timeseries.py
fit(X, sortbykey, privacy_level=PrivacyLevel.HIGH_FIDELITY, entities=None, generate_cols=None, exclude_cols=None, dtypes=None, target=None, name=None, anonymize=None, condition_on=None)
Fit the synthesizer.
The synthesizer accepts as training dataset either a pandas DataFrame
directly or a YData DataSource
.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
Union[DataSource, DataFrame]
|
Training dataset |
required |
sortbykey |
Union[str, List[str]]
|
column(s) to use to sort timeseries datasets |
required |
privacy_level |
PrivacyLevel
|
Synthesizer privacy level (defaults to high fidelity) |
HIGH_FIDELITY
|
entities |
Union[str, List[str]]
|
(optional) columns representing entities ID |
None
|
generate_cols |
List[str]
|
(optional) columns that should be synthesized |
None
|
exclude_cols |
List[str]
|
(optional) columns that should not be synthesized |
None
|
dtypes |
Dict[str, Union[str, DataType]]
|
(optional) datatype mapping that will overwrite the datasource metadata column datatypes |
None
|
target |
Optional[str]
|
(optional) Metadata associated to the datasource |
None
|
name |
Optional[str]
|
(optional) Synthesizer instance name |
None
|
anonymize |
Optional[str]
|
(optional) fields to anonymize and the anonymization strategy |
None
|
condition_on |
Optional[List[str]]
|
(Optional[List[str]]): (optional) list of features to condition upon |
None
|
Source code in /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/ydata/sdk/synthesizers/timeseries.py
sample(n_entities, condition_on=None)
Sample from a TimeSeriesSynthesizer
instance.
If a training dataset was not using any entity
column, the Synthesizer assumes a single entity.
A TimeSeriesSynthesizer
always sample the full trajectory of its entities.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
n_entities |
int
|
number of entities to sample |
required |
condition_on |
Optional[dict]
|
(Optional[dict]): (optional) conditional sampling parameters |
None
|
Returns:
Type | Description |
---|---|
DataFrame
|
synthetic data |
Source code in /opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/ydata/sdk/synthesizers/timeseries.py
PrivacyLevel
Bases: StringEnum
Privacy level exposed to the end-user.
BALANCED_PRIVACY_FIDELITY = 'BALANCED_PRIVACY_FIDELITY'
class-attribute
instance-attribute
Balanced privacy/fidelity
HIGH_FIDELITY = 'HIGH_FIDELITY'
class-attribute
instance-attribute
High fidelity
HIGH_PRIVACY = 'HIGH_PRIVACY'
class-attribute
instance-attribute
High privacy