Synthesizer

Synthesizer(
    method,
    **kwargs
)

Synthetic data generator supporting multiple synthesis methods.

Parameters

  • method (str): Synthesis method
    • ‘default’: Use SDV-GaussianCopula
    • ‘custom_data’: Load custom data from file
    • ‘sdv-single_table-{method}’: Use SDV provided methods
      • copulagan: CopulaGAN generative model
      • ctgan: CTGAN generative model
      • gaussiancopula: Gaussian Copula model
      • tvae: TVAE generative model

Examples

from petsard import Synthesizer


# Using SDV's GaussianCopula
syn = Synthesizer(method='sdv-single_table-gaussiancopula')

# Using default method
syn = Synthesizer(method='default')

# Synthesizng
syn.create(data=df, metadata=metadata)
syn.fit_sample()
synthetic_data = syn.data_syn

Methods

create()

syn.create(data, metadata=None)

Initialize synthesizer.

Parameters

  • data (pd.DataFrame): Training dataset
  • metadata (Metadata, optional): Dataset’s Metadata object
    • Default: None

Returns

None. Initializes the synthesizer object

fit()

Train synthesis model.

Parameters

None

Returns

None. Updates synthesizer’s internal state

sample()

syn.sample(
    sample_num_rows=None,
    reset_sampling=False,
    output_file_path=None
)

Generate synthetic data.

Parameters

  • sample_num_rows (int, optional): Number of rows to generate
    • Default: None (use original data row count)
  • reset_sampling (bool, optional): Whether to reset sampling state
    • Default: False
  • output_file_path (str, optional): Output file path
    • Default: None

Returns

None. Generated data is stored in data_syn attribute

fit_sample()

syn.fit_sample(**kwargs)

依序執行訓練與生成。整合 fit()sample() 的功能。

Parameters

Same as sample()

Returns

None. Generated data is stored in data_syn attribute

Attributes

  • data_syn: Generated synthetic data (pd.DataFrame)
  • config: Configuration dictionary containing:
    • method (str): Synthesis method name
    • method_code (int): Method type code
    • Additional parameters specific to each method
  • synthesizer: Instantiated synthesizer object (for SDV methods)
  • loader: Loader object (for ‘custom_data’ method only)