Adapter

petsard.adapter

The Adapter module provides wrapper classes that standardize the execution interface for all PETsARD pipeline components. Each adapter encapsulates a specific module (Loader, Synthesizer, etc.) and provides consistent methods for configuration, execution, and result retrieval.

Design Overview

The Adapter system follows a decorator pattern, wrapping core modules with standardized interfaces for pipeline execution. This design ensures consistent behavior across all pipeline components while maintaining flexibility for module-specific functionality.

Key Principles

Standardization: All adapters implement the same base interface for consistent pipeline execution
Encapsulation: Each adapter wraps its corresponding module, handling configuration and execution details
Error Handling: Comprehensive error logging and exception handling across all adapters
Metadata Management: Consistent metadata handling using the Metadater system

Base Classes

`BaseAdapter`

BaseAdapter(config)

Abstract base class defining the standard interface for all adapters.

Parameters

config (dict): Configuration parameters for the adapter

Methods

run(input): Execute the adapter’s functionality
set_input(status): Configure input data from pipeline status
get_result(): Retrieve the adapter’s output data
get_metadata(): Retrieve metadata associated with the output

Adapter Classes

`LoaderAdapter`

LoaderAdapter(config)

Wraps the Loader module for data loading operations.

Configuration Parameters

filepath (str): Path to the data file
method (str, optional): Loading method (‘default’ for benchmark data)
column_types (dict, optional): Column type specifications
header_names (list, optional): Custom header names
na_values (str/list/dict, optional): Custom NA value definitions

Key Methods

get_result(): Returns loaded DataFrame
get_metadata(): Returns SchemaMetadata for the loaded data

`SplitterAdapter`

SplitterAdapter(config)

Wraps the Splitter module for data splitting operations.

Configuration Parameters

train_split_ratio (float): Ratio for training data (default: 0.8)
num_samples (int): Number of split samples (default: 1)
random_state (int/float/str, optional): Random seed
method (str, optional): ‘custom_data’ for loading pre-split data

Key Methods

get_result(): Returns dict with ’train’ and ‘validation’ DataFrames
get_metadata(): Returns updated SchemaMetadata with split information

`PreprocessorAdapter`

PreprocessorAdapter(config)

Wraps the Processor module for data preprocessing operations.

Configuration Parameters

method (str): Processing method (‘default’ or ‘custom’)
sequence (list, optional): Custom processing sequence
config (dict, optional): Processor-specific configuration

Key Methods

get_result(): Returns preprocessed DataFrame
get_metadata(): Returns updated SchemaMetadata

`SynthesizerAdapter`

SynthesizerAdapter(config)

Wraps the Synthesizer module for synthetic data generation.

Configuration Parameters

method (str): Synthesis method (e.g., ‘sdv’)
model (str): Model type (e.g., ‘GaussianCopula’)
Additional parameters specific to the chosen method

Key Methods

get_result(): Returns synthetic DataFrame

`PostprocessorAdapter`

PostprocessorAdapter(config)

Wraps the Processor module for data postprocessing operations.

Configuration Parameters

method (str): Processing method (‘default’ or custom)

Key Methods

get_result(): Returns postprocessed DataFrame

`ConstrainerAdapter`

ConstrainerAdapter(config)

Wraps the Constrainer module for applying data constraints.

Configuration Parameters

field_combinations (list): Field combination constraints
target_rows (int, optional): Target number of rows
sampling_ratio (float, optional): Sampling ratio for resampling
max_trials (int, optional): Maximum resampling attempts

Key Methods

get_result(): Returns constrained DataFrame

`EvaluatorAdapter`

EvaluatorAdapter(config)

Wraps the Evaluator module for data quality assessment.

Configuration Parameters

method (str): Evaluation method (e.g., ‘sdmetrics’)
Additional parameters specific to the chosen method

Key Methods

get_result(): Returns dict of evaluation results by metric type

`DescriberAdapter`

DescriberAdapter(config)

Wraps the Describer module for descriptive data analysis.

Configuration Parameters

method (str): Description method
Additional parameters specific to the chosen method

Key Methods

get_result(): Returns dict of descriptive analysis results

`ReporterAdapter`

ReporterAdapter(config)

Wraps the Reporter module for result export and reporting.

Configuration Parameters

method (str): Report method (‘save_data’ or ‘save_report’)
source (str/list): Source modules for data export
granularity (str): Report granularity (‘global’, ‘columnwise’, ‘pairwise’)
output (str, optional): Output filename prefix

Key Methods

get_result(): Returns generated report data

Usage Examples

Basic Adapter Usage

from petsard.adapter import LoaderAdapter

# Create and configure adapter
config = {"filepath": "data.csv"}
loader_adapter = LoaderAdapter(config)

# Set input (typically done by Executor)
input_data = loader_adapter.set_input(status)

# Execute operation
loader_adapter.run(input_data)

# Retrieve results
data = loader_adapter.get_result()
metadata = loader_adapter.get_metadata()

Pipeline Integration

from petsard.config import Config
from petsard.executor import Executor

# Adapters are typically used through Config and Executor
config_dict = {
    "Loader": {"load_data": {"filepath": "data.csv"}},
    "Synthesizer": {"synth": {"method": "sdv", "model": "GaussianCopula"}},
    "Evaluator": {"eval": {"method": "sdmetrics"}}
}

config = Config(config_dict)
executor = Executor(config)
executor.run()

Architecture Benefits

1. Consistent Interface

Standardized methods: All adapters implement the same base interface
Predictable behavior: Consistent execution patterns across all modules

2. Error Handling

Comprehensive logging: Detailed logging for debugging and monitoring
Exception management: Consistent error handling and reporting

3. Pipeline Integration

Status management: Seamless integration with the Status system
Data flow: Standardized data passing between pipeline stages

4. Modularity

Separation of concerns: Each adapter handles one specific functionality
Extensibility: Easy to add new adapters for new modules

The Adapter system provides the foundation for PETsARD’s modular pipeline architecture, ensuring consistent and reliable execution across all data processing stages.

Reporter Config