Custom Evaluation

Besides built-in evaluation methods, you can create your own evaluation methods. This is particularly useful when you have specific evaluation needs.

Click the below button to run this example in Colab:

---
Loader:
  data:
    filepath: 'benchmark/adult-income.csv'
Splitter:
  demo:
    num_samples: 1
    train_split_ratio: 0.8
Preprocessor:
  demo:
    method: 'default'
Synthesizer:
  demo:
    method: 'default'
Postprocessor:
  demo:
    method: 'default'
Evaluator:
  custom:
    method: 'custom_method'
    module_path: 'custom-evaluation.py'  # Path to your custom synthesizer
    class_name: 'MyEvaluator_Pushover'  # Synthesizer class name
Reporter:
  save_report_global:
    method: 'save_report'
    granularity: 'global'
  save_report_columnwise:
    method: 'save_report'
    granularity: 'columnwise'
  save_report_pairwise:
    method: 'save_report'
    granularity: 'pairwise'
...

Creating Custom Evaluator

When implementing custom evaluations, you can freely choose whether to inherit from BaseEvaluator or not, but the integration of your evaluation program with PETsARD primarily relies on these two essential constants:

self.REQUIRED_INPUT_KEYS (list[str]): Defines the dictionary keys required in the input data. Standard keys include ori (original data), syn (synthetic data), and control (control data). Whether you provide the control parameter determines if your custom evaluation can be performed without a data splitting process. self.AVAILABLE_SCORES_GRANULARITY (list[str]): Defines the granularity options for evaluation results. Available options include global (global evaluation), columnwise (column-by-column evaluation), pairwise (column pair evaluation), and details (custom detailed evaluation).

import pandas as pd


class MyEvaluator_Pushover:
    REQUIRED_INPUT_KEYS: list[str] = ["ori", "syn", "control"]
    AVAILABLE_SCORES_GRANULARITY: list[str] = [
        "global",
        "columnwise",
        "pairwise",
        "details",
    ]

    def __init__(self, config: dict):
        """
        Args:
            config (dict): The configuration assign by Synthesizer
        """
        self.config: dict = config

    def eval(self, data: dict[str, pd.DataFrame]) -> dict[str, pd.DataFrame]:
        # Implement your evaluation logic
        eval_result: dict[str, int] = {"score": 100}
        colnames: list[str] = data["ori"].columns
        pairs: list[tuple[str, str]] = [
            (col1, col2)
            for i, col1 in enumerate(colnames)
            for j, col2 in enumerate(colnames)
            if j <= i
        ]
        lorem_text: str = (
            "Lorem ipsum dolor sit amet, consectetur adipiscing elit, "
            "sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. "
            "Ut enim ad minim veniam, "
            "quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. "
            "Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. "
            "Excepteur sint occaecat cupidatat non proident, "
            "sunt in culpa qui officia deserunt mollit anim id est laborum."
        )

        return {
            # Return overall evaluation results
            "global": pd.DataFrame(eval_result, index=["result"]),
            # Return per-column evaluation results. Must contains all column names
            "columnwise": pd.DataFrame(eval_result, index=colnames),
            # Return column relationship evaluation results. Must contains all column pairs
            "pairwise": pd.DataFrame(
                eval_result, index=pd.MultiIndex.from_tuples(pairs)
            ),
            # Return detailed evaluation results, not specified the format
            "details": pd.DataFrame({"lorem_text": lorem_text.split(". ")}),
        }

Required Methods

For your own evaluator, you only need to implement one eval() method that returns a dictionary containing evaluation results at different granularity levels. The keys of this dictionary must match those defined in AVAILABLE_SCORES_GRANULARITY, and each value must be a pd.DataFrame that conforms to a specific format.

Format Requirements for Dictionary Key-Value Pairs

global：Global Evaluation Results
- A single-row DataFrame showing overall scores or evaluation summary
columnwise：Column-level Results
- A DataFrame with one row per original data column, using column names as indices
pairwise：Column-pair Results
- A DataFrame with one row per column pair, using a MultiIndex to represent column pairs
details: Custom Details
- A DataFrame in a custom format that can contain any type of detailed evaluation information

ML Utility Benchmark Datasets