datetime

`DatetimeTransformer`

Bases: TransformerWrapper

A transformer to convert datetime features to numeric features. Before applying an underlying (wrapped) transformer. The datetime features are converted to nanoseconds since the epoch, and missing values are assigned to 0.0 under the AugmentMissingnessStrategy.

Parameters:

Name	Type	Description	Default
`transformer`	`ColumnTransformer`	The `ColumnTransformer` to wrap.	required

After applying the transformer, the following attributes will be populated:

Attributes:

Name	Type	Description
`original_column_name`		The name of the original column.

Source code in src/nhssynth/modules/dataloader/transformers/datetime.py

class DatetimeTransformer(TransformerWrapper):
    """
    A transformer to convert datetime features to numeric features. Before applying an underlying (wrapped) transformer.
    The datetime features are converted to nanoseconds since the epoch, and missing values are assigned to 0.0 under the `AugmentMissingnessStrategy`.

    Args:
        transformer: The [`ColumnTransformer`][nhssynth.modules.dataloader.transformers.base.ColumnTransformer] to wrap.

    After applying the transformer, the following attributes will be populated:

    Attributes:
        original_column_name: The name of the original column.
    """

    def __init__(self, transformer: ColumnTransformer) -> None:
        super().__init__(transformer)

    def apply(self, data: pd.Series, missingness_column: Optional[pd.Series] = None, **kwargs) -> pd.DataFrame:
        """
        Firstly, the datetime data is floored to the nano-second level. Next, the floored data is converted to float nanoseconds since the epoch.
        The float value of `pd.NaT` under the operation above is then replaced with `np.nan` to ensure missing values are represented correctly.
        Finally, the wrapped transformer is applied to the data.

        Args:
            data: The column of data to transform.
            missingness_column: The column of missingness indicators to augment the data with.

        Returns:
            The transformed data.
        """
        self.original_column_name = data.name
        floored_data = pd.Series(data.dt.floor("ns").to_numpy().astype(float), name=data.name)
        nan_corrected_data = floored_data.replace(pd.to_datetime(pd.NaT).to_numpy().astype(float), np.nan)
        return super().apply(nan_corrected_data, missingness_column, **kwargs)

    def revert(self, data: pd.DataFrame, **kwargs) -> pd.DataFrame:
        """
        The wrapped transformer's `revert` method is applied to the data. The data is then converted back to datetime format.

        Args:
            data: The full dataset including the column(s) to be reverted to their pre-transformer state.

        Returns:
            The reverted data.
        """
        reverted_data = super().revert(data, **kwargs)
        data[self.original_column_name] = pd.to_datetime(
            reverted_data[self.original_column_name].astype("Int64"), unit="ns"
        )
        return data

`apply(data, missingness_column=None, **kwargs)`

Firstly, the datetime data is floored to the nano-second level. Next, the floored data is converted to float nanoseconds since the epoch. The float value of pd.NaT under the operation above is then replaced with np.nan to ensure missing values are represented correctly. Finally, the wrapped transformer is applied to the data.

Parameters:

Name	Type	Description	Default
`data`	`Series`	The column of data to transform.	required
`missingness_column`	`Optional[Series]`	The column of missingness indicators to augment the data with.	`None`

Returns:

Type	Description
`DataFrame`	The transformed data.

Source code in src/nhssynth/modules/dataloader/transformers/datetime.py

def apply(self, data: pd.Series, missingness_column: Optional[pd.Series] = None, **kwargs) -> pd.DataFrame:
    """
    Firstly, the datetime data is floored to the nano-second level. Next, the floored data is converted to float nanoseconds since the epoch.
    The float value of `pd.NaT` under the operation above is then replaced with `np.nan` to ensure missing values are represented correctly.
    Finally, the wrapped transformer is applied to the data.

    Args:
        data: The column of data to transform.
        missingness_column: The column of missingness indicators to augment the data with.

    Returns:
        The transformed data.
    """
    self.original_column_name = data.name
    floored_data = pd.Series(data.dt.floor("ns").to_numpy().astype(float), name=data.name)
    nan_corrected_data = floored_data.replace(pd.to_datetime(pd.NaT).to_numpy().astype(float), np.nan)
    return super().apply(nan_corrected_data, missingness_column, **kwargs)

`revert(data, **kwargs)`

The wrapped transformer's revert method is applied to the data. The data is then converted back to datetime format.

Parameters:

Name	Type	Description	Default
`data`	`DataFrame`	The full dataset including the column(s) to be reverted to their pre-transformer state.	required

Returns:

Type	Description
`DataFrame`	The reverted data.

Source code in src/nhssynth/modules/dataloader/transformers/datetime.py

def revert(self, data: pd.DataFrame, **kwargs) -> pd.DataFrame:
    """
    The wrapped transformer's `revert` method is applied to the data. The data is then converted back to datetime format.

    Args:
        data: The full dataset including the column(s) to be reverted to their pre-transformer state.

    Returns:
        The reverted data.
    """
    reverted_data = super().revert(data, **kwargs)
    data[self.original_column_name] = pd.to_datetime(
        reverted_data[self.original_column_name].astype("Int64"), unit="ns"
    )
    return data