load_csv
Module for loading CSV data into pandas DataFrames with custom error handling and logging.
Functions:
Name | Description |
---|---|
load_devices_datasets |
Loads device datasets from CSV files specified in the pipeline configuration. |
Constants
NA_VALUES : List[str] A list of strings that should be treated as NA values when loading CSV files.
load_devices_datasets(pipeline_config)
Loads the device datasets from CSV files specified in the pipeline configuration.
Expected the pipeline_config
to contain a dataset_config
dictionary with dataset names as
keys and their respective loading parameters as values, for example:
pipeline_config.dataset_config = {
"dataset_name": {
"file_path": "path/to/csv_file.csv",
# Additional parameters for loading the CSV file can be included here.
},
...
}
The function will load each dataset into a pandas DataFrame and add it to the datasets
dictionary under the key "data". If a dataset already contains a "data" key, it will be removed
before loading the new DataFrame.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
pipeline_config
|
Config
|
The configuration object containing dataset information. |
required |
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, Any]]
|
The input dictionary with an additional key "data" in each inner dictionary, containing the loaded DataFrame. |
Raises:
Type | Description |
---|---|
NoDatasetsProvidedError
|
If the |