Refactor standardize function for enhanced flexibility
This merge request contains a couple of significant changes to the standardize function. While the overall functionally remains similar, I have simplified the function, removed a couple of paramters, deprecated others, and chaged the funcitonality of the 'labels' paramter
- Removed the start_labels_at_1 in favour of an enhanced 'labels' paramter
- Unified 'table' and 'path' parameters into a single 'annotations' parameter to simplify the function's interface. Added deprecation warnings for 'table' and 'path' to maintain backward compatibility. The annotations parameter can receive either a pandas df like the 'table' argument or a path to a csv like the 'path' parameter.
- Removed 'mapper' argument and deprecated '_create_label_dict' to encourage direct manipulation of DataFrame columns before calling standardize, streamlining the function’s operation.
- Removed te missing_columns function in favour of just checking the missing columns in hte standardize function code.
- Improved label handling logic to dynamically adapt to provided 'labels' configurations, such as custom mappings, and automatic integer mapping. The following options are possible:
- auto (default): the function will automatically map all the labels in the table to integers starting from 0.
- None: No mapping is done, this is useful if hte user has already the label configuration they want
- list: maps the labels in the list to integers starting from 0. Any remaining label is mapped to -1.
- dict: Full control of how the labels are mapped. Any label not in the dict is mapped to -1.
- Updated documentation, tests and examples to reflect new functionality and argument handling.
This merge request does not handle the modifications dicussed in #31