Skip to content

pandas_utils

Pandas utils are used to handle common actions performed on pandas dataframes.

convert_hourly_time_series(df, target_resolution_seconds, value_distribution)

Resamples an hourly time series dataframe to a finer resolution.

This function expands an input dataframe with hourly resolution (3600 seconds) into smaller time intervals based on the specified target_resolution_seconds. Each new row maintains the original variable_value or distributes it proportionally, depending on the selected value_distribution method.

Parameters:

Name Type Description Default
df DataFrame

The input dataframe containing hourly data. Expected columns: - "start_time_lb_utc" (datetime, UTC): Start timestamp of each hourly interval. - "stop_time_lb_utc" (datetime, UTC): End timestamp of each hourly interval. - "variable_id" (str): Identifier for the variable. - "variable_value" (int | float): Measured value for the time interval. - "resolution_seconds" (int): Original resolution of the data (must be 3600).

required
target_resolution_seconds int

The desired resolution for resampling. Must be a factor of 3600 (e.g., 900 for quarter-hourly, 300 for 5-minute).

required
value_distribution str

Specifies how variable_value is handled when resampling. - "same": Each new interval retains the original hourly value. - "divide": The original value is evenly split across the new intervals.

required

Returns: pd.DataFrame: A dataframe with the updated resolution, containing columns: "start_time_lb_utc", "stop_time_lb_utc", "variable_id", "variable_value", and "resolution_seconds".

Raises:

Type Description
ValueError

If the input dataframe is missing required columns.

ValueError

If resolution_seconds is not 3600 in the input dataframe.

ValueError

If target_resolution_seconds is not a factor of 3600.

ValueError

If value_distribution is not one of "same" or "divide".

Example
import pandas as pd
from datetime import datetime
from zoneinfo import ZoneInfo

df = pd.DataFrame({
    "start_time_lb_utc": [pd.Timestamp("2025-02-10 12:00:00", tz="UTC")],
    "stop_time_lb_utc": [pd.Timestamp("2025-02-10 13:00:00", tz="UTC")],
    "variable_id": ["A"],
    "variable_value": [100],
    "resolution_seconds": [3600]
})

df_resampled = convert_hourly_time_series(df, target_resolution_seconds=900, value_distribution="divide")
print(df_resampled)
Source code in physical_operations_utils/pandas_utils.py
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
def convert_hourly_time_series(
    df: pd.DataFrame, target_resolution_seconds: int, value_distribution: str
):
    """
    Resamples an hourly time series dataframe to a finer resolution.

    This function expands an input dataframe with hourly resolution (3600 seconds)
    into smaller time intervals based on the specified `target_resolution_seconds`.
    Each new row maintains the original `variable_value` or distributes it
    proportionally, depending on the selected `value_distribution` method.

    Parameters:
        df (pd.DataFrame): The input dataframe containing hourly data.
            Expected columns:
            - "start_time_lb_utc" (datetime, UTC): Start timestamp of each hourly interval.
            - "stop_time_lb_utc" (datetime, UTC): End timestamp of each hourly interval.
            - "variable_id" (str): Identifier for the variable.
            - "variable_value" (int | float): Measured value for the time interval.
            - "resolution_seconds" (int): Original resolution of the data (must be 3600).

        target_resolution_seconds (int): The desired resolution for resampling.
            Must be a factor of 3600 (e.g., 900 for quarter-hourly, 300 for 5-minute).

        value_distribution (str): Specifies how `variable_value` is handled when resampling.
            - "same": Each new interval retains the original hourly value.
            - "divide": The original value is evenly split across the new intervals.

    Returns:
    pd.DataFrame: A dataframe with the updated resolution, containing columns:
    "start_time_lb_utc", "stop_time_lb_utc", "variable_id", "variable_value",
    and "resolution_seconds".


    Raises:
        ValueError: If the input dataframe is missing required columns.
        ValueError: If `resolution_seconds` is not 3600 in the input dataframe.
        ValueError: If `target_resolution_seconds` is not a factor of 3600.
        ValueError: If `value_distribution` is not one of "same" or "divide".

    Example:
        ```python
        import pandas as pd
        from datetime import datetime
        from zoneinfo import ZoneInfo

        df = pd.DataFrame({
            "start_time_lb_utc": [pd.Timestamp("2025-02-10 12:00:00", tz="UTC")],
            "stop_time_lb_utc": [pd.Timestamp("2025-02-10 13:00:00", tz="UTC")],
            "variable_id": ["A"],
            "variable_value": [100],
            "resolution_seconds": [3600]
        })

        df_resampled = convert_hourly_time_series(df, target_resolution_seconds=900, value_distribution="divide")
        print(df_resampled)
        ```
    """
    required_columns = {
        "start_time_lb_utc",
        "stop_time_lb_utc",
        "variable_id",
        "variable_value",
        "resolution_seconds",
    }

    if df.empty:
        return df

    # 🔹 If 'start_time_lb_utc' is in index, reset it to columns
    if "start_time_lb_utc" not in df.columns and df.index.name == "start_time_lb_utc":
        df = df.reset_index()

    # 🔹 Ensure no extra spaces or formatting issues in column names
    df.columns = df.columns.str.strip()

    # 🔹 Check for missing required columns
    missing_columns = required_columns - set(df.columns)
    if missing_columns:
        raise ValueError(
            f"Input dataframe must contain columns: {required_columns}. Missing: {missing_columns}"
        )

    # Validate UTC datetime columns
    validate_df_column_is_utc_datetime(df, "start_time_lb_utc")
    validate_df_column_is_utc_datetime(df, "stop_time_lb_utc")

    # Ensure the resolution is hourly (3600 seconds)
    if any(df["resolution_seconds"] != 3600):
        raise ValueError("Input dataframe must have a resolution of 3600 seconds.")

    # Validate that target resolution is a factor of the input resolution
    if 3600 % target_resolution_seconds != 0:
        raise ValueError(
            "Target resolution must be a factor of the input resolution (3600 seconds)."
        )

    # Validate value_distribution option
    if value_distribution not in {"same", "divide"}:
        raise ValueError(
            "Invalid value_distribution option. Choose either 'same' or 'divide'."
        )

    factor = 3600 // target_resolution_seconds

    df_expanded = df.loc[df.index.repeat(factor)].reset_index(drop=True)

    df_expanded["start_time_lb_utc"] = df_expanded.groupby(df_expanded.index // factor)[
        "start_time_lb_utc"
    ].transform(
        lambda x: x
        + pd.to_timedelta(range(factor), unit="s") * target_resolution_seconds
    )

    df_expanded["stop_time_lb_utc"] = df_expanded["start_time_lb_utc"] + pd.Timedelta(
        seconds=target_resolution_seconds
    )

    if value_distribution == "divide":
        df_expanded["variable_value"] /= factor

    df_expanded["resolution_seconds"] = target_resolution_seconds

    return df_expanded

filter_dataframe_by_resolution_seconds(df, keep_resolutions)

Filters a dataframe based on the resolution in seconds of each of its rows. Keeps only such columns where the resolution is one specified in input keep_resolutions.

Parameters:

Name Type Description Default
df DataFrame

The input dataframe to be filtered. Must contain a int94 column called resolution_seconds.

required
keep_resolutions List[int]

A list of resolutions to keep. Must contain only integers and cannot be empty.

required

Returns:

Type Description
DataFrame

pd.DataFrame: A DataFrame containing only those rows from the input where the resolution is in keep_resolutions.

Raises:

Type Description
ValueError

If the dataframe does not have a int64 column resolution_seconds, if keep_resolutions is empty or if keep_resolutions has non-integer elements.

Example
    import pandas as pd

    from physical_operations_utils.pandas_utils import (
        filter_dataframe_by_resolution_seconds,
    )

    df = pd.DataFrame(
        data={
            "column1": ["keep", "discard", "keep", "discard", "keep", "keep"],
            "resolution_seconds": [3600, 666, 900, 666, 900, 900],
        }
    )
    keep_resolutions = [3600, 900]

    res = filter_dataframe_by_resolution_seconds(
        df=df, keep_resolutions=keep_resolutions
    )

    print(res)
Source code in physical_operations_utils/pandas_utils.py
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def filter_dataframe_by_resolution_seconds(
    df: pd.DataFrame, keep_resolutions: List[int]
) -> pd.DataFrame:
    """
    Filters a dataframe based on the resolution in seconds of each of its rows. Keeps only such columns where the resolution is one specified in input keep_resolutions.

    Parameters:
        df (pd.DataFrame): The input dataframe to be filtered. Must contain a int94 column called resolution_seconds.
        keep_resolutions (List[int]): A list of resolutions to keep. Must contain only integers and cannot be empty.

    Returns:
        pd.DataFrame: A DataFrame containing only those rows from the input where the resolution is in keep_resolutions.

    Raises:
        ValueError: If the dataframe does not have a int64 column resolution_seconds, if keep_resolutions is empty or if keep_resolutions has non-integer elements.

    Example:
        ```python
            import pandas as pd

            from physical_operations_utils.pandas_utils import (
                filter_dataframe_by_resolution_seconds,
            )

            df = pd.DataFrame(
                data={
                    "column1": ["keep", "discard", "keep", "discard", "keep", "keep"],
                    "resolution_seconds": [3600, 666, 900, 666, 900, 900],
                }
            )
            keep_resolutions = [3600, 900]

            res = filter_dataframe_by_resolution_seconds(
                df=df, keep_resolutions=keep_resolutions
            )

            print(res)
        ```
    """
    if len(keep_resolutions) < 1:
        raise ValueError(
            "Input keep_resolutions must have at least one integer element"
        )
    if not all(isinstance(el, int) for el in keep_resolutions):
        raise ValueError("Input keep_resolutions must only contain integers")
    if "resolution_seconds" not in df.columns:
        raise ValueError("Input dataframe must have a column `resolution_seconds`")
    if df["resolution_seconds"].dtype != "int64":
        raise ValueError("Column `resolution_seconds` must be of type `int64`")
    df_filtered = df.copy(deep=True)
    return df_filtered.loc[
        df_filtered["resolution_seconds"].isin(keep_resolutions), :
    ].reset_index(drop=True)

generate_empty_df_with_start_stop_time_lb_utc_variable_id_and_variable_value(start_time_lb_utc, stop_time_lb_utc, resolution_seconds, variable_ids, variable_value_data_type)

Generates an empty pandas DataFrame with time intervals and variable metadata.

This function generates a DataFrame with time intervals between start_time_lb_utc and stop_time_lb_utc at the given resolution for each given variable_id. It also adds a variable_value column initialized based on the specified data type if the variable_value_data_type is not "None".

Parameters:

Name Type Description Default
start_time_lb_utc datetime

The starting time (UTC) for the generated intervals. Represents the start_time_lb_utc of the first row.

required
stop_time_lb_utc datetime

The stopping time (UTC) for the generated intervals. Represents the stop_time_lb_utc of the last row.

required
resolution_seconds int

The interval resolution in seconds.

required
variable_ids List[str]

A list of variable IDs to include in the DataFrame.

required
variable_value_data_type str

The data type of the variable_value column. Must be one of "str", "int", "float", "bool", or "None".

required

Returns:

Type Description
DataFrame

pd.DataFrame: A DataFrame containing time intervals, variable IDs, and initialized values.

Raises:

Type Description
ValueError

If variable_ids is empty or variable_id_data_type is invalid.

Example
from datetime import datetime
from zoneinfo import ZoneInfo

df = generate_empty_df_with_start_stop_time_lb_utc_variable_id_and_variable_value(
    datetime(2025, 2, 1, 12, 0, tzinfo=ZoneInfo("UTC")),
    datetime(2025, 2, 1, 14, 0, tzinfo=ZoneInfo("UTC")),
    3600,
    ["temperature", "humidity"],
    "float"
)
print(df.head())
Source code in physical_operations_utils/pandas_utils.py
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
def generate_empty_df_with_start_stop_time_lb_utc_variable_id_and_variable_value(
    start_time_lb_utc: datetime,
    stop_time_lb_utc: datetime,
    resolution_seconds: int,
    variable_ids: List[str],
    variable_value_data_type: str,
) -> pd.DataFrame:
    """
    Generates an empty pandas DataFrame with time intervals and variable metadata.

    This function generates a DataFrame with time intervals between `start_time_lb_utc` and `stop_time_lb_utc` at the given resolution
    for each given `variable_id`. It also adds a `variable_value` column initialized based on the specified data type if the `variable_value_data_type` is not "None".

    Parameters:
        start_time_lb_utc (datetime): The starting time (UTC) for the generated intervals. Represents the start_time_lb_utc of the first row.
        stop_time_lb_utc (datetime): The stopping time (UTC) for the generated intervals. Represents the stop_time_lb_utc of the last row.
        resolution_seconds (int): The interval resolution in seconds.
        variable_ids (List[str]): A list of variable IDs to include in the DataFrame.
        variable_value_data_type (str): The data type of the `variable_value` column.
            Must be one of `"str"`, `"int"`, `"float"`, `"bool"`, or `"None"`.

    Returns:
        pd.DataFrame: A DataFrame containing time intervals, variable IDs, and initialized values.

    Raises:
        ValueError: If `variable_ids` is empty or `variable_id_data_type` is invalid.

    Example:
        ```python
        from datetime import datetime
        from zoneinfo import ZoneInfo

        df = generate_empty_df_with_start_stop_time_lb_utc_variable_id_and_variable_value(
            datetime(2025, 2, 1, 12, 0, tzinfo=ZoneInfo("UTC")),
            datetime(2025, 2, 1, 14, 0, tzinfo=ZoneInfo("UTC")),
            3600,
            ["temperature", "humidity"],
            "float"
        )
        print(df.head())
        ```
    """
    validate_datetime_in_utc(start_time_lb_utc)
    validate_datetime_in_utc(stop_time_lb_utc)
    validate_start_time_before_end_time(start_time_lb_utc, stop_time_lb_utc)
    if len(variable_ids) == 0:
        raise ValueError("Variable IDs list cannot be empty.")
    if variable_value_data_type not in ["str", "int", "float", "bool", "None"]:
        raise ValueError(
            "Variable value data type must be one of the following: 'str', 'int', 'float', 'bool', or 'None'."
        )
    final_df = pd.DataFrame()
    for id in variable_ids:
        df = pd.DataFrame()
        df["start_time_lb_utc"] = pd.date_range(
            start_time_lb_utc,
            stop_time_lb_utc,
            freq=f"{resolution_seconds}s",
            tz=ZoneInfo("UTC"),
            inclusive="left",
        )
        df["stop_time_lb_utc"] = df["start_time_lb_utc"] + pd.Timedelta(
            seconds=resolution_seconds
        )
        df["variable_id"] = id
        final_df = pd.concat([final_df, df])
    if variable_value_data_type == "str":
        final_df["variable_value"] = ""
    elif variable_value_data_type == "int":
        final_df["variable_value"] = 0
        final_df["variable_value"] = final_df["variable_value"].astype("int64")
    elif variable_value_data_type == "float":
        final_df["variable_value"] = 0.0
        final_df["variable_value"] = final_df["variable_value"].astype("float64")
    elif variable_value_data_type == "bool":
        final_df["variable_value"] = False
        final_df["variable_value"] = final_df["variable_value"].astype("bool")
    final_df["resolution_seconds"] = resolution_seconds
    final_df["resolution_seconds"] = final_df["resolution_seconds"].astype("int64")
    return final_df.reset_index(drop=True)

generate_formated_html_table_string_from_df(df, mark_current_utc_hour_bold_cursive=False, column_to_color_scale_map=None, thicker_border_columns=None, parse_snake_case_column_headers=None, scrollable=False, max_height_px=1000)

Generates an HTML table string from a pandas DataFrame with optional formatting features.

This function: 1. Converts a given DataFrame into an HTML table string. 2. Allows marking rows in bold and cursive if the current UTC hour matches start_time_lb_utc. 3. Supports coloring for specific columns based on provided mappings, including: - "long_red_short_blue_coloring": Applies a color scale from red to blue for negative values and blue to red for positive values. - "red_or_green": Colors negative values red and positive values green. - "risk_meter": Applies a color scale from red to blue for negative values and blue to red for positive values. Range starts at -0.1 and 0.1. 4. Adds thicker borders to specified columns (left or right). 5. Allows parsing snake_case column headers into properly formatted titles.

Parameters:

Name Type Description Default
df DataFrame

The DataFrame to convert into an HTML table.

required
mark_current_utc_hour_bold_cursive bool

If True, rows containing the current UTC hour in the "start_time_lb_utc" or "start_time_lb_stockholm" column will be highlighted in bold and italic.

False
column_to_color_scale_map Dict[str, str]

A mapping of column names to color scale functions. - "long_red_short_blue_coloring": Applies color scale from red to blue for negative values and blue to red for positive values. - "red_negative_green_positive": Applies red for negative values and green for positive values. - "red_positive_green_negative": Applies red for positive values and green for negative values. - "risk_meter": Uses a predefined gradient for risk values. - Any unrecognized key will generate a warning.

None
thicker_border_columns Dict[str, str]

A mapping of column names to border positions ("left" or "right") for applying a thicker border.

None
parse_snake_case_column_headers List[str]

A list of column names to format from snake_case to Title Case. Example: "start_time_lb_utc" -> "Start Time Lb Utc".

None
scrollable bool

If True, makes the table scrollable with a fixed height.

False
max_height_px int

The maximum height of the scrollable div in pixels. Only applied if scrollable is True. Default is 1000.

1000

Returns:

Name Type Description
str str

The generated HTML table as a string.

Raises:

Type Description
KeyError

If a required column for formatting (e.g., "start_time_lb_utc") is missing.

Example
import pandas as pd

df = pd.DataFrame({
    "start_time_lb_utc": ["2025-02-01 14:00:00+00:00"],
    "value": [10],
    "risk_score": [-0.5]
})

html_table = generate_formated_html_table_string_from_df(
    df,
    mark_current_utc_hour_bold_cursive=True,
    column_to_color_scale_map={"value": "red_negative_green_positive", "risk_score": "risk_meter"},
    thicker_border_columns={"value": "right"},
    parse_snake_case_column_headers=["start_time_lb_utc", "risk_score"]
)
print(html_table)  # Outputs an HTML table string
Source code in physical_operations_utils/pandas_utils.py
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
def generate_formated_html_table_string_from_df(  # noqa: C901
    df: pd.DataFrame,
    mark_current_utc_hour_bold_cursive: bool = False,
    column_to_color_scale_map: Dict[str, str] = None,
    thicker_border_columns: Dict[str, str] = None,
    parse_snake_case_column_headers: List[str] = None,
    scrollable: bool = False,
    max_height_px: int = 1000,
) -> str:
    """
    Generates an HTML table string from a pandas DataFrame with optional formatting features.

    This function:
    1. Converts a given DataFrame into an HTML table string.
    2. Allows marking rows in **bold and cursive** if the current UTC hour matches `start_time_lb_utc`.
    3. Supports **coloring** for specific columns based on provided mappings, including:
       - "long_red_short_blue_coloring": Applies a color scale from red to blue for negative values and blue to red for positive values.
       - "red_or_green": Colors negative values red and positive values green.
       - "risk_meter": Applies a color scale from red to blue for negative values and blue to red for positive values. Range starts at -0.1 and 0.1.
    4. Adds **thicker borders** to specified columns (left or right).
    5. Allows **parsing snake_case column headers** into properly formatted titles.

    Parameters:
        df (pd.DataFrame): The DataFrame to convert into an HTML table.
        mark_current_utc_hour_bold_cursive (bool, optional): If `True`, rows containing the current UTC hour
            in the "start_time_lb_utc" or "start_time_lb_stockholm" column will be highlighted in **bold** and *italic*.
        column_to_color_scale_map (Dict[str, str], optional): A mapping of column names to color scale functions.
            - "long_red_short_blue_coloring": Applies color scale from red to blue for negative values and blue to red for positive values.
            - "red_negative_green_positive": Applies red for negative values and green for positive values.
            - "red_positive_green_negative": Applies red for positive values and green for negative values.
            - "risk_meter": Uses a predefined gradient for risk values.
            - Any unrecognized key will generate a warning.
        thicker_border_columns (Dict[str, str], optional): A mapping of column names to border positions ("left" or "right")
            for applying a **thicker border**.
        parse_snake_case_column_headers (List[str], optional): A list of column names to format from snake_case to Title Case.
            Example: "start_time_lb_utc" -> "Start Time Lb Utc".
        scrollable (bool, optional): If `True`, makes the table scrollable with a fixed height.
        max_height_px (int, optional): The maximum height of the scrollable div in pixels. Only applied if `scrollable` is `True`. Default is 1000.

    Returns:
        str: The generated HTML table as a string.

    Raises:
        KeyError: If a required column for formatting (e.g., "start_time_lb_utc") is missing.

    Example:
        ```python
        import pandas as pd

        df = pd.DataFrame({
            "start_time_lb_utc": ["2025-02-01 14:00:00+00:00"],
            "value": [10],
            "risk_score": [-0.5]
        })

        html_table = generate_formated_html_table_string_from_df(
            df,
            mark_current_utc_hour_bold_cursive=True,
            column_to_color_scale_map={"value": "red_negative_green_positive", "risk_score": "risk_meter"},
            thicker_border_columns={"value": "right"},
            parse_snake_case_column_headers=["start_time_lb_utc", "risk_score"]
        )
        print(html_table)  # Outputs an HTML table string
        ```
    """
    if column_to_color_scale_map is None:
        column_to_color_scale_map = {}
    if thicker_border_columns is None:
        thicker_border_columns = {}
    if parse_snake_case_column_headers is None:
        parse_snake_case_column_headers = []

    scrollable_div_open = (
        f'<div style="max-height: {max_height_px}px; overflow-y: auto; margin: 10px;">'
        if scrollable
        else "<div style='margin: 10px;'>"
    )
    scrollable_div_close = "</div>"

    html_table = '<table style="border: 1px solid black; width: 100%; border-collapse: separate; border-spacing: 0;">'
    html_table += '<thead style="position: sticky; top: 0; background-color: #ffffff; z-index: 100;"><tr>'

    # Apply a thicker border if the column is in thicker_border_columns
    for col in df.columns:
        if col in thicker_border_columns.keys() and thicker_border_columns[col] in [
            "right",
            "left",
        ]:
            border_side = thicker_border_columns[col]
            border_style = f"border-{border_side}: 4px solid black"
        else:
            border_style = ""
        if col in parse_snake_case_column_headers:
            col_header = " ".join(col.split("_")).title()
        else:
            col_header = col
        html_table += f'<th style="border: 1px solid black; {border_style}; background-color: #ffffff; z-index: 100; padding: 2px;">{col_header}</th>'
    html_table += "</tr></thead><tbody>"

    # Add rows
    for _, row in df.iterrows():
        # Apply bold and cursive style to the row if the current UTC hour is in the row
        if mark_current_utc_hour_bold_cursive:
            if "start_time_lb_utc" in df.columns:
                is_current_hour = is_datetime_in_current_utc_hour(
                    row["start_time_lb_utc"]
                )
            elif "start_time_lb_stockholm" in df.columns:
                is_current_hour = is_datetime_in_current_utc_hour(
                    row["start_time_lb_stockholm"].tz_convert(ZoneInfo("UTC"))
                )
            else:
                is_current_hour = False
            if is_current_hour:
                html_table += '<tr style="font-weight: bold; font-style: italic;">'
            else:
                html_table += "<tr>"
        else:
            html_table += "<tr>"

        # Add and format columns
        for col in df.columns:
            # Apply a thicker border if the column is in thicker_border_columns
            if col in thicker_border_columns.keys() and thicker_border_columns[col] in [
                "right",
                "left",
            ]:
                border_side = thicker_border_columns[col]
                border_style = f"border-{border_side}: 4px solid black;"
            else:
                border_style = ""

            # Apply color scale if the column is in column_to_color_scale_map
            value = row[col]
            color = DEFUALT_COLOR
            if col in column_to_color_scale_map.keys():
                if column_to_color_scale_map[col] == "long_red_short_blue_coloring":
                    color = get_long_red_short_blue_coloring(value)
                elif column_to_color_scale_map[col] == "red_negative_green_positive":
                    color = get_color_red_negative_green_positive(value)
                elif column_to_color_scale_map[col] == "red_positive_green_negative":
                    color = get_color_red_negative_green_positive(value, invert=True)
                elif column_to_color_scale_map[col] == "risk_meter":
                    color = get_color_risk_meter(value)
                else:
                    logging.warning(
                        f"Not implemented color scale for column {col}: {column_to_color_scale_map[col]}"
                    )
            html_table += f'<td style="border: 1px solid black; background-color: {color}; {border_style}; padding: 2px;">{value}</td>'
        html_table += "</tr>"

    html_table += "</tbody></table>"
    return scrollable_div_open + html_table + scrollable_div_close

get_color_red_negative_green_positive(value, invert=False)

Returns a color code based on whether the value is positive or negative.

This function: - Returns red (#FF0000) if the value is negative. - Returns green (#008000) if the value is positive. - Returns DEFUALT_COLOR for zero or non-numeric values.

Parameters:

Name Type Description Default
value int | float

The numeric value to evaluate.

required
invert bool

If True, the color mapping is inverted (red for positive, green for negative). Default is False.

False

Returns:

Name Type Description
str str

A hex color code representing the mapped color.

Example
get_color_red_negative_green_positive(-10)  # Output: "#FF0000"
get_color_red_negative_green_positive(20)   # Output: "#008000"
get_color_red_negative_green_positive(0)    # Output: DEFUALT_COLOR
Source code in physical_operations_utils/pandas_utils.py
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
def get_color_red_negative_green_positive(value, invert: bool = False) -> str:
    """
    Returns a color code based on whether the value is positive or negative.

    This function:
    - Returns **red (`#FF0000`)** if the value is negative.
    - Returns **green (`#008000`)** if the value is positive.
    - Returns `DEFUALT_COLOR` for zero or non-numeric values.

    Parameters:
        value (int | float): The numeric value to evaluate.
        invert (bool, optional): If `True`, the color mapping is inverted (red for positive, green for negative). Default is `False`.

    Returns:
        str: A hex color code representing the mapped color.

    Example:
        ```python
        get_color_red_negative_green_positive(-10)  # Output: "#FF0000"
        get_color_red_negative_green_positive(20)   # Output: "#008000"
        get_color_red_negative_green_positive(0)    # Output: DEFUALT_COLOR
        ```
    """
    if isinstance(value, (int, float)):
        red = "#FF0000"
        green = "#008000"
        if value < 0:
            if invert:
                return green
            else:
                return red
        if value > 0:
            if invert:
                return red
            else:
                return green
    return DEFUALT_COLOR

get_color_risk_meter(value)

Returns a color code based on the provided numerical value, applying a gradient where negative values are blue-shaded and positive values are red-shaded.

This function: - Maps values between -0.1 and lower to varying shades of blue. - Maps values between 0.1 and higher to varying shades of red/orange. - Values outside the predefined ranges return DEFUALT_COLOR.

Parameters:

Name Type Description Default
value int | float

The numeric value to evaluate.

required

Returns:

Name Type Description
str str

A hex color code representing the mapped color.

Example
get_color_risk_meter(-20)  # Output: "#B2D3E6"
get_color_risk_meter(30)   # Output: "#FC4E2A"
get_color_risk_meter(0)    # Output: DEFUALT_COLOR
Source code in physical_operations_utils/pandas_utils.py
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
def get_color_risk_meter(value) -> str:
    """
    Returns a color code based on the provided numerical value, applying
    a gradient where negative values are blue-shaded and positive values
    are red-shaded.

    This function:
    - Maps values **between -0.1 and lower** to varying shades of **blue**.
    - Maps values **between 0.1 and higher** to varying shades of **red/orange**.
    - Values outside the predefined ranges return `DEFUALT_COLOR`.

    Parameters:
        value (int | float): The numeric value to evaluate.

    Returns:
        str: A hex color code representing the mapped color.

    Example:
        ```python
        get_color_risk_meter(-20)  # Output: "#B2D3E6"
        get_color_risk_meter(30)   # Output: "#FC4E2A"
        get_color_risk_meter(0)    # Output: DEFUALT_COLOR
        ```
    """
    if isinstance(value, (int, float)):
        if -0.3 <= value < -0.1:
            return "#D8E9F3"
        if -0.5 <= value < -0.3:
            return "#B2D3E6"
        if -0.7 <= value < -0.5:
            return "#8DBDD8"
        if -1.0 <= value < -0.7:
            return "#68A7CA"
        if value < -1.0:
            return "#4390BC"
        if 0.1 < value <= 0.3:
            return "#FEB24C"
        if 0.3 < value <= 0.5:
            return "#FD8D3C"
        if 0.5 < value <= 0.7:
            return "#FC4E2A"
        if 0.7 < value <= 1.0:
            return "#E31A1C"
        if value > 1.0:
            return "#B10026"
    return DEFUALT_COLOR

get_long_red_short_blue_coloring(value)

Returns a color code based on the provided numerical value, applying a gradient where negative values are blue-shaded and positive values are red-shaded.

This function: - Maps values between -5 and -45 to varying shades of blue. - Maps values between 5 and 45 to varying shades of red/orange. - Values outside the predefined ranges return DEFUALT_COLOR.

Parameters:

Name Type Description Default
value int | float

The numeric value to evaluate.

required

Returns:

Name Type Description
str str

A hex color code representing the mapped color.

Example
get_long_red_short_blue_coloring(-20)  # Output: "#B2D3E6"
get_long_red_short_blue_coloring(30)   # Output: "#FC4E2A"
get_long_red_short_blue_coloring(0)    # Output: DEFUALT_COLOR
Source code in physical_operations_utils/pandas_utils.py
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
def get_long_red_short_blue_coloring(value) -> str:
    """
    Returns a color code based on the provided numerical value, applying
    a gradient where negative values are blue-shaded and positive values
    are red-shaded.

    This function:
    - Maps values **between -5 and -45** to varying shades of **blue**.
    - Maps values **between 5 and 45** to varying shades of **red/orange**.
    - Values outside the predefined ranges return `DEFUALT_COLOR`.

    Parameters:
        value (int | float): The numeric value to evaluate.

    Returns:
        str: A hex color code representing the mapped color.

    Example:
        ```python
        get_long_red_short_blue_coloring(-20)  # Output: "#B2D3E6"
        get_long_red_short_blue_coloring(30)   # Output: "#FC4E2A"
        get_long_red_short_blue_coloring(0)    # Output: DEFUALT_COLOR
        ```
    """
    if isinstance(value, (int, float)):
        if -15 <= value < -5:
            return "#D8E9F3"
        if -25 <= value < -15:
            return "#B2D3E6"
        if -35 <= value < -25:
            return "#8DBDD8"
        if -45 <= value < -35:
            return "#68A7CA"
        if value < -45:
            return "#4390BC"
        if 5 < value <= 15:
            return "#FEB24C"
        if 15 < value <= 25:
            return "#FD8D3C"
        if 25 < value <= 35:
            return "#FC4E2A"
        if 35 < value <= 45:
            return "#E31A1C"
        if value > 45:
            return "#B10026"
    return DEFUALT_COLOR