--- title: Utils keywords: fastai sidebar: home_sidebar summary: "General utilities. Should probably split up into `utils.time` and `utils.download`" description: "General utilities. Should probably split up into `utils.time` and `utils.download`" nb_path: "notebooks/01_utils.ipynb" ---
{% raw %}
{% endraw %} {% raw %}
{% endraw %}

Time format strings

These are the different format strings these utils convert from and to.

An identifier with xxx_dt_format_xxx in its name signifies a full datetime format as compared to dates only.

{% raw %}
{% endraw %} {% raw %}

nasa_date_to_datetime[source]

nasa_date_to_datetime(datestr:str)

Parameters:

  • datestr : <class 'str'>

    Date string of the form Y-j

Returns:

  • <class 'datetime.datetime'>
{% endraw %} {% raw %}
{% endraw %} {% raw %}

nasa_date_to_iso[source]

nasa_date_to_iso(datestr:str, with_hours:bool=False)

Convert the day-number based NASA date format to ISO.

Parameters:

  • datestr : <class 'str'>

    Date string of the form Y-j

  • with_hours : <class 'bool'>, optional

    Switch if return is wanted with hours (i.e. isoformat)

Returns:

  • <class 'str'>

    Datestring in either Y-m-d or ISO-format.

{% endraw %} {% raw %}
{% endraw %} {% raw %}
nasa_date = "2010-110"
iso_date = "2010-4-20"
{% endraw %} {% raw %}
assert nasa_date_to_iso(nasa_date, with_hours=True) == "2010-04-20T00:00:00"
assert nasa_date_to_iso(nasa_date) == "2010-04-20"
{% endraw %} {% raw %}

iso_to_nasa_date[source]

iso_to_nasa_date(datestr:str)

Convert iso date to day-number based NASA date.

Parameters:

  • datestr : <class 'str'>

    Date string of the form Y-m-d

Returns:

  • <class 'str'>

    Datestring in NASA standard yyyy-jjj

{% endraw %} {% raw %}
{% endraw %} {% raw %}
assert iso_to_nasa_date(iso_date) == nasa_date
{% endraw %} {% raw %}

nasa_datetime_to_iso[source]

nasa_datetime_to_iso(dtimestr:str)

Convert the day-number based NASA datetime format to ISO.

Note: This is dateTIME vs nasa_date_to_iso which is just DATE.

Parameters:

  • dtimestr : <class 'str'>

    Datetime string of the form Y-jTH-M-S

{% endraw %} {% raw %}
{% endraw %} {% raw %}
nasa_datetime = "2010-110T10:12:14"
nasa_datetime_with_ms = nasa_datetime + ".123000"
iso_datetime = "2010-04-20T10:12:14"
iso_datetime_with_ms = iso_datetime + ".123000"
{% endraw %} {% raw %}
assert nasa_datetime_to_iso(nasa_datetime) == iso_datetime
assert nasa_datetime_to_iso(nasa_datetime_with_ms) == iso_datetime_with_ms
{% endraw %} {% raw %}

iso_to_nasa_datetime[source]

iso_to_nasa_datetime(dtimestr:str)

Convert iso datetime to day-number based NASA datetime.

Parameters:

  • dtimestr : <class 'str'>

    Datetime string of the form yyyy-mm-ddTHH-MM-SS

{% endraw %} {% raw %}
{% endraw %} {% raw %}
assert iso_to_nasa_datetime(iso_datetime) == nasa_datetime
assert iso_to_nasa_datetime(iso_datetime_with_ms) == nasa_datetime_with_ms
{% endraw %} {% raw %}

replace_all_nasa_times[source]

replace_all_nasa_times(df:DataFrame)

Find all NASA times in dataframe and replace with ISO.

Changes will be implemented on incoming dataframe!

This will be done for all columns with the word TIME in the column name.

Parameters:

  • df : <class 'pandas.core.frame.DataFrame'>

    DataFrame with NASA time columns

{% endraw %} {% raw %}
{% endraw %}

Network utils

{% raw %}

parse_http_date[source]

parse_http_date(text:str)

Parse date string retrieved via urllib.request.

Parameters:

  • text : <class 'str'>

    datestring from urllib.request

Returns:

  • <class 'datetime.datetime'>

    dt.datetime object from given datetime string

{% endraw %} {% raw %}

get_remote_timestamp[source]

get_remote_timestamp(url:str)

Get the timestamp of a remote file.

Useful for checking if there's an updated file available.

Parameters:

  • url : <class 'str'>

    URL to check timestamp for

Returns:

  • <class 'datetime.datetime'>
{% endraw %} {% raw %}

url_retrieve[source]

url_retrieve(url:str, outfile:str, chunk_size:int=128)

Improved urlretrieve with progressbar, timeout and chunker.

This downloader has built-in progress bar using tqdm and using the requests package it improves standard urllib behavior by adding time-out capability.

I tested different chunk_sizes and most of the time 128 was actually fastest, YMMV.

Inspired by https://stackoverflow.com/a/61575758/680232

Parameters:

  • url : <class 'str'>

    The URL to download

  • outfile : <class 'str'>

    The path where to store the downloaded file.

  • chunk_size : <class 'int'>, optional

    The size of the chunk for the request.iter_content call. Default: 128

{% endraw %} {% raw %}

have_internet[source]

have_internet()

Fastest way to check for active internet connection.

From https://stackoverflow.com/a/29854274/680232

{% endraw %} {% raw %}
{% endraw %}

Image processing helpers

{% raw %}

height_from_shadow[source]

height_from_shadow(shadow_in_pixels:float, sun_elev:float)

Calculate height of an object from its shadow length.

Note, that your image might have been binned. You need to correct shadow_in_pixels for that.

Parameters:

  • shadow_in_pixels : <class 'float'>

    Measured length of shadow in pixels

  • sun_elev : <class 'float'>

    Ange of sun over horizon in degrees

Returns:

  • <class 'float'>

    Height [meter]

{% endraw %} {% raw %}

get_gdal_center_coords[source]

get_gdal_center_coords(imgpath:Union[str, Path])

Get center rows/cols pixel coordinate for GDAL-readable dataset.

Check CLI gdalinfo --formats to see all formats that GDAL can open.

Parameters:

  • imgpath : typing.Union[str, pathlib.Path]

    Path to raster image that is readable by GDLA

Returns:

  • typing.Tuple[int, int]

    center row/col coordinates.

{% endraw %} {% raw %}

file_variations[source]

file_variations(filename:Union[str, Path], extensions:list)

Create a variation of file names.

Generate a list of variations on a filename by replacing the extension with the provided list.

Adapted from T. Olsens `file_variations of the pysis module for using pathlib.

Parameters:

  • filename : typing.Union[str, pathlib.Path]

    The original filename to use as a base.

  • extensions : <class 'list'>

Returns:

  • <class 'list'>

    list of Paths

{% endraw %} {% raw %}
{% endraw %} {% raw %}
fname = "abc.txt"
{% endraw %} {% raw %}
extensions = ".cub .cal.cub .map.cal.cub".split()
{% endraw %} {% raw %}
file_variations(fname, extensions)
[Path('abc.cub'), Path('abc.cal.cub'), Path('abc.map.cal.cub')]
{% endraw %} {% raw %}
assert len(extensions) == len(file_variations(fname, extensions))
{% endraw %}