| |
- cluster_for_separation(datasets, UTM=True, method='HDBSCAN', epsilon=10, min_cluster_size=2)
- CLuster the data for separation metrics, using the density-based clustering method of choice (DBSCAN or HDBSCAN)
Args:
datasets (list): list of movement period DataFrames
UTM (bool, optional): True if using UTM data, false if GPS data. Defaults to True.
method (string ('DBSCAN' or 'HDBSCAN'), optional): clustering method. Defaults to 'HDBSCAN'.
epsilon (float, optional): epsilon for clustering method, if applicable. This is the threshold distance for clusters to be separated, preventing micro-cluistering Defaults to 10.
min_cluster_size(int, optional): minimum number of points to be considered a cluster. Clusters with less than this number will be considered outliers. Defaults to 2.
Returns:
_type_: _description_
all_membership_probs (list) : list of dataframes with cluster membership probabilities for each soldier, if method = 'HDBSCAN'
all_labels (list): list of dataframes with cluster labels for each soldier at each timepoint
all_scores (list): list of series the silouette scores for each timepoint, if multiple clusters are present, otherwise NaN
- get_outlier_time(all_labels)
- get the amount of time each soldier is an outlier from cluster_for_separation label outputs
Args:
all_labels (list of DataFrames): list of clustering label dataframes
Returns:
outlier_times (list of Series): amount of time each soldier is considered an outlier (label = -1) for each movement period dataframe
- make_cluster_gifs(prepped_clust_dfs)
- make gifs from plot_prepped datasets
Args:
prepped_clust_dfs (list of DataFrames): list of dataframes that have been prepped for plotting (long form for seaborn)
Returns:
None: None
- prep_cluster_df(datasets, all_labels, change_units=True, decimate=0)
- Prep dataframe for plotting
Args:
datasets (list): list of movement period DataFrames
all_labels (list of DataFrames): list of clustering label dataframes
change_units (bool, optional): True if units should be changed (change if Long/Lat). Defaults to True.
decimate (int, optional): Decimation factor for the signal. Defaults to 0.
Returns:
prepped_clust_dfs (list of DataFrames): Long form dataframe for seaborn plots (columns=['longitude','latitude','ID','time'])
|