analytics_packages package

Submodules

analytics_packages.custom_pandas module

These are all very outdated, wrote these several years ago. There are probably much better ways to achieve what you need to -James

analytics_packages.custom_pandas.add_to_bottom(df, column_names, values)[source]

Adds to bottom of dataframe based on ‘column_names’ and 2D list ‘values.’ See append_to_df_with_df for appending dfs

analytics_packages.custom_pandas.and_gate_many_cols(df, columns, col_name)[source]
analytics_packages.custom_pandas.append_to_df_with_df(og_df, new_df, reset_ind=True)[source]

Appends ‘og_df’ with ‘new_df’ and returns new

analytics_packages.custom_pandas.boxplot(df, column_with_values, group_by=None)[source]

Shows a boxplot values found in ‘column_with_values’ with the option to group by another column

analytics_packages.custom_pandas.ceiling_filter(df, max_val, column)[source]

Returns df with values below ‘max_val’ found in ‘column’

analytics_packages.custom_pandas.check_column_in_list(df, column, list, new_column)[source]

returns a dataframe with a boolean value in new column if the row had one of those value or not

analytics_packages.custom_pandas.df_change_row_ind_col_value(df, index, column, new_val)[source]
analytics_packages.custom_pandas.df_datetime_to_time_cols(df, datetime_col, extra=False)[source]
analytics_packages.custom_pandas.df_replace(df, value_to_change, to_fill)[source]
analytics_packages.custom_pandas.dict_from_two_columns(df, key_col, val_col)[source]
analytics_packages.custom_pandas.dict_to_df(dictionary)[source]

Return a dataframe from ‘dictionary’

analytics_packages.custom_pandas.drop_these_cols(df, cols)[source]

Returns dataframe without columns ‘cols’

analytics_packages.custom_pandas.drop_these_rows(df, rows)[source]

Returns dataframe without ‘rows’ based on index value

analytics_packages.custom_pandas.fill_nans(df, value_to_fill)[source]
analytics_packages.custom_pandas.filter_df_by_dates(df, date_col_dt, lower_datetime, upper_datetime, low_inc=True, up_inc=False)[source]

takes in a pandas df and returns one being filtered by lower and upper dates

analytics_packages.custom_pandas.floor_filter(df, min_val, column)[source]

Returns df with values above ‘min_val’ found in ‘column’

analytics_packages.custom_pandas.get_date_and_time()[source]
analytics_packages.custom_pandas.get_df(file_name, **params)[source]

Read csv file from local directory: return dataframe

analytics_packages.custom_pandas.get_unique_values(df, col)[source]
analytics_packages.custom_pandas.grab_rows_with_certain_values(df, column, values, return_not_in_values=False)[source]

Returns a version of the dataframe where every row in “column” contains a value found in list”values”

analytics_packages.custom_pandas.histogram(two_dim_list, x_axis_titles, legend_labels, x_label, y_label, graph_title, text_size=11, axesfont=26, titlesize=32, opacity=0.6)[source]

Prints a histogram: two_dim_list variable should contain n-number (number of differnt colored series to plot) of lists of numerical values l-length long. x_axis_titles is a list of strings l-length long legend labels is a list of strings to the number of lists contained in two_dim_list ex: two_dim_list = [ [.1, .4, .3, .2], [.2, .4, .3, .1], [.2, .3, .3, .2] ], x_axis_titles = [‘Pepperoni’,’Sausage’,’Cheese’,’Vegetable’], legend_labels = [‘under 20 years old’,’20-50’,’50+’]

analytics_packages.custom_pandas.keep_these_cols(df, cols)[source]

Returns dataframe with columns ‘cols’

analytics_packages.custom_pandas.keep_these_rows(df, rows)[source]

Returns dataframe with index values contained in ‘rows’

analytics_packages.custom_pandas.map_df_col_to_new_id(df, col, new_col_name, df2, id_col, map_col)[source]
analytics_packages.custom_pandas.map_df_column_to_dict(df, col, dict, new_col)[source]
analytics_packages.custom_pandas.move_last_column_to_first(df)[source]
analytics_packages.custom_pandas.multiple_filters(df, columns, two_dim_list)[source]

columns = [‘Age’,’Name’] two_dim_list = [ [21, 25], [‘James’,’Michael’] ]

this function sends back the df with age values of 21 and 25 and name values of james and michael

analytics_packages.custom_pandas.new_df_with_value_in_col(df, col, val, opposite=False)[source]
analytics_packages.custom_pandas.prep_datetime(df, time_col, dt_col, format='%Y-%m-%d %H:%M:%S')[source]
analytics_packages.custom_pandas.rename_cols(df, existing, new)[source]

renames cols found in ‘existing’ to match those found in “new”

analytics_packages.custom_pandas.replace_in_df(dataframe, column, to_find, to_replace)[source]

Replaces list or string “to_find” with list or string “to_replace” Looks up all instances in “column” found in the dataframe and replaces them Returns dataframe

analytics_packages.custom_pandas.sort_df(df, columns, ascend=True, na_pos='last')[source]
analytics_packages.custom_pandas.split_by_time_filter(df, how='hours', new_poss_values=[])[source]

returns dfs which have been sifted based on hours/days/months etc refer to params.py -> time_splits for reference

df1 df2 customer hour customer hour 0 0 0 1 1 0 1 1 2 0 2 1

analytics_packages.custom_pandas.split_df_into_equal_time(df, time_chunk, datetime_col, format='seconds')[source]

returns dfs after being split into separate dfs by a time separator example: starting from time 0, separate into chunks of 3 weeks at a time

ISOTIME

analytics_packages.custom_pandas.value_counts_df(original_dataframe, column=None)[source]

This function outputs a dataframe with counts of the unique values for each column from the input dataframe (function argument). The counts are given for each column of the input dataframe - in the output a column with unique values is paired with another column with the counts for the unique values

analytics_packages.custom_pandas.value_counts_in_df(df, col)[source]

analytics_packages.custom_xlwings module

These are all very outdated, wrote these several years ago. There are probably much better ways to achieve what you need to -James

analytics_packages.custom_xlwings.add_sheet(sheet_name, work_book)[source]

adds sheet to workbook

analytics_packages.custom_xlwings.alpha_from_column_names(df, strings)[source]
analytics_packages.custom_xlwings.alpha_from_index(integer)[source]

Takes a 0-based index (integer) and returns the corresponding column header

analytics_packages.custom_xlwings.alphas_from_index_list(ints)[source]
analytics_packages.custom_xlwings.change_cell_color(ws, top_left_cell, cell_color, bottom_right_cell=None)[source]

changes a range of cells a certain color

analytics_packages.custom_xlwings.clear_all(ws)[source]
analytics_packages.custom_xlwings.column_index_from_alphas(string)[source]
analytics_packages.custom_xlwings.combine_string_columns(df, col1, col2, new_column)[source]

Returns df with new column that has a compiled string of col1 and col2

analytics_packages.custom_xlwings.delete_sheet(ws)[source]
analytics_packages.custom_xlwings.df_from_rows(rows)[source]
analytics_packages.custom_xlwings.full_range(ws)[source]
analytics_packages.custom_xlwings.get_book_sheets(wb)[source]
analytics_packages.custom_xlwings.get_book_sheets_names(wb)[source]
analytics_packages.custom_xlwings.get_column(ws, col_index, nested=True)[source]

gets a column from the ws

analytics_packages.custom_xlwings.get_column_headers_from_alpha(df, list_of_alphas)[source]
analytics_packages.custom_xlwings.get_df_from_ws(ws)[source]
analytics_packages.custom_xlwings.get_rows(ws, top_left_cell=(1, 1), bottom_right=None)[source]
analytics_packages.custom_xlwings.get_wb(book_name)[source]
analytics_packages.custom_xlwings.get_ws(wb, sheet='Sheet1')[source]
analytics_packages.custom_xlwings.keep_these_rows(df, locs)[source]
analytics_packages.custom_xlwings.remove_slash_from_ws_name(string, replace=True, char='-')[source]
analytics_packages.custom_xlwings.row_to_col(row)[source]
analytics_packages.custom_xlwings.sort_ws(ws, column_alphas)[source]

takes active ws and list of column alphas and sorts worksheet

analytics_packages.custom_xlwings.write_2d(ws, rows, top_left_cell=(1, 1))[source]
analytics_packages.custom_xlwings.write_df_col_to_ws(ws, df, col_index, col_name)[source]

writes a df column to a certain column number in the ws

analytics_packages.custom_xlwings.write_df_to_ws(ws, df)[source]

Module contents