# default_exp tidyaddr

Notes on Security:

  • Colabs runs in a free, secure and private vm hosted by google (a tiered-service)
  • All data in the VM is lost when the VM Shuts down which happens a minute or so after the user leaves the colab

1. Upload a csv file

  • Execute next cell-block in order to run tidyAddr.
  • You will be prompted to upload a CSV file. It must contain only one column labeled 'address' exactly.
print(' ~~~~~~~~~~~ UPLOAD A FILE PLEASE ~~~~~~~~~~~') import os from google.colab import files uploaded = files.upload()fileuploadedFilename = "AFile_2020_addronly.csv" # d = list( uploaded.keys() )[0]#export # Input: A CSV with a single column titled 'address' (may require pre-processing to get it like this) # Output: A CSV with a single column containing text that can be split into 5 columns in xl. # - Needs to be re-merged with any other columns removed before tidy addring. import os def runTidyAddr( filename, newfilename ): print( filename, newfilename ) print(' ~~~~~~~~~~~ Getting TIDYADDR ~~~~~~~~~~~ ') ! rm tidyaddr-js -r ! git clone https://github.com/BNIA/tidyaddr-js.git print(' ~~~~~~~~~~~ Installing TIDYADDR ~~~~~~~~~~~') ! echo ORIGINAL_DIRECTORY && echo $(ls) ! cp *.csv tidyaddr-js/ && cd tidyaddr-js && npm install print(' ~~~~~~~~~~~ Running TIDYADDR ~~~~~~~~~~~') ! echo TIDYADDR_DIRECTORY && cd tidyaddr-js && echo $(ls) txt = "cd tidyaddr-js && node tidyaddr.js clean-csv " + filename + " " + newfilename os.system(txt) runTidyAddr( uploadedFilename, './'+uploadedFilename.replace(".", "_tidyaddred.") )
  1. The output will be a single text column. It can be split by ";" in excel to break it out into 5 or so different columns of data.
  2. This output (once split into many columns) will need to be re-appended to the original dataset to have a final complete version.
ls

Debugging

This following section is python code. Its not part of tidy addr but I use it to quickly inspect the dataframe in case there is any problem.

import pandas as pdpd.read_csv('bnia_snap_for_2019.csv')df = pd.read_csv('bnia_tanf_2019_20208_by_zipcodes.csv') df.head(1)ls

SEARCH

CONNECT WITH US

DONATE

Help us keep this resource free and available to the public. Donate now!

Donate to BNIA-JFI

CONTACT US

Baltimore Neighborhood Indicators Alliance
The Jacob France Institute
1420 N. Charles Street, Baltimore, MD 21201
410-837-4377 | bnia-jfi@ubalt.edu