Skip to content

Commit c533682

Browse files
authored
Merge pull request #84 from MetropolitanTransportationCommission/master
Bring UDST up to date with MTC branch plus code debt payment
2 parents 71317d1 + 1d12275 commit c533682

File tree

108 files changed

+25759
-15360
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

108 files changed

+25759
-15360
lines changed

.travis.yml

+30
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
language: python
2+
sudo: false
3+
python:
4+
- '2.7'
5+
install:
6+
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
7+
wget http://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
8+
else
9+
wget http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
10+
fi
11+
- bash miniconda.sh -b -p $HOME/miniconda
12+
- export PATH="$HOME/miniconda/bin:$PATH"
13+
- hash -r
14+
- conda config --set always_yes yes --set changeps1 no
15+
- conda update -q conda
16+
# Useful for debugging any issues with conda
17+
- conda info -a
18+
# don't think we need all these packages, but copying from urbansim
19+
- >
20+
conda create -q -c synthicity -n test-environment
21+
python=$TRAVIS_PYTHON_VERSION
22+
cytoolz ipython-notebook jinja2 matplotlib numpy pandas patsy pip scipy
23+
statsmodels pytables pytest pyyaml pandana
24+
- source activate test-environment
25+
- pip install pep8
26+
- pip install -r requirements.txt
27+
script:
28+
- pep8 baus
29+
- pep8 scripts
30+
- py.test baus

Makefile

-55
This file was deleted.

README.md

+37-111
Original file line numberDiff line numberDiff line change
@@ -1,126 +1,52 @@
1-
DRAFT Bay Area Urbansim Implementation
1+
DRAFT Bay Area UrbanSim (BAUS) Implementation
22
=======
33

4-
This is the DRAFT UrbanSim implementation for the Bay Area. Documenation for the Bay Area model is available at http://metropolitantransportationcommission.github.io/baus_docs/ and documentation for the generic UrbanSim model is at https://udst.github.io/urbansim/index.html
4+
[![Build Status](https://travis-ci.org/MetropolitanTransportationCommission/bayarea_urbansim.svg?branch=master)](https://travis-ci.org/MetropolitanTransportationCommission/bayarea_urbansim)
55

6-
###Install Overview
7-
* https://mtcdrive.account.box.com/login
8-
* get anaconda (version as indicated in reqs below)
9-
* bash Anaconda2-4.0.0-Linux-x86_64.sh
10-
* yes to prepend install location to .bashrc
11-
* open new terminal
12-
* sudo apt-get update
13-
* sudo apt-get -y install git g++ python-dev unzip
14-
* git clone https://github.com/MetropolitanTransportationCommission/bayarea_urbansim.git
15-
* pip install -r requirements.txt (comment out pandana)
16-
* pip install pandana
17-
* get data
18-
* change RUNNUM so in 5000s etc
19-
* python run.py -s 4 & OR python all.py &
6+
This is the DRAFT UrbanSim implementation for the Bay Area. Policy documentation for the Bay Area model is available [here](http://data.mtc.ca.gov/bayarea_urbansim/) and documentation for the UrbanSim framework is available [here](https://udst.github.io/urbansim/).
207

21-
###Data
8+
### Install Overview
229

23-
We track the data for this project in the Makefile in this repository. The makefile will generally be the most up to date list of which data is needed, where it goes in the directory, etc.
10+
* Install Python for your OS ([Anaconda](https://www.continuum.io/downloads) highly suggested)
11+
* Clone this repository
12+
* Install dependencies using `pip install -r requirements.txt`
13+
* Get data using `python run.py -c --mode fetch-data` (you will need an appropriately configured AWS credentials file which you must get from your MTC contact)
14+
* Preprocess data using `python run.py -c --mode preprocessing`
15+
* Run a simulation using `python run.py -c` (default mode is simulation)
2416

25-
To fetch data with [AWS CLI](https://aws.amazon.com/cli/) and Make, you can:
26-
`make data`.
17+
### An overview of run.py
18+
19+
Run.py is a command line interface (cli) used to run Bay Area UrbanSim in various modes. These modes currently include:
2720

28-
Below we provide a list to links of the data in the Makefile for convenience, but in general the makefile is what is being used to run simulations. If you find that something below is out of date w/r/t the makefile, please feel free to update it and submit a pull request.
21+
* estimation, which runs a series of models to save parameter estimates for all statistical models
22+
* simulation, which runs all models to create a simulated regional growth forecast
23+
* fetch_data, which downloads large data files from Amazon S3 as inputs for BAUS
24+
* preprocessing, which performas long-running data cleaning steps and writes newly cleaned data back to the binary h5 file for use in the other steps
25+
* baseyearsim which runs a "base year simulation" which summarizes the data before the simulation runs (during simulation, summaries are written after each year, so the first year's summaries are *after* the base year is finished - a base year simulation writes the summaries before any models have run)
2926

30-
####Data necessary for run.py to run
27+
### Outputs from Simulation (written to the runs directory)
3128

32-
These data should be in the data/ folder:
29+
ALL OUTPUT IN THIS DIRECTORY IS NOT OFFICIAL OUTPUT. PLEASE CONTACT MTC FOR OFFICIAL OUTPUTS OF THE LAST PLAN BAY AREA.
3330

34-
https://s3.amazonaws.com/bayarea_urbansim/data/2015_06_01_osm_bayarea4326.h5
35-
https://s3.amazonaws.com/bayarea_urbansim/data/2015_08_03_tmnet.h5
36-
https://s3.amazonaws.com/bayarea_urbansim/data/2015_12_21_zoning_parcels.csv
37-
https://s3.amazonaws.com/bayarea_urbansim/data/02_01_2016_parcels_geography.csv
38-
https://s3.amazonaws.com/bayarea_urbansim/data/2015_08_29_costar.csv
39-
https://s3.amazonaws.com/bayarea_urbansim/data/2015_09_01_bayarea_v3.h5
40-
41-
Because the hdf5 file used here contains one table with proprietary data, you will need to enter credentials to download it. You can request them from Tom Buckley([email protected]). Or if you already have access to Box, you can download the hdf5 file from there.
42-
43-
####Data Description
44-
45-
46-
How To
47-
------
48-
####Set Up Simulation and Estimation
49-
Install dependencies using standard [pip](https://pip.pypa.io/en/latest/user_guide.html#requirements-files) requirements install:
50-
`pip install -r requirements.txt`
51-
You may also need to install pandana
52-
`pip install pandana`
53-
54-
####Set up using a Virtual Machine
55-
For convenience, there is a [Vagrantfile](https://www.vagrantup.com/) and a `scripts/vagrant/bootstrap.sh` file. This is the recommended way to set up and run `Simulation.py` on Windows.
56-
57-
####Enter Amazon Web Services credentials to fetch data.
58-
59-
See [Installing](http://docs.aws.amazon.com/cli/latest/userguide/installing.html) and [configuring] (http://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html)
60-
61-
Each of the following just runs a different set of models for a different set of years.
62-
63-
####Run a Simulation
64-
In the repository directory type `python run.py`
65-
66-
####Estimate Regressions used in the Simulation
67-
In the repository directory edit `run.py` and set `MODE` to "estimation" and type `python run.py`
68-
69-
####Run a Base Year Simulation
70-
In the repository directory edit `run.py` and set `MODE` to "baseyearsim" and type `python run.py`. A base year simulation is used to run a few models and make sure everything matches the first year of the control totals but not to add any new buildings. This is then used in comparison of the year 2040 to the base year for all future simulations (until the control totals change) and this mode is rerun.
71-
72-
####Review Outputs from Simulation
73-
74-
#####Runs Directory
75-
76-
ALL OUTPUT IN THIS DIRECTORY IS CONSIDERED DRAFT. PLEASE CONTACT MTC FOR OFFICIAL FINAL OUTPUTS.
77-
78-
`#` = a number that is updated in the RUNNUM file in the bayarea_urbansim directory each time you run Simulation.py.
31+
`[num]` = a positive integer used to identify each successive run. This number usually starts at 1 and increments each time run.py is called.
7932

8033
Many files are output to the `runs/` directory. They are described below.
8134

8235
filename |description
8336
----------------------------|-----------
84-
run#_topsheet_2040 | An overall summary of various housing, employment, etc by regional planning area types
85-
run#_parcel_output.csv |csv of parcels that are built for review in Explorer
86-
run#_parcel_data_diff.csv |A CSV with parcel level output for *all* parcels with lat, lng and includes change in total_residential_units and change in total_job_spaces, as well as zoned capacity measures
87-
run#_simulation_output.json |summary by TAZ for review in Explorer (unix only)
88-
run#_taz_summaries |A CSV for [input to the MTC travel model](http://analytics.mtc.ca.gov/foswiki/UrbanSimTwo/OutputToTravelModel)
89-
run#_urban_footprint_summary | A CSV with A Summary of how close the scenario is to meeting [Performance Target 4](http://planbayarea.org/the-plan/plan-details/goals-and-targets.html)
90-
91-
92-
Browse results [here](http://urbanforecast.com/runs/)
93-
94-
######Other Directories
95-
Below is an explanation of the directories in this repository not described above.
96-
97-
configs/
98-
99-
The YAML files in this directory allow you to configure UrbanSim by changing the keys and values of arguments taken by urbansim functions. See the [UrbanSim Defaults](https://udst.github.io/urbansim_defaults/) docs for more details.
100-
101-
Note that even the values taken by data can be and are configured with these config files (e.g. values in `settings.yaml`).
102-
103-
data_regeneration/
104-
105-
The scripts in here can be used to re-create the data in the `data/` folder from source (various local, state, and federal sources). Use these to re-create the data here when source data change fundamentally.
106-
107-
scripts/
108-
This is a good place to put scripts that can exist independently of the analysis environment here.
109-
110-
####Parcel Geometries
111-
112-
The parcel geometries are the basis of many operations in the simulation. For example, as one can see in [this pull request](https://github.com/MetropolitanTransportationCommission/bayarea_urbansim/pull/121), in order to add schedule real estate development projects to the list of projects that are included in the simulation, one must use an existing `geom_id`, which is a field on the parcels table added [here](https://github.com/MetropolitanTransportationCommission/bayarea_urbansim/blob/master/data_regeneration/match_aggregate.py#L775-L784).
113-
114-
Parcel geometries are available at the following link:
115-
116-
https://s3.amazonaws.com/bayarea_urbansim/data/09_01_2015_parcel_shareable.zip
117-
118-
#####Geom ID
119-
120-
Please be aware that many ArcGIS users have found that ArcGIS automatically converts and then rounds the `geom_id` column, effectively making it unusable. Therefore we recommend using QGIS, which does not exhibit this behavior with delimited files by default.
121-
122-
Also, in Microsoft Excel, you will need to make sure that the data type of the `geom_id` column is set to `number` and that the number of decimal points is set to 0. Otherwise when you save the CSV again the `geom_id`s will be unusable.
123-
124-
What is the `geom_id` field and why does it exist?
125-
126-
In short, this is a legacy identifier. The `geom_id` field was introduced as a stable identifier for parcels across shapefiles, database tables, CSV's, and other data types. It is an integer because at some point there was a need to support integer only identifiers. It is not based on an Assessor's Parcel Numbers because there was a perception that those were inadequate. And it is based on the geometry of the parcel because many users have found that geometries are the most important feature of parcels.
37+
run[num]\_topsheet\_[year].csv | An overall summary of various housing and employment outcomes summarized by very coarse geographies.
38+
run[num]_parcel_output.csv | A csv of all new built space in the region. This has a few thousand rows and dozens of columns which contain various inputs and outputs, as well as debugging information which helps explain why each development was picked by UrbanSim.
39+
run[num]\_parcel_data\_[year].csv |A CSV with parcel level output for *all* parcels with lat, lng and includes change in total_residential_units and change in total_job_spaces, as well as zoned capacity measures.
40+
run[num]\_building_data\_[year].csv |The same as above but for buildings.
41+
run[num]\_taz\_summarie\s_[year].csv |A CSV for [input to the MTC travel model](http://analytics.mtc.ca.gov/foswiki/UrbanSimTwo/OutputToTravelModel)
42+
run[num]\_pda_summaries\_[year].csv, run[num]\_juris_summaries\_[year].csv, run[num]\_superdistrict_summaries\_[year].csv | Similar outputs to the taz summaries but for each of these geographies. Used for understanding the UrbanSim forecast at an aggregate level.
43+
run[runnum]_dropped_buildings.csv | A summary of buildings which were redeveloped during the simulated forecast.
44+
run[runnum]_simulation_output.json | Used by the web output viewer.
45+
46+
47+
### Directory structure
48+
49+
* baus/ contains all the Python code which runs the BAUS model.
50+
* data/ contains BAUS inputs which are small enough to store and render in GitHub (large files are stored on Amazon S3) - this also contains lots of scenario inputs in the form of csv files. See the readme in the data directory for detailed docs on each file.
51+
* configs/ contains the model configuration files used by UrbanSim. This also contains settings.yaml which provides simulation inputs and settings in a non-tabular form.
52+
* scripts/ these are one-off scripts which are used to perform various input munging and output analysis tasks. See the docs in that directory for more information.

Vagrantfile

-25
This file was deleted.

all.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
# run a full package of scenarios
66

7-
for num in [0, 1, 2, 3, 4]:
7+
for num in [0, 1, 3, 4, 5]:
88
os.system('python run.py -s %d' % num)
99

1010
with open('RUNNUM', 'r') as f:

all2.py

-27
This file was deleted.

0 commit comments

Comments
 (0)