Aim: quantify death increase due to COVID-19 in France at the department level.
- Death data form INSEE (French Statistic Agency) :
- Geography referential (commune and departement) :
- Population data time serie from INSEE :
- Population data time serie from INSEE - 2020-03-27 Weekly update for Covid :
- Number of admissions in hospitals due to influanza 2010-2020 from Sante Publique France : (
- = civil status register of death, for years yyyy (death accounted for in thoses years)
- commune.csv = list of administrative area level 3 in France
- departments.csv = list of administrative area level 2 in France
- INSEE - year x dept x sex x age - population.csv = historical series of population in each department, by age class and sex
Civil status files :
- lastname Last name of dead person
- firstname First name of dead person
- sex Sex of dead person (M = Male / F = Female)
- birth_date_txt Birth Date in text format (format YYYYMMDD). MM or DD = 00 when unknown
- birth_commune_code Commune code of birth place (may not exist in referrential for historical communes)
- birth_commune_name Commune name of birth place
- birth_country_name Country of birth
- death_date_txt Birth Date in text format (format YYYYMMDD). MM or DD = 00 when unknown
- death_commune_code Commune code of death(may not exist in referrential for historical communes)
- death_register_number Number of death in civil status register (not always a number)
- birth_date Valid birth date (YYYY-MM-DD) format. Empty if invalid or unknown
- death_date Valid death date (YYYY-MM-DD) format. Empty if invalid or unknown
Geographical referential files :
- commune_code Code of commune (3rd level administrative subdivision of France)
- commune Name of commune (3rd level administrative subdivision of France)
- population Headcount of population as of 01/01/2020 (estimated)
- departement_code Code of department (2nd level administrative subdivision of France)
- departement Name of department (2nd level administrative subdivision of France)
- region_code Code of région (1st level administrative subdivision of France)
- region Name of region (1st level administrative subdivision of France)
- classe_age_5 age interval (5 years)
File : INSEE_deces_1990_2019_byweek.csv.gz
- Collection and concatenation of yearly death data (1990=> 2019)
- Deleted " replaced with white space
- retreated dates : removed lines with invalid birth or death dates (0,72% of cases)
- fitler death dates> 1970
- join with geography referential to assiciate commune with department
- correction of department for specific geographies (Lyon, Paris, Marseille with arrondissement code instead of comune code in INSEE File)
- Grouping by death department, death date, sex, age, year of observation
- Year 1997 and 2003 should be removed from any analysis (1997 data seem invalid and 2003 is an outlier due to specific heat wave in France in August)
- Model by age group at the level of French départements
- redressement des données hebdo de l'INSEE pour estimer l'effet de décallage dans la remontée des information (délai entre survenance du délai et comptabilisation par l'INSEE)
To generate a pickle file ready for analysis based on the years 2010 to 2019:
python -f INSEE_deces_2010_2019
This will create a pickle file in a preprocessed_data
folder containing a pandas dataframe, aggregated by "departement" id number. weeknumber_year.ipynb
is a simple example notebook to see what the data look like.
You need to install git-lfs to retrieve the large data:
- follow the Installing Git Large File Storage guideline
- read the Configuring Git Large File Storage
The following steps must be performed on a Anaconda prompt console, or
alternatively, in a Windows command console that has executed the
command that initializes the PATH
so that
the conda
command is found.
Clone the repository:
$ git clone [email protected]:scrouzet/covid19-incrementality.git $ cd covid19-incrementality
Create a virtual environment with all dependencies.
$ conda env create -f environment.yaml
Activate the environment and install this package (optionally with the
flag).$ conda activate covid19inc-env $ pip install -e .
- Fork this repo
- Commit your code in your forked repo
- Do a pull-request on the main repo and ask for code reviewers
- Take into account the comments
For contributors
- Make a branch "feat-short_name_feature"
- Commit your code in this branch
- Do a pull-request on the main repo and ask for code reviewers
- Take into account the comments