R script for loading PATSTAT global (Autumn Edition 2018) into PostgreSQL

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

daniel_hain
Posts: 1
Joined: Wed Jan 30, 2019 10:45 am

R script for loading PATSTAT global (Autumn Edition 2018) into PostgreSQL

Post by daniel_hain » Wed Jan 30, 2019 11:54 am

Dear all,

I lately undertook an effort to create an R-Script that loads the current (Autumn 2018) version of PATSTAT global into a PostgreSQL database. It
- Creates all necessary tables (very close to the suggestion to be found in the PATSTAT SQL script)
- unzips the PATSTAT downloads one-by-one and load them into RAM
- Writes them into the corresponding database tables
- Sets all primary and foreign keys
- Defines a couple of (what I believe to be usefull) indicies
- Clusters the tables by their primary keys (not implemented yet).

The implementation in R has a couple of advantages>
- It facilitates the database setup for analysts which are not necessary used to the initial configuration of databases
- The optimized R ´fread´ function loads the tables way faster than the SQl native WRITE TABLE

It can be found here:

https://github.com/daniel-hain/PATSTAT-PostgreSQL

Comments and suggestions are always welcome. Hope it proves helpful for some.

Best
Daniel
Last edited by daniel_hain on Thu Jan 31, 2019 9:46 am, edited 1 time in total.


EPO / PATSTAT Support
Posts: 426
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: R script for loading PATSTAT global (Autumn Edition 2018) into PostgreSQL

Post by EPO / PATSTAT Support » Thu Jan 31, 2019 9:41 am

Hello Daniel,
many thanks for sharing your work with the community.
Geert BOEDT
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply