Only download newly added data from new version

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

tobias
Posts: 2
Joined: Sun May 19, 2019 4:15 pm

Only download newly added data from new version

Post by tobias » Sun May 19, 2019 4:59 pm

Hello,

Is there a way to only download the newly added data from a new PATSTAT version?

For example, a way to only download the new data that were added to the tls201 table in the Autumn 2018 dataset, i.e. those data which were not included in the Spring 2018 dataset.

I tried using EARLIEST_PUBLN_DATE as a condition for the tls201 table but somehow this does not work because 261,696 data entries from the 2,584,657 newly added data entries are not captured under this condition. I used the most recent EARLIEST_PUBLN_DATE from the Spring 2018 version (2018-02-02) as the condition in the Autumn 2018 version, and then also accounted for new entries with missing values (those with '9999-12-31'). Does anyone know where my mistake lies in this approach?

Thank you!


EPO / PATSTAT Support
Posts: 426
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Only download newly added data from new version

Post by EPO / PATSTAT Support » Mon May 20, 2019 7:46 am

Hi Tobias,

your misconception seems that you assume that each new edition of PATSTAT is just the same as the previous plus some additional data. This is by far not the case. Let's see some examples:
  • The EPO collects data from various offices and the timing of these deliveries are often not guaranteed. Sometimes we receive data of patents published long ago, which of course we nevertheless we integrate to improve completeness.
  • Even in old data corrections take place, and sometimes they are quite massive.
  • Adding new data might trigger changes on old data. E.g. adding a new patent might create, split or combine DOCDB or INPADOC families.
  • Typically a new PATSTAT edition also has changes in the database schema, which of course affects old and new data.
In short: There is no practical way to update PATSTAT with only the difference between 2 editions. This is why we always deliver the complete data set. You need to replace your PATSTAT database with a newly loaded version.

Best regards,
Martin
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


tobias
Posts: 2
Joined: Sun May 19, 2019 4:15 pm

Re: Only download newly added data from new version

Post by tobias » Tue May 21, 2019 12:00 pm

Hi Martin,

Thank you very much for the clarification!

Best,

Tobias


Post Reply