Mapping data completeness of PATSTAT Global and INPADOC data (TLS231)

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

EPO / PATSTAT Support
Posts: 433
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Mapping data completeness of PATSTAT Global and INPADOC data (TLS231)

Post by EPO / PATSTAT Support » Wed Dec 19, 2018 11:21 am

When doing advanced statistical analysis, it is important to understand the coverage and content of the data you are working with. PATSTAT Global contains data coming from all over the world. Quality, timeliness and completeness vary a great deal depending on your data selection approach. Data from the main patent offices (EP, US, WO, JP, KR, CN, AU, EU-countries etc...) is more or less complete and of (very) good quality; notoriously missing is for example patent data from India. Some patent offices do not include crucial data items to conduct certain kinds of analysis. (f.ex.: CN, KR, JP: hardly any data on the applicant country). This might affect data models, and researchers will have to develop their own methodologies to replenish or complete the data using other sources (or methodes), or develop statistically sound approaches to obtain representative data sets.

To make it easier to identify the "white spots", the PATSTAT team has created a Tableau dashboard that maps the content and coverage of all the patents included in PATSTAT Global. The link below will lead you to the published dashboard without the need of you having to install special software.
If you have observations or suggestions for improvements, kindly let us know via -patstat @ epo.org-

Autumn 2023: https://public.tableau.com/app/profile/ ... STATGlobal
Spring 2023: https://public.tableau.com/app/profile/ ... STATGlobal
Autumn 2022: https://public.tableau.com/app/profile/ ... STATGlobal

The TLS231_INPADOC_LEGAL_EVENT table currently contains a whopping 360 million records covering legal status from 61 patent authorities using 3.368 different legal event codes.
Researchers need to have a good understanding on the meaning of those codes before using them. But equally important is to understand the coverage and the use of the codes. Many legal event codes have been used very sporadic on a small number of applications before disappearing into obscurity.
To shed some light on this topic, you can find here an excel sheet that maps for all event codes the number of patents for which the codes have been used grouped by the respective patent application year. This should give researchers a good first indication whether or not the codes they intend to use are useful for the set of patents they want to analyse.
Comments are welcomed.
Overview_legal_status_events_tls231_2023b.xlsx
(1.15 MiB) Downloaded 191 times
Overview_legal_status_events_tls231_2023a.xlsx
(1.14 MiB) Downloaded 300 times
Overview_legal_status_events_tls231_2022b.xlsx
(1.11 MiB) Downloaded 326 times

(The excel sheet also includes the query used to extract the overview table. You can use the excel filters to narrow down and select event codes, event_category_code, event authority or any other relevant data item according to your needs.)

PATSTAT Register contains all the procedural and event data for patents that have been filed at the EPO. It contains (more or less) the same data as what you will find in the EPO Register. (https://www.epo.org/en/searching-for-pa ... l/register) Some data attributes (such as citations) that is already included in PATSTAT Global has been omitted to avoid duplication.
Last edited by EPO / PATSTAT Support on Thu Oct 28, 2021 5:25 pm, edited 2 times in total.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply