Dead links, wrong data?

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

Dead links, wrong data?

Post by nico.rasters » Sun Jan 31, 2016 1:00 pm

http://www.epo.org/searching/data/data/ ... gular.html used to point to a Kind Code concordance list. It's now a 404 page. This link is mentioned on http://worldwide.espacenet.com/help?loc ... =kindcodes and in the Data Catalog 5.02. You can find the kind codes on https://www.epo.org/searching-for-paten ... gular.html

Another wild goose chase was my search for the PATSTAT Online login url. The PATSTAT Online User Manual (which I now can not find either) and Google point at http://www.epo.org/searching-for-patent ... tstat.html. However, the real link is https://data.epo.org/expert-services/start.html

Also, the Data Catalogue on http://www.epo.org/searching-for-patent ... tstat.html covers the Spring edition instead of Autumn.

In the Autumn version I noticed a new field in tls201_appln, namely earliest_filing_id. It appears to be referring to a priority patent, but this is not the case. For example docdb_family_id 45420505 has three applications 340572313, 375715005, 376610796 which all name themselves as earliest_filing_id. But according to tls204_appln_prior 376610796 is the priority. So for priorities I'd stick with http://gder.phpnet.org/rassenfosse/down ... nt_inv.sql
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

Re: Dead links, wrong data?

Post by nico.rasters » Sun Jan 31, 2016 1:27 pm

Also found (in Autumn 2015):
SELECT tls206_person.*
FROM tls207_pers_appln
INNER JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
WHERE tls207_pers_appln.appln_id = 339979633
ORDER BY doc_std_name_id;

Note how SIMPSON STEVEN LEWIS CHARLES occurs twice as doc_std_name. Once erroneously for Julian Richard Davis.
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: Dead links, wrong data?

Post by mkracker » Wed Feb 10, 2016 1:34 pm

Dear Nico,

You really had bad luck. Just 2 days before your post all the web pages of the EPO home page related to patent information has been relaunched, with a different structure and of course different URLs. Consequently, many URLs contained in already published documents became invalid. In the mean time, I updated the most relevant documents (Data Catalogs, PATSTAT Online user manual), so they now contain working links.
The new website structure has also some advantages: You will now find all PATSTAT related data in one place: http://www.epo.org/searching-for-patent ... tstat.html. Most documents, like the PATSTAT Online user manual, are now in the "Downloads" tab.
Sorry for the confusion during the transitional period.

Regarding attribute EARLIEST_FILING_ID in table TLS201_APPLN. Unlike you assumed, this attribute not necessarily refers to a priority.The Data Catalog defines it as:
Derived from the tables
- TLS201_APPLN self-priority
- TLS201_APPLN PCT application
- TLS204_APPLN_PRIOR Paris Convention priority
- TLS205_TECH_REL technical relations
- TLS216_APPLN_CONTN application continuations

Comments: If multiple applications have been filed on the earliest filing date, then conceptually any of these applications can be regarded as the earliest application. Nevertheless, preference is given to the international application. In other words: If there is a PCT application which was filed on the earliest application date, then the APPLN_ID of this PCT application is taken as the EARLIEST_FILING_ID. Otherwise the application with the smallest APPLN_ID will be taken.
In short: It might be the case that there are multiple related applications filed on the same earliest date. This is the case in your example. Surprisingly, there are quite some applications (< 1%) which are filed on the same day as their priority. We will analyse whether in these cases we should prefer the ID of a priority over a non-priority application.

You also mentioned an example of a wrongly assigned DOC_STD_NAME. This is a known issue, which we unfortunately cannot solve. In section 8 "Known Deficiencies" of the Data Catalog it is written:
TLS206_PERSON / TLS906_PERSON: DOCDB standardized names:
Some DOCDB standardised names are wrongly assigned to persons of US patents, because the sequence of persons in the USPTO data source and that in DOCDB sometimes do not match correctly. There is no know fix. When working with US patent applicants or inventors, you should avoid using the DOCDB standardised name. Instead, you might consider other harmonized names available in table TLS906_PERSON.
As a workaround you might look at table TLS906_PERSON, which contains 2 more types of harmonized names.
-------------------------------------------
Martin Kracker / EPO


Post Reply