TLS218_DOCDB_FAM

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

TLS218_DOCDB_FAM

Post by nico.rasters » Sat Jan 04, 2014 7:02 pm

The TLS218_DOCDB_FAM table in PATSTAT October 2013 seems to be a bit off.
The documentation states that: "Generally speaking, if two applications claim exactly the same prior applications as priorities (these can be Paris Convention priorities or just technical relation priorities), then
they are defined by the EPO as belonging to the same DOCDB simple family.
"

However, I found 4317547 families in which at least one of the members was also a priority. In other words, there's a family with members A and B, where A is the priority of B. See also below for a real example.

Code: Select all

mysql> SELECT * FROM `tls218_docdb_fam` WHERE `docdb_family_id`=22576;
+----------+-----------------+
| appln_id | docdb_family_id |
+----------+-----------------+
|  2968193 |           22576 |
|  2977272 |           22576 |
+----------+-----------------+
2 rows in set (0.00 sec)

mysql> SELECT * FROM `tls204_appln_prior` WHERE `appln_id` IN (2968193, 2977272);
+----------+----------------+--------------------+
| appln_id | prior_appln_id | prior_appln_seq_nr |
+----------+----------------+--------------------+
|  2977272 |        2968193 |                  1 |
+----------+----------------+--------------------+
1 row in set (0.00 sec)

mysql> SELECT * FROM `tls205_tech_rel` WHERE `appln_id` IN (2968193, 2977272);
Empty set (0.03 sec)
There are also cases where the priorities do not match. See below for an example.
I will write a script later to count how many families are affected by this.

Code: Select all

mysql> SELECT *
    -> FROM `tls218_docdb_fam`
    -> LEFT OUTER JOIN `tls204_appln_prior` ON `tls218_docdb_fam`.`appln_id` = `tls204_appln_prior`.`appln_id`
    -> LEFT OUTER JOIN `tls205_tech_rel` ON `tls218_docdb_fam`.`appln_id` = `tls205_tech_rel`.`appln_id`
    -> WHERE `docdb_family_id` =9834;
+----------+-----------------+----------+----------------+--------------------+----------+-------------------+
| appln_id | docdb_family_id | appln_id | prior_appln_id | prior_appln_seq_nr | appln_id | tech_rel_appln_id |
+----------+-----------------+----------+----------------+--------------------+----------+-------------------+
|  2416995 |            9834 |     NULL |           NULL |               NULL |     NULL |              NULL |
| 14298974 |            9834 |     NULL |           NULL |               NULL |     NULL |              NULL |
| 14298975 |            9834 | 14298975 |        5668479 |                  1 |     NULL |              NULL |
| 36709412 |            9834 |     NULL |           NULL |               NULL |     NULL |              NULL |
+----------+-----------------+----------+----------------+--------------------+----------+-------------------+
4 rows in set (0.00 sec)
Caveat: the problem could be with TLS204 and/or TLS205 instead.
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: TLS218_DOCDB_FAM

Post by mkracker » Tue Jan 07, 2014 3:59 pm

PATSTAT does not compute the EPO simple families (table TLS218_DOCDB_FAM) by itself, but simply takes this information from the DOCDB backfile. Although the EPO simple family (also called DOCDB family) is very useful and widely used, the DOCDB manual states: "The business rules governing the family building process are internal to the EPO and not public.". In fact, I don't know the rules, too.

In short: PATSTAT takes the simple family as delivered in DOCDB for granted. Nevertheless, I checked your 2 examples (using PATSTAT Oct 2013). IMO the 2 families in your example are plausible and give no indication that the DOCDB family building went wrong.

1) DOCDB_FAMILY_ID = 9834
All 4 family members are a regional / national application of the same PCT application, which has APPLN_ID = 5668479.
To check, look at field INTERNAT_APPLN_ID of table TLS201_APPLN of these 4 applications.

2) DOCDB_FAMILY_ID = 22576
This family has 2 Belgian applications filed in 1926 and 1927. Their contents seem to be very similar or identical (e. g. they have the same title). If you look them up in Espacenet via their APPLN_NR_EPODOC (BE19260336310 and BE19270346786), you will receive the same document record. A strange family, but still somehow plausible.

Of course, DOCDB is not error-free. Everybody who detects a family-mix up is encouraged to report this, e. g. via the "Report data error" link in Espacenet.

---------------------------
Martin Kracker, EPO - PATSTAT
-------------------------------------------
Martin Kracker / EPO


nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

Re: TLS218_DOCDB_FAM

Post by nico.rasters » Tue Jan 07, 2014 6:23 pm

Hi Martin,

My problem actually is with the statement that "if two applications claim exactly the same prior applications as priorities (these can be Paris Convention priorities or just technical relation priorities), then they are defined by the EPO as belonging to the same DOCDB simple family". Note the word "exactly".

In the case of family 22576 we see that appln_id 2977272 has priority appln_id 2968193, and the other family member appln_id 2968193 should -according to the definition- also have priority appln_id 2968193. But I don't think it's possible to claim oneself as a priority. And we see that it doesn't do that. These two patents do not claim exactly the same prior applications (note that I checked TLS205 as well), and should therefore not be in the same family.

But maybe I am misreading the documentation. It does say "Generally speaking". So perhaps this simply means that in most cases the family members share the same priorities (and in other cases they don't). There are around 67 million families, and approximately 4.5 million records were a bit off. Less than 10%. So is the caveat in the "generally speaking"?
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: TLS218_DOCDB_FAM

Post by mkracker » Wed Jan 08, 2014 11:08 am

The PATSTAT Data Catalog describes table TLS218_DOCDB_FAM quite correctly: "Generally speaking, if two applications claim exactly the same prior applications as priorities (.....), then they are defined by the EPO as belonging to the same DOCDB simple family. The EPO reserves the right to classify an application into a particular simple family irrespective of this general rule ....."

Beside this general rule, there are many particular rules to handle special situations. Just think about what applicants can do with various types of continuations. All these cases have to be treated in a sensible way.

Please note: In contrast to your assumption we consider the filing date of the first filing as a self-priority.
A simple example: A first filing takes place in Germany in 2000. Its filing date (in year 2000) is regarded as a (self-)priority. Then in 2001 the invention is again filed at the EPO, claiming the German application as a priority. Because both applications have the same priority (in year 2000), they are in the same DOCDB family.

--------------------------
Martin Kracker, EPO - PATSTAT
-------------------------------------------
Martin Kracker / EPO


nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

Re: TLS218_DOCDB_FAM

Post by nico.rasters » Sat Jan 11, 2014 5:47 pm

Thanks Martin.

Self-priority makes sense. I suppose these are excluded from TLS204 to save space? E.g. instead of including a record where appln_id=1234 and prior_appln_id=1234 we now have the rule that if appln_id isn't there, it's a priority. Instead we could have had the rule that if appln_id = prior_appln_id it's a self-priority. Makes more sense to me conceptually.

Actually, I always assumed that the priority itself should also be included in the DOCDB simple family, but I found plenty of families where this was not the case. Are perhaps those families wrong then? In other words, if we have patents A, B, and C where C is the priority for A and B... is C then (always) a self-priority and should it (always) be part of the family?
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


nico.rasters
Posts: 140
Joined: Wed Jul 08, 2009 5:51 pm
Contact:

Re: TLS218_DOCDB_FAM

Post by nico.rasters » Sat Jan 25, 2014 9:36 pm

The documentation also gives the following description for DOCDB_FAMILY_ID and INPADOC_FAMILY_ID:
"Means that most probably the applications share exactly the same priorities (Paris Convention or technical relation) as in table TLS204_PRIOR_APPLN and TLS205_TECH_REL and tls216_APPLN_CONTN."

Table TLS216 is not mentioned in the explanation under 5.20 TLS218_DOCDB_FAM: Link between DOCDB family members. Probably it should be.

Until disproven, I will assume that a DOCDB family also includes the priorities, and that these priorities can be found in TLS204, TLS205, and TLS216.

In fact, I randomly stumbled upon a family which was based on two priorities where these priorities were referring to eachother as priorities. In other words, patents A and B both have priorities P1 and P2; P1 has priority P2, P2 has priority P1. And thanks to the self-priority theory we can add P1->P1 and P2->P2. I guess this is the only way to confirm to the "share exactly the same priorities rule. The DOCDB family subsequently consists of A, B, P1, and P2.
________________________________________
Nico Doranov
Data Manager

Daigu Academic Services & Data Stewardship
http://www.daigu.nl/


Post Reply