Database errors in the field nb_applicants

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

torben
Posts: 13
Joined: Thu Nov 12, 2015 9:59 am

Database errors in the field nb_applicants

Post by torben » Tue Mar 21, 2017 9:50 am

Hi,

I'm using PATSTAT offline autumn 2015 and obviously there are some database errors in the field nb_applicants respectively applt_seq_nb. So the problem is that in the PATSTAT database are more applicants listed as on the original document.

For example the US-patent US8196979B2. There are 11 applicants (9 inventors and 2 companies). When you look at the original document there are only 2 companies as assignee. I found this error only at US-patent data.

I think in US-patent law it's not necessary to name a company when submitting the application. So PATSTAT saves the inventors as applicant. After the grant has been published the companies(real applicants) are added.


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Database errors in the field nb_applicants

Post by Geert Boedt » Tue Mar 28, 2017 10:53 am

Hello Torben,
the US data is a bit particular, your observation that that original document gives 9 inventors and 2 applicatants is correct. And it is customary that inventors are initially also applicants, after which a transfer of ownership is registered.
https://worldwide.espacenet.com/publica ... C=B2&ND=4#

But if you look on the PDF document it specifies INID code 75, and that INID code is defined in WIPO ST9 as "Name(s) of inventor(s) who is (are) also applicant(s)".
As the EPO works with the data delivered by the respective offices, we use these codes and WIPO definitions to assign applicant and inventor sequence numbers. The number of applicant and inventors is then the MAX number for each.
Here is the reference document:
03-09-01.pdf
(173.16 KiB) Downloaded 230 times
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


torben
Posts: 13
Joined: Thu Nov 12, 2015 9:59 am

Re: Database errors in the field nb_applicants

Post by torben » Wed Mar 29, 2017 10:11 am

Hi Geert,

thanks for your reply. I'am trying to identify cooporations via patents. So I'am using the inv_seq_nb and inv_seq_nb to exclude inventors out of the results. The big problem is that there are still a lot of "natural persons" in the results.

Is there a possibility to exclude all persons and just recieving companies or is there a possibility to recieve the information in field code"** (73) Name(s) of grantee(s), holder(s), assignee(s) or owner(s)"?


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Database errors in the field nb_applicants

Post by Geert Boedt » Wed Mar 29, 2017 10:39 am

Hello Torben,
you did not specify how you use the invt_seq_nr and applt_seq_nr to identify companies, but I assume you force invt_seq_nr = 0 to retain companies. That is a valid methodology.

Something like this:

Code: Select all

SELECT distinct tls201_appln.appln_id, appln_auth,appln_nr, appln_kind, appln_filing_date,tls207_pers_appln.applt_seq_nr,
tls207_pers_appln.invt_seq_nr,tls206_person.person_id,person_name, psn_name, person_ctry_code, psn_sector
  FROM tls201_appln 
  join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
  join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
where tls201_appln.appln_id = 50000109 and invt_seq_nr = 0 
order by applt_seq_nr , invt_seq_nr
The disadvantage is that this approach will exclude all applications where no company name is registered. You could of course add those separately via a UNION of 2 queries.
Another approach would be to use the psn_sector attribute. You will need to do some testing to see if the quality of that indicator is good enough (or better) then using the sequence numbers. Try out the different methods, see how much noise you get, and if more precision is needed you have to further clean.

On your request "a possibility to receive the information in field code"** (73) Name(s) of grantee(s), holder(s), assignee(s) or owner(s)"?" PATSTAT has the information as published on the document, but for US applications we also use the assignee data downloaded from the USPTO website. If there is no data tagged with INID 73, then we can not include it. It is not so that we exclude available information, it's simply not available.
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


torben
Posts: 13
Joined: Thu Nov 12, 2015 9:59 am

Re: Database errors in the field nb_applicants

Post by torben » Thu Apr 27, 2017 8:40 am

Hi Geert,

thanks for your reply. Since last week I've got an access to the PATSTAT 2016 autumn version and I tried to use the psn_sector attribute. I combine the attribute (nb_applicant >=3 and psn_sector) to get patents with at least 3 companys. The problem is that an inventor could also be an applicant. In my results just companys are listed but also patents with just one company (nb_applicants =3).

How I could count the total companies listed on a patent?


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Database errors in the field nb_applicants

Post by Geert Boedt » Thu Apr 27, 2017 10:51 am

Hello Torben,
you could use something like this: I joined with an extra table created in a sub query, that selects only applications that have at least 3 "company applicants"; identified through the combinations of the applicant and inventor numbers and further restricted via the psn_sector attribute.

Code: Select all

select distinct tls201_appln.appln_id, appln_auth, appln_nr, appln_kind, appln_filing_date, granted, docdb_family_id, docdb_family_size, 
nb_citing_docdb_fam, applt_seq_nr, invt_seq_nr ,psn_name, person_ctry_code, more2.number_applt, psn_sector
from tls201_appln join tls224_appln_cpc on tls201_appln.appln_id = tls224_appln_cpc.appln_id
join tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id
join (select appln_id, count (applt_seq_nr) number_applt from tls207_pers_appln join tls206_person on tls207_pers_appln.person_id = tls206_person.person_id  
		where applt_seq_nr > 0 and invt_seq_nr = 0 and psn_sector <> 'INDIVIDUAL'
		group by appln_id 
		having count(applt_seq_nr)>2) as more2 on tls201_appln.appln_id = more2.appln_id
where left(cpc_class_symbol,4) = 'Y04S' and applt_seq_nr > 0 and invt_seq_nr = 0
order by appln_auth, appln_filing_date, tls201_appln.appln_id, applt_seq_nr
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


torben
Posts: 13
Joined: Thu Nov 12, 2015 9:59 am

Re: Database errors in the field nb_applicants

Post by torben » Wed May 03, 2017 2:53 pm

Hi Geert,
thanks for your reply. This query was a nice hint but I have to minimaly modify it.
In the subquery I just looked at the relevant entries of the attribute psn_sector. You just excluded 'INDIVIDUAL' but I think there are two more entries at the attribute psn_sector which mainly comprises persons namly 'UNKNOWN' and an empty entry.

Code: Select all

where applt_seq_nr > 0 and invt_seq_nr = 0 and psn_sector in	('COMPANY',
																			'COMPANY GOV NON-PROFIT',
																			'COMPANY GOV NON-PROFIT UNIVERSITY',
																			'COMPANY HOSPITAL',
																			'COMPANY UNIVERSITY',
																			'GOV NON-PROFIT',
																			'GOV NON-PROFIT HOSPITAL',
																			'GOV NON-PROFIT UNIVERSITY',
																			'HOSPITAL',
																			'UNIVERSITY',
																			'UNIVERSITY HOSPITAL')

I also added theses boundary condition for the normal query to only get the named companies on a patent. Without these boundary conditions I will get a count variable which counts the companies on a patent but also the listed persons.

Thanks for your help ;).


Geert Boedt
Posts: 176
Joined: Tue Oct 19, 2004 10:36 am
Location: Vienna

Re: Database errors in the field nb_applicants

Post by Geert Boedt » Wed May 03, 2017 2:59 pm

Exactly;
that is the nice thing about PATSTAT;
you can tweak the data as much as you need/want ...
Geert
Best regards,

Geert Boedt
PATSTAT support
Business Use of Patent Information
EPO Vienna


Post Reply