Annual Amount of Pat Family Applications by Country of Origi

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

deef
Posts: 5
Joined: Tue Jun 13, 2017 4:50 pm

Annual Amount of Pat Family Applications by Country of Origi

Post by deef » Tue Jun 13, 2017 5:22 pm

Hi all,

I'm attempting to make a correlation analysis between a country's R&D expenditures and innovative power. Obtaining the R&D data isn't much of a problem. Getting good and reliable data for the innovative power is a whole different story, though.

My approach to measure the innovative power of a country is by counting the "inventions" that origin from that respective country. The term "inventions" is where things get sketchy - at least for me. By browsing the discussion forums here as well as further researching on the EPO website and other sources, I came to the conclusion that the most reasonable way appears to be counting patent families as opposed to individual patents.

Now, I've come up with a SQL query for PATSTAT (2017 spring edition) which - from my understanding - collects the following data and appears to work fine:
  • The annual total of INPADOC applications
  • by residents of a certain country
  • for each of the years 1995 through 2015

Code: Select all

SELECT q.earliest_filing_year, COUNT(q.earliest_filing_year) AS cnt_A
FROM (
  SELECT  DISTINCT inpadoc_family_id, earliest_filing_year
  FROM tls201_appln a
  JOIN tls207_pers_appln pa ON a.appln_id = pa.appln_id
  JOIN tls206_person p ON pa.person_id = p.person_id
  WHERE earliest_filing_year BETWEEN 1995 AND 2015
  AND pa.applt_seq_nr > 0
  AND ipr_type = 'PI'
  AND person_ctry_code = 'DE' -- US, KR, (CN,) GB, ES, IT, FR, JP
) q
GROUP BY q.earliest_filing_year
This leaves me with several questions for more experienced users:
  • Does the query I posted above actually provide the data that I described in the list above?
  • Am I skipping crucial stuff or narrowing down the results too far with that query?
  • Is my approach to pick patent families over patents viable and valid to measure inventions?
  • Did I opt for the correct patent family definition? Would DOCDB be suited better for my purposes?
  • Am I missing something else?
I'm really struggling to get a resilient data base for my research so any sort of help would be greatly appreciated. If you need further information please let me know.

Thanks in advance!

Kind regards,
deef


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: Annual Amount of Pat Family Applications by Country of O

Post by mkracker » Mon Jun 19, 2017 8:57 pm

Hi deef,

You asked for feedback, so here are my thoughts. To be most useful, I am trying to be picky.
  • DOCDB families (aka simple families) are usually used for counting inventions. All applications of the same DOCDB family can be regarded as equivalent. In contrast the broader INPADOC family (aka extended family) is usually utilized to count technologies, because 2 applications of the same INPADOC family may be quite different.
  • What you will get with your query:
    - The annual total of (DOCDB or INPADOC) families
    - by country codes of the applicants (exactly: The applicants as published in the most current publication of the corresponding application; this is not necessarily the applicant who has filed the application in the first place)
    - for each of the years 1995 through 2015
    - Only patents are considered; e.g. utility models are ignored; this often makes perfect sense
  • Note for some applicants the country code may not be known. Check the coverage of those offices which are most relevant to you.
  • Note that numbers for the last years will be too low, because some applications might not have been published "yet" (= cut-off date for the 2017 Spring Edition was end of January 2017). You will see this effect easily when you sort by date.
You could also simplify your query, which will be easier to read. In your case there is no need for a nested sub-query in the FROM clause. This query gives the same result:

Code: Select all

SELECT earliest_filing_year, COUNT(DISTINCT inpadoc_family_id) AS cnt_A
FROM tls201_appln a
JOIN tls207_pers_appln pa ON a.appln_id = pa.appln_id
JOIN tls206_person p ON pa.person_id = p.person_id
WHERE earliest_filing_year BETWEEN 1995 AND 2015
AND pa.applt_seq_nr > 0
AND ipr_type = 'PI'
AND person_ctry_code = 'DE' -- US, KR, (CN,) GB, ES, IT, FR, JP
GROUP BY earliest_filing_year
ORDER BY earliest_filing_year desc
All the best for your project
Martin / EPO PATSTAT
-------------------------------------------
Martin Kracker / EPO


deef
Posts: 5
Joined: Tue Jun 13, 2017 4:50 pm

Re: Annual Amount of Pat Family Applications by Country of O

Post by deef » Tue Jun 20, 2017 8:49 am

Hi Martin,

Thank you very much for your very helpful response!

With regard to your explanation, I'll opt for DOCDB over INPADOC now since I'm actually interested in inventions. So thanks for clarification. I'll also have to trim the time period to 2014 as per your remark.

Anyways, I still have three questions if I may:
  • Is there any way to filter for the applicant who has filed the application in the first place? It's more of an aesthetic issue, though.
  • How can I find out the coverage of the offices as you mentioned above? I tried googling and searching the EPO website without any success.
  • I've come to think that including utility models might be reasonable for my research since not every invention is filed as a patent necessarily. I could imagine that also counting UMs might enhance the data quality subsequently - even if not available in the US. Any objections to this reasoning from a more experienced user?
Thank you for cleaning up my query as well. I wasn't familiar with using the COUNT() function that way. Very neat! :-)

Kind regards,
deef


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: Annual Amount of Pat Family Applications by Country of O

Post by mkracker » Tue Jun 20, 2017 3:45 pm

Hi,

1) To find the applicant who has filed the application in the first place:
In PATSTAT Biblio applicants (and likewise inventors) are connected in 2 ways to an application:

a) Most commonly, you use the latest applicants (as published). You get them by joining applicants with persons, i.e. TLS201_APPLN - TLS207_PERS_APPLN - TLS206_PERSON.

b) But applicants (here aka assignees, owners,... ) may change over time. So on each publication of an application a different set of applicants may have been published. You may get all applicants from the first publication to the last publication of a given application by joining TLS201_APPLN - TLS211_PAT_PUBLN - TLS227_PERS_PUBLN - TLS206_PERSON. If you restrict your result to the first publication, you will get the applicants who have filed in the first place (unless there was a change of applicants before the first publication). Note that you likely would need a nested sub-query to express all this in a single query.
This approach is a little bit more complex, but you can do an analysis of ownership over time.

2) Checking the data quality of PATSTAT, you should query PATSTAT directly. You will not find the real situation somewhere else. Below is a query which could serve as a starting point for an analysis you could do. Please adapt it to your needs.

Code: Select all

select appln_auth, count(*) as nbOfPersons
from tls201_appln a
join tls207_pers_appln pa on a.appln_id = pa.appln_id
join tls206_person p on pa.person_id = p.person_id
where appln_auth in -- ignore too small offices
	('AR','AT','AU','BE','BG','BR','CA','CH','CN','CO','CS','CZ','DD','DE','DK','EA','EP',
	'ES','FI','FR','GB','GR','HK','HR','HU','ID','IE','IL','IN','IT','JP','KR','MA','MX','MY','NL','NO','NZ',
	'PE','PL','PT','RO','RU','SE','SG','SI','SK','SU','TR','TW','UA','US','UY','YU','ZA')
and appln_kind <> 'W'       -- exclude filings at the Reveiving office
and internat_appln_id = 0   -- If analysis of national / regional filings only: exclude PCTs in the national/regional phase
and year(appln_filing_date) >= 2000  -- limit to newer filings; situation might be different for other periods
and person_ctry_code <> ''	-- retrieve only persons WITH a country code
and applt_seq_nr > 0		-- ignore persons which are only inventors
group by appln_auth
order by appln_auth
3) Include utility models or not? There is no single answer. There are offices where there are as many UMs as patents (e.g. CN), where there is no UM (e.g. EP) or where UMs are a minority (e. g. DE). Why not compute both, evaluate and then decide which approach makes more sense for you.

Best regards,
Martin / EPO PATSTAT
-------------------------------------------
Martin Kracker / EPO


deef
Posts: 5
Joined: Tue Jun 13, 2017 4:50 pm

Re: Annual Amount of Pat Family Applications by Country of O

Post by deef » Tue Jun 20, 2017 3:59 pm

Hi Martin,

Thank you for providing me with even more information. I'll see how far it takes me and report back later. I really appreciate all the help you and your team provide to help all of us out in the forums.

Kind regards,
deef


deef
Posts: 5
Joined: Tue Jun 13, 2017 4:50 pm

Re: Annual Amount of Pat Family Applications by Country of O

Post by deef » Thu Jun 22, 2017 3:27 pm

Hi again,

As for the coverage, I took the query suggested by Martin and modified it. Sadly, only 57% of the persons come with a country code. That's really frustrating since this potentially distorts the overall image significantly. In Japan, not even 1% of the persons that file applications have a country code. Am I doing something wrong? screenshot of some numbers if you're interested

This is my tweaked query for that:

Code: Select all

select appln_auth, count(*) as nbOfPersons
from tls201_appln a
join tls207_pers_appln pa on a.appln_id = pa.appln_id
join tls206_person p on pa.person_id = p.person_id
where appln_auth in -- ignore too small offices
   ('AR','AT','AU','BE','BG','BR','CA','CH','CN','CO','CS','CZ','DD','DE','DK','EA','EP',
   'ES','FI','FR','GB','GR','HK','HR','HU','ID','IE','IL','IN','IT','JP','KR','MA','MX','MY','NL','NO','NZ',
   'PE','PL','PT','RO','RU','SE','SG','SI','SK','SU','TR','TW','UA','US','UY','YU','ZA')
and appln_kind <> 'W'       -- exclude filings at the Receiving office
and year(appln_filing_date) >= 1995  -- limit to newer filings; situation might be different for other periods
and person_ctry_code <> ''   -- retrieve only persons WITH a country code
/* Note: I commented out the line above to get the total of persons per auth office */
and applt_seq_nr > 0      -- ignore persons which are only inventors
group by appln_auth
order by appln_auth
As for getting only the first publication/filing and the respective country code, I tried the query below. Contrary to what Martin suggested, this one doesn't include a Subquery which is why I report back here to double check. I tried various things to filter for the first publication of a docdb family using the attribute earliest_publn_year but without any success. Is my query viable? The numbers I get seem OK-ish, but given the coverage problem mentioned above I'm very doubtful about everything right now. :(

Code: Select all

SELECT earliest_filing_year AS Yr, COUNT(DISTINCT docdb_family_id) AS Cnt
FROM tls201_appln a
LEFT JOIN tls211_pat_publn b ON a.appln_id = b.appln_id
LEFT JOIN tls227_pers_publn pb ON b.pat_publn_id = pb.pat_publn_id
LEFT JOIN tls206_person p ON pb.person_id = p.person_id
WHERE earliest_filing_year BETWEEN 1995 AND 2015
AND pb.applt_seq_nr > 0
AND (ipr_type = 'PI' or ipr_type = 'UM')
AND person_ctry_code = 'DE' -- US, KR, (CN,) GB, ES, IT, FR, JP
AND a.appln_id = a.earliest_filing_id
GROUP BY earliest_filing_year
ORDER BY earliest_filing_year asc
Thanks in advance!

Best regards,
deef


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: Annual Amount of Pat Family Applications by Country of O

Post by mkracker » Fri Jun 23, 2017 4:59 pm

Hi deef,

Your query looks fine. You are counting all families where first filings have an applicant of a certain country published by one of its publications.
There is a slight difference to your earlier query, where you counted families which have family members (i.e. applications) whose most recent publications contain an applicant of a certain country.

And unfortunately you are right. We do not have the country code of applicants of Japanese patents.

Martin
-------------------------------------------
Martin Kracker / EPO


deef
Posts: 5
Joined: Tue Jun 13, 2017 4:50 pm

Re: Annual Amount of Pat Family Applications by Country of O

Post by deef » Fri Jun 23, 2017 6:00 pm

Hi Martin,

Thanks for verifying my query. I'll work with that one now.

Also thanks for confirming my findings concerning the lack of coverage esp. for Japan. Even though it's quite disappointing and I'll have to find a way to still get a representative set of data for my research. Maybe I'll have to limit the scope of the investigations.

Anyways, thank you very much for all your support, Martin. You helped me out a whole lot! Have a nice weekend.

Best regards,
deef


Post Reply