patents grouped by NUTS and IPC code by year

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Lorenzo68
Posts: 4
Joined: Mon Sep 23, 2019 10:38 pm

patents grouped by NUTS and IPC code by year

Post by Lorenzo68 » Mon Sep 23, 2019 11:03 pm

Dear Patstat support,

I am trying to obtain a dataset containing the number of patents in a specific range of years and relevant to a specific IPC code (sub-classes, groups and sub-groups) attributed to NUTS cod
es (regions), e.g G06N

I have so far used the following query

SELECT p2.doc_std_name_id, p2.doc_std_name, p2.psn_sector, p2.nuts AS nuts_DOC_STD_NAME_ID, count (*) AS Count_nuts_DOC_STD_NAME_ID FROM tls206_person p2 WHERE p2.doc_std_name_id IN ( SELECT p1.doc_std_name_id FROM tls206_person p1 WHERE p1.person_id IN ( SELECT ap.person_id FROM sub-classes, groups and sub-groups. tls201_appln a INNER JOIN tls207_pers_appln ap ON ap.appln_id = a.appln_id AND a.appln_filing_year BETWEEN 2011 AND 2018 INNER JOIN tls224_appln_cpc c ON c.appln_id = a.appln_id WHERE a.appln_id < 900000000 AND a.appln_kind IN ('A','W') AND ap.applt_seq_nr > 0 AND c.cpc_class_symbol LIKE 'G06N %' OR c.cpc_class_symbol LIKE 'B65D2%' GROUP BY ap.person_id ) GROUP BY p1.doc_std_name_id ) AND p2.nuts_level BETWEEN 1 AND 8 GROUP BY p2.doc_std_name_id, p2.doc_std_name, p2.psn_sector, p2.nuts ORDER BY p2.doc_std_name_id, Count_nuts_DOC_STD_NAME_ID DESC, nuts_DOC_STD_NAME_ID ASC

But I am not sure wether a query on how many times the associated psn_id is labelled with a specific NUTS code would work best
here the alternative query
SELECT p2.psn_id, p2.psn_name, p2.nuts AS nuts_hrm_l2_id, count (*) AS Count_nuts_hrm_l2_id FROM tls206_person p2 WHERE p2.psn_id IN ( SELECT p1.psn_id FROM tls206_person p1 WHERE p1.person_id IN ( SELECT ap.person_id FROM tls201_appln a INNER JOIN tls207_pers_appln ap ON ap.appln_id = a.appln_id INNER JOIN tls224_appln_cpc c ON c.appln_id = a.appln_id WHERE a.appln_id < 900000000 AND a.appln_kind IN ('A','W') AND ap.applt_seq_nr > 0 AND c.cpc_class_symbol LIKE 'Y0%' GROUP BY ap.person_id ) GROUP BY p1.psn_id ) AND p2.nuts_level BETWEEN 1 AND 8 GROUP BY p2.psn_id, p2.psn_name, p2.nuts ORDER BY p2.psn_id, Count_nuts_hrm_l2_id DESC, nuts_hrm_l2_id ASC

My aim is to detect the presence of patents attributed to NUTS regions within a specific lapse of time and pertaining to a specific technological domain, going into detail of IPC sub-classes, groups and sub-groups.

Thanks in advance for your help

best

Lorenzo


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: patents grouped by NUTS and IPC code by year

Post by EPO / PATSTAT Support » Tue Sep 24, 2019 9:47 am

Hello Lorenzo,
your first query has somewhere a syntax error.
Can you elaborate a bit on how you would like the output of your query to be ?
I noted the following criteria: filed between 2011 and 2018, application kind code A or W (PCT's), CPC_codes G06N or B65D2 .
But how would you like the "output" to look ?
Assuming that the result would be an "excel like" table, what level of aggregation do you want ?
What are the attributes in the columns, what level of details (grouping) do you have in the rows ?
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Lorenzo68
Posts: 4
Joined: Mon Sep 23, 2019 10:38 pm

Re: patents grouped by NUTS and IPC code by year

Post by Lorenzo68 » Tue Sep 24, 2019 10:02 am

Hello thanks for your reply
I am trying to develop a final dataset that contains patents aggregation (count by psn_id or doc_standard_name) by NUTs 3 level (rows) and possibly IPC subgroups codes as columns (e.g. G06N and B65D2). I would also appreciate the possibility to execute this query and obtain data by year (better) or time lag (e.g. 2012-2019).
My original question was whether harmonisation via doc-standard_name or Psn_id is better. Or whether an even better harmonisation would work in this case. But as I understand from your answer there is a way to generate a query closer to the final desired output, if I get it correctly. In any case, also a dataset with patents by row (psn_id or else) and IPC codes and nuts codes (both as a count) by column would work.
thanks a lot
best
Lorenzo


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: patents grouped by NUTS and IPC code by year

Post by EPO / PATSTAT Support » Tue Sep 24, 2019 1:57 pm

Hello Lorenzo,

Whether you use the DOCDB_STD_NAME or the PSN_NAME will not make much difference as NUTS codes (level3) are only assigned to applicants & inventors in the EU for which we have addresses available in PATSTAT. That limits it more or less to EP applications, for which both name harmonisation approaches will give very good results.
Here is the query using the PSN name:

Code: Select all

SELECT psn_name, appln_filing_year,tls904_nuts.nuts, tls904_nuts.nuts_label,
count(distinct( case when left(cpc_class_symbol ,4)=  'G06N' then tls201_appln.appln_id end)) as "G06N",
count(distinct( case when left(cpc_class_symbol ,5)=  'B65D2'  then tls201_appln.appln_id end )) as "B65D2"
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls224_appln_cpc on tls201_appln .appln_id = tls224_appln_cpc.appln_id and (left(cpc_class_symbol ,4)=  'G06N' or left(cpc_class_symbol ,5)=  'B65D2' )
join tls904_nuts on tls206_person.nuts = tls904_nuts.nuts
WHERE appln_filing_year between 2005 and 2018
and appln_auth = 'EP'
and applt_seq_nr > 0 and invt_seq_nr = 0 -- excluding pure inventors and inventor/applicants 
and person_ctry_code  in (select  ctry_code from tls801_country where eu_member = 'Y')
group by  psn_name, appln_filing_year,tls904_nuts.nuts, tls904_nuts.nuts_label
ORDER BY psn_name, appln_filing_year
Here is the code for using the DOCDB_STD_NAME:

Code: Select all

SELECT doc_std_name, appln_filing_year,tls904_nuts.nuts, tls904_nuts.nuts_label,
count(distinct( case when left(cpc_class_symbol ,4)=  'G06N' then tls201_appln.appln_id end)) as "G06N",
count(distinct( case when left(cpc_class_symbol ,5)=  'B65D2'  then tls201_appln.appln_id end )) as "B65D2"
FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls906_person ON tls207_pers_appln.person_id = tls906_person.person_id
JOIN tls224_appln_cpc on tls201_appln .appln_id = tls224_appln_cpc.appln_id and (left(cpc_class_symbol ,4)=  'G06N' or left(cpc_class_symbol ,5)=  'B65D2' )
join tls904_nuts on tls906_person.nuts = tls904_nuts.nuts
WHERE appln_filing_year between 2005 and 2018
--and appln_auth = 'EP'
and applt_seq_nr > 0 and invt_seq_nr = 0 -- excluding pure inventors and inventor/applicants 
and person_ctry_code  in (select  ctry_code from tls801_country where eu_member = 'Y')
group by  doc_std_name, appln_filing_year,tls904_nuts.nuts, tls904_nuts.nuts_label
ORDER BY doc_std_name, appln_filing_year
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Lorenzo68
Posts: 4
Joined: Mon Sep 23, 2019 10:38 pm

Re: patents grouped by NUTS and IPC code by year

Post by Lorenzo68 » Wed Sep 25, 2019 9:12 am

Dear Support team
The query works perfectly
thank you very much for your support
Lorenzo


Lorenzo68
Posts: 4
Joined: Mon Sep 23, 2019 10:38 pm

Re: patents grouped by NUTS and IPC code by year

Post by Lorenzo68 » Fri Sep 27, 2019 10:37 am

Dear support team
with reference to the query received in this thread I wonder if the syntax allows a further specification: the co-occurrence of two CPC codes in the same doc_standard_ name or associated with the same psn_name, always at NUTS level. I wonder if the point in the syntax to integrate this change is the following line subtituting AND for OR:
"JOIN tls224_appln_cpc on tls201_appln .appln_id = tls224_appln_cpc.appln_id and (left(cpc_class_symbol ,4)= 'G06N' and left(cpc_class_symbol ,4)= 'G06F' )"

thanks
Lorenzo


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: patents grouped by NUTS and IPC code by year

Post by EPO / PATSTAT Support » Tue Oct 01, 2019 4:18 pm

Hello Lorenzo,
I expanded the query a bit by adding an extra column based on the extra tailor made table (via LEFT JOIN !) which only gives the appln_id's from those applications that have both classification codes: "G06N_AND_G06F .
I also had to add the "G06F" classification on the join of the CPC table in order to include also those applications.
This is far from elegant programming, but it illustrates that it can be done.
On a local installed PATSTAT version, I would definitely work with intermediate temporary tables instead of adding more conditions through sub-queries as done here.
If you want to extend this with further CPC combinations, simply add the extra tables in the FROM part and also add extra columns (CASE...) in the SELECT part of the query.

Code: Select all

SELECT psn_name, appln_filing_year,tls904_nuts.nuts, tls904_nuts.nuts_label,
count(distinct( case when left(cpc_class_symbol ,4)=  'G06N' then tls201_appln.appln_id end)) as "G06N",
count(distinct( case when left(cpc_class_symbol ,5)=  'B65D2'  then tls201_appln.appln_id end )) as "B65D2",
count(distinct( case when grouped_ipc.appln_id is not null then tls201_appln.appln_id end )) as "G06N_AND_G06F"

FROM tls201_appln JOIN tls207_pers_appln ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person ON tls207_pers_appln.person_id = tls206_person.person_id
JOIN tls224_appln_cpc on tls201_appln .appln_id = tls224_appln_cpc.appln_id and (left(cpc_class_symbol ,4)=  'G06N' or left(cpc_class_symbol ,5)=  'B65D2'   
																						or left( cpc_class_symbol ,4)= 'G06F'   )
join tls904_nuts on tls206_person.nuts = tls904_nuts.nuts 
left join (select distinct TAB_G06N.appln_id from tls224_appln_cpc TAB_G06N  join tls224_appln_cpc  TAB_G06F  on TAB_G06N.appln_id = TAB_G06F.appln_id and left(TAB_G06N.cpc_class_symbol ,4) =  'G06N' and left(TAB_G06F.cpc_class_symbol ,4) =  'G06F') as grouped_IPC
on tls201_appln.appln_id = grouped_ipc.appln_id

WHERE appln_filing_year between 2005 and 2018
and appln_auth = 'EP'
and applt_seq_nr > 0 and invt_seq_nr = 0 -- excluding pure inventors and inventor/applicants 
and person_ctry_code  in (select  ctry_code from tls801_country where eu_member = 'Y')
group by  psn_name, appln_filing_year,tls904_nuts.nuts, tls904_nuts.nuts_label
ORDER BY psn_name, appln_filing_year
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply