IPC queries lead to empty row

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

alex_c
Posts: 1
Joined: Mon Feb 06, 2023 10:21 am

IPC queries lead to empty row

Post by alex_c » Mon Feb 06, 2023 11:24 am

Dear support team,

As a junior researcher, I am looking at the evolution of patents in the field of eco-innovation in china (and more specific localities in a second step). To find patents in the realm of eco-inno, I used a working by the OECD giving at the end a list of environmental-related IPC (https://read.oecd-ilibrary.org/environm ... w-en#page1).

The issue is that with the above code I developed, for some IPC I obtain usable results, but for other, the rows remain empty, and cannot figure out why.

Code: Select all

 SELECT   appln_filing_year, count (a.appln_id) as 'number of application' 
FROM tls201_appln a
JOIN tls209_appln_ipc i ON a.appln_id = i.appln_id
join tls224_appln_cpc c on a.appln_id = c.appln_id
join tls207_pers_appln x on a.appln_id = x.appln_id
join tls206_person pers on x.person_id = pers.person_id
--
WHERE (i.ipc_class_symbol = 'H05K  13/04' or i.ipc_class_symbol = 'B01D  53/34')
AND a.appln_filing_year BETWEEN 2000 AND 2019
and person_ctry_code ='cn'

GROUP BY a.appln_filing_year
ORDER BY a.appln_filing_year asc 
I would afterward add more IPC or CPC, but let only a few to be more clear. For instance, I obtain empty row (even not 0) with those ICP : B01D 53/36, Y02E 40/10 and so on ... (tough I look at the empty space before the slash)

Could you please enlighten me, or redirect me to an already discussed topic, I look after it on the forum but couldn't find anything.

If I may also ask you, how to identify Chinese cities in PATSTAT, I would be also very glad. I tried with the ZiP code or the address or other items from table 206, but that was not very conclusive.

I am sorry for the questions and I thank you very much.

Best,
Alex


Elizabethgabor
Posts: 5
Joined: Tue Feb 07, 2023 9:06 am

Re: IPC queries lead to empty row

Post by Elizabethgabor » Sat Feb 11, 2023 9:48 am

alex_c wrote:
Mon Feb 06, 2023 11:24 am
Dear support team,

As a junior researcher, I am looking at the evolution of patents in the field of eco-innovation in china (and more specific localities in a second step). To find patents in the realm of eco-inno, I used a working by the OECD giving at the end a list of environmental-related IPC (https://read.oecd-ilibrary.org/environm ... w-en#page1).

The issue is that with the above code I developed, for some IPC I obtain usable results, but for other, the rows remain empty, and cannot figure out why.

Code: Select all

 SELECT   appln_filing_year, count (a.appln_id) as 'number of application' 
FROM tls201_appln a
JOIN tls209_appln_ipc i ON a.appln_id = i.appln_id
join tls224_appln_cpc c on a.appln_id = c.appln_id
join tls207_pers_appln x on a.appln_id = x.appln_id
join tls206_person pers on x.person_id = pers.person_id
--
WHERE (i.ipc_class_symbol = 'H05K  13/04' or i.ipc_class_symbol = 'B01D  53/34')
AND a.appln_filing_year BETWEEN 2000 AND 2019
and person_ctry_code ='CN'

GROUP BY a.appln_filing_year
ORDER BY a.appln_filing_year asc 
I would afterwards add more IPC or CPC, but let only a few to be more clear. For instance, I obtain empty row (even not 0) with those ICP : B01D 53/36, Y02E 40/10 and so on ... (tough I look at the empty space before the slash)

Could you please enlighten me, or redirect me to an already discussed topic, I look after it on the forum but couldn't find anything.

If I may also ask you, how to identify Chinese cities in PATSTAT, I would be also very glad. I tried with the ZiP code or the address or other items from table 206, but that was not very conclusive.

I am sorry for the questions and I thank you very much.

Best,
Alex
You are running into an issue with your SQL query not returning any results for certain IPC codes. This could be because the IPC codes you are searching for are not present in the data or because the data is not correctly joined with the other tables.

Here are a few suggestions that might help you solve this issue:

Verify the IPC codes: Make sure that the IPC codes you are searching for are present in the data by running a separate query to check for the presence of these codes. You can do this by running a query like this:
less
Copy code

Code: Select all

SELECT i.ipc_class_symbol
FROM tls209_appln_ipc i
WHERE i.ipc_class_symbol IN ('H05K  13/04', 'B01D  53/34', 'B01D  53/36', 'Y02E  40/10')
This will give you a list of all the IPC codes present in the tls209_appln_ipc table and you can check if the codes you are searching for are present.

Check for data accuracy: Make sure that the data is correctly joined with the other tables. You can do this by running a query to check if there are any missing values in the joined tables. For example:
sql
Copy code

Code: Select all

SELECT a.appln_id
FROM tls201_appln a
LEFT JOIN tls209_appln_ipc i ON a.appln_id = i.appln_id
LEFT JOIN tls224_appln_cpc c on a.appln_id = c.appln_id
LEFT JOIN tls207_pers_appln x on a.appln_id = x.appln_id
LEFT JOIN tls206_person pers on x.person_id = pers.person_id
WHERE i.appln_id IS NULL OR c.appln_id IS NULL OR x.appln_id IS NULL OR pers.person_id IS NULL
This will show you any missing values in the joined tables and help you identify the source of the problem.

Regarding the second part of your question, identifying Chinese cities in PATSTAT can be a challenge as the data is typically stored at the country level. You may want to consider using the person_address column in the tls206_person table to extract city information, but keep in mind that the information may not be very accurate. Another option would be to use external data sources such as geographic information systems (GIS) to map the addresses to their corresponding cities.

I hope these suggestions help. Let me know if there's anything else I can assist with.


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: IPC queries lead to empty row

Post by EPO / PATSTAT Support » Tue Feb 14, 2023 2:49 pm

I would like to add a couple of items on Elizabeths' answer (which is correct).
The OECD document is from 2014; IPC and CPC codes migh have changed. But the Y-tags in the CPC scheme have absolutely changed and have been pruned. So you will also need to check if those Y-tags are still valid or maybe need to be changed or replaced by other classification(s).

In your SQL, you join tls201_appln with the IPC and CPC table although you don't use the CPC table neither CPC data in your WHERE or SELECT clause. So I would assume you don't need it. But if you join to both tables, the data base needs to have matching data in your WHERE clause for both tables. If there is no CPC data (whatever code), the application would not be taken into account.

One also need to be careful when using IPC and CPC codes.
i.ipc_class_symbol = 'H05K 13/04' is valid, and there are no subgroups.
You can observe that when you run the query below.

Code: Select all

SELECT distinct ipc_class_symbol
FROM tls209_appln_ipc
where ipc_class_symbol LIKE 'H05K  13/04%'
-- observer the % wildcard with the LIKE statement 
But if you run the same query on the CPC table, you will see that there are plenty of subsgroups in "H05K 13/04"; and you will need to use the LIKE statement to be sure you get all of them.
https://worldwide.espacenet.com/patent/ ... =H05K13/04

Code: Select all

SELECT distinct cpc_class_symbol
FROM tls224_appln_cpc
where cpc_class_symbol LIKE 'H05K  13/04%'
And similar for CPC "B01D 53/34"

Code: Select all

SELECT distinct cpc_class_symbol
FROM tls224_appln_cpc
where cpc_class_symbol LIKE 'B01D  53/34%'
But BEWARE of CPC "B01D 53/34". As CPC code, it has subgroups that do not start with "B01D 53/34"
See: https://worldwide.espacenet.com/patent/ ... =B01D53/34
ESPACENET has an elegant solution for this problem via the "/LOW" operator. I allows you to select all the codes
(subgroups) when you tick "B01D53/34". But in SQL you will need to be sure that all the subgroups are included in your query structure via "ORs" or "LIKE"; that means, checking them manually and adding them to your WHERE clause.
So taking all the above into account, have a look at this:

Code: Select all

SELECT   appln_filing_year, count (distinct (a.appln_id)) as 'number of application' 
--distinct to avoid double counts when an application has >1 CN inventor or applicant,
-- only single applications are counted.
FROM tls201_appln a
join tls207_pers_appln x on a.appln_id = x.appln_id
join tls206_person pers on x.person_id = pers.person_id
WHERE
(
a.appln_id in (Select appln_id from tls209_appln_ipc where ipc_class_symbol = 'H05K  13/04' 
or ipc_class_symbol in('B01D  53/34', 'B01D  53/38') or ipc_class_symbol like 'B01D  53/4%'or ipc_class_symbol like 'B01D  53/5%' or ipc_class_symbol like 'B01D  53/6%'
or ipc_class_symbol like 'B01D  53/7%' or ipc_class_symbol like 'B01D  53/8%' or ipc_class_symbol like 'B01D  53/9%')

or 

a.appln_id in (Select appln_id from tls224_appln_cpc where cpc_class_symbol LIKE 'H05K  13/04%' 
or cpc_class_symbol like  'B01D  53/34' or cpc_class_symbol =  'B01D  53/38' or  cpc_class_symbol like 'B01D  53/4%'or cpc_class_symbol like 'B01D  53/5%' or cpc_class_symbol like 'B01D  53/6%'
or cpc_class_symbol like 'B01D  53/7%' or cpc_class_symbol like 'B01D  53/8%' or cpc_class_symbol like 'B01D  53/9%' )
)
AND a.appln_filing_year BETWEEN 2000 AND 2019
and person_ctry_code ='CN'
GROUP BY a.appln_filing_year
ORDER BY a.appln_filing_year asc 
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


alex_c
Posts: 1
Joined: Mon Feb 06, 2023 10:21 am

Re: IPC queries lead to empty row

Post by alex_c » Thu Feb 16, 2023 2:40 pm

First thanks a lot both of you!!

I think I will definitively use your strategy for chinese cities Elizabeth, it will also allow me to classify applications per suburbs which can be interesting

Regarding the changes in ipc/cpc, do you have any document to keep track of the change in ipc cpc, because as the number of ipc/cpc is consequent a did a try using AI but would like to check myself too and couldn't find a document on. I found some documents with ipc but not precising the change accross years.

Plus, if I correctly understood, when I remove the cpc table, I could write the code as below, however when I am using the cpc, I should think about the subgroups and add a "%" or did I misunderstand something ?:

Thank you again

when use only ipc:

Code: Select all

SELECT   appln_filing_year, count (distinct (a.appln_id)) as 'number of application' 
--distinct to avoid double counts when an application has >1 CN inventor or applicant,
-- only single applications are counted.
FROM tls201_appln a
join tls207_pers_appln x on a.appln_id = x.appln_id
join tls206_person pers on x.person_id = pers.person_id
WHERE
(
a.appln_id in (Select appln_id from tls209_appln_ipc 

where (
---- Environmental management
-------1.1 Air pollution abatement
-------1.1.1 Emissions abatement from stationary sources
------- Post-combustion technologies

(ipc_class_symbol  between 'B01d  53/34' and  'B01d  53/72') 	
or (ipc_class_symbol like 'F23G   7/06') 
or (ipc_class_symbol  like 'F23J15')
or (ipc_class_symbol  like 'F42B 1/20')

----- Integrated technologies
or (ipc_class_symbol  like 'C21B   7/22')
or (ipc_class_symbol  like 'C21C   5/38')
or (ipc_class_symbol  like 'F23D  14/12')
or (ipc_class_symbol  like 'F23C  10/02')
or (ipc_class_symbol  like 'F23C10')
)
)
AND a.appln_filing_year BETWEEN 1985 AND 2019
and person_ctry_code ='CN'
GROUP BY a.appln_filing_year
ORDER BY a.appln_filing_year asc
when use ipc and cpc:

Code: Select all

SELECT appln_filing_year, COUNT(DISTINCT a.appln_id) AS 'number of applications'
FROM tls201_appln AS a
JOIN tls207_pers_appln AS x ON a.appln_id = x.appln_id
JOIN tls206_person AS pers ON x.person_id = pers.person_id
WHERE
(
a.appln_id IN (
SELECT appln_id
FROM tls209_appln_ipc
WHERE ipc_class_symbol LIKE 'H05K  13/04%'
OR ipc_class_symbol LIKE 'B01D  53/34%'
OR ipc_class_symbol LIKE 'B01D  53/38%'
OR ipc_class_symbol LIKE 'B01D  53/4%'
OR ipc_class_symbol LIKE 'B01D  53/5%'
OR ipc_class_symbol LIKE 'B01D  53/6%'
OR ipc_class_symbol LIKE 'B01D  53/7%'
OR ipc_class_symbol LIKE 'B01D  53/8%'
OR ipc_class_symbol LIKE 'B01D  53/9%'
)
OR
a.appln_id IN (
SELECT appln_id
FROM tls224_appln_cpc
WHERE cpc_class_symbol LIKE 'H05K  13/04%'
OR cpc_class_symbol LIKE 'B01D  53/34/LOW'
OR cpc_class_symbol LIKE 'B01D  53/38/LOW'
OR cpc_class_symbol LIKE 'B01D  53/4/LOW%'
OR cpc_class_symbol LIKE 'B01D  53/5/LOW%'
OR cpc_class_symbol LIKE 'B01D  53/6/LOW%'
OR cpc_class_symbol LIKE 'B01D  53/7/LOW%'
OR cpc_class_symbol LIKE 'B01D  53/8/LOW%'
OR cpc_class_symbol LIKE 'B01D  53/9/LOW%'
)
)
AND a.appln_filing_year BETWEEN 1985 AND 2019
AND pers.person_ctry_code = 'CN'
GROUP BY a.appln_filing_year
ORDER BY a.appln_filing_year ASC

  
Last edited by alex_c on Thu Feb 16, 2023 3:22 pm, edited 2 times in total.


Post Reply