PATSTAT Online Y02E SQL query

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

aroh
Posts: 4
Joined: Wed Feb 26, 2020 8:59 pm

PATSTAT Online Y02E SQL query

Post by aroh » Wed Feb 26, 2020 9:09 pm

Hello,
I am using the PATSTAT Online database 2019 Autumn edition. I want to extract patent data by linking the 4 tables - 201, 224, 207 and 206. I want to see country-wise patents related to wind energy filed for the time period 1999 - 2018 tagged using the Y02E scheme. I want information related appln_id, earliest_filing_date, person_country_code, apple_auth.

Code: Select all

SELECT DISTINCT a.appln_id, a.appln_auth, a.appln_nr, a.appln_kind, a.earliest_filing_date, b.cpc_class_symbol, c.invt_seq_nr, d.person_name, d.person_ctry_code
FROM tls201_appln a JOIN tls224_appln_cpc b ON a.appln_id = b.appln_id
LEFT OUTER JOIN tls207_pers_appln c on a.appln_id = c.appln_id
LEFT OUTER JOIN tls206_person d ON c.person_id = d.person_id
WHERE b.cpc_class_symbol = 'Y02E  10/7%' AND c.invt_seq_nr > 0
ORDER BY a.appln_auth, a.earliest_filing_date, a.appln_id, c.invt_seq_nr, d.person_ctry_code
This is the code I have used and I wanted to know the changes required to the code.
Thank you.


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: PATSTAT Online Y02E SQL query

Post by EPO / PATSTAT Support » Thu Feb 27, 2020 2:42 pm

Hello Aroh,
there is a small SQL syntax error in your query.
A wildcard % comes allways in combination with LIKE. And the number of digits before the / is exaclty eight.
So: b.cpc_class_symbol like 'Y02E 10/7%' will work.
There are 2 spaces beteen the E and the 10.
Instead of a wildcard, you could also use:
WHERE left (b.cpc_class_symbol, 10) = 'Y02E 10/7' AND c.invt_seq_nr > 0

I also added a STRING_AGG to group the CPC codes into 2 field, which groups the rows.

Code: Select all

SELECT DISTINCT a.appln_id, a.appln_auth, a.appln_nr, a.appln_kind, a.earliest_filing_date, 
STRING_AGG (cast( b.cpc_class_symbol as NVARCHAR(MAX)), '; ') cpc_GROUP,
 c.invt_seq_nr, d.person_name, d.person_ctry_code
FROM tls201_appln a JOIN tls224_appln_cpc b ON a.appln_id = b.appln_id
LEFT OUTER JOIN tls207_pers_appln c on a.appln_id = c.appln_id
LEFT OUTER JOIN tls206_person d ON c.person_id = d.person_id
WHERE left (b.cpc_class_symbol, 10) =  'Y02E  10/7' AND c.invt_seq_nr > 0
and appln_filing_year between 1999 and 2018 
group by a.appln_id, a.appln_auth, a.appln_nr, a.appln_kind, a.earliest_filing_date, 
 c.invt_seq_nr, d.person_name, d.person_ctry_code
ORDER BY a.appln_auth, a.earliest_filing_date,  a.appln_id, c.invt_seq_nr, d.person_ctry_code
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


aroh
Posts: 4
Joined: Wed Feb 26, 2020 8:59 pm

Re: PATSTAT Online Y02E SQL query

Post by aroh » Thu Feb 27, 2020 4:53 pm

Hello,
Thank you for your prompt response and help.
If I use - WHERE left (b.cpc_class_symbol, 10) = 'Y02E 10/7' AND c.invt_seq_nr > 0 , will it allow me to select all the patent applications tagged with Y02E10/70-766 under the Y02E tag for wind energy?
Also, can you please explain the function of STRING_AGG in the code? I haven't seen its use in the EPO document 'Using PATSTAT with SQL for beginners'. Thank you once again.


aroh
Posts: 4
Joined: Wed Feb 26, 2020 8:59 pm

Re: PATSTAT Online Y02E SQL query

Post by aroh » Thu Feb 27, 2020 5:15 pm

Hello,
This is in continuation to my earlier query. There are a lot of patent applications where the inventor country is not found. I saw the following code in the PATSTAT manual. Will the problem of not finding inventor country be solved using fractional count?

Code: Select all

SELECT d.person_ctry_code, SUM(tot_in_ctry/tot_in_patent) AS fractional_count
FROM 
         (SELECT 
                      t.appln_id, 
                      isnull (t1.person_ctry_code, ' ') AS person_ctry_code,
                      isnull (t1.tot_in_ctry, 1) AS tot_in_ctry,
                      isnull (t2.tot_in_patent, 1) AS tot_in_patent
                      FROM 
                      our_sample t                   
                 LEFT OUTER JOIN
           (SELECT
                     c.appln_id,
                     d.person_ctry_code,
                     COUNT (d.person_id) AS tot_in_ctry
            FROM
                      tls207_pers_appln c
                      INNER JOIN tls206_person d ON c.person_id = d.person_id 
           WHERE 
                     c.invt_seq_nr > 0
           GROUP BY c.appln_id, person_ctry_code
           ) t1 ON t.appln_id = t1.appln_id 
                     LEFT OUTER JOIN
           (SELECT
                     appln_id, MAX(invt_seq_nr) AS tot_in_patent
            FROM 
                    tls207_pers_appln
            GROUP BY appln_id HAVING MAX(invt_seq_nr) > 0
            )  t2 ON t.appln_id = t2.appln_id
            ) our_sample_with_country        
GROUP BY person_ctry_code
ORDER BY SUM(tot_in_ctry/tot_in_patent) DESC;


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: PATSTAT Online Y02E SQL query

Post by EPO / PATSTAT Support » Fri Feb 28, 2020 10:05 am

aroh wrote:
Thu Feb 27, 2020 4:53 pm
Hello,
Thank you for your prompt response and help.
If I use - WHERE left (b.cpc_class_symbol, 10) = 'Y02E 10/7' AND c.invt_seq_nr > 0 , will it allow me to select all the patent applications tagged with Y02E10/70-766 under the Y02E tag for wind energy?
Also, can you please explain the function of STRING_AGG in the code? I haven't seen its use in the EPO document 'Using PATSTAT with SQL for beginners'. Thank you once again.
Yes, Y02E10/766 will be included with the above WHERE clause. The "PATSTAT with SQL for beginners" only covers a fraction of the possible SQL functions. STRING_AGG is a fairly new function which comes in handy to group rows in the output. There are many good SQL-learning websites that provide sample queries. I often use https://www.w3schools.com/sql/default.asp to look up SQL syntax.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: PATSTAT Online Y02E SQL query

Post by EPO / PATSTAT Support » Fri Feb 28, 2020 12:13 pm

aroh wrote:
Thu Feb 27, 2020 5:15 pm
Hello,
This is in continuation to my earlier query. There are a lot of patent applications where the inventor country is not found. I saw the following code in the PATSTAT manual. Will the problem of not finding inventor country be solved using fractional count?

Code: Select all

SELECT d.person_ctry_code, SUM(tot_in_ctry/tot_in_patent) AS fractional_count
FROM 
         (SELECT 
                      t.appln_id, 
                      isnull (t1.person_ctry_code, ' ') AS person_ctry_code,
                      isnull (t1.tot_in_ctry, 1) AS tot_in_ctry,
                      isnull (t2.tot_in_patent, 1) AS tot_in_patent
                      FROM 
                      our_sample t                   
                 LEFT OUTER JOIN
           (SELECT
                     c.appln_id,
                     d.person_ctry_code,
                     COUNT (d.person_id) AS tot_in_ctry
            FROM
                      tls207_pers_appln c
                      INNER JOIN tls206_person d ON c.person_id = d.person_id 
           WHERE 
                     c.invt_seq_nr > 0
           GROUP BY c.appln_id, person_ctry_code
           ) t1 ON t.appln_id = t1.appln_id 
                     LEFT OUTER JOIN
           (SELECT
                     appln_id, MAX(invt_seq_nr) AS tot_in_patent
            FROM 
                    tls207_pers_appln
            GROUP BY appln_id HAVING MAX(invt_seq_nr) > 0
            )  t2 ON t.appln_id = t2.appln_id
            ) our_sample_with_country        
GROUP BY person_ctry_code
ORDER BY SUM(tot_in_ctry/tot_in_patent) DESC;
What is the source document for the code ?
Maybe you could restrict the analysis to persons with no missing country data ? (which can indeed yield biased estimates as the missing data is not random)
Weighting to compensate for missing data is a possible solution, what approach to use is up to researchers (and the tools they have available...)
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


aroh
Posts: 4
Joined: Wed Feb 26, 2020 8:59 pm

Re: PATSTAT Online Y02E SQL query

Post by aroh » Fri Feb 28, 2020 3:03 pm

Hello,
Thank you for the information. I have used the paper titled "An Introduction to the Patstat Database with Example Queries", retrieved from https://onlinelibrary.wiley.com/doi/ful ... 8462.12073


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: PATSTAT Online Y02E SQL query

Post by EPO / PATSTAT Support » Fri Feb 28, 2020 4:35 pm

aroh wrote:
Fri Feb 28, 2020 3:03 pm
Hello,
Thank you for the information. I have used the paper titled "An Introduction to the Patstat Database with Example Queries", retrieved from https://onlinelibrary.wiley.com/doi/ful ... 8462.12073
No, the weighing in this code is not a sollution for missing person country codes.
Rather have a look at this - chapter 3.4.
Attachments
new patent indicator.pdf
(238.54 KiB) Downloaded 204 times
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Post Reply