Two counts for different 'where' statements

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

cako
Posts: 8
Joined: Thu Feb 11, 2016 7:08 pm

Two counts for different 'where' statements

Post by cako » Fri Feb 12, 2016 5:01 pm

Hi,
In essence I would like to know if it is possible in PATSTAT Online to create two counts, where each represents their own set of where statements?

As an example, I would like to investigate Vestas patents after 2010, here I would like to (in a single run) count the number of unique families in general and the number of families which have a granted document.
In the code below I have tried doing it by creating two tls201 tables, however, no matter what I do they get connected such that one influences the other.
Do you have any ideas or suggestions?

Thanks,
Carsten

Code: Select all

SELECT Count(DISTINCT tls201_appln.docdb_family_id)   AS families, 
       Count(DISTINCT tls201_appln_b.docdb_family_id) AS granted_families 
FROM   tls201_appln 
       JOIN tls207_pers_appln 
         ON tls201_appln.appln_id = tls207_pers_appln.appln_id 
       JOIN tls206_person 
         ON tls207_pers_appln.person_id = tls206_person.person_id 
       JOIN tls207_pers_appln AS tls207_pers_appln_b 
         ON tls206_person.person_id = tls207_pers_appln_b.person_id 
       JOIN tls201_appln AS tls201_appln_b 
         ON tls207_pers_appln_b.appln_id = tls201_appln_b.appln_id 
WHERE  tls201_appln.earliest_filing_year >= '2010' 
       AND tls201_appln.ipr_type = 'PI' 
       AND tls201_appln_b.earliest_filing_year >= '2010' 
       AND tls201_appln_b.ipr_type = 'PI' 
       AND tls201_appln_b.granted = '1' 
       AND ( person_name LIKE '%vestas%' )   


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: Two counts for different 'where' statements

Post by mkracker » Wed Feb 17, 2016 8:51 am

It is possible to create 2 totally independent counts in one query, thanks to T-SQLs (which is Microsoft's SQL dialect) ability to have sub queries not only in the WHERE or FROM clause, but also in the SELECT clause.
(Just out of curiosity: Why would you like to have both counts in the same query, and not in 2 different queries?)

Here is how it might look like:

Code: Select all

SELECT  
    (select count(distinct docdb_family_id)
    from tls201_appln a 
    join tls207_pers_appln pa on a.appln_id = pa.appln_id
    join tls206_person p on pa.person_id = p.person_id
    where earliest_filing_year >= 2010 
    and ipr_type = 'PI'
    and pa.applt_seq_nr > 0  -- only applicants
    and person_name LIKE '%vestas%' 
    ) AS families,
    
    (select count(DISTINCT docdb_family_id)
    from tls201_appln a 
    join tls207_pers_appln pa on a.appln_id = pa.appln_id
    join tls206_person p on pa.person_id = p.person_id
    where earliest_filing_year >= 2010 
    and ipr_type = 'PI'
    and pa.applt_seq_nr > 0  -- only applicants
    and person_name LIKE '%vestas%'
    and granted = 1
    ) AS granted_families
Some notes:
  • I added a condition to make sure that only applicants are searched
  • Numerical values need not be quoted
  • The solution I proposed Is IMO easier to understand, but bloated (almost the same logic repeated in both sub queries). Run time increases gracefully with increasing numbers or size of family, while the run time of your query with multiple joins seem to increase much faster. Just try "sony" instead of "vestas"
  • How to best retrieve persons has been discussed in recent posts. You may need to analyse names first, probably using one of the standardized names.
  • Because you analyse a rather recent time period (>2010), there may many families where no application yet has been granted, but will be in the near future.
-------------------------------------------
Martin Kracker / EPO


cako
Posts: 8
Joined: Thu Feb 11, 2016 7:08 pm

Re: Two counts for different 'where' statements

Post by cako » Thu Feb 18, 2016 9:38 am

Thank you very much for your reply, it was something along those lines I had been looking for.
I had however, hoped that I avoided simply copying and pasting it, as I thought that it would double the processing time.

The reason why I would want it in one Query is because I need to go through fairly many Companies (~ 20) - and exporting the results in patstat online is a fairly tedious process, so I would prefer not exporting 40 times.

Also thanks for your extra tips, I was not aware that you could limit the results to only applicants using the applt_seq_nr, and I will look into the names.


Post Reply