Applications in B25J/Y10S but not B60W/B60K/B60L

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Joachim
Posts: 7
Joined: Tue May 23, 2017 9:56 am

Applications in B25J/Y10S but not B60W/B60K/B60L

Post by Joachim » Tue May 29, 2018 8:56 am

Dear PATSTAT user forum,

I am looking into robot technologies and am starting out my analysis in the B25J and Y10S classes to start things off.

However, I am already now aware that for my purposes I am not interested in robotics technologies in relation to cars. Accordingly, I woulk like to eliminate the data subset being classified both in B25J/Y10S and one of B60W, B60K, B60L and B60Y.
For this purpose I used the NOT IN operator but when I use the statistics option in patstat online and select the parameters as CPC or IPC vs Applicant, there are still Bubbles corresponding to the classes I tried to eliminate (B60W, B60K, B60L and B60Y).

The database used is Patstat online spring 2018 edition and my search query is the following:

Select
tls201_appln.earliest_filing_year,
tls201_appln.appln_id

FROM tls201_appln
JOIN tls207_pers_appln
ON tls201_appln.appln_id = tls207_pers_appln.appln_id
JOIN tls206_person
ON tls206_person.person_id = tls207_pers_appln.person_id
JOIN tls226_person_orig
ON tls206_person.person_id = tls226_person_orig.person_id
JOIN tls209_appln_ipc
ON tls209_appln_ipc.appln_id = tls201_appln.appln_id
JOIN tls224_appln_cpc
ON tls224_appln_cpc.appln_id = tls201_appln.appln_id

WHERE
IPR_type = 'PI'
And tls201_appln.earliest_filing_year between '2010' and '2020'
AND (LEFT(ipc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'Y10S')
AND LEFT(ipc_class_symbol, 4) not in ('B60W','B60L','B60K')
AND LEFT(cpc_class_symbol, 4) not in ('B60W','B60L','B60K', 'B60Y')

GROUP BY
tls201_appln.earliest_filing_year,
tls201_appln.appln_id

ORDER by
tls201_appln.earliest_filing_year


I know that the 'NOT IN' reduced the total search hits, but I must either be missing something or the patstat statistics panel might not belimited to the particular dataset returned by the search query?

Can anyone offer any insights as to what might have gone wrong?

Best regards,
Joachim


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Applications in B25J/Y10S but not B60W/B60K/B60L

Post by EPO / PATSTAT Support » Tue May 29, 2018 9:58 am

Hello Joachim,

your query does not do what I think you want it to do.
Basically, what you intend to do is to exclude a subset from your applications that are defined by the first conditions "AND (LEFT(ipc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'Y10S') "

So you have to exclude a subset of applications, and not a subset of IPC or CPC symbols from the result list.
There are various ways of doing this. Using SQL "EXISTS" would be one, but I personally prefer the "IN" clause at application level. You also should keep in mind that when you join CPC as well as IPC tables, that conditions on both tables need to be fulfilled- meaning that your result list will have only applications that have CPC as well as IPC codes.
I think that what you want is that or the IPC or the CPC conditions are fulfilled. Therefore you should use a LEFT JOIN to link the classification tables.
Here is a re-worked query. Have a look at it, and see if it does what you want. (There might be other solutions.)

Code: Select all

Select
 tls201_appln.earliest_filing_year, 
 tls201_appln.appln_id

 FROM tls201_appln JOIN tls207_pers_appln
	ON tls201_appln.appln_id = tls207_pers_appln.appln_id
 JOIN tls226_person_orig
	 ON tls207_pers_appln.person_id = tls226_person_orig.person_id
 left JOIN tls209_appln_ipc
	ON tls209_appln_ipc.appln_id = tls201_appln.appln_id
 left JOIN tls224_appln_cpc 
	ON tls224_appln_cpc.appln_id = tls201_appln.appln_id

 WHERE
 IPR_type = 'PI'
 And tls201_appln.earliest_filing_year between '2010' and '2020'
 AND (LEFT(ipc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'Y10S') 
 
 
 AND tls201_appln.appln_id not in (Select appln_id from  tls209_appln_ipc where LEFT(ipc_class_symbol, 4) in ('B60W','B60L','B60K'))
 AND tls201_appln.appln_id not in (Select appln_id from  tls224_appln_cpc where LEFT(cpc_class_symbol, 4) in ('B60W','B60L','B60K', 'B60Y'))

 GROUP BY
 tls201_appln.earliest_filing_year,
 tls201_appln.appln_id

 ORDER by
 tls201_appln.earliest_filing_year
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Joachim
Posts: 7
Joined: Tue May 23, 2017 9:56 am

Re: Applications in B25J/Y10S but not B60W/B60K/B60L

Post by Joachim » Wed May 30, 2018 11:46 am

Thank you for the quick reply,

Your assumptions are correct, I am indeed interested in documents classified in either IPC B25J or CPC B25J/Y10S.

The re-worked query worked like a charm. It did not change the top 20 Applicants significantly, but certainly rearranged their internal rankings.

At the moment I am exploring my options, so I will likely change post several other questions along the way.

But until then, thank you again for your help.


Joachim
Posts: 7
Joined: Tue May 23, 2017 9:56 am

Re: Applications in B25J/Y10S but not B60W/B60K/B60L

Post by Joachim » Mon Jun 04, 2018 10:14 am

Ok, so after having played around a little with the query and specifying the Y10S to Y10S 901 (to exclude all the irrelevant US tech classes) I would like to take a look at geographical and Industrial sector distribution of robotics technology.
Namely, I would like to examine how the robotics industry in DK matches up against the world and if there are differences in filing patterns and industrial sector distribution.

For that purpose I am including NACE codes as well as HAN_NAME. I am unsure though how much I can trust the HAN_NAME as there appears to be quite a few NULL hits in the HAN_NAME groupings. Are there other/better options to group by applicant name?

Also, I would appreciate it if anyone could look through my query to see if the query itself is actually doing what I want it to do?


Select top 100
tls902_IPC_NACE2.nace2_descr,
count(distinct tls201_appln.appln_id)
--count(distinct tls201_appln.inpadoc_family_id)

FROM tls201_appln
left JOIN tls207_pers_appln
ON tls201_appln.appln_id = tls207_pers_appln.appln_id
left JOIN tls226_person_orig
ON tls207_pers_appln.person_id = tls226_person_orig.person_id
left JOIN tls206_person
ON tls207_pers_appln.person_id = tls206_person.person_id
LEFT JOIN tls229_appln_nace2
ON tls201_appln.appln_id = tls229_appln_nace2.appln_id
LEFT JOIN tls902_ipc_nace2
ON tls229_appln_nace2.nace2_code = tls902_ipc_nace2.nace2_code
left JOIN tls209_appln_ipc
ON tls209_appln_ipc.appln_id = tls201_appln.appln_id
left JOIN tls224_appln_cpc
ON tls224_appln_cpc.appln_id = tls201_appln.appln_id

WHERE
IPR_type = 'PI'
--AND tls201_appln.appln_auth = 'DK'
AND tls226_person_orig.person_ctry_code = 'DK'
AND tls207_pers_appln.applt_seq_nr >= 1
And tls201_appln.earliest_filing_year between '2000' and '2020'
AND (LEFT(ipc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 8) = 'Y10S 901')
--AND tls201_appln.appln_id not in (Select appln_id from tls209_appln_ipc where LEFT(ipc_class_symbol, 4) in ('B60W','B60L','B60K'))
--AND tls201_appln.appln_id not in (Select appln_id from tls224_appln_cpc where LEFT(cpc_class_symbol, 4) in ('B60W','B60L','B60K', 'B60Y'))

GROUP BY
tls902_IPC_NACE2.nace2_descr


ORDER by
count(distinct tls201_appln.appln_id) DESC
--count(distinct tls201_appln.inpadoc_family_id) DESC


-- tls201_appln.earliest_filing_year,


Also eventually I will be looking into which companies are actually filing in DK. However as DK is part of the EP cooperation it would not be particularly accurate to just look for appln_auth = ‘DK’ as I would miss out on all the EP applications validated in DK, is that correct?
Is there a way for me to include the EP’s that are validated in DK?
And is it possible to write a query that combines the validated EP’s and applications filed in DK into one result?

So to sum up;

1) Is there an alternative to HAN_NAME applicant harmonization that might produce fewer null results?
2) Is the query above suitable for recovering the number of applications with a DK applicant within B25J and Y10S 901 distributed across NACE sectors?
3) Is there a way to combine a count of applications filed in DK and a count of applications validated in DK through an EP application into one result?

The database used is Patstat online spring 2018 edition.

Thanks in advance,
Joachim


EPO / PATSTAT Support
Posts: 425
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

Re: Applications in B25J/Y10S but not B60W/B60K/B60L

Post by EPO / PATSTAT Support » Fri Jun 15, 2018 9:19 am

Hello Joachim,
Your query seems fine to me as such. But I am not sure why you join with tls226 as well as 206 ? TLS226 is only useful for analysis where you need to have the "address data" more tagged in detail - for as far as such data is available.-
Keep in mind that the query produces a very general count, across all patent authorities. So it gives you absolutely no granularity based on application authority or filing year (and there will be nothing for 2018-2020).
You did not use the weight factors, which is fine as long as you keep in mind that the total numbers of applications are not the totals you will see in the result. But it gives a clear idea (unbiased by the number of assigned IPC codes) on how the industrial sectors compare to each other.

On the HAN names, you should normally not get any NULL values because all applicant names have a HAN name. I added a sample query (based on yours) to illustrate this for other readers.
You can easily change HAN_NAME through PSN, DOS_STD_NAME or the original PERSON_NAME to see the differences.

Code: Select all

Select top 100
han_name,
 count(distinct tls201_appln.appln_id)
FROM tls201_appln 
 left JOIN tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
 left JOIN tls906_person ON tls207_pers_appln.person_id = tls906_person.person_id 
 LEFT JOIN tls229_appln_nace2
 ON tls201_appln.appln_id = tls229_appln_nace2.appln_id
 LEFT JOIN tls902_ipc_nace2
 ON tls229_appln_nace2.nace2_code = tls902_ipc_nace2.nace2_code
 left JOIN tls209_appln_ipc
 ON tls209_appln_ipc.appln_id = tls201_appln.appln_id
 left JOIN tls224_appln_cpc 
 ON tls224_appln_cpc.appln_id = tls201_appln.appln_id
 WHERE
 IPR_type = 'PI'
 AND tls906_person.person_ctry_code = 'DK'
 AND tls207_pers_appln.applt_seq_nr >= 1
 And tls201_appln.earliest_filing_year between '2000' and '2020'
 AND (LEFT(ipc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 8) = 'Y10S 901') 
GROUP BY
 han_name
ORDER by
 count(distinct tls201_appln.appln_id) DESC
About missing out on EP granted applications that have entered DK national phase.
(Or double counting application.) If you don't limit your search by restricting counts to certain authorities, then you will always have (a lot of) double counts.
It is not wrong as such, but it takes into account the "geographical" coverage of the protection.
Imagine a DK applicant, filing a priority in Denmark, followed by a subsequent PCT filing at WIPO, followed by an EP filing (based on the PCT), followed by a granted patent, that enters national phase back in Denmark, and gets a new publication (with kind code T) --> the "invention" will be counted 4 times.
The double counting can always be avoided by counting families instead of applications. (see attached excel)

About the "undercounting" granted patents, this becomes even more complicated because patent authorities have their own (sometimes changing) rules on what will be "printed" or just "registered" once they have a granted EP or PCT patent. To complicate matters, one also has to take into account the "London agreement"... resulting in "not being printed unless...."

There is no short cut approach to count granted applications that will work for all countries, one has to look at country by country.
For granted EP and PCT patents in Denmark: Denmark will assign it's own application number / kind code combination to all granted EP and PCT applications.
So you can simply use " granted = 1 and appln_auth = 'DK' " as criteria.
See excel sheet for examples on the names and "granted" patents.
DK_robots.xlsx
(21.86 KiB) Downloaded 241 times
There is a short cut approach to count "families", by counting all families that have a granted "EP" applications for which renewal fees have been paid, as well as all the families that contain a "granted" application at national level. Here is the same example but for GB, if you run it, then you will see that the list contains granted EP applications for which renewal fees have been paid in GB, but you do not find any GB applications belonging to the same family. (because there is no available data) If you would use the same approach as for DK, then you would miss out on all gratend EP applications validated in the GB!

Code: Select all

Select distinct tls201_appln.appln_id,appln_auth,[appln_nr],[appln_kind],[appln_filing_date],[appln_nr_epodoc]
      ,[appln_nr_original],[ipr_type],[receiving_office],[internat_appln_id],[int_phase]
      ,[reg_phase],[nat_phase],[granted],[docdb_family_id],[inpadoc_family_id], fee_country
FROM tls201_appln 
 left JOIN tls207_pers_appln on tls201_appln.appln_id = tls207_pers_appln.appln_id
 left JOIN tls906_person ON tls207_pers_appln.person_id = tls906_person.person_id 
 LEFT JOIN tls229_appln_nace2
 ON tls201_appln.appln_id = tls229_appln_nace2.appln_id
 LEFT JOIN tls902_ipc_nace2
 ON tls229_appln_nace2.nace2_code = tls902_ipc_nace2.nace2_code
 left JOIN tls209_appln_ipc
 ON tls209_appln_ipc.appln_id = tls201_appln.appln_id
 left JOIN tls224_appln_cpc 
 ON tls224_appln_cpc.appln_id = tls201_appln.appln_id
 left join tls231_inpadoc_legal_event on tls201_appln.appln_id = tls231_inpadoc_legal_event.appln_id
 WHERE
 IPR_type = 'PI'
 AND tls906_person.person_ctry_code = 'GB'
 AND tls207_pers_appln.applt_seq_nr >= 1
 And tls201_appln.earliest_filing_year between '2000' and '2020'
 AND (LEFT(ipc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 4) = 'B25J' OR LEFT(cpc_class_symbol, 8) = 'Y10S 901') 
 and granted = 1	
 and (appln_auth = 'GB' or (appln_auth = 'EP' and fee_country = 'GB' and event_code = 'pgfp'))
 order by docdb_family_id
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org


Joachim
Posts: 7
Joined: Tue May 23, 2017 9:56 am

Re: Applications in B25J/Y10S but not B60W/B60K/B60L

Post by Joachim » Tue Jun 19, 2018 8:22 am

Thank you very much for the very in-depth reply.

Regarding the query you are absolutely that there is no reason for joining both tls226 and tls206. I somehow missed that.
I do not really expect any results for 2018-2020, but just set a nice round number in the near future as upper limit. I am also a little bit wary about results from late 2016-2017 keeping the grace period of 18 months and supply of data from various sources for the database in mind.

As for the NACE codes, the idea was to examine which secondary industries, robot technologies are integrated in, or more specifically, being developed for integration in. For that purpose I wanted to avoid using the IPC weighting.

Your thoughts on extracting information on granted patents and granted EP’s is going to be extremely helpful and will save at least a couple of nights worth of sleep I think.

Thank you for your assistance!

Best Regards
Joachim


Post Reply