My usual approach (=setting up intermediate tables) will unfortunately not work in PATSTAT Online.
Lucky you though, in PATSTAT Online the tls206 table has been enriched with EEE-PPAT and OECD HAN standard names. Take your pick.
To narrow down on applicant names, you could use tls207 `applt_seq_nr`>0... but there are some patents that list the inventor as the applicant as well, so instead of using it as a search criteria just add it to your SELECT statement.
Keep in mind that while the availability of inventor information on average might not be so good, certain patents have a higher quality than others. So for example a European filing might have all the information, while some subsequent filing in the middle of nowhere will have no information. And this brings us to the concept of patent families. The same invention, but filed in various countries. Basically only one of the patents in the whole family needs to have detailed information. And if there is no information at all? Then you could simply assume that the inventor did not list his country because it is the same as the country of the applicant.
Let's assume that we've decided that Microsoft Corporation is doc_std_name_id 2292. There are actually more IDs that match Microsoft. Let's take 4537916 as well. You probably need to use a subquery. Running the query on tls209 -without distinct- gave me 117,929 results btw. Even after using distinct there are still far too many appln_ids to use WHERE `appln_id` IN (1, 2, 3, 4, ...). See below for an example.
SELECT SQL_NO_CACHE *
FROM `tls207_pers_appln`
INNER JOIN `tls206_person` ON `tls207_pers_appln`.`person_id` = `tls206_person`.`person_id`
WHERE `tls207_pers_appln`.`appln_id` IN
(SELECT DISTINCT `tls209_appln_ipc`.`appln_id`
FROM `tls209_appln_ipc`
INNER JOIN `tls207_pers_appln` ON `tls209_appln_ipc`.`appln_id` = `tls207_pers_appln`.`appln_id`
INNER JOIN `tls206_person` ON `tls207_pers_appln`.`person_id` = `tls206_person`.`person_id`
WHERE `doc_std_name_id` IN ( 2292, 4537916 ) AND `ipc_class_symbol` LIKE "G06F%");
The first 10 rows already look funny. Six inventors and they're all the same? It's interesting to note that person_id 263 is blank, except for doc_std_name_id 2041689. And that's a blank name. Shouldn't be in the database if you ask me. That's an error in their algorithm to distinguish persons. They should create a new person_id for each fully blank person, not assume that all blanks are equal.
Code: Select all
+-----------+-----------+--------------+-------------+-----------+------------------+-----------------+-----------------------+-------------------------------------+
| person_id | appln_id | applt_seq_nr | invt_seq_nr | person_id | person_ctry_code | doc_std_name_id | person_name | person_address |
+-----------+-----------+--------------+-------------+-----------+------------------+-----------------+-----------------------+-------------------------------------+
| 263 | 304883 | 0 | 1 | 263 | | 2041689 | | |
| 263 | 304883 | 0 | 2 | 263 | | 2041689 | | |
| 263 | 304883 | 0 | 3 | 263 | | 2041689 | | |
| 263 | 304883 | 0 | 4 | 263 | | 2041689 | | |
| 263 | 304883 | 0 | 5 | 263 | | 2041689 | | |
| 263 | 304883 | 0 | 6 | 263 | | 2041689 | | |
| 263 | 419191813 | 2 | 0 | 263 | | 2041689 | | |
| 2682 | 1224 | 1 | 0 | 2682 | US | 2292 | MICROSOFT CORPORATION | One Microsoft Way,Redmond, WA 98052 |
| 2682 | 95326 | 1 | 0 | 2682 | US | 2292 | MICROSOFT CORPORATION | One Microsoft Way,Redmond, WA 98052 |
| 2682 | 124940 | 1 | 0 | 2682 | US | 2292 | MICROSOFT CORPORATION | One Microsoft Way,Redmond, WA 98052 |
+-----------+-----------+--------------+-------------+-----------+------------------+-----------------+-----------------------+-------------------------------------+
10 rows in set (2.03 sec)
If you want me to check some things in PATSTAT October 2014, send me an email (
info@daigu.nl).