EXPORTING DATA TO EXCEL

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.
Post Reply

Bernardo
Posts: 1
Joined: Sat Aug 27, 2016 6:21 pm

EXPORTING DATA TO EXCEL

Post by Bernardo » Sat Aug 27, 2016 6:49 pm

Hello, I am using the 2016 autumn version of PATSTAT, and as it is the first time I use this software or SQL language, a few issues have arisen.

A couple of questions here:
1) I am trying to find a way of exporting tables to Excel in order to create graphics and manipulate data as I wish. How may I do such export?

2) I am trying to extract a timeline of patent publications worldwide according to a certain IPC classification. I have counted patent publications which had the same publication day for acquiring a table containing patent publications, and the number of patent publications in a given day.

PATSTAT requires users to select appln_id in order to open the "results" and "statistics windows". If I select appln_id in the query, the sum of patent publication will not succeed, as there are many different patent appln_id for the same publn_date.
Is there any query to solve this problem?
I post hereby the query I have written:

select distinct a.appln_id,a.publn_date, count (a.publn_date)
from tls211_pat_publn a, tls209_appln_ipc b

where a.appln_id = b.appln_id
and publn_date <= '2015-12-31' and publn_date >= '2010-01-01'and publn_date<> '9999-12-31'
and (ipc_class_symbol = 'E04D 13/18'
or ipc_class_symbol = 'F26B 3/28'
or ipc_class_symbol like 'F03G 6/%'
or ipc_class_symbol like 'F24J 2/%'
or ipc_class_symbol ='C02F 1/14')
group by a.publn_date, a.appln_id
order by publn_date asc

Now the query which I believe gave me the best results, but without permission to access the statistics window:

select a.publn_date, count (a.publn_date)
from tls211_pat_publn a, tls209_appln_ipc b

where a.appln_id = b.appln_id
and publn_date <= '2015-12-31' and publn_date >= '2010-01-01'and publn_date<> '9999-12-31'
and (ipc_class_symbol = 'E04D 13/18'
or ipc_class_symbol = 'F26B 3/28'
or ipc_class_symbol like 'F03G 6/%'
or ipc_class_symbol like 'F24J 2/%'
or ipc_class_symbol ='C02F 1/14')
group by a.publn_date
order by publn_date asc

Thank you very much


mkracker
Posts: 120
Joined: Wed Sep 04, 2013 6:17 am
Location: Vienna

Re: EXPORTING DATA TO EXCEL

Post by mkracker » Tue Sep 20, 2016 8:36 pm

Hi Bernardo,

as a beginner of SQL you have done a great job. In a statistical analysis you typically want to count something, like applications published on a certain date or year. There are 2 approaches to do that and both require different queries which are - except for the SELECT and a GROUP BY clause, very much alike.

1) Do the counting in your query:
In your query you did this with

Code: Select all

SELECT a.publn_date, COUNT (a.publn_date)
...
GROUP BY a.publn_date
The disadvantage is that you do not see the details of the data but just the aggregation, so you might probably count something you did not expected and wanted (like utility models; or patents which have not been granted; or ..). You usually also cannot post-process the data, because due to the aggregation relevant details are already lost. So you might want to try this (first):

2) Get the relevant data first, do the counting later.
Leave the GROUP BY aside by now and put every data you need to verify the correctness of your query in the SELECT clause. Usually you will also include the APPLN_ID to uniquely identify an application. Now you have multiple options to do the aggregation:

a) By adding the aggregation in your query. See point 1) above.

b) By downloading the data and doing the analysis offline in your favourite analysis or visualisation tool. PATSTAT Online provides these 2 download options:
  • Download the result of your query, exactly as you see it in the Table Window. You get a flat table, which you can e. g. further manipulate or visualise in Excel etc.. This download works with every query (size permitting).
  • Download the result with the "Download Subset" feature of PATSTAT Online. Here the APPLN_ID must be part of your query result. On the other hand, any other attribute is optional. Based on your list of APPLN_IDs you can download the records of any table which is related to these APPLN_IDs. In fact, you download multiple tables (i.e. a database) with consistent data. As a result, you get either a MS Access database or set of flat CSV files which you again can import into your preferred tool..
c) Create charts in the Statistics Window of PATSTAT Online: Here the same requirement exist as for the "Subset Download": The APPLN_ID must exist, but all other data is automatically fetched by PATSTAT Online - you do not have to include them in your query.

Best of luck with your project!
-------------------------------------------
Martin Kracker / EPO


Post Reply