## NACE2 codes weights calculation

Here you can post your opinions, ask questions and share experiences on the PATSTAT product line. Please always indicate the PATSTAT edition (e.g. 2015 Autumn Edition) and the database (e.g. PATSTAT Online, MySQL, MS SQL Server, ...) you are using.

Posts: 4
Joined: Mon Jul 25, 2022 10:21 am

### NACE2 codes weights calculation

Hi all,

I would like to know how you calculate the weights assigned to NACE codes in the TLS229_APPLN_NACE2 table.

Regards,
Aris

EPO / PATSTAT Support
Posts: 342
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

### Re: NACE2 codes weights calculation

Hello Aris,
concordance-table-between-ipc-and-nace2-9756
And in the PATSTAT Data Catalog.
DataCatalog_Global_v5.19.pdf

Geert Boedt
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org

Posts: 4
Joined: Mon Jul 25, 2022 10:21 am

### Re: NACE2 codes weights calculation

Thank you Geert for the reference.

The referred documents do not explain how NACE codes are applied to applications and how their weight is calculated.
Take, for example, the appln_id 437252193. The following query returns 9 NACE2 codes with different weights. Their weights sum to 1. It looks like an algorithm assigned those weights. Is there any document describing the algorithm/procedure?

Code: Select all

``select * from tls229_appln_nace2 where appln_id = 437252193``

Code: Select all

``````appln_id    nace2_code  weight
437252193   "20.1"      0.012931035
437252193   "26.2"      0.10344828
437252193   "26.3"      0.79310346
437252193   "26.4"      0.004310345
437252193   "26.5"      0.021551725
437252193   "26.51"     0.004310345
437252193   "26.52"     0.004310345
437252193   "26.7"      0.012931035
437252193   "28.23"     0.04310345
``````
Regards,
Aris

EPO / PATSTAT Support
Posts: 342
Joined: Thu Feb 22, 2007 5:33 pm
Contact:

### Re: NACE2 codes weights calculation

The computation of the WEIGHT value is more complicated than the simple ratio used for a technology field because each field get a weight according to a number of relations to the application:
• First, per application, its IPC symbols (TLS209_APPLN_IPC.IPC_CLASS_SYMBOL matching TLS902_IPC_NACE2.IPC) linked to a NACE2 code (TLS902_IPC_NACE2.NACE2_CODE) get the default weight defined in TLS902_IPC_NACE2.NACE2_WEIGHT (usually 1.0)
• Some weights are then set to null when associated on the same application to some other IPCs (see TLS902_IPC_NACE2.NOT_WITH_IPC), unless also associated with some other IPCs (see TLS902_IPC_NACE2.UNLESS_WITH_IPC).
• Finally, per application, this weight is converted to a ratio between the NACE2 code weight computed par NACE2 code and the sum of all weight per application in order to get a value between 0 and 1.0 proportional (the sum per application of all TLS229_APPLN_NACE2.WEIGHT must always equals 1.0) - as you rightly stated.-
Sample dummy case:
- An application (id="123") has "A01N 1/00", "A01N 1/02" and "A22B 1/00" IPC symbols.
- Those are related respectively to NACE2 code "20.2" (with weight "1"), "20.2" (with weight "1"), and "28.9" (with weight "1")
- The TLS230_APPLN_TECHN_FIELD table will store ("123", "20.2", "0,6666667") and ("123", "28.9", "0,3333333") rows, as NACE2 code "20.2" represents 2/3 of all weights ((1 +1) / (1 +1 +1)).

Because the concordance table is not strictly 4 characters based, a single IPC code could generate 2 rows.
e.g. symbol "A61K 8/00" in [tls209_appln_ipc] could be linked to "A61K" and "A61K 8". We decided that only the most detailed IPC code will be taken into account. In this case "A61K 8". (The link to A61K would then be discarded.)
This was according to us the best and most representative way to implement the methodology and according to my knowledge it has been accepted by the PATSTAT community. But nothing stops users from developing their own interpretation of the whitepaper. All the reference data is available in the PATSTAT tables.
PATSTAT Support Team
EPO - Vienna
patstat @ epo.org