Shadow Concepts in PoolParty

This section contains a short guide on how to use the Shadow Concepts functionality in PoolParty.

In relation to the corpus management and term extraction functions in PoolParty you can also use the Shadow Concepts function.

These topics presuppose that you are familiar with the basic Corpus Management functions and ideas in PoolParty.

The Shadow Concepts function in PoolParty is based on the idea that co-occurrences of concepts can help to further refine extraction results for corpus documents. It makes sure that search results in search applications will be more reliable and cover a broader range of documents. You can use the interface as described in this section to check on the results of the co-occurrences calculation.

The co-occurrences calculation is basis for all Shadow Concepts functions you can use.

The idea of Shadow Concepts can be described as follows:

  • they are concepts of your thesaurus.
  • during entity extraction in the corpus concepts A, B and C often co-occur with concept D.
  • D will be suggested as a shadow concept.
    • This means, a user might search for D in a search application.
    • Yet, a document contains only the concepts A, B and C.
    • This document still will be listed in the search results, since D is a 'shadow concept'.


An example outlining this function would be this: texts that deal with Peru and the Inca culture, would contain the term 'Machu Picchu' often in close proximity to 'Peru'. The concept 'Machu Picchu' will be suggested as a 'shadow concept'.

Searching for the concept 'Machu Picchu', documents that only contain the concept 'Peru' will still be found and listed, since 'Machu Picchu' is their shadow concept.

The schematic representation of the co-occurrence calculation for 'Machu Picchu' looks like this, with a short text example:

