Common Workflows#
Importing the Combined Nomenclature codes#
This functionality is built into pyst_client. py-semantic-taxonomy must be deployed and available:
from pyst_client.cn import CombinedNomenclatureLoader
CombinedNomenclatureLoader(
year=<year>,
api_key=<pyst-api-key>,
host=<host>,
sample=True or False
).write()
Where:
yearis a integer, like2024. If only installing the sample data, this should be 2024 or 2025 (you can also run both one after the other).api_keyis the write-enabled API key (PyST_auth_token) forpy-semantic-taxonomy.hostis the URL thatpy-semantic-taxonomyis running at, e.g. "http://localhost:8000" if running locallysampleis a boolean flag on whether only a sample of the available data should be imported. The full import takes more than an hour.
If you are testing locally using the default Docker containers for Postgres and Typesense, then the CombinedNomenclatureLoader can take these values:
CombinedNomenclatureLoader(
year=2024,
api_key="abc123", # Default API key; adjust if needed
host="http://127.0.0.1:8000", # Default Uvicorn port; adjust if needed
sample=True
).write()
Creating a new taxonomy#
PyST stores concept schemes, concepts, and relationships between concepts all in different places, so the creation of these objects needs to happen in a defined order:
- First, decide on the URL pattern you will use for concept schemes and concepts. A reasonable pattern is
https://<base_url>/<concept-scheme-notation>/<concept-notation>. Many of the EU semantic taxonomies use this patter, e.g.http://data.europa.eu/xsp/cn2025/970300000080. When following this pattern, the concept scheme notation should be different from version to version or year to year. - Second, create the concept scheme.
- Third, create the concepts. Although it is possible to provide relationship information among concepts inside the individual concept documents, this is recommended against, as concept creation requests are normally submitted in parallel, and we can't run graph integrity checks against unknown graph nodes.
- Finally, define relationships among concepts. It's best if each request to
relationshipscreates one relationship, and these can be chunked to run in parallel (asyncio doesn't seem to like it when thousands of tasks are submitted at once - its better to do 20 or 50 at a time).
Updating a Concept or a Concept relationship#
Best practice is to always record the who, what, why, and when of changes, which can be done by adding change, editorial, or history notes to the Concept.
Depending on your institutions review practices, you could also require that change suggestions include changing the Concept.status to draft, and that changes are only accepted after review, when the status could be changed to accepted.
Creating a new Correspondence#
Creating new Correspondence and ConceptAssociation objects should also follow a set order:
- First, decide on the URL pattern you will use, especially for
ConceptAssociations, which can be N-to-1. It would be nice if the URLs forCorrespondenceswere human readable, butConceptAssociationscould use computer-generated ids. - Second, create the
Correspondenceobject. This doesn't have any references toConceptAssociationobjects. - Third, create the
ConceptAssociationobjects. They don't have any information on membership in aCorrespondence. - Finally, link the
ConceptAssociationobjects to theCorrespondenceusing themade_ofendpoint. This can be done in one request as this endpoint takes a list of inputs.