Installing Superb Data Kraken¶
The Superb Data Kraken consists of multiple services which combine to the most amazing Data Kraken
Note
- These instructions assume, that the prerequesits are already met.
- You might want to manage your configuration globally, as many properties are required by various services.
Organizationmanager¶
To get an up and running instance of the organizationmanager the following steps are required:
- provide a PostgreSQL-database, configure via the following properties:
- in config-map.yml adjust
$(DATABASE_SERVER)
and$(ORGAMANAGER_DATABASE)
accordingly - in provided-secrets.yml adjust
DATABASE_PASSWORD
andDATABASE_USER
(in the form of<USER>@<DATABASE_SERVER>
) accordingly
- in config-map.yml adjust
- provide a Kafka-instance with the following topic:
space-deleted
(or any other topic configured as propertyorganizationmanager.kafka.topic.space-deleted
), configure via the following properties:- in provided-secrets.yml adjust
KAFKA_SASL_JAAS_CONFIG
(in the form oforg.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://<KAFKA_BROKER>/;SharedAccessKeyName=<KEY_NAME>;SharedAccessKey=<KEY>";
) accordingly
- in provided-secrets.yml adjust
- provide the following libraries locally:
- build a Docker-image to the container-registry of your liking. We provided you with 2 options:
- with additional logging to Azure Application Insights (Dockerfile)
- and without Azure Application Insights (Dockerfile-no-appinsights)
If you choose to use the Docker-image with additional logging to Azure Application Insights, you need to provide extra properties in config-map.yml:
APP_INSIGHTS_CONNECTION_STRING
the Connection-String of your Azure Application Insights instanceAPP_INSIGHTS_INSTRUMENTATION_KEY
the Instrumentation-Key of your Azure Application Insights instance
For additional configuration, please consider these properties within the kubernetes-folder:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )REALM
the specific realm set up with the openid connect (oidc) providerCLIENT_ID
the client-id within defined realm (used for simplifying swagger-access)CLIENT_ID_CONFIDENTIAL
the id of the confidential client used for Service Account-access. This Service Account should have the following permissions:- get/update users in User-Management
- get/create/update/delete roles in User-Management
CLIENT_SECRET_CONFIDENTIAL
the secret of the confidential client used for Service Account-accessLOG_LEVEL
the logging-level for organizationmanagerCONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-imageDOMAIN
the domain organizationmanager should be available atKAFKA_BOOTSTRAP_SERVER
the Kafka-Bootstrap-Server
Storagemanager¶
To get an up and running instance of the storage-manager the following steps are required:
- provide the following libraries locally:
- build a Docker-image to the container-registry of your liking. We provided you with 2 options:
- with additional logging to Azure Application Insights (Dockerfile)
- and without Azure Application Insights (Dockerfile-no-appinsights)
If you choose to use the Docker-image with additional logging to Azure Application Insights, you need to provide extra properties in config-map.yml:
APP_INSIGHTS_CONNECTION_STRING
the Connection-String of your Azure Application Insights instanceAPP_INSIGHTS_INSTRUMENTATION_KEY
the Instrumentation-Key of your Azure Application Insights instance
For additional configuration, please consider these properties within the kubernetes-folder:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )REALM
the specific realm set up with the openid connect (oidc) providerCLIENT_ID
the client-id within defined realm (used for simplifying swagger-access)LOG_LEVEL
the logging-level for storage-managerCONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-imageDOMAIN
the domain storage-manager should be available atRESOURCE_GROUP
the resource-group in Azure the storage-principal is permitted to (see Azure Subscription)AZURE_STORAGE_CLIENT_ID
application (client) ID of the storage-principal managed application (see Azure Subscription)AZURE_STORAGE_CLIENT_SECRET
application (client) secret of the storage-principal managed application (see Azure Subscription)AZURE_TENANT_ID
the tenant-id of the Azure Subscription
Accessmanager¶
To get an up and running instance of the accessmanager the following steps are required:
- provide a Kafka-instance with the following topic:
accessmanager-commit
(or any other topic configured as propertyaccessmanager.topic.upload-complete
), configure via the following properties:- in provided-secrets.yml adjust
KAFKA_SASL_JAAS_CONFIG
(in the form oforg.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://<KAFKA_BROKER>/;SharedAccessKeyName=<KEY_NAME>;SharedAccessKey=<KEY>";
) accordingly
- in provided-secrets.yml adjust
- provide the following libraries locally:
- build a Docker-image to the container-registry of your liking. We provided you with 2 options:
- with additional logging to Azure Application Insights (Dockerfile)
- and without Azure Application Insights (Dockerfile-no-appinsights)
If you choose to use the Docker-image with additional logging to Azure Application Insights, you need to provide extra properties in config-map.yml:
APP_INSIGHTS_CONNECTION_STRING
the Connection-String of your Azure Application Insights instanceAPP_INSIGHTS_INSTRUMENTATION_KEY
the Instrumentation-Key of your Azure Application Insights instance
For additional configuration, please consider these properties within the kubernetes-folder:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )REALM
the specific realm set up with the openid connect (oidc) providerCLIENT_ID
the client-id within defined realm (used for simplifying swagger-access)LOG_LEVEL
the logging-level for accessmanagerCONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-imageDOMAIN
the domain accessmanager should be available atRESOURCE_GROUP
the resource-group in Azure the storage-principal is permitted to (see Azure Subscription)AZURE_STORAGE_CLIENT_ID
application (client) ID of the storage-principal managed application (see Azure Subscription)AZURE_STORAGE_CLIENT_SECRET
application (client) secret of the storage-principal managed application (see Azure Subscription)AZURE_TENANT_ID
the tenant-id of the Azure SubscriptionKAFKA_BOOTSTRAP_SERVER
the Kafka-Bootstrap-Server
Metadata¶
To get an up and running instance of the metadata-service the following steps are required:
- provide a Kafka-instance with the following topics:
indexing-done
(or any other topic configured as propertymetadata.topics.indexing-done-topic
) - will be triggered, once a new metadata-set is indexed - andmetadata-update
(or any other topic configured as propertymetadata.topics.metadata-update-topic
) - will be triggered, once a new metadata-set is updated configure via the following properties:- in provided-secrets.yml adjust
KAFKA_SASL_JAAS_CONFIG
(in the form oforg.apache.kafka.common.security.plain.PlainLoginModule required username="$ConnectionString" password="Endpoint=sb://<KAFKA_BROKER>/;SharedAccessKeyName=<KEY_NAME>;SharedAccessKey=<KEY>";
) accordingly
- in provided-secrets.yml adjust
- provide the following libraries locally:
- build a Docker-image to the container-registry of your liking. We provided you with 2 options:
- with additional logging to Azure Application Insights (Dockerfile)
- and without Azure Application Insights (Dockerfile-no-appinsights)
If you choose to use the Docker-image with additional logging to Azure Application Insights, you need to provide extra properties in config-map.yml:
APP_INSIGHTS_CONNECTION_STRING
the Connection-String of your Azure Application Insights instanceAPP_INSIGHTS_INSTRUMENTATION_KEY
the Instrumentation-Key of your Azure Application Insights instance
For additional configuration, please consider these properties within the kubernetes-folder:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )REALM
the specific realm set up with the openid connect (oidc) providerCLIENT_ID
the client-id within defined realm (used for simplifying swagger-access)CLIENT_ID_CONFIDENTIAL
the id of the confidential client used for Service Account-access. This Service Account should have the following permissions:- access to OpenSearch security-plugin (edit roles/rolesmappings/tenants)
- update all indices in OpenSearch
CLIENT_SECRET_CONFIDENTIAL
the secret of the confidential client used for Service Account-accessLOG_LEVEL
the logging-level for metadata-serviceCONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-imageDOMAIN
the domain metadata-service should be available atELASTICSEARCH_SERVICE
name of the kubernetes-service of the elasticsearch/opensearch-client-serviceELASTICSEARCH_SECURITY_ENDPOINT
endpoint of the elasticsearch/opensearch security-plugin (might be/_plugins/_security/api
- OpenSearch - or/_opendistro/_security/api
- Elasticsearch)
Search¶
To get an up and running instance of the search-service no explicit steps are required.
For configuration, please consider these properties within the kubernetes-folder:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )REALM
the specific realm set up with the openid connect (oidc) providerCLIENT_ID
the client-id within defined realm (used for simplifying swagger-access)LOG_LEVEL
the logging-level for searchCONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-imageDOMAIN
the domain search should be available atELASTICSEARCH_SERVICE
name of the kubernetes-service of the elasticsearch/opensearch-client-service
Ingest¶
To get an up and running instance of the metadata-service the following steps are required:
- provide an EventSource for your Kafka-topic defined in AccessManager (
accessmanager-commit
or any other topic configured as propertyaccessmanager.topic.upload-complete
) - this EventSource is referenced by ingest-Sensor:- EventSource accessmanager-commit:
accessmanager-commit.eventsource.yaml
apiVersion: argoproj.io/v1alpha1 kind: EventSource metadata: name: accessmanager-commit spec: template: affinity: nodeAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: agentpool operator: In values: - userpool container: resources: requests: cpu: 20m memory: 200Mi limits: cpu: 20m memory: 200Mi azureEventsHub: accessmanager: fqdn: sdk-eventhub-dev.servicebus.windows.net sharedAccessKeyName: name: azure-event-source key: sharedAccessKeyName sharedAccessKey: name: azure-event-source key: sharedAccessKey hubName: accessmanager-commit
- EventSource accessmanager-commit:
- build the Docker-images to the container-registry of your liking:
For additional configuration, please consider these properties within the kubernetes-folder:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )CONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-imageDOMAIN
the domain oidc-provider is available atSKIP_VALIDATE_ORGANIZATIONS
Names of organization for which the validation-tree (basicmetadata
,anonymize
,enrichment
andvalidate
) should be skippedCLIENT_ID_CONFIDENTIAL
the id of the confidential client used for Service Account-access. This Service Account should have the following permissions:- update all indices in OpenSearch
- update datasets within all spaces
CLIENT_SECRET_CONFIDENTIAL
the secret of the confidential client used for Service Account-access
Worker¶
To get an up and running instance of the workers no explicit steps are required.
For configuration, please consider these properties within the argo-folders of the respective workers:
postfix
a postfix you might want to add to your servides ( should be consistent across your installation as cluster-interal domains are predifined with this postfix )CONTAINER_REGISTRY
the container-registry that stores the Docker-imagetagVersion
the tag-version of the Docker-image
UI¶
Will be defined soon.