在Kubernetes中安裝OpenTelemetry

Posted by Polin on Sat, Dec 7, 2024

在Kubernetes中安裝OpenTelemetry

在現代微服務架構中,應用程式的可觀察性至關重要,而 OpenTelemetry 是目前業界採用的開源標準框架,用於收集分布式追蹤、指標和日誌數據。本範例將帶你快速實踐如何在 Kubernetes 中安裝 OpenTelemetry,並通過範例來演示數據收集的流程。

先決條件

在開始之前,請確保您的 macOS 環境已經具備以下軟體:

  1. Homebrew:macOS 的套件管理工具,用來安裝其他必要軟體。

  2. KinD: 一種基於 Docker 的本地 Kubernetes 集群工具,用於在本地開發與測試 Kubernetes 工作負載。 可以透過 Homebrew 安裝:

    brew install kind
    

    更多詳細說明可參考 在本機用 KinD 建立 Kubernetes

  3. Helm: Kubernetes 的應用程式包管理工具,用於簡化應用部署與管理。可以透過 Homebrew 安裝

    brew install helm
    

安裝 Cert-Manager

Cert-Manager 是 OpenTelemetry Operator 的依賴項,用於管理證書。請確保 Cert-Manager 已安裝。

以下是安裝步驟:

1. 新增 Cert-Manager 儲存庫:

helm repo add jetstack https://charts.jetstack.io

指令執行成功後會顯示以下類似訊息

"jetstack" has been added to your repositories

2. 更新 Helm Chart 儲存庫

helm repo update jetstack

指令執行成功後會顯示以下類似訊息

Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈

3. 安裝 Cert-Manager

helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true

指令執行成功後會顯示以下類似訊息

NAME: cert-manager
LAST DEPLOYED: Sat Dec  7 11:57:41 2024
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.16.2 has been deployed successfully!

In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).

More information on the different types of issuers and how to configure them
can be found in our documentation:

https://cert-manager.io/docs/configuration/

For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:

https://cert-manager.io/docs/usage/ingress/

4.驗證安裝

檢查 Cert-Manager 是否成功運行:

kubectl get pods -n cert-manager

指令執行成功後會顯示以下類似訊息

NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-b6fd485d9-58qmt              1/1     Running   0          100s
cert-manager-cainjector-dcc5966bc-782sf   1/1     Running   0          100s
cert-manager-webhook-dfb76c7bd-2sggz      1/1     Running   0          100s

安裝 OpenTelemetry Operator

OpenTelemetry Operator 用於幫助管理 Collector 和自動注入追蹤配置。

以下是安裝步驟:

1. 新增 OpenTelemetry Operator 儲存庫:

首先,將 OpenTelemetry Operator 的 Helm Chart 儲存庫加入本地 Helm 設定:

helm repo add open-telemetry \
  https://open-telemetry.github.io/opentelemetry-helm-charts

指令執行成功後會顯示以下類似訊息

"open-telemetry" has been added to your repositories

2. 更新 Helm Chart 儲存庫

確保您擁有最新的 Chart 資料:

helm repo update open-telemetry

指令執行成功後會顯示以下類似訊息

Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "open-telemetry" chart repository
Update Complete. ⎈Happy Helming!⎈

3. 安裝 OpenTelemetry Operator

使用 Helm Chart 安裝 Vault 至指定的命名空間,範例 vault:

helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
  --namespace opentelemetry-operator  \
  --create-namespace \
  --set "manager.collectorImage.repository=otel/opentelemetry-collector-k8s"

指令執行成功後會顯示以下類似訊息

NAME: opentelemetry-operator
LAST DEPLOYED: Sat Dec  7 12:04:10 2024
NAMESPACE: opentelemetry-operator
STATUS: deployed
REVISION: 1
NOTES:
opentelemetry-operator has been installed. Check its status by running:
  kubectl --namespace opentelemetry-operator get pods -l "app.kubernetes.io/name=opentelemetry-operator"

Visit https://github.com/open-telemetry/opentelemetry-operator for instructions on how to create & configure OpenTelemetryCollector and Instrumentation custom resources by using the Operator.

4. 驗證部署

確認 Operator 的 Pod 是否啟動:

kubectl get pods -n opentelemetry-operator

指令執行成功後會顯示以下類似訊息

NAME                                      READY   STATUS    RESTARTS   AGE
opentelemetry-operator-6c58f7d968-4tqjc   2/2     Running   0          3m28s

配置 OpenTelemetry Collector

Collector 是 OpenTelemetry 的核心組件,負責接收、處理和導出觀察性數據。

1.創建 Collector YAML文件

創建名為 otel-collector.yaml 的文件:

cat > otel-collector.yaml <<EOF
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: otel-collector
  namespace: opentelemetry-operator
spec:
  mode: sidecar
  config:
    receivers:
      jaeger:
        protocols:
          thrift_compact:
    processors:
      batch: {}
    exporters:
      debug: 
        verbosity: detailed # 使用 debug 导出器替代 logging 导出器
    service:
      pipelines:
        traces:
          receivers: [jaeger]
          processors: [batch]
          exporters: [debug]
EOF

2. 套用YAML文件

套用此配置:

kubectl apply -f otel-collector.yaml

指令執行成功後會顯示以下類似訊息

opentelemetrycollector.opentelemetry.io/otel-collector created

3. 驗證部署

確認 collectors 存在:

kubectl get  opentelemetrycollectors -n opentelemetry-operator

指令執行成功後會顯示以下類似訊息

NAMESPACE                NAME             MODE      VERSION   READY   AGE   IMAGE   MANAGEMENT
opentelemetry-operator   otel-collector   sidecar   0.114.0           44s           managed

部署範例程式並整合 OpenTelemetry

我們使用jaegertracing提供的範例,以下展示如何將其部署並與 OpenTelemetry Collector 整合。

1. 創建範例YAML文件

cat > pod.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: myapp
  annotations:
    sidecar.opentelemetry.io/inject: "opentelemetry-operator/otel-collector"
spec:
  containers:
  - name: myapp
    image: jaegertracing/vertx-create-span:operator-e2e-tests
    ports:
      - containerPort: 8080
        protocol: TCP
EOF

2. 套用YAML文件

套用此配置:

kubectl apply -f pod.yaml

指令執行成功後會顯示以下類似訊息

pod/myapp created

3. 驗證部署

1. 查看Pod是否有正常啟動

kubectl get pod

指令執行成功後會顯示以下類似訊息

NAME    READY   STATUS    RESTARTS   AGE
myapp   2/2     Running   0          13s

2. 開放 8080 Port直接測試連線

kubectl port-forward pods/myapp 8080:8080

指令執行成功後會顯示以下類似訊息

Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080

3. 測試連線

使用cURL 指令測試

curl localhost:8080

連線成功後會顯示以下類似訊息

Hello from Vert.x!

4. 查看Pod的Log

查看主要Pod的Log

kubectl logs pods/myapp myapp 

指令執行成功後會顯示以下類似訊息

2024-12-07 05:05:51 INFO  MainVerticle:31 - Agent at: null
2024-12-07 05:05:51 INFO  Configuration:248 - Initialized tracer=JaegerTracer(version=Java-0.35.5, serviceName=order, reporter=CompositeReporter(reporters=[RemoteReporter(sender=UdpSender(), closeEnqueueTimeout=1000), LoggingReporter(logger=org.slf4j.reload4j.Reload4jLoggerAdapter@4e594447)]), sampler=ConstSampler(decision=true, tags={sampler.type=const, sampler.param=true}), tags={hostname=myapp, jaeger.version=Java-0.35.5, ip=10.244.0.23}, zipkinSharedRpcSpan=false, expandExceptionLogs=false, useTraceId128Bit=false)
2024-12-07 05:05:51 INFO  Configuration:248 - Initialized tracer=JaegerTracer(version=Java-0.35.5, serviceName=inventory, reporter=CompositeReporter(reporters=[RemoteReporter(sender=UdpSender(), closeEnqueueTimeout=1000), LoggingReporter(logger=org.slf4j.reload4j.Reload4jLoggerAdapter@4e594447)]), sampler=ConstSampler(decision=true, tags={sampler.type=const, sampler.param=true}), tags={hostname=myapp, jaeger.version=Java-0.35.5, ip=10.244.0.23}, zipkinSharedRpcSpan=false, expandExceptionLogs=false, useTraceId128Bit=false)
2024-12-07 05:05:51 INFO  MainVerticle:49 - HTTP server started on port 8080
2024-12-07 05:05:59 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:381500d6b91fea96:4f22b3136d0cd1cb:1 - getAccountFromCache
2024-12-07 05:05:59 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:96b13d445554b087:4f22b3136d0cd1cb:1 - getAccountFromStorage
2024-12-07 05:05:59 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:4f22b3136d0cd1cb:65054449879d93f5:1 - getAccount
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:9683a29070bad470:36b4834ef81162cf:1 - chargeCreditCard
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:36b4834ef81162cf:65054449879d93f5:1 - submitOrder
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:65054449879d93f5:0:1 - requestStarted
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:bb8711f5897dac9c:36b4834ef81162cf:1 - dispatchEventToInventory
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:c5f2e7088058515d:36b4834ef81162cf:1 - changeOrderStatus
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:7870df4211fc981c:a5bc4913627f045e:1 - checkInventoryStatus
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:60f8553b097346a7:a5bc4913627f045e:1 - updateInventory
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:c5d3c88f157f99c1:a5bc4913627f045e:1 - prepareOrderManifest
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:a5bc4913627f045e:bb8711f5897dac9c:1 - receiveEvent

可以從Log看到都有Span都蹤跡

接下來查看Sidecar的Log

kubectl logs pods/myapp otc-container 
2024-12-07T05:05:51.060Z	info	service@v0.114.0/service.go:166	Setting up own telemetry...
2024-12-07T05:05:51.060Z	warn	service@v0.114.0/service.go:221	service::telemetry::metrics::address is being deprecated in favor of service::telemetry::metrics::readers
2024-12-07T05:05:51.060Z	info	telemetry/metrics.go:70	Serving metrics	{"address": "0.0.0.0:8888", "metrics level": "Normal"}
2024-12-07T05:05:51.061Z	info	builders/builders.go:26	Development component. May change in the future.	{"kind": "exporter", "data_type": "traces", "name": "debug"}
2024-12-07T05:05:51.062Z	warn	jaegerreceiver@v0.114.0/factory.go:49	jaeger receiver will deprecate Thrift-gen and replace it with Proto-gen to be compatbible to jaeger 1.42.0 and higher. See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18485 for more details.	{"kind": "receiver", "name": "jaeger", "data_type": "traces"}
2024-12-07T05:05:51.063Z	info	service@v0.114.0/service.go:238	Starting otelcol-k8s...	{"Version": "0.114.0", "NumCPU": 4}
2024-12-07T05:05:51.063Z	info	extensions/extensions.go:39	Starting extensions...
2024-12-07T05:05:51.065Z	info	service@v0.114.0/service.go:261	Everything is ready. Begin running and processing data.
2024-12-07T05:06:00.505Z	info	Traces	{"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 3}
2024-12-07T05:06:00.505Z	info	ResourceSpans #0
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(order)
     -> host.name: Str(myapp)
     -> opencensus.exporterversion: Str(Jaeger-Java-0.35.5)
     -> ip: Str(10.244.0.23)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope  
Span #0
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 4f22b3136d0cd1cb
    ID             : 381500d6b91fea96
    Name           : getAccountFromCache
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.551 +0000 UTC
    End time       : 2024-12-07 05:05:59.615371 +0000 UTC
    Status code    : Error
    Status message : 
Attributes:
     -> message: Str(Cache miss)
Span #1
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 4f22b3136d0cd1cb
    ID             : 96b13d445554b087
    Name           : getAccountFromStorage
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.616 +0000 UTC
    End time       : 2024-12-07 05:05:59.644533 +0000 UTC
    Status code    : Unset
    Status message : 
Span #2
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 65054449879d93f5
    ID             : 4f22b3136d0cd1cb
    Name           : getAccount
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.501 +0000 UTC
    End time       : 2024-12-07 05:05:59.645381 +0000 UTC
    Status code    : Unset
    Status message : 
	{"kind": "exporter", "data_type": "traces", "name": "debug"}
2024-12-07T05:06:01.509Z	info	Traces	{"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 2, "spans": 9}
2024-12-07T05:06:01.510Z	info	ResourceSpans #0
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(order)
     -> host.name: Str(myapp)
     -> opencensus.exporterversion: Str(Jaeger-Java-0.35.5)
     -> ip: Str(10.244.0.23)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope  
Span #0
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 36b4834ef81162cf
    ID             : 9683a29070bad470
    Name           : chargeCreditCard
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.646 +0000 UTC
    End time       : 2024-12-07 05:06:00.64654 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> card: Str(x123)
Span #1
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 65054449879d93f5
    ID             : 36b4834ef81162cf
    Name           : submitOrder
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.646 +0000 UTC
    End time       : 2024-12-07 05:06:00.650364 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> order-id: Str(c85b7644b6b5)
Span #2
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 
    ID             : 65054449879d93f5
    Name           : requestStarted
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.5 +0000 UTC
    End time       : 2024-12-07 05:06:00.656193 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> sampler.type: Str(const)
     -> account: Str(14a25bb1-c510-49b7-ac09-9f7f7f1cc354)
     -> sampler.param: Bool(true)
Span #3
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 36b4834ef81162cf
    ID             : bb8711f5897dac9c
    Name           : dispatchEventToInventory
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.651 +0000 UTC
    End time       : 2024-12-07 05:06:00.692278 +0000 UTC
    Status code    : Unset
    Status message : 
Span #4
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 36b4834ef81162cf
    ID             : c5f2e7088058515d
    Name           : changeOrderStatus
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.649 +0000 UTC
    End time       : 2024-12-07 05:06:00.720253 +0000 UTC
    Status code    : Unset
    Status message : 
ResourceSpans #1
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(inventory)
     -> host.name: Str(myapp)
     -> opencensus.exporterversion: Str(Jaeger-Java-0.35.5)
     -> ip: Str(10.244.0.23)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope  
Span #0
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : a5bc4913627f045e
    ID             : 7870df4211fc981c
    Name           : checkInventoryStatus
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.775 +0000 UTC
    End time       : 2024-12-07 05:06:00.872212 +0000 UTC
    Status code    : Unset
    Status message : 
Span #1
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : a5bc4913627f045e
    ID             : 60f8553b097346a7
    Name           : updateInventory
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.873 +0000 UTC
    End time       : 2024-12-07 05:06:00.897421 +0000 UTC
    Status code    : Error
    Status message : 
Attributes:
     -> message: Str(Cannot open connection to storage. Queueing update.)
Span #2
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : a5bc4913627f045e
    ID             : c5d3c88f157f99c1
    Name           : prepareOrderManifest
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.898 +0000 UTC
    End time       : 2024-12-07 05:06:00.970265 +0000 UTC
    Status code    : Unset
    Status message : 
Span #3
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : bb8711f5897dac9c
    ID             : a5bc4913627f045e
    Name           : receiveEvent
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.692 +0000 UTC
    End time       : 2024-12-07 05:06:00.970895 +0000 UTC
    Status code    : Unset
    Status message : 
	{"kind": "exporter", "data_type": "traces", "name": "debug"}

因為我們設定Collector中直接把exporters顯示在Log中

因此可以直接從Sidecar的Log中看到詳細的Trace

實際使用上會在後面接集中位置

參考

OpenTelemetry 官網

OpenTelemetry Helm Charts Github網址

Vert.x starter with Jaeger tracer 範例程式