在Kubernetes中安裝OpenTelemetry
在現代微服務架構中,應用程式的可觀察性至關重要,而 OpenTelemetry 是目前業界採用的開源標準框架,用於收集分布式追蹤、指標和日誌數據。本範例將帶你快速實踐如何在 Kubernetes 中安裝 OpenTelemetry,並通過範例來演示數據收集的流程。
先決條件
在開始之前,請確保您的 macOS 環境已經具備以下軟體:
- 
Homebrew:macOS 的套件管理工具,用來安裝其他必要軟體。
 - 
KinD: 一種基於 Docker 的本地 Kubernetes 集群工具,用於在本地開發與測試 Kubernetes 工作負載。 可以透過 Homebrew 安裝:
brew install kind更多詳細說明可參考 在本機用 KinD 建立 Kubernetes。
 - 
Helm: Kubernetes 的應用程式包管理工具,用於簡化應用部署與管理。可以透過 Homebrew 安裝
brew install helm 
安裝 Cert-Manager
Cert-Manager 是 OpenTelemetry Operator 的依賴項,用於管理證書。請確保 Cert-Manager 已安裝。
以下是安裝步驟:
1. 新增 Cert-Manager 儲存庫:
helm repo add jetstack https://charts.jetstack.io
指令執行成功後會顯示以下類似訊息
"jetstack" has been added to your repositories
2. 更新 Helm Chart 儲存庫
helm repo update jetstack
指令執行成功後會顯示以下類似訊息
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "jetstack" chart repository
Update Complete. ⎈Happy Helming!⎈
3. 安裝 Cert-Manager
helm install cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --set crds.enabled=true
指令執行成功後會顯示以下類似訊息
NAME: cert-manager
LAST DEPLOYED: Sat Dec  7 11:57:41 2024
NAMESPACE: cert-manager
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
cert-manager v1.16.2 has been deployed successfully!
In order to begin issuing certificates, you will need to set up a ClusterIssuer
or Issuer resource (for example, by creating a 'letsencrypt-staging' issuer).
More information on the different types of issuers and how to configure them
can be found in our documentation:
https://cert-manager.io/docs/configuration/
For information on how to configure cert-manager to automatically provision
Certificates for Ingress resources, take a look at the `ingress-shim`
documentation:
https://cert-manager.io/docs/usage/ingress/
4.驗證安裝
檢查 Cert-Manager 是否成功運行:
kubectl get pods -n cert-manager
指令執行成功後會顯示以下類似訊息
NAME                                      READY   STATUS    RESTARTS   AGE
cert-manager-b6fd485d9-58qmt              1/1     Running   0          100s
cert-manager-cainjector-dcc5966bc-782sf   1/1     Running   0          100s
cert-manager-webhook-dfb76c7bd-2sggz      1/1     Running   0          100s
安裝 OpenTelemetry Operator
OpenTelemetry Operator 用於幫助管理 Collector 和自動注入追蹤配置。
以下是安裝步驟:
1. 新增 OpenTelemetry Operator 儲存庫:
首先,將 OpenTelemetry Operator 的 Helm Chart 儲存庫加入本地 Helm 設定:
helm repo add open-telemetry \
  https://open-telemetry.github.io/opentelemetry-helm-charts
指令執行成功後會顯示以下類似訊息
"open-telemetry" has been added to your repositories
2. 更新 Helm Chart 儲存庫
確保您擁有最新的 Chart 資料:
helm repo update open-telemetry
指令執行成功後會顯示以下類似訊息
Hang tight while we grab the latest from your chart repositories...
...Successfully got an update from the "open-telemetry" chart repository
Update Complete. ⎈Happy Helming!⎈
3. 安裝 OpenTelemetry Operator
使用 Helm Chart 安裝 Vault 至指定的命名空間,範例 vault:
helm install opentelemetry-operator open-telemetry/opentelemetry-operator \
  --namespace opentelemetry-operator  \
  --create-namespace \
  --set "manager.collectorImage.repository=otel/opentelemetry-collector-k8s"
指令執行成功後會顯示以下類似訊息
NAME: opentelemetry-operator
LAST DEPLOYED: Sat Dec  7 12:04:10 2024
NAMESPACE: opentelemetry-operator
STATUS: deployed
REVISION: 1
NOTES:
opentelemetry-operator has been installed. Check its status by running:
  kubectl --namespace opentelemetry-operator get pods -l "app.kubernetes.io/name=opentelemetry-operator"
Visit https://github.com/open-telemetry/opentelemetry-operator for instructions on how to create & configure OpenTelemetryCollector and Instrumentation custom resources by using the Operator.
4. 驗證部署
確認 Operator 的 Pod 是否啟動:
kubectl get pods -n opentelemetry-operator
指令執行成功後會顯示以下類似訊息
NAME                                      READY   STATUS    RESTARTS   AGE
opentelemetry-operator-6c58f7d968-4tqjc   2/2     Running   0          3m28s
配置 OpenTelemetry Collector
Collector 是 OpenTelemetry 的核心組件,負責接收、處理和導出觀察性數據。
1.創建 Collector YAML文件
創建名為 otel-collector.yaml 的文件:
cat > otel-collector.yaml <<EOF
apiVersion: opentelemetry.io/v1beta1
kind: OpenTelemetryCollector
metadata:
  name: otel-collector
  namespace: opentelemetry-operator
spec:
  mode: sidecar
  config:
    receivers:
      jaeger:
        protocols:
          thrift_compact:
    processors:
      batch: {}
    exporters:
      debug: 
        verbosity: detailed # 使用 debug 导出器替代 logging 导出器
    service:
      pipelines:
        traces:
          receivers: [jaeger]
          processors: [batch]
          exporters: [debug]
EOF
2. 套用YAML文件
套用此配置:
kubectl apply -f otel-collector.yaml
指令執行成功後會顯示以下類似訊息
opentelemetrycollector.opentelemetry.io/otel-collector created
3. 驗證部署
確認 collectors 存在:
kubectl get  opentelemetrycollectors -n opentelemetry-operator
指令執行成功後會顯示以下類似訊息
NAMESPACE                NAME             MODE      VERSION   READY   AGE   IMAGE   MANAGEMENT
opentelemetry-operator   otel-collector   sidecar   0.114.0           44s           managed
部署範例程式並整合 OpenTelemetry
我們使用jaegertracing提供的範例,以下展示如何將其部署並與 OpenTelemetry Collector 整合。
1. 創建範例YAML文件
cat > pod.yaml <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: myapp
  annotations:
    sidecar.opentelemetry.io/inject: "opentelemetry-operator/otel-collector"
spec:
  containers:
  - name: myapp
    image: jaegertracing/vertx-create-span:operator-e2e-tests
    ports:
      - containerPort: 8080
        protocol: TCP
EOF
2. 套用YAML文件
套用此配置:
kubectl apply -f pod.yaml
指令執行成功後會顯示以下類似訊息
pod/myapp created
3. 驗證部署
1. 查看Pod是否有正常啟動
kubectl get pod
指令執行成功後會顯示以下類似訊息
NAME    READY   STATUS    RESTARTS   AGE
myapp   2/2     Running   0          13s
2. 開放 8080 Port直接測試連線
kubectl port-forward pods/myapp 8080:8080
指令執行成功後會顯示以下類似訊息
Forwarding from 127.0.0.1:8080 -> 8080
Forwarding from [::1]:8080 -> 8080
3. 測試連線
使用cURL 指令測試
curl localhost:8080
連線成功後會顯示以下類似訊息
Hello from Vert.x!
4. 查看Pod的Log
查看主要Pod的Log
kubectl logs pods/myapp myapp 
指令執行成功後會顯示以下類似訊息
2024-12-07 05:05:51 INFO  MainVerticle:31 - Agent at: null
2024-12-07 05:05:51 INFO  Configuration:248 - Initialized tracer=JaegerTracer(version=Java-0.35.5, serviceName=order, reporter=CompositeReporter(reporters=[RemoteReporter(sender=UdpSender(), closeEnqueueTimeout=1000), LoggingReporter(logger=org.slf4j.reload4j.Reload4jLoggerAdapter@4e594447)]), sampler=ConstSampler(decision=true, tags={sampler.type=const, sampler.param=true}), tags={hostname=myapp, jaeger.version=Java-0.35.5, ip=10.244.0.23}, zipkinSharedRpcSpan=false, expandExceptionLogs=false, useTraceId128Bit=false)
2024-12-07 05:05:51 INFO  Configuration:248 - Initialized tracer=JaegerTracer(version=Java-0.35.5, serviceName=inventory, reporter=CompositeReporter(reporters=[RemoteReporter(sender=UdpSender(), closeEnqueueTimeout=1000), LoggingReporter(logger=org.slf4j.reload4j.Reload4jLoggerAdapter@4e594447)]), sampler=ConstSampler(decision=true, tags={sampler.type=const, sampler.param=true}), tags={hostname=myapp, jaeger.version=Java-0.35.5, ip=10.244.0.23}, zipkinSharedRpcSpan=false, expandExceptionLogs=false, useTraceId128Bit=false)
2024-12-07 05:05:51 INFO  MainVerticle:49 - HTTP server started on port 8080
2024-12-07 05:05:59 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:381500d6b91fea96:4f22b3136d0cd1cb:1 - getAccountFromCache
2024-12-07 05:05:59 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:96b13d445554b087:4f22b3136d0cd1cb:1 - getAccountFromStorage
2024-12-07 05:05:59 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:4f22b3136d0cd1cb:65054449879d93f5:1 - getAccount
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:9683a29070bad470:36b4834ef81162cf:1 - chargeCreditCard
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:36b4834ef81162cf:65054449879d93f5:1 - submitOrder
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:65054449879d93f5:0:1 - requestStarted
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:bb8711f5897dac9c:36b4834ef81162cf:1 - dispatchEventToInventory
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:c5f2e7088058515d:36b4834ef81162cf:1 - changeOrderStatus
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:7870df4211fc981c:a5bc4913627f045e:1 - checkInventoryStatus
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:60f8553b097346a7:a5bc4913627f045e:1 - updateInventory
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:c5d3c88f157f99c1:a5bc4913627f045e:1 - prepareOrderManifest
2024-12-07 05:06:00 INFO  LoggingReporter:43 - Span reported: 65054449879d93f5:a5bc4913627f045e:bb8711f5897dac9c:1 - receiveEvent
可以從Log看到都有Span都蹤跡
接下來查看Sidecar的Log
kubectl logs pods/myapp otc-container 
2024-12-07T05:05:51.060Z	info	service@v0.114.0/service.go:166	Setting up own telemetry...
2024-12-07T05:05:51.060Z	warn	service@v0.114.0/service.go:221	service::telemetry::metrics::address is being deprecated in favor of service::telemetry::metrics::readers
2024-12-07T05:05:51.060Z	info	telemetry/metrics.go:70	Serving metrics	{"address": "0.0.0.0:8888", "metrics level": "Normal"}
2024-12-07T05:05:51.061Z	info	builders/builders.go:26	Development component. May change in the future.	{"kind": "exporter", "data_type": "traces", "name": "debug"}
2024-12-07T05:05:51.062Z	warn	jaegerreceiver@v0.114.0/factory.go:49	jaeger receiver will deprecate Thrift-gen and replace it with Proto-gen to be compatbible to jaeger 1.42.0 and higher. See https://github.com/open-telemetry/opentelemetry-collector-contrib/pull/18485 for more details.	{"kind": "receiver", "name": "jaeger", "data_type": "traces"}
2024-12-07T05:05:51.063Z	info	service@v0.114.0/service.go:238	Starting otelcol-k8s...	{"Version": "0.114.0", "NumCPU": 4}
2024-12-07T05:05:51.063Z	info	extensions/extensions.go:39	Starting extensions...
2024-12-07T05:05:51.065Z	info	service@v0.114.0/service.go:261	Everything is ready. Begin running and processing data.
2024-12-07T05:06:00.505Z	info	Traces	{"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 1, "spans": 3}
2024-12-07T05:06:00.505Z	info	ResourceSpans #0
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(order)
     -> host.name: Str(myapp)
     -> opencensus.exporterversion: Str(Jaeger-Java-0.35.5)
     -> ip: Str(10.244.0.23)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope  
Span #0
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 4f22b3136d0cd1cb
    ID             : 381500d6b91fea96
    Name           : getAccountFromCache
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.551 +0000 UTC
    End time       : 2024-12-07 05:05:59.615371 +0000 UTC
    Status code    : Error
    Status message : 
Attributes:
     -> message: Str(Cache miss)
Span #1
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 4f22b3136d0cd1cb
    ID             : 96b13d445554b087
    Name           : getAccountFromStorage
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.616 +0000 UTC
    End time       : 2024-12-07 05:05:59.644533 +0000 UTC
    Status code    : Unset
    Status message : 
Span #2
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 65054449879d93f5
    ID             : 4f22b3136d0cd1cb
    Name           : getAccount
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.501 +0000 UTC
    End time       : 2024-12-07 05:05:59.645381 +0000 UTC
    Status code    : Unset
    Status message : 
	{"kind": "exporter", "data_type": "traces", "name": "debug"}
2024-12-07T05:06:01.509Z	info	Traces	{"kind": "exporter", "data_type": "traces", "name": "debug", "resource spans": 2, "spans": 9}
2024-12-07T05:06:01.510Z	info	ResourceSpans #0
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(order)
     -> host.name: Str(myapp)
     -> opencensus.exporterversion: Str(Jaeger-Java-0.35.5)
     -> ip: Str(10.244.0.23)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope  
Span #0
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 36b4834ef81162cf
    ID             : 9683a29070bad470
    Name           : chargeCreditCard
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.646 +0000 UTC
    End time       : 2024-12-07 05:06:00.64654 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> card: Str(x123)
Span #1
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 65054449879d93f5
    ID             : 36b4834ef81162cf
    Name           : submitOrder
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.646 +0000 UTC
    End time       : 2024-12-07 05:06:00.650364 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> order-id: Str(c85b7644b6b5)
Span #2
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 
    ID             : 65054449879d93f5
    Name           : requestStarted
    Kind           : Unspecified
    Start time     : 2024-12-07 05:05:59.5 +0000 UTC
    End time       : 2024-12-07 05:06:00.656193 +0000 UTC
    Status code    : Unset
    Status message : 
Attributes:
     -> sampler.type: Str(const)
     -> account: Str(14a25bb1-c510-49b7-ac09-9f7f7f1cc354)
     -> sampler.param: Bool(true)
Span #3
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 36b4834ef81162cf
    ID             : bb8711f5897dac9c
    Name           : dispatchEventToInventory
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.651 +0000 UTC
    End time       : 2024-12-07 05:06:00.692278 +0000 UTC
    Status code    : Unset
    Status message : 
Span #4
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : 36b4834ef81162cf
    ID             : c5f2e7088058515d
    Name           : changeOrderStatus
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.649 +0000 UTC
    End time       : 2024-12-07 05:06:00.720253 +0000 UTC
    Status code    : Unset
    Status message : 
ResourceSpans #1
Resource SchemaURL: 
Resource attributes:
     -> service.name: Str(inventory)
     -> host.name: Str(myapp)
     -> opencensus.exporterversion: Str(Jaeger-Java-0.35.5)
     -> ip: Str(10.244.0.23)
ScopeSpans #0
ScopeSpans SchemaURL: 
InstrumentationScope  
Span #0
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : a5bc4913627f045e
    ID             : 7870df4211fc981c
    Name           : checkInventoryStatus
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.775 +0000 UTC
    End time       : 2024-12-07 05:06:00.872212 +0000 UTC
    Status code    : Unset
    Status message : 
Span #1
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : a5bc4913627f045e
    ID             : 60f8553b097346a7
    Name           : updateInventory
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.873 +0000 UTC
    End time       : 2024-12-07 05:06:00.897421 +0000 UTC
    Status code    : Error
    Status message : 
Attributes:
     -> message: Str(Cannot open connection to storage. Queueing update.)
Span #2
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : a5bc4913627f045e
    ID             : c5d3c88f157f99c1
    Name           : prepareOrderManifest
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.898 +0000 UTC
    End time       : 2024-12-07 05:06:00.970265 +0000 UTC
    Status code    : Unset
    Status message : 
Span #3
    Trace ID       : 000000000000000065054449879d93f5
    Parent ID      : bb8711f5897dac9c
    ID             : a5bc4913627f045e
    Name           : receiveEvent
    Kind           : Unspecified
    Start time     : 2024-12-07 05:06:00.692 +0000 UTC
    End time       : 2024-12-07 05:06:00.970895 +0000 UTC
    Status code    : Unset
    Status message : 
	{"kind": "exporter", "data_type": "traces", "name": "debug"}
因為我們設定Collector中直接把exporters顯示在Log中
因此可以直接從Sidecar的Log中看到詳細的Trace
實際使用上會在後面接集中位置