CUR2.0 を Athenaで利用する

Posted on 2024/08/14

ToC

Cost Usage Report 2.0

AWS Summit Tokyo 2024にて、AWS利用料に関連するセッションがたくさんありました。
CUR(Cost Usage Report)の活用の話のなかで、「CUR2.0がリリースされた」との話を聞いたため、これまで利用していたCUR(Legacy CURと呼ぶらしい…)との違いを確認して、CUR2.0を活用してみようと考えました。
これまでLegacy CURを出力して、Athenaにて集計等を実施しコストの計算を行っていたので、まずはCUR2.0でも同様にAthenaでクエリーができることを確認してみたいと思います。

S3のバケットポリシー設定

CUR2.0のファイルを出力するためには、既存のCURに加えてコスト管理データエクスポート機能(bcm-data-exports)がS3バケットにファイル出力できるように設定追加する必要があります。
Cloud Formationで追加する際には、下記のようなPolicyとなります。 また、コンソールから設定する際には、CUR2.0のData Exportの出力先の設定時に必要となるPolicyが表示されます。そのため、コンソールにて追加することでも問題ありません。

- Sid: "CUR20Policy"
  Principal:
    Service: 
      - !Sub "billingreports.${AWS::URLSuffix}"
      - !Sub "bcm-data-exports.${AWS::URLSuffix}"
  Action:
    - s3:PutObject
    - s3:GetBucketPolicy
  Effect: "Allow"
  Resource: 
    - !Sub "arn:${AWS::Partition}:s3:::${BucketName}"
    - !Sub "arn:${AWS::Partition}:s3:::${BucketName}/*"
Condition:
  StringLike:
  aws:SourceAccount: !Ref "AWS::AccountId"
  aws:SourceArn: 
    - !Sub "arn:${AWS::Partition}:cur:${AWS::Region}:${AWS::AccountId}:definition/*"
    - !Sub "arn:${AWS::Partition}:bcm-data-exports:${AWS::Region}:${AWS::AccountId}:export/*"

Data Exportの設定

AWSコンソールによる設定

Billing and Cost Managementの画面から、Cost Analysis > Data Exportsの順に選択すると現在出力されているExportの一覧が表示されます。 CUR2.0がリリースされる以前から利用していたCURは、TypeはLegacy CUR exportとして表示されました。
新しく出力するため、Createボタンを選択して出力種別をStandard data exportとして出力頻度や出力先の情報を設定していきます。
今回は、日次の頻度で、ファイルはParquet形式としてS3に出力し、そのファイルに対してAthenaでクエリーを実行できるようにしたいと思います。なお、CUR2.0のファイルがS3に出力されるのは、設定してから24時間程度かかるようですので、気長に待ちましょう…

/posts/2024/07/img/cur20_hu2ec1820cfd71376b58bd333f72ac7f3c_181572_700x0_resize_lanczos_3.png

Cloud Formationによる設定

Cloud Formationもサポートされているようですので、AWSコンソールによる設定と同じ条件で出力してみようと思います。
ドキュメントを参照しただけでは、はじめQueryStatementへの設定値がわかりませんでした。マネジメントコンソールの設定画面を見ていたところPreview queryというボタンがあったため設定値がわかりました。個人的には、CloudFormationにて設定した方がマネジメントコンソールでの設定のように出力先のS3バケットのPolicyを強制的に書き換える必要がないので良いようにも思います。

なお、テンプレートのDeletionPolicyについては、CloudFormationのImport Resourceを試した都合上、コメントとして残してます。Arnの指定は、AWSドキュメントを参照してください。

BillingReportDaily:
  Type: AWS::BCMDataExports::Export
  # DeletionPolicy: Retain
  Properties:
    Export:
      Name: "CostUsageReport20"
      DataQuery: 
        QueryStatement: 'SELECT bill_bill_type, bill_billing_entity, bill_billing_period_end_date, bill_billing_period_start_date, bill_invoice_id, bill_invoicing_entity, bill_payer_account_id, bill_payer_account_name, cost_category, discount, discount_bundled_discount, discount_total_discount, identity_line_item_id, identity_time_interval, line_item_availability_zone, line_item_blended_cost, line_item_blended_rate, line_item_currency_code, line_item_legal_entity, line_item_line_item_description, line_item_line_item_type, line_item_net_unblended_cost, line_item_net_unblended_rate, line_item_normalization_factor, line_item_normalized_usage_amount, line_item_operation, line_item_product_code, line_item_resource_id, line_item_tax_type, line_item_unblended_cost, line_item_unblended_rate, line_item_usage_account_id, line_item_usage_account_name, line_item_usage_amount, line_item_usage_end_date, line_item_usage_start_date, line_item_usage_type, pricing_currency, pricing_lease_contract_length, pricing_offering_class, pricing_public_on_demand_cost, pricing_public_on_demand_rate, pricing_purchase_option, pricing_rate_code, pricing_rate_id, pricing_term, pricing_unit, product, product_comment, product_fee_code, product_fee_description, product_from_location, product_from_location_type, product_from_region_code, product_instance_family, product_instance_type, product_instancesku, product_location, product_location_type, product_operation, product_pricing_unit, product_product_family, product_region_code, product_servicecode, product_sku, product_to_location, product_to_location_type, product_to_region_code, product_usagetype, reservation_amortized_upfront_cost_for_usage, reservation_amortized_upfront_fee_for_billing_period, reservation_availability_zone, reservation_effective_cost, reservation_end_time, reservation_modification_status, reservation_net_amortized_upfront_cost_for_usage, reservation_net_amortized_upfront_fee_for_billing_period, reservation_net_effective_cost, reservation_net_recurring_fee_for_usage, reservation_net_unused_amortized_upfront_fee_for_billing_period, reservation_net_unused_recurring_fee, reservation_net_upfront_value, reservation_normalized_units_per_reservation, reservation_number_of_reservations, reservation_recurring_fee_for_usage, reservation_reservation_a_r_n, reservation_start_time, reservation_subscription_id, reservation_total_reserved_normalized_units, reservation_total_reserved_units, reservation_units_per_reservation, reservation_unused_amortized_upfront_fee_for_billing_period, reservation_unused_normalized_unit_quantity, reservation_unused_quantity, reservation_unused_recurring_fee, reservation_upfront_value, resource_tags, savings_plan_amortized_upfront_commitment_for_billing_period, savings_plan_end_time, savings_plan_instance_type_family, savings_plan_net_amortized_upfront_commitment_for_billing_period, savings_plan_net_recurring_commitment_for_billing_period, savings_plan_net_savings_plan_effective_cost, savings_plan_offering_type, savings_plan_payment_option, savings_plan_purchase_term, savings_plan_recurring_commitment_for_billing_period, savings_plan_region, savings_plan_savings_plan_a_r_n, savings_plan_savings_plan_effective_cost, savings_plan_savings_plan_rate, savings_plan_start_time, savings_plan_total_commitment_to_date, savings_plan_used_commitment FROM COST_AND_USAGE_REPORT'
        TableConfigurations: 
          COST_AND_USAGE_REPORT: 
            INCLUDE_MANUAL_DISCOUNT_COMPATIBILITY: "FALSE"
            INCLUDE_RESOURCES: "TRUE"
            INCLUDE_SPLIT_COST_ALLOCATION_DATA: "FALSE"
            TIME_GRANULARITY: "DAILY"
      DestinationConfigurations:
        S3Destination:
          S3Bucket: "[Your S3 Bucket]"
          S3Region: "us-east-1"
          S3Prefix: "[Your S3 Prefix Key]"
          S3OutputConfigurations:
            Compression: "PARQUET"
            Format: "PARQUET"
            OutputType: "CUSTOM"
            Overwrite: "OVERWRITE_REPORT"
      RefreshCadence: 
        Frequency: "SYNCHRONOUS"

Athenaで利用するDatatable

最後にAthenaクエリを実行するためのData TableをCloud Formationで定義します。
少し長いので表示は折りたたんでいます。

コードを展開する
  BillingCURTable:
    Type: AWS::Glue::Table
    Properties:
      CatalogId: !Ref AWS::AccountId
      DatabaseName: !Ref BillingDatabase
      TableInput:
        Name: !Sub 'cost_usage_report'
        Description: 'AWS Cost Usage Report 2.0 table'
        TableType: EXTERNAL_TABLE
        Parameters:
          serialization.format: 1
          projection.enabled: 'true'
          projection.billing_period.type: date
          projection.billing_period.range: 2024-07,NOW
          projection.billing_period.format: yyyy-MM
          projection.billing_period.interval: '1'
          projection.billing_period.interval.unit: MONTHS
          storage.location.template: !Sub 's3://[Your S3 Bucket and Prefix Key]/data/BILLING_PERIOD=${!billing_period}'
        StorageDescriptor:
          Location: !Sub 's3://[Your S3 Bucket and Prefix Key]/data'
          Columns:
            - Name: bill_bill_type
              Type: string
            - Name: bill_billing_entity
              Type: string
            - Name: bill_billing_period_end_date
              Type: timestamp
            - Name: bill_billing_period_start_date
              Type: timestamp
            - Name: bill_invoice_id
              Type: string
            - Name: bill_invoicing_entity
              Type: string
            - Name: bill_payer_account_id
              Type: string
            - Name: bill_payer_account_name
              Type: string
            - Name: cost_category
              Type: map<string,string>
            - Name: discount
              Type: map<string,string>
            - Name: discount_bundled_discount
              Type: string
            - Name: discount_total_discount
              Type: string
            - Name: identity_line_item_id
              Type: string
            - Name: identity_time_interval
              Type: string
            - Name: line_item_availability_zone
              Type: string
            - Name: line_item_blended_cost
              Type: double
            - Name: line_item_blended_rate
              Type: string
            - Name: line_item_currency_code
              Type: string
            - Name: line_item_legal_entity
              Type: string
            - Name: line_item_line_item_description
              Type: string
            - Name: line_item_line_item_type
              Type: string
            - Name: line_item_net_unblended_cost
              Type: string
            - Name: line_item_net_unblended_rate
              Type: string
            - Name: line_item_normalization_factor
              Type: double
            - Name: line_item_normalized_usage_amount
              Type: double
            - Name: line_item_operation
              Type: string
            - Name: line_item_product_code
              Type: string
            - Name: line_item_resource_id
              Type: string
            - Name: line_item_tax_type
              Type: string
            - Name: line_item_unblended_cost
              Type: double
            - Name: line_item_unblended_rate
              Type: string
            - Name: line_item_usage_account_id
              Type: string
            - Name: line_item_usage_account_name
              Type: string
            - Name: line_item_usage_amount
              Type: double
            - Name: line_item_usage_end_date
              Type: timestamp
            - Name: line_item_usage_start_date
              Type: timestamp
            - Name: line_item_usage_type
              Type: string
            - Name: pricing_currency
              Type: string
            - Name: pricing_lease_contract_length
              Type: string
            - Name: pricing_offering_class
              Type: string
            - Name: pricing_public_on_demand_cost
              Type: double
            - Name: pricing_public_on_demand_rate
              Type: string
            - Name: pricing_purchase_option
              Type: string
            - Name: pricing_rate_code
              Type: string
            - Name: pricing_rate_id
              Type: string
            - Name: pricing_term
              Type: string
            - Name: pricing_unit
              Type: string
            - Name: product
              Type: map<string,string>
            - Name: product_comment
              Type: string
            - Name: product_fee_code
              Type: string
            - Name: product_fee_description
              Type: string
            - Name: product_from_location
              Type: string
            - Name: product_from_location_type
              Type: string
            - Name: product_from_region_code
              Type: string
            - Name: product_instance_family
              Type: string
            - Name: product_instance_type
              Type: string
            - Name: product_instancesku
              Type: string
            - Name: product_location
              Type: string
            - Name: product_location_type
              Type: string
            - Name: product_operation
              Type: string
            - Name: product_pricing_unit
              Type: string
            - Name: product_product_family
              Type: string
            - Name: product_region_code
              Type: string
            - Name: product_servicecode
              Type: string
            - Name: product_sku
              Type: string
            - Name: product_to_location
              Type: string
            - Name: product_to_location_type
              Type: string
            - Name: product_to_region_code
              Type: string
            - Name: product_usagetype
              Type: string
            - Name: reservation_amortized_upfront_cost_for_usage
              Type: double
            - Name: reservation_amortized_upfront_fee_for_billing_period
              Type: double
            - Name: reservation_availability_zone
              Type: string
            - Name: reservation_effective_cost
              Type: double
            - Name: reservation_end_time
              Type: string
            - Name: reservation_modification_status
              Type: string
            - Name: reservation_net_amortized_upfront_cost_for_usage
              Type: string
            - Name: reservation_net_amortized_upfront_fee_for_billing_period
              Type: string
            - Name: reservation_net_effective_cost
              Type: string
            - Name: reservation_net_recurring_fee_for_usage
              Type: string
            - Name: reservation_net_unused_amortized_upfront_fee_for_billing_period
              Type: string
            - Name: reservation_net_unused_recurring_fee
              Type: string
            - Name: reservation_net_upfront_value
              Type: string
            - Name: reservation_normalized_units_per_reservation
              Type: string
            - Name: reservation_number_of_reservations
              Type: string
            - Name: reservation_recurring_fee_for_usage
              Type: double
            - Name: reservation_reservation_a_r_n
              Type: string
            - Name: reservation_start_time
              Type: string
            - Name: reservation_subscription_id
              Type: string
            - Name: reservation_total_reserved_normalized_units
              Type: string
            - Name: reservation_total_reserved_units
              Type: string
            - Name: reservation_units_per_reservation
              Type: string
            - Name: reservation_unused_amortized_upfront_fee_for_billing_period
              Type: double
            - Name: reservation_unused_normalized_unit_quantity
              Type: double
            - Name: reservation_unused_quantity
              Type: double
            - Name: reservation_unused_recurring_fee
              Type: double
            - Name: reservation_upfront_value
              Type: double
            - Name: resource_tags
              Type: map<string,string>
            - Name: savings_plan_amortized_upfront_commitment_for_billing_period
              Type: double
            - Name: savings_plan_end_time
              Type: string
            - Name: savings_plan_instance_type_family
              Type: string
            - Name: savings_plan_net_amortized_upfront_commitment_for_billing_period
              Type: string
            - Name: savings_plan_net_recurring_commitment_for_billing_period
              Type: string
            - Name: savings_plan_net_savings_plan_effective_cost
              Type: string
            - Name: savings_plan_offering_type
              Type: string
            - Name: savings_plan_payment_option
              Type: string
            - Name: savings_plan_purchase_term
              Type: string
            - Name: savings_plan_recurring_commitment_for_billing_period
              Type: double
            - Name: savings_plan_region
              Type: string
            - Name: savings_plan_savings_plan_a_r_n
              Type: string
            - Name: savings_plan_savings_plan_effective_cost
              Type: double
            - Name: savings_plan_savings_plan_rate
              Type: double
            - Name: savings_plan_start_time
              Type: string
            - Name: savings_plan_total_commitment_to_date
              Type: double
            - Name: savings_plan_used_commitment
              Type: double
          SerdeInfo:
            SerializationLibrary: org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe
          InputFormat:  org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat
          OutputFormat: org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat
          Parameters:
            parquet.compression: SNAPPY
        PartitionKeys:
          - Name: billing_period
            Type: string

CUR2.0にクエリを実行

Data Tableを作成すると、Athenaでクエリーを実行できるようになります。 列数が114もあるため、詳細はAWSのドキュメントを参考にしてカラムの情報を確認してください。
Map型のカラムが定義されているなど、データ型もLegacy CURからは変更点がありそうです。


参照