Modeling Data¶
Installation¶
dbt --version
Core:
- installed: 1.9.4
- latest: 1.9.4 - Up to date!
Plugins:
- bigquery: 1.9.1 - Up to date!
Sources¶
_sources.yml
version: 2
sources:
- name: feast
description: >
This source contains the raw data for the fraud detection project. It includes
transaction records, user account features, and labels indicating whether a
transaction was fraudulent.
project: feast-oss
dataset: fraud_tutorial
meta:
domain: fraud
owner: "@kclai"
tags:
- fraud
- pii
tables:
- name: transactions
- name: user_account_features
- name: user_has_fraudulent_transactions
Staging¶
- naming convention:
stg__<src>__<tbl>
stg__feast__transactions¶
stg__feast__transactions.sql
select
src_account as src_account_id,
dest_account as dest_account_id,
amount,
case
when is_fraud = 1 then true
when is_fraud = 0 then false
else null
end as is_fraud,
timestamp as created_at
from {{ source('feast', 'transactions') }}
stg__feast__transactions.yml
version: 2
models:
- name: stg__feast__transactions
description: >
This staging model standardizes raw transaction data for downstream fraud analysis.
It includes source and destination accounts, transaction amount, timestamp, and
a binary fraud label. Used as a foundational layer for detecting fraudulent behavior.
config:
event_time: created_at
contract:
enforced: true
columns:
- name: src_account_id
data_type: string
description: >
The unique identifier of the source account that initiated the transaction.
- name: dest_account_id
data_type: string
description: >
The unique identifier of the destination account that received the transaction.
- name: amount
data_type: float
description: >
The monetary amount of the transaction. Assumes a consistent currency.
- name: is_fraud
data_type: boolean
description: >
A boolean flag indicating whether the transaction was identified as fraudulent.
`true` indicates a confirmed fraudulent transaction.
- name: created_at
data_type: timestamp
description: >
The timestamp when the transaction was created. Used for time-based analysis and
feature extraction (e.g., fraud trends over time).
stg__feast__acct_fraud_7d¶
stg__feast__acct_fraud_7d.sql
select
user_id as account_id,
case
when user_has_fraudulent_transactions_7d = 1 then true
when user_has_fraudulent_transactions_7d = 0 then false
else null
end as has_fraud_7d,
feature_timestamp
from {{ source('feast', 'user_has_fraudulent_transactions') }}
stg__feast__acct_fraud_7d.yml
version: 2
models:
- name: stg__feast__acct_fraud_7d
description: >
This staging model identifies whether a user has been involved in any fraudulent transactions
within a specified time window (7 days). It is typically used as a feature in fraud detection
pipelines or real-time inference systems.
config:
contract:
enforced: true
columns:
- name: account_id
data_type: string
description: >
The unique identifier of the user. This is used as the primary key to track fraud status
per user.
- name: has_fraud_7d
data_type: boolean
description: >
A binary flag (true or false) indicating whether the user has had any fraudulent transactions
in the past 7 days. Can be used as a feature in ML models.
- name: feature_timestamp
data_type: timestamp
description: >
The timestamp representing the point-in-time when the feature was calculated. Useful for
point-in-time correctness in feature stores or backtesting.
stg__feast__acct_profiles¶
stg__feast__acct_profiles.sql
select
user_id as account_id,
credit_score,
account_age_days,
case
when user_has_2fa_installed = 1 then true
when user_has_2fa_installed = 0 then false
else null
end as has_2fa_installed,
feature_timestamp
from {{ source('feast', 'user_account_features') }}
stg__feast__acct_profiles.yml
version: 2
models:
- name: stg__feast__acct_profiles
description: ""
config:
contract:
enforced: true
columns:
- name: account_id
data_type: string
description: ""
- name: credit_score
data_type: int
description: ""
- name: account_age_days
data_type: int
description: ""
- name: has_2fa_installed
data_type: boolean
description: ""
- name: feature_timestamp
data_type: timestamp
description: ""
Intermediate¶
int__feast__acct_num_txns¶
int__feast__acct_num_txns.sql
{{ config(
materialized='incremental',
incremental_strategy='microbatch',
event_time='transaction_date',
begin='2025-04-20',
batch_size='day',
partition_by={
"field": "transaction_date",
"data_type": "date",
"granularity": "day"
},
on_schema_change='append_new_columns'
) }}
with transactions as (
-- this ref will be auto-filtered
select
{{ dbt_utils.star(
ref('fraud', 'stg__feast__transactions')
) }}
from {{ ref('fraud', 'stg__feast__transactions') }}
)
select
src_account_id as account_id,
date(created_at) as transaction_date,
count(*) as num_transactions,
sum(amount) as total_amount,
sum(case when amount > 0 then amount else 0 end) as total_deposits,
from transactions
group by account_id, date(created_at)
int__feast__acct_num_txns__extented¶
int__feast__acct_num_txns__extented.sql
select
*,
sum(num_transactions) over (
partition by account_id
order by unix_date(transaction_date)
range between 6 preceding
and current row
) as num_transactions_7d
from {{ ref('fraud', 'int__feast__acct_num_txns') }}
Marts¶
feat_acct_num_txns_7d¶
feat_acct_num_txns_7d.sql
select
account_id as entity_id,
num_transactions_7d,
cast(transaction_date as timestamp) as feature_timestamp
from {{ ref("fraud", "int__feast__acct_num_txns__extented") }}
feat_acct_fraud_7d¶
feat_acct_fraud_7d.sql
select
account_id as entity_id,
has_fraud_7d,
feature_timestamp
from {{ ref('fraud', 'stg__feast__acct_fraud_7d') }}
feat_acct_profiles¶
feat_acct_profiles.sql
select
account_id as entity_id,
credit_score,
account_age_days,
has_2fa_installed,
feature_timestamp
from {{ ref('fraud', 'stg__feast__acct_profiles') }}
Exposures¶
exposures.yml
version: 2
exposures:
- name: account_features
description:
This exposure contains features related to account activity and fraud detection.
type: ml
maturity: medium
owner:
name: KC Lai
email: kuanchoulai10@gmail.com
label: Account-related features
url: https://example.com
depends_on:
- ref('feat_acct_fraud_7d')
- ref('feat_acct_num_txns_7d')
- ref('feat_acct_profiles')