Skip to content

Commit ff8ea73

Browse files
authored
Add support for Clickhouse (#10)
Clickhouse can now be optionally enabled by setting `enable_clickhouse=true`. Similar to our Cloudformation stack, it is a single EC2 instance. **I no longer create an NLB or Autoscaling group for this like in Cloudformation.**. Instead the lambdas will point directly at the EC2 instance. If we ever change or replace the EC2 host it will automatically update all references to use the new IP. The NLB in this case wasn't adding anything of value. I believe I have a way to preserve the Clickhouse metadata EBS volume between replacements too. This will allow us to change things like the AMI or other params and recreate a new Clickhouse instance with a short downtime but without losing any data. The variables `use_external_clickhouse_address` in the root module, `external_clickhouse_s3_bucket_name`, and `clickhouse_instance_count` in the submodule are for a specific use case for one customer and shouldn't be used by anyone else. Other Changes: * Allow passing `custom_domain` and `custom_certificate_arn` for Cloudfront in the root module * Randomize the RDS final snapshot suffix. (This is to support create/destroy/create/destroy CI workflows)
2 parents f86f6aa + e98e080 commit ff8ea73

File tree

21 files changed

+529
-41
lines changed

21 files changed

+529
-41
lines changed

.gitignore

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,8 +10,8 @@ crash.log
1010
crash.*.log
1111

1212
# Exclude all .tfvars files, which are likely to contain sensitive data, such as
13-
# password, private keys, and other secrets. These should not be part of version
14-
# control as they are data points which are potentially sensitive and subject
13+
# password, private keys, and other secrets. These should not be part of version
14+
# control as they are data points which are potentially sensitive and subject
1515
# to change depending on the environment.
1616
*.tfvars
1717
*.tfvars.json
@@ -36,13 +36,15 @@ override.tf.json
3636
.terraformrc
3737
terraform.rc
3838

39-
# Do not commit the terraform.tf and provider.tf files. This module should not have concrete
39+
# Do not commit the terraform.tf and provider.tf files. This module should not have concrete
4040
# backend or provider configs. Users will set that up themselves.
41-
terraform.tf
42-
provider.tf
41+
/terraform.tf
42+
/provider.tf
43+
/modules/*/terraform.tf
44+
/modules/*/provider.tf
4345

4446
# Ignore the braintrust-sandbox directory as it is used for internal testing
4547
braintrust-sandbox/*
4648

4749
# Ignore IntelliJ IDEA project files
48-
.idea
50+
.idea

README.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,6 @@
22

33
This module is used to create the VPC, Databases, Lambdas, and associated resources for the self-hosted Braintrust data plane.
44

5-
**NOTE: This module is not yet ready for production use. It is still under development.**
6-
75
## How to use this module
86

97
To use this module, **copy the [`examples/braintrust-data-plane`](examples/braintrust-data-plane) directory to a new Terraform directory in your own repository**. Follow the instructions in the [`README.md`](examples/braintrust-data-plane/README.md) file in that directory to configure the module for your environment.

examples/braintrust-data-plane/main.tf

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,12 @@ module "braintrust-data-plane" {
5252
# List of origins to whitelist for CORS
5353
# whitelisted_origins = []
5454

55+
# Custom domain name for the CloudFront distribution
56+
# custom_domain = null
57+
58+
# ARN of the ACM certificate for the custom domain
59+
# custom_certificate_arn = null
60+
5561
# The maximum number of requests per user allowed in the time frame specified by outbound_rate_limit_window_minutes. Setting to 0 will disable rate limits
5662
# outbound_rate_limit_max_requests = 0
5763

main.tf

Lines changed: 19 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -89,17 +89,18 @@ module "services" {
8989
postgres_host = module.database.postgres_database_address
9090
redis_host = module.redis.redis_endpoint
9191
redis_port = module.redis.redis_port
92-
# TODO: Brainstore
93-
# brainstore_hostname = var.brainstore_hostname
94-
# brainstore_port = var.brainstore_port
95-
# brainstore_s3_bucket_name = var.brainstore_s3_bucket_name
92+
93+
clickhouse_host = try(module.clickhouse[0].clickhouse_instance_private_ip, null)
94+
clickhouse_secret_id = try(module.clickhouse[0].clickhouse_secret_id, null)
9695

9796
# Service configuration
9897
braintrust_org_name = var.braintrust_org_name
9998
api_handler_provisioned_concurrency = var.api_handler_provisioned_concurrency
10099
whitelisted_origins = var.whitelisted_origins
101100
outbound_rate_limit_window_minutes = var.outbound_rate_limit_window_minutes
102101
outbound_rate_limit_max_requests = var.outbound_rate_limit_max_requests
102+
custom_domain = var.custom_domain
103+
custom_certificate_arn = var.custom_certificate_arn
103104

104105
# Networking
105106
service_security_group_ids = [module.main_vpc.default_security_group_id]
@@ -121,3 +122,17 @@ module "services" {
121122

122123
kms_key_arn = local.kms_key_arn
123124
}
125+
126+
module "clickhouse" {
127+
source = "./modules/clickhouse"
128+
count = var.enable_clickhouse ? 1 : 0
129+
130+
deployment_name = var.deployment_name
131+
clickhouse_instance_count = var.use_external_clickhouse_address ? 0 : 1
132+
clickhouse_instance_type = var.clickhouse_instance_type
133+
clickhouse_metadata_storage_size = var.clickhouse_metadata_storage_size
134+
clickhouse_subnet_id = module.main_vpc.private_subnet_1_id
135+
clickhouse_security_group_ids = [module.main_vpc.default_security_group_id]
136+
137+
kms_key_arn = local.kms_key_arn
138+
}

modules/clickhouse/iam.tf

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
resource "aws_iam_instance_profile" "clickhouse" {
2+
name = "${var.deployment_name}-ClickhouseInstanceProfile"
3+
role = aws_iam_role.clickhouse.name
4+
}
5+
6+
resource "aws_iam_role" "clickhouse" {
7+
name = "${var.deployment_name}-ClickhouseRole"
8+
9+
assume_role_policy = jsonencode({
10+
Version = "2012-10-17"
11+
Statement = [{
12+
Effect = "Allow"
13+
Principal = {
14+
Service = "ec2.amazonaws.com"
15+
}
16+
Action = "sts:AssumeRole"
17+
}]
18+
})
19+
}
20+
21+
resource "aws_iam_role_policy" "clickhouse_secret_access" {
22+
name = "AccessSecret"
23+
role = aws_iam_role.clickhouse.id
24+
policy = jsonencode({
25+
Version = "2012-10-17"
26+
Statement = [{
27+
Effect = "Allow"
28+
Action = "secretsmanager:GetSecretValue"
29+
Resource = aws_secretsmanager_secret.clickhouse_secret.arn
30+
}]
31+
})
32+
}
33+
34+
resource "aws_iam_role_policy" "clickhouse_s3_access" {
35+
name = "AccessS3Bucket"
36+
role = aws_iam_role.clickhouse.id
37+
policy = jsonencode({
38+
Version = "2012-10-17"
39+
Statement = [{
40+
Effect = "Allow"
41+
Action = "s3:*"
42+
Resource = [
43+
"arn:aws:s3:::${local.clickhouse_bucket_name}",
44+
"arn:aws:s3:::${local.clickhouse_bucket_name}/*"
45+
]
46+
}]
47+
})
48+
}
49+
50+

modules/clickhouse/main.tf

Lines changed: 118 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,118 @@
1+
locals {
2+
clickhouse_bucket_name = var.external_clickhouse_s3_bucket_name == null ? aws_s3_bucket.clickhouse_s3_bucket[0].id : var.external_clickhouse_s3_bucket_name
3+
}
4+
5+
data "aws_region" "current" {}
6+
7+
data "aws_ami" "amazon_linux_2" {
8+
most_recent = true
9+
owners = ["amazon"]
10+
11+
filter {
12+
name = "name"
13+
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
14+
}
15+
}
16+
17+
data "aws_subnet" "clickhouse_subnet" {
18+
id = var.clickhouse_subnet_id
19+
}
20+
21+
resource "aws_launch_template" "clickhouse" {
22+
count = var.clickhouse_instance_count
23+
name = "${var.deployment_name}-clickhouse"
24+
image_id = data.aws_ami.amazon_linux_2.id
25+
instance_type = var.clickhouse_instance_type
26+
27+
iam_instance_profile {
28+
name = aws_iam_instance_profile.clickhouse.name
29+
}
30+
31+
metadata_options {
32+
http_endpoint = "enabled"
33+
http_tokens = "required"
34+
http_put_response_hop_limit = 1
35+
}
36+
37+
network_interfaces {
38+
delete_on_termination = true
39+
security_groups = var.clickhouse_security_group_ids
40+
subnet_id = var.clickhouse_subnet_id
41+
}
42+
43+
block_device_mappings {
44+
device_name = "/dev/xvda"
45+
ebs {
46+
volume_size = 128
47+
volume_type = "gp3"
48+
encrypted = true
49+
kms_key_id = var.kms_key_arn
50+
delete_on_termination = true
51+
}
52+
}
53+
54+
key_name = var.clickhouse_instance_key_pair_name
55+
56+
user_data = base64encode(templatefile("${path.module}/user_data.sh", {
57+
aws_region = data.aws_region.current.name
58+
s3_bucket_name = local.clickhouse_bucket_name
59+
clickhouse_secret_id = aws_secretsmanager_secret.clickhouse_secret.arn
60+
clickhouse_secret_version_id = aws_secretsmanager_secret_version.clickhouse_secret.version_id
61+
}))
62+
63+
tag_specifications {
64+
resource_type = "instance"
65+
tags = {
66+
Name = "${var.deployment_name}-clickhouse"
67+
}
68+
}
69+
70+
tag_specifications {
71+
resource_type = "volume"
72+
tags = {
73+
Name = "${var.deployment_name}-clickhouse"
74+
}
75+
}
76+
77+
tag_specifications {
78+
resource_type = "network-interface"
79+
tags = {
80+
Name = "${var.deployment_name}-clickhouse"
81+
}
82+
}
83+
}
84+
85+
resource "aws_instance" "clickhouse" {
86+
count = var.clickhouse_instance_count
87+
launch_template {
88+
id = aws_launch_template.clickhouse[0].id
89+
version = "$Latest"
90+
}
91+
}
92+
93+
# EBS volume for Clickhouse metadata. This needs to be preserved across instances.
94+
resource "aws_ebs_volume" "clickhouse_metadata" {
95+
count = var.clickhouse_instance_count
96+
availability_zone = data.aws_subnet.clickhouse_subnet.availability_zone
97+
size = var.clickhouse_metadata_storage_size
98+
type = "gp3"
99+
encrypted = true
100+
kms_key_id = var.kms_key_arn
101+
102+
tags = {
103+
Name = "${var.deployment_name}-clickhouse-metadata"
104+
}
105+
lifecycle {
106+
prevent_destroy = true
107+
}
108+
}
109+
110+
resource "aws_volume_attachment" "clickhouse_metadata" {
111+
count = var.clickhouse_instance_count
112+
device_name = "/dev/sdf"
113+
volume_id = aws_ebs_volume.clickhouse_metadata[0].id
114+
instance_id = aws_instance.clickhouse[0].id
115+
116+
# This is a workaround to ensure the volume is attached after the instance is created.
117+
depends_on = [aws_instance.clickhouse]
118+
}

modules/clickhouse/outputs.tf

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
output "clickhouse_instance_id" {
2+
value = try(aws_instance.clickhouse[0].id, null)
3+
}
4+
5+
output "clickhouse_instance_private_ip" {
6+
value = try(aws_instance.clickhouse[0].private_ip, null)
7+
}
8+
9+
output "clickhouse_secret_arn" {
10+
value = aws_secretsmanager_secret.clickhouse_secret.arn
11+
}
12+
13+
output "clickhouse_s3_bucket_name" {
14+
value = try(aws_s3_bucket.clickhouse_s3_bucket[0].id, var.external_clickhouse_s3_bucket_name)
15+
}
16+
17+
output "clickhouse_secret_version_id" {
18+
# This is unfortunately needed because the AWSCURRENT version stage appears to be eventually consistent.
19+
# If you try get the secret right after creating it, you will get an error saying AWSCURRENT doesn't exist.
20+
# So instead you must point directly at the version_id
21+
description = "The ID of the secret version"
22+
value = aws_secretsmanager_secret_version.clickhouse_secret.version_id
23+
}

modules/clickhouse/s3.tf

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
resource "aws_s3_bucket" "clickhouse_s3_bucket" {
2+
count = var.external_clickhouse_s3_bucket_name == null ? 1 : 0
3+
bucket_prefix = "${var.deployment_name}-clickhouse"
4+
5+
lifecycle {
6+
# S3 does not support renaming buckets
7+
ignore_changes = [bucket_prefix]
8+
}
9+
}
10+
11+
resource "aws_s3_bucket_server_side_encryption_configuration" "clickhouse_s3_bucket" {
12+
count = var.external_clickhouse_s3_bucket_name == null ? 1 : 0
13+
bucket = aws_s3_bucket.clickhouse_s3_bucket[0].id
14+
15+
rule {
16+
apply_server_side_encryption_by_default {
17+
sse_algorithm = var.kms_key_arn != null ? "aws:kms" : "AES256"
18+
kms_master_key_id = var.kms_key_arn
19+
}
20+
bucket_key_enabled = var.kms_key_arn != null
21+
}
22+
}

modules/clickhouse/secrets.tf

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
resource "aws_secretsmanager_secret" "clickhouse_secret" {
2+
name_prefix = "${var.deployment_name}/ClickhouseSecret-"
3+
kms_key_id = var.kms_key_arn
4+
}
5+
6+
data "aws_secretsmanager_random_password" "clickhouse_secret" {
7+
exclude_characters = "\"'@/\\"
8+
exclude_punctuation = true
9+
password_length = 32
10+
}
11+
12+
resource "aws_secretsmanager_secret_version" "clickhouse_secret" {
13+
secret_id = aws_secretsmanager_secret.clickhouse_secret.id
14+
secret_string = data.aws_secretsmanager_random_password.clickhouse_secret.random_password
15+
16+
lifecycle {
17+
ignore_changes = [secret_string]
18+
}
19+
}

0 commit comments

Comments
 (0)