You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Elementary Cloud is designed with the core principle of least privilege.
9
+
Our cloud service does not require permissions to access the customer data.
10
+
Therefor, we instruct our customers to create a dedicated role for Elementary with `read only` access only to the Elementary schema in your data warehouse.
11
+
12
+
As long as you follow the onboarding process instructions, it will be impossible for Elementary Cloud to read data from your warehouse that does not reside in the Elementary schema.
13
+
This ensures that Elementary cloud will not mistakenly access your data, and minimizes the risk in case of a data breach.
14
+
Our product and architecture are always evolving, but our commitment to secure design always remains.
15
+
16
+
17
+
## How it works
18
+
19
+
1. You install the Elementary dbt package in your dbt project and configure it to write to it's own schema, the Elementary schema.
20
+
2. The package writes test results, run results, logs and metadata to the Elementary schema.
21
+
3. The cloud service only requires `read access` to the Elementary schema, not to schemas where your sensitive data is stored.
22
+
4. The cloud service connects to sync the Elementary schema using an **encrypted connection** and a **static IP address** that you will need to add to your allowlist.
The Elementary schema stores only metadata, aggregated metrics and logs.
35
+
You can find the details of the tables [here](/guides/modules-overview/dbt-package).
36
+
37
+
The only exception to that is the `test_results_samples` which can be disabled. This is a feature that shows a sample of a few raw failed rows for failed tests, to help them triage and understand the problem.
38
+
To avoid this sampling, set the var `test_sample_rows_count: 0` in your `dbt_project.yml` (default is 5 sample rows).
39
+
40
+
41
+
## Secrets and data protection
42
+
43
+
-**Tokens and credentials** - For customer secrets (tokens and credentials) we use [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html). Secrets Manager uses envelope encryption with AWS KMS keys and data keys to protect each secret value. Whenever the secret value in a secret changes, Secrets Manager generates a new data key to protect it. The data key is encrypted under a KMS key and stored in the metadata of the secret. [See this link for more details](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html).
44
+
-**Customer data (Elementary schema replica)** - The synced customer data is encrypted at rest using server-side encryption (AES-256).
45
+
46
+
## Compliance
47
+
48
+
[Contact us](mailto:legal@elementary-data.com) for auditing reports and penetration testing results.
49
+
50
+
## Have more questions?
51
+
52
+
We would be happy to answer!
53
+
Reach out to us on [email](mailto:legal@elementary-data.com) or [Slack](https://join.slack.com/t/elementary-community/shared_invite/zt-1b9vogqmq-y~IRhc2396CbHNBXLsrXcA).
Copy file name to clipboardExpand all lines: docs/guides/add-elementary-tests.mdx
+32-23Lines changed: 32 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,8 +2,7 @@
2
2
title: "Add anomaly detection tests"
3
3
---
4
4
5
-
After you [install the dbt package](/quickstart#install-the-dbt-package), you can add Elementary data anomaly detection
6
-
tests.
5
+
After you [install the dbt package](/quickstart#install-the-dbt-package), you can add Elementary data anomaly detection tests.
7
6
8
7
## Data anomaly detection dbt tests
9
8
@@ -27,13 +26,13 @@ alt="Demo"
27
26
Monitors the row count of your table over time per time bucket (if configured without `timestamp_column`, will count table total rows).
28
27
29
28
Upon running the test, your data is split into time buckets (daily by default, configurable with the `time bucket`
30
-
field), and then we compute the row count per bucket for the last `days_back` days (by default 14).
29
+
field), and then we compute the row count per bucket for the last [`days_back`](/guides/anomaly-detection-configuration/days-back) days (by default 14).
31
30
32
31
The test then compares the row count of buckets within the detection period (last 2 days by default, controlled by the
33
32
`backfill_days` var), and compares it with the row count of the previous time buckets.
34
33
If there were any anomalies during the detection period, the test will fail.
35
34
36
-
For advanced configuration of Elementary anomaly tests, please click [here](/guides/add-elementary-tests#advanced-configuration-for-your-elementary-tests)
35
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
37
36
38
37
39
38
<CodeGroup>
@@ -96,7 +95,7 @@ The test then compares the freshness of buckets within the detection period (las
96
95
`backfill_days`var), and compares it with the freshness of the previous time buckets.
97
96
If there were any anomalies during the detection period, the test will fail.
98
97
99
-
For advanced configuration of Elementary anomaly tests, please click [here](/guides/add-elementary-tests#advanced-configuration-for-your-elementary-tests)
98
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
100
99
101
100
<CodeGroup>
102
101
@@ -150,7 +149,7 @@ timestamp ("now") and the most recent event timestamp.
150
149
- If both an `event_timestamp_column` and an `update_timestamp_column` are provided, the test will measure over time
151
150
the difference between these two columns.
152
151
153
-
For advanced configuration of Elementary anomaly tests, please click [here](/guides/add-elementary-tests#advanced-configuration-for-your-elementary-tests)
152
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
154
153
155
154
<CodeGroup>
156
155
@@ -199,7 +198,7 @@ It is best to configure it on low-cardinality fields.
199
198
The test counts rows grouped by given columns/expressions, and can be configured using the `dimensions`
200
199
and `where_expression` keys.
201
200
202
-
For advanced configuration of Elementary anomaly tests, please click [here](/guides/add-elementary-tests#advanced-configuration-for-your-elementary-tests)
201
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
203
202
204
203
<CodeGroup>
205
204
@@ -265,7 +264,7 @@ Executes column level monitors and anomaly detection on all the columns of the t
265
264
are [detailed here](/guides/data-anomaly-detection#tests-and-monitors-types) and can be configured using
266
265
the `all_columns_anomalies` key.
267
266
268
-
For advanced configuration of Elementary anomaly tests, please click [here](/guides/add-elementary-tests#advanced-configuration-for-your-elementary-tests)
267
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
269
268
270
269
<CodeGroup>
271
270
@@ -323,10 +322,13 @@ Executes column level monitors and anomaly detection. Specific monitors
323
322
are [detailed here](/guides/data-anomaly-detection#tests-and-monitors-types) and can be configured using
324
323
the `column_anomalies` key.
325
324
326
-
For advanced configuration of Elementary anomaly tests, please click [here](/guides/add-elementary-tests#advanced-configuration-for-your-elementary-tests)
325
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
327
326
328
327
<CodeGroup>
329
328
329
+
330
+
For advanced configuration of Elementary anomaly tests, refer to [tests configuration](/guides/elementary-tests-configuration).
331
+
330
332
```yml Models
331
333
version: 2
332
334
@@ -563,28 +565,35 @@ sources:
563
565
564
566
## Configure your elementary anomaly detection tests
565
567
566
-
The elementary anomaly detection tests described above can work out-of-the-box with default configuration. However,
567
-
we support additional configuration that can be used to customize their behavior, depending on your needs.
568
-
Read all about data anomaly detection tests configuration [here](/guides/elementary-tests-configuration).
568
+
<Tip>If your data set has a timestamp column that represents the creation time of a field, it is highly recommended configuring it as a `timestamp_column`.</Tip>
569
+
570
+
To support different types of data sets, the tests have configuration that can be used to customize their behavior.
571
+
Read more about [data anomaly detection tests configuration here](/guides/elementary-tests-configuration).
569
572
570
-
We recommend adding a tag to the tests so you could execute these in a dedicated run using the selection
571
-
parameter `--select tag:elementary`.
572
-
If you wish to only be warned on anomalies, configure the severity of the tests to warn.
573
+
We recommend adding a tag to the tests so you could execute these in a dedicated run using the selection parameter `--select tag:elementary`.
574
+
If you wish to only be warned on anomalies, configure the `severity` of the tests to `warn`.
573
575
574
576
575
577
## What happens on each test?
576
578
577
-
Upon running a test, your data is split into time buckets based on the `time_bucket` field and is limited by
578
-
the `days_back` var. The test then compares a certain metric (e.g. row count) of the buckets that are within the detection
579
-
period (`backfill_days`) to the row count of all the previous time buckets within the `days_back` period.
579
+
Upon running a test, your data is split into time buckets based on the [`time_bucket`](/guides/anomaly-detection-configuration/time-bucket) field and is limited by
580
+
the [`days_back`](/guides/anomaly-detection-configuration/days-back) var. The test then compares a certain metric (e.g. row count) of the buckets that are within the detection
581
+
period ([`backfill_days`](/guides/anomaly-detection-configuration/backfill-days)) to the row count of all the previous time buckets within the [`days_back`](/guides/anomaly-detection-configuration/days-back) period.
580
582
If there were any anomalies in the detection period, the test will fail.
581
-
On each test elementary package executes the relevant monitors, and searches for anomalies by comparing to historical
582
-
metrics.
583
-
At the end of the `dbt test` run, all results and collected metrics are merged into the elementary models.
583
+
On each test elementary package executes the relevant monitors, and searches for anomalies by comparing to historical metrics.
To learn more, refer to [core concepts](/guides/how-anomaly-detection-works).
593
+
584
594
585
595
## What does it mean when a test fails?
586
596
587
-
When a test fail, it means that an anomaly was detected on this metric and dataset. To learn more, refer
588
-
to [anomaly detection](/guides/data-anomaly-detection#anomaly-detection).
597
+
When a test fail, it means that an anomaly was detected on this metric and dataset. To learn more, refer to [core concepts](/guides/how-anomaly-detection-works) and [anomaly detection](/guides/data-anomaly-detection).
If the backfill_days are set to 2, only data points in the last 2 days will be included in the detection period and could be flagged anomalous.
10
+
If backfill_days is set to 7 days, the detection period will be 7 days long.
11
+
12
+
For incremental models, this is also the period for re-calculating metrics.
13
+
If metrics for buckets in the backfill days were already calculated, Elementary will overwrite them. The reason behind it is to monitor recent backfills of data, if there were any.
14
+
This configuration should be changed according to your data delays.
15
+
16
+
-_Default: 2_
17
+
-_Relevant tests: Anomaly detection tests with `timestamp_column`_
0 commit comments