Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.allium.so/llms.txt

Use this file to discover all available pages before exploring further.

Datasets are named, user-managed lists of string values (addresses, contract IDs, user IDs, tokens, etc.) that you reference from a Beam JavaScript transform via beam.contains(). Membership changes via the Datasets API are reflected immediately in deployed pipelines — no redeploy needed. Use datasets when your filter list is large, changes frequently, or is shared across multiple pipelines. For small, static allowlists, a set filter is simpler.

Using beam.contains() in a transform

Inside a JavaScript (v8) transform, call beam.contains() with the dataset name and the candidate values from the current record. It returns true if any of the values are in the dataset.
function transform(input) {
  if (beam.contains("DATASET_NAME", [input.field1, input.field2])) {
    return input;
  }
  return null; // drop records that don't match
}
ArgumentTypeDescription
DATASET_NAMEstringThe dataset name as registered via Create dataset. Must belong to the authenticated API key.
[input.field1, input.field2]string[]Array of values from the current record to check for membership. Returns true on the first match.
Values are normalized to lowercase on insert. Compare lowercase values from your record (e.g. input.from_address.toLowerCase()) to avoid missing matches on mixed-case fields like EVM addresses.

Example — filter transfers where either side is in a watchlist

function transform(input) {
  const from = (input.from_address || "").toLowerCase();
  const to = (input.to_address || "").toLowerCase();
  if (beam.contains("watchlist-wallets", [from, to])) {
    return input;
  }
  return null;
}

Example — tag records that touch a known contract set

function transform(input) {
  const to = (input.to_address || "").toLowerCase();
  if (beam.contains("known-defi-contracts", [to])) {
    input.is_defi = true;
  }
  return input;
}

Typical workflow

1

Create the dataset

Call Create dataset with a unique name. You’ll get back a dataset_id you can use for ID-based endpoints.
2

Populate it

Bulk-load values with Add entries (by ID) or Add entries by name. Each request accepts up to 250,000 values; chunk larger uploads at up to 5 concurrent requests.
3

Reference it from a transform

In a JavaScript transform on your pipeline config, call beam.contains("DATASET_NAME", [input.field1, input.field2]). Deploy the pipeline once via Deploy.
4

Update membership live

Add or remove values via Add entries and Remove entries at any time. Lookups in the running pipeline pick up the change immediately — no redeploy.

Endpoints

Create dataset

Create a new dataset by name.

List datasets

List all datasets owned by the authenticated user.

Delete dataset

Permanently delete a dataset and all its entries.

Add entries

Add values to a dataset by dataset_id.

Add entries by name

Add values to a dataset by name — convenient when you only track names in your code.

List entries

Page through the values in a dataset.

Remove entries

Remove specific values from a dataset.

Check entry

Check whether a single value exists in a dataset.

Datasets vs. filter values

Both back fast set-membership lookups, but they’re wired in differently:
DatasetsFilter values
Referenced fromJavaScript transforms via beam.contains()Set filter transforms (declarative)
ScopeReusable across pipelines and transformsBound to a specific set filter transform
Lookup shapeCheck N record fields against one named listCheck one record field against the set
Best forCross-pipeline allowlists/denylists, dynamic enrichment logicSimple “field X must be in this list” filters
See Filter Values for the set-filter alternative.