curl --request POST \
--url https://api.trieve.ai/api/chunks/scroll \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--header 'TR-Dataset: <tr-dataset>' \
--data '{
"filters": {
"must": [
{
"field": "tag_set",
"match_all": [
"A",
"B"
]
},
{
"field": "num_value",
"range": {
"gte": 10,
"lte": 25
}
}
]
},
"offset_chunk_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"page_size": 1,
"sort_by": {
"field": "<string>",
"direction": "desc",
"prefetch_amount": 1
}
}'{
"chunks": [
{
"chunk_html": "<p>Hello, world!</p>",
"created_at": "2021-01-01 00:00:00.000",
"dataset_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"link": "https://trieve.ai",
"metadata": {
"key": "value"
},
"tag_set": "[tag1,tag2]",
"time_stamp": "2021-01-01 00:00:00.000",
"tracking_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"updated_at": "2021-01-01 00:00:00.000",
"weight": 0.5
}
]
}Get paginated chunks from your dataset with filters and custom sorting. If sort by is not specified, the results will sort by the id’s of the chunks in ascending order. Sort by and offset_chunk_id cannot be used together; if you want to scroll with a sort by then you need to use a must_not filter with the ids you have already seen. There is a limit of 1000 id’s in a must_not filter at a time.
curl --request POST \
--url https://api.trieve.ai/api/chunks/scroll \
--header 'Authorization: <api-key>' \
--header 'Content-Type: application/json' \
--header 'TR-Dataset: <tr-dataset>' \
--data '{
"filters": {
"must": [
{
"field": "tag_set",
"match_all": [
"A",
"B"
]
},
{
"field": "num_value",
"range": {
"gte": 10,
"lte": 25
}
}
]
},
"offset_chunk_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
"page_size": 1,
"sort_by": {
"field": "<string>",
"direction": "desc",
"prefetch_amount": 1
}
}'{
"chunks": [
{
"chunk_html": "<p>Hello, world!</p>",
"created_at": "2021-01-01 00:00:00.000",
"dataset_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"link": "https://trieve.ai",
"metadata": {
"key": "value"
},
"tag_set": "[tag1,tag2]",
"time_stamp": "2021-01-01 00:00:00.000",
"tracking_id": "e3e3e3e3-e3e3-e3e3-e3e3-e3e3e3e3e3e3",
"updated_at": "2021-01-01 00:00:00.000",
"weight": 0.5
}
]
}The dataset id or tracking_id to use for the request. We assume you intend to use an id if the value is a valid uuid.
JSON request payload to scroll through chunks (chunks)
ChunkFilter is a JSON object which can be used to filter chunks. This is useful for when you want to filter chunks by arbitrary metadata. Unlike with tag filtering, there is a performance hit for filtering on metadata.
Show child attributes
All of these field conditions have to match for the chunk to be included in the result set.
Filters can be constructed using either fields on the chunk objects, ids or tracking ids of chunks, and finally ids or tracking ids of groups.
Show child attributes
Field is the name of the field to filter on. Commonly used fields are timestamp, link, tag_set, location, num_value, group_ids, and group_tracking_ids. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata..
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
{
"gt": "2021-01-01 00:00:00.000",
"gte": "2021-01-01 00:00:00.000",
"lt": "2021-01-01 00:00:00.000",
"lte": "2021-01-01 00:00:00.000"
}
Show child attributes
Show child attributes
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
{
"field": "metadata.key1",
"match": ["value1", "value2"],
"range": { "gt": 0, "gte": 0, "lt": 1, "lte": 1 }
}
None of these field conditions can match for the chunk to be included in the result set.
Filters can be constructed using either fields on the chunk objects, ids or tracking ids of chunks, and finally ids or tracking ids of groups.
Show child attributes
Field is the name of the field to filter on. Commonly used fields are timestamp, link, tag_set, location, num_value, group_ids, and group_tracking_ids. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata..
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
{
"gt": "2021-01-01 00:00:00.000",
"gte": "2021-01-01 00:00:00.000",
"lt": "2021-01-01 00:00:00.000",
"lte": "2021-01-01 00:00:00.000"
}
Show child attributes
Show child attributes
Show child attributes
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
{
"field": "metadata.key1",
"match": ["value1", "value2"],
"range": { "gt": 0, "gte": 0, "lt": 1, "lte": 1 }
}
Only one of these field conditions has to match for the chunk to be included in the result set.
Filters can be constructed using either fields on the chunk objects, ids or tracking ids of chunks, and finally ids or tracking ids of groups.
Show child attributes
Field is the name of the field to filter on. Commonly used fields are timestamp, link, tag_set, location, num_value, group_ids, and group_tracking_ids. The field value will be used to check for an exact substring match on the metadata values for each existing chunk. This is useful for when you want to filter chunks by arbitrary metadata. To access fields inside of the metadata that you provide with the card, prefix the field name with metadata..
Boolean is a true false value for a field. This only works for boolean fields. You can specify this if you want values to be true or false.
DateRange is a JSON object which can be used to filter chunks by a range of dates. This leverages the time_stamp field on chunks in your dataset. You can specify this if you want values in a certain range. You must provide ISO 8601 combined date and time without timezone.
{
"gt": "2021-01-01 00:00:00.000",
"gte": "2021-01-01 00:00:00.000",
"lt": "2021-01-01 00:00:00.000",
"lte": "2021-01-01 00:00:00.000"
}
Show child attributes
Show child attributes
Match all lets you pass in an array of values that will return results if all of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
Match any lets you pass in an array of values that will return results if any of the items match. The match value will be used to check for an exact substring match on the metadata values for each existing chunk. If both match_all and match_any are provided, the match_any condition will be used.
{
"field": "metadata.key1",
"match": ["value1", "value2"],
"range": { "gt": 0, "gte": 0, "lt": 1, "lte": 1 }
}
{
"must": [
{
"field": "tag_set",
"match_all": ["A", "B"]
},
{
"field": "num_value",
"range": { "gte": 10, "lte": 25 }
}
]
}
Offset chunk id is the id of the chunk to start the page from. If not specified, this defaults to the first chunk in the dataset sorted by id ascending.
Page size is the number of chunks to fetch. This can be used to fetch more than 10 chunks at a time.
x >= 0Show child attributes
Field to sort by. This has to be a numeric field with a Qdrant Range index on it. i.e. num_value and timestamp
desc, asc How many results to pull in before the sort
x >= 0Number of chunks equivalent to page_size starting from offset_chunk_id
Show child attributes
Timestamp of the creation of the chunk
ID of the dataset which the chunk belongs to
Unique identifier of the chunk, auto-generated uuid created by Trieve
Timestamp of the last update of the chunk
Weight of the chunk, can be any float. Used as a multiplier on a chunk's relevance score for ranking purposes.
HTML content of the chunk, can also be an arbitrary string which is not HTML
Image URLs of the chunk, can be any list of strings. Used for image search and RAG.
Link to the chunk, should be a URL
Metadata of the chunk, can be any JSON object
Numeric value of the chunk, can be any float. Can represent the most relevant numeric value of the chunk, such as a price, quantity in stock, rating, etc.
Tag set of the chunk, can be any list of strings. Used for tag-filtered searches.
Timestamp of the chunk, can be any timestamp. Specified by the user.
Tracking ID of the chunk, can be any string, determined by the user. Tracking ID's are unique identifiers for chunks within a dataset. They are designed to match the unique identifier of the chunk in the user's system.
Was this page helpful?