Creating Query Request
Please read general information about Query Api first.
In this page you will learn about all the objects that you can use to constract a query request. These objects can be used to generate simple or very complex queries based on your use-case. So not all of them are required in every request. Here is a basic example from Quick Start page;
curl --request POST 'https://api.eventbunker.io/v1/query/json' \
--header 'x-eventbunker-query-key: <YOUR-QUERY-API-KEY>' \
--header 'Content-Type: application/json' \
--data-raw '{
"time_period": "minus_5_minutes",
"group_by": ["example_string_field"],
"computations": [
{ "field_name": "example_number_field", "function": "sum" }
],
"filters": [
{ "field_name": "event_name", "op": "equals", "value": "my_first_event_name" },
{ "field_name": "customer_identifier", "op": "equals", "value": "your_app_customer_id" }
],
"order_by": [{ "field_name": "example_string_field", "direction": "ASC" }]
}'
If you know little bit about SQL, you can recognize this query string below as a transformed version of request above;
SELECT example_string_field, sum(example_number_field)
FROM my_data_table
WHERE event_name = 'my_first_event_name'
and customer_identifier = 'your_app_customer_id'
and event_time between now()-5m and now()
group by example_string_field
order by example_string_field ASC
The example above could be total number of pageview events by category for a user's activity in your application in last 5 minutes ordered by category name. You can extend this query using more query elements which are described below this page to many use-cases. Such as pageviews in last week group by day and category to learn which categories that this user was interested in every day most.
Query Endpoints
Json Data Endpoint;
POST https://api.eventbunker.io/v1/query/json
Embedded Chart Iframe Endpoint;
POST https://api.eventbunker.io/v1/query/iframe
Query Request Fields
Group By
group_by
is used get drill-down information in the result dataset. You can use any data field except date type fields to group computation results. To group by date, you can use interval
query element described below. You can combine limit
and order_by
with group_by
to control number and order of result dataset in case grouping field produces too many small groups that you do not need.
group_by
is an optional element in string array type inside query json. You can use minimum 1, maximum all the fields in the event data you ingested. If you want to reference any computations
field in the group by, then you must use it's alias
name in the array since this is a dynamic value and alias is the name of the field in that query context.
"group_by": ["example_field_name_A", "example_field_name_B"]
Order By
order_by
is used get result dataset in desired order by one or more fields. You can use any data field which is referenced in that query to order results. Remember, if a field does not exist in resultset, this field can not be used in order_by
as well. You can order by fields that are referenced in group_by
and computations
in ascending(ASC) or descending(DESC) order. Order of the field names in the array determines the priority of ordering.
order_by
is an optional element in object array type inside query json. Each object has field_name
and direction
as shown in the example. Similar to group_by
usage, if you want to reference any computations
field in the order by, then you must use it's alias
name in the array since this is a dynamic value and alias is the name of the field in that query context. order_by
is usually combined with limit
and group_by
shape the resultset.
"order_by": [
{ "field_name": "example_field_name_A", "direction": "ASC" },
{ "field_name": "example_field_name_B", "direction": "DESC" }
]
Limit
limit
is used to limit resultset object count. For example if you want to top 10 results from query, you can combine order by
and limit : 10
to that easily.
limit
is an optional element in number type inside query json.
"limit": 10
Interval
interval
is used to get result dataset divided in time buckets. You can use minute
, hour
, day
, week
, month
, quarter
, year
to get a result object calculated for each time bucket.
For example a query of pageview events in last 3 months for everyday can be achieved by passing "interval": "day"
and "time_period": "last_3_months"
. Since we could not use arbitrary groups bys in sql query, interval element is here to group by arbitrary time buckets.
interval
is an optional element in string type inside query json. If used, timezone
parameter value also applies to interval time.
"interval": "day"
Response json will contain eb_time_bucket
as time field, ISO string type, like example below;
{
"event_name": "page_view",
"eb_time_bucket": "2022-01-04T15:00:00+01:00",
"count": 123
}
Fill Zero
fill_zero
is used to get 0 values on numerical values in result dataset when interval
for that specific time bucket does not exist. Like example in interval
, if you do not have any pageview event in a specific day, then by using "fill_zero": true
in query json, you can still get every 90 days in result set for 3 months, with missing days filled for you automatically as 0 values. This is a convenience when you use result dataset to create charts as missing data points can be problem or looks bad.
fill_zero
is an optional element in bool type inside query json. Default value is True. It can only be used if interval
is used in a query.
"fill_zero": true
Timezone
timezone
is used to include time zone information in time_period
filtering. All data is kept in database in UTC format, if you write data into EventBunker with timezone information present in your event's timestamp
then you can use timezone
element to set time zone information during query time to get consistent results. Otherwise you could be writing data in UTC+3 in your event tracking and querying UTC date, thus you will get inconsistent result. You can set timezone
information different per query request, allowing you to accommodate all of your users whether they are in US, EU or Asia.
timezone
is an optional element in string type inside query json. Valid values are the standards defined in IANA timezone database and widely used in all programming languages. You can see full list here in wikipedia.
"timezone": "Europe/Amsterdam"
Time Period
time_period
is used to filter events by time. It has two options, either a start and end date combination or a pre-defined formatted relative date string.
If you use start and end date filtering feature, you must pass 2 date strings [start_date, end_date]
in ISO-8601 format, similar to but not limited to examples below. Please check links for Javascript and Python examples. If the date format is wrong, you will get an error from api. While filtering timestamp, start_date
is inclusive, end_date
value is exclusive, so it is until that time but not equal to end date.
"time_period": [start_date, end_date]
"time_period": ["2021-12-25", "2021-12-26"]
// OR
"time_period": ["2021-12-25T21:48:00", "2021-12-26T21:48:00"]
// OR
"time_period": ["2021-12-25T21:48:00+04:00", "2021-12-26T21:48:00+04:00"]
Second options is passing relative date string. You can use any numerical value combined with time frames which are listed here: second(s)
, minute(s)
, hour(s)
, day(s),
week(s)
, month(s)
, quarter(s)
, year(s)
. You can use both singular (minute
) or plural (minutes
) versions, api accepts both interchangeably for convenience.
The difference between minus
, current
and last
is that;
minus means exactly current timestamps in seconds minus given amount of time period, so if current time is
2021-12-25T21:48:13
then"minus_30_minutes"
will select events between2021-12-25T21:18:13
and current time.current means the time frame that begins from truncated current timestamp minus given time period until now. For example
current_1_day
means that today from00:00:00
tonow
.last means the time frame that completed periods of begin and end date. Now is not included in here. For example if now is
14:39:00
thenlast_1_hour
means period from13:00:00
to14:00:00
.
As you can see that current [hour or day or month] is not included when last
is used, but it is included when current
is used. Refer the detailed example timestamp values table below for comparison.
Usage examples;
"time_period": "minus_3_days"
// OR
"time_period": "minus_30_seconds"
// OR
"time_period": "current_1_hour"
// OR
"time_period": "current_1_quarters"
// OR
"time_period": "last_7_days"
// OR
"time_period": "last_3_weeks"
Relative Timestamp Examples
Let's assume current time is 2021-12-25T21:18:13
and see minus
, current
and last
usage examples' actual start and end dates which will be used in querying events by api.
Example | Start Timestamp | End Timestamp (excluding) |
---|---|---|
minus_2_hours | 2021-12-25T19:18:13 | 2021-12-25T21:18:13 |
current_2_hours | 2021-12-25T20:00:00 | 2021-12-25T21:18:13 |
last_2_hours | 2021-12-25T19:00:00 | 2021-12-25T21:00:00 |
minus_1_month | 2021-11-25T21:18:13 | 2021-12-25T21:18:13 |
current_1_month | 2021-12-01T00:00:00 | 2021-12-25T21:18:13 |
last_1_month | 2021-11-01T00:00:00 | 2021-12-01T00:00:00 |
Computations
computations
is main aggregation element in the query. You can calculate simple sum, average, min, max or more complex computations using any field in the event data.
computations
is an optional element in object array type inside query json. If nothing provided you will get count
of rows by default in query response. Each computation consists of field_name
, alias
and function
fields, although alias
is optional, highly recommended to use alias to get predictable field names in query result json. Also if you use a computation in group_by
or order_by
then you have to pass alias
's value into those elements as field_name
. If alias
is not provided then by default computed columns name becomes "function_field_name", which can cause error if same name is an actual field which also used in query too.
"computations": [
{ "field_name": "price", "function": "avg" },
{ "field_name": "order_item", "alias": "total_num_items", "function": "sum" }
]
# then order by can be like
"order_by": [ { "field_name": "total_num_items", "direction": "DESC" }]
Available Computation Functions List
Function | Applicable Data Type | Description |
---|---|---|
count | All | event count |
count_unique | All | distinct count by an event field. Similar to count (distinct customer_id) in normal sql. |
sum | Numerical | Total value in aggregated event field. |
min | Numerical, String | Minimum value in aggregated event field. |
max | Numerical, String | Maximum value in aggregated event field. |
avg | Numerical | Average value in aggregated event field. |
any | All | Used to select first encountered value in aggregated event field. Useful for some group by cases. |
p_50 | Numerical | Median, quantile, 50 percentile of the event field. |
p_75 | Numerical | 75 percentile of the event field. |
p_90 | Numerical | 90 percentile of the event field. |
p_95 | Numerical | 95 percentile of the event field. |
p_99 | Numerical | 99 percentile of the event field. |
Filters
filters
is used to include selection criterias in the query. For event time filtering you have to use time_period
and for everything else you can add any field in event data as a filter to narrow down event's data.
filters
is an optional element in object array type inside query json. Each filter consists of field_name
, op
and value
fields, you can find supported operators in the table below.
Example usage;
"filters": [
{ "field_name": "event_name", "op": "equals", "value": "page_view" },
{ "field_name": "page", "op": "not_equals", "value": "home" }
]
Available Filters List
Filter | Applicable Data Type | Description |
---|---|---|
equals | All | Equal, (sql = ) |
not_equals | All | Not Equal, (sql != ) |
greater_than | Numerical, String | Value Greater Than, (sql > ) |
greater_than_or_equal | Numerical, String | Value Greater Than or Equal, (sql >= ) |
less_than | Numerical, String | Value Less Than, (sql < ) |
less_than_or_equal | Numerical, String | Value Less Than or Equal, (sql <= ) |
exists | All | Field exists in event data, (sql is not null ) |
not_exists | All | Field does not exist in event data, (sql is null ) |
contains | String | Value contains string part, (sql like % ) |
not_contains | String | Value does not contain string part, (sql not like % ) |
has | Array | Array has this value in it, example ["blue","yellow"] has "blue" |