Query Syntax¶
This page describes Verdict’s query syntax. In general, Verdict follows the syntax of the standard SQL. There are some temporary limitations (highlighted using blue boxes below), which will be gradually lifted in future versions.
Query¶
The Verdict query must be an aggregate query with optional groupby, orderby, and limit clauses.
query := SELECT aggregate_function, ...
FROM relation
[GROUP BY base_attr, ...]
[ORDER BY alias, ...]
[LIMIT int]
alias := str
base_attr := str
Example:
SELECT sum(price) as price_sum, count(*) as c
FROM hive.tpch_sf100.lineitem_premerged
GROUP BY l_linestatus
ORDER BY price_sum
LIMIT 5
Note
If the groupby clause is present, the grouping columns will be prepended in the result set. In the future, this behavior will be changed to follow the standard SQL semantics.
Relation¶
A relation can be a base table, joins of relations, or subqueries.
relation := base_table |
relation join_expr |
(SELECT attr_alias, ...
FROM relation
[WHERE predicate]
[GROUP BY attr, ...]) alias
join_expr := join_type relation ON base_attr = base_attr
join_type := INNER JOIN |
LEFT JOIN |
RIGHT JOIN |
OUTER JOIN
attr_alias := attr [AS] alias
Examples of the relation:
-- example 1
(
select
l_returnflag,
l_quantity,
l_extendedprice,
l_discount,
l_extendedprice,
l_extendedprice * (1 - l_discount) disc_price,
l_extendedprice * (1 - l_discount) * (1 + l_tax) charge,
l_returnflag,
l_linestatus
from
hive.tpch_tiny.lineitem_premerged
where
l_shipdate <= date '1998-12-01'
) t1
-- example 2
(
select
l_orderkey,
l_extendedprice * (1 - l_discount) revenue,
o_orderdate,
o_shippriority
from
hive.tpch_tiny.lineitem_premerged l
inner join hive.tpch_tiny.orders_premerged o
on l_orderkey = o_orderkey
where
c_mktsegment = 'BUILDING'
and o_orderdate < date '1995-03-22'
and l_shipdate > date '1995-03-22'
) t1
Note that in the above example, t1 is the alias of the subquery relation. If the alias
is omitted the same name is assigned for base attributes and an arbitrary name is assigned for
derived attributes (e.g., l_extendedprice * (1 - l_discount)).
Note
The join type must be equijoin (whether it be inner, left, or right). The attribute that appears on the left-hand side of the equality sign is assumed to the attribute in the left join table. The similar rule applies for the right attribute.
Attribute¶
An attribute can be a base attribute or some functions of it.
attr := base_attr |
constant |
scalar_function |
aggregate_function
constant := int |
str |
date '0000-00-00' |
timestamp '0000-00-00 00:00:00'
predicate := logical_expr |
comparative_expr
logical_expr := predicate AND predicate |
predicate OR predicate |
NOT predicate
comparative_expr := attr > attr |
attr < attr |
attr >= attr |
attr <= attr |
attr <> attr |
attr in [ constant, ... ]
We describe more details about functions in the subsequent sections.
Scalar Functions¶
A scalar function is the function that produces an output value for each input value.
scalar_function := math_function |
string_function
Mathematical Functions¶
math_function := attr + attr |
attr - attr |
attr * attr |
attr / attr |
floor(attr) |
ceil(attr) |
round(attr)
String Functions¶
string_function := SUBSTR(attr, start, length) |
TO_STRING(attr) |
CAST(attr AS VARCHAR) |
CONCAT(attr, attr) |
LENGTH(attr) |
REPLACE(old, new) |
UPPER(attr) |
LOWER(attr) |
STARTSWITH(attr, pattern) |
CONTAINS(attr, pattern) |
ENDSWITH(attr, pattern)
Note
We are adding more scalar functions.
Aggregate Functions¶
An aggregate function is a function that produces a single row given multiple rows.
aggregate_function := COUNT(*) |
SUM(base_attr) |
AVG(base_attr)
Note
To use a derived attribute within aggregate functions, you can first create new attributes using subqueries, then attribute those new attributes.