PSL Reference

PSL can be used to define entities on two abstraction layers:

  • Relational (see: Physical Mapping) - defines schema tables that can be used in SQL queries. The relational model can span multiple data sources and provides uniform SQL access to your entire data landscape.

  • Multi-dimensional (see: Logical Model) - defines dimensions and measures on top of schema tables. It allows you to run aggregate and record type queries.

A simple example:

schema MySchema(dataSource = "Test") {
measure unitSales(column = "sales.unit_sales")
dimension Customer(table = "customer") {
attribute city(column = "city")
}
table sales(physicalTable = "SALES_FACT_1998") {
column customer_id(expression = "CUSTOMER_ID", tableRef = "customer.customer_id")
column unit_sales(expression = "UNIT_SALES")
}
table customer(physicalTable = "CUSTOMER") {
column customer_id(expression = "CUSTOMER_ID")
column city(expression = "CITY")
}
}

Now it is possible to query this schema for the measure unitSales aggregated by the dimension attribute Customer.city.

General Syntactic Structure

PSL definitions have the following structure:

<definition> := <type> <name> <parameters>? <body>?
<parameters> := '(' <param> (',' <param>)* ')'
<param> := <paramName> '=' <paramValue>
<body> := '{' <definition>* '}'

Schema

Top level element defining the complete data model.

Syntax

'schema' <name> <parameters>? <body>?

Parameters

Name

Description

dataSource

Default data source for this schema. Data sources are referred to by name inside the project containing this schema.

Body

Contains the following definitions:

Examples

schema MySchema(dataSource = "pg_prod") {
dimension Time
}

Schema Import/Expose

The following entities can be imported/exposed: schemas, dimensions, measures and tables.

The Import definition imports names inside the current schema but does not expose them in the schema interface. Imported definitions can be used, for example, in conforms or other places where a reference to an existing definition is necessary.

Expose definition exposes the imported definitions in the schema interface.

Each imported name can have an alias. If an alias is not provided the entity will be imported with last name component in the qualified name (i.e. unqualified name).

Syntax

'import' IDENTIFIER
'expose' <entity_type> <qualifiedName> [ <alias> ]
'expose' <field_type>s (<wildcard>|<scope>) with (<prefix>|<namespace>)
Where:
<field_type> := 'dimension' | 'measure' | 'table'
<entity_type> := 'schema' | <field_type>
<wildcard> := qualifiedName.*
<scope> := qualifiedName.{<qualifiedName> (',' <qualifiedName>)*}
<pefix> := prefix IDENTIFIER
<namespace> := namespace <qualifiedName>
<qualifiedName> := IDENTIFIER ('.' IDENTIFIER)*
<alias> := as IDENTIFIER

Examples

/* 1. Importing */
import Sales // importing schema
import Inventory
schema Foodmart {
dimension Store {
conforms Sales.Store // using imported name
conforms Inventory.Store
}
measure sales {
conforms Sales.sales // using imported name
}
}
/* 2. Expose */
import Sales
schema Foodmart {
expose schema Sales // exposing schema
expose schema Inventory as Inv
expose measure Sales.storeSales
expose dimension Sales.Store_D as Store
expose tables Sales.*
expose measures Sales.{m1, m2}
expose dimensions Sales.* with prefix Foo_
expose dimensions Sales.* with namespace Cusomer.Public
expose dimensions Sales.{d1, d2} with namespace Cusomer.Public
expose dimensions Sales.{d1, d2} with prefix Foo_
/*
* Exposed names: Store, Inv, storeSales, d1, d2, m1, m2, all tables and dimensions from Sales schema, etc.,
* are accessible directly in queries to the Foodmart schema or when importing/exporting Foodmart schema itself.
*/
dimension ConformingStore {
conforms Store // exported names can be used inside this schema as well
conforms InvStore
}
}

Dimension

Defines a set of related attributes which can be queried. Attributes can be further grouped in a dimension by hierarchies and levels.

A dimension defines a namespace for the set of elements it contains. The dimension name is used to prefix attribute names when querying. For example:

dimension Customer {
attribute name
}

Provides a dimension attribute called Customer.name that can be used for querying.

Syntax

'dimension' <name> <parameters>? <body>?

Parameters

Name

Description

table

Default table for this dimension.

Body

Contains the following definitions:

Examples

/*
* 1. Two different hierarchies
* First hierarchy here is the default.
* Levels have the following qualified names:
* Customer.ageGroup, Customer.age, Customer.customer,
* Customer.Geo.country, Customer.Geo.city, Customer.Geo.customer
*/
dimension Customer {
hierarchy {
level ageGroup
level age
level customer
}
hierarchy Geo {
level country
level city
level customer
}
}
/*
* 2. Plain dimension structure
* In this case there are no hierarchical relationships inside the dimension
* Attributes have the following qualified names:
* Person.id, Person.firstName, Person.lastName
*/
dimension Person {
attribute id
attribute firstName
attribute lastName
}

Attribute

Defines data attribute that can be queried using Contiamo.

Syntax

'attribute' <name> <parameters>? <body>?

Parameters

Name

Description

table

table for this attribute.

column

Reference to the column for this attribute. Can be either in form of table.column or just column. In the latter case the default table is used to locate referenced column. If the column parameter is not provided, the attribute name is used instead.

columns

Array of equivalent references to columns for this attribute. Each element in the array corresponds to the rules described for the column definition above.

expression

castType

Specifies SQL type to which the output value for this level is converted.

Body

Contains the following definitions:

Examples

attribute Email(column = "user_email")

Hierarchy (optional)

Defines an optionally named group of levels. The order of levels inside a hierarchy defines hierarchical relationships from general to specific. Levels defined before are more broad than levels defined after. Hierarchies are used to group levels and attributes for organizational purposes. They have no impact on query execution.

Syntax

'hierarchy' <name>? <parameters>? <body>

Note that the name is optional. In such a case, this is the default hierarchy inside a dimension. Only one hierarchy inside a dimension is allowed to have an empty name.

Parameters

Name

Description

table

Default table for this hierarchy.

Body

Contains the following definitions:

Examples

hierarchy {
level Country
level Region
level City
}

Level (optional)

Defines groups of attributes. Levels as well as attributes represent specific data values. Unlike dimensions and hierarchies that cannot be queried directly and provide models, levels and attributes can be used in Contiamo queries.

In terms of modeling, the main difference between levels and attributes is that level is used to model hierarchies, while attribute does not participate in hierarchical relationships.

Syntax

'level' <name> <parameters>? <body>?

Parameters

Name

Description

table

Table for this level.

column

Reference to the column for this level. Can be either in form of table.column or just column. In the latter case the default table is used to locate referenced column. If column parameter is not provided, level name is used instead.

columns

Array of equivalent references to columns for this level. Each element in the array corresponds to the rules described for the column definition above.

expression

castType

Specifies SQL type to which the output value for this level is converted.

Body

Contains the following definitions:

Examples

level Country(columns = ["country_id", "stats.country_id"]) {
attribute name
attribute isoCountryCode
}

Measure

Measure is an aggregatable piece of data.

Measures can be defined in 3 different flavors:

  • aggregated measure - measure that optionally applies filtering to dimensions and then aggregates underlying columns directly

  • filtered measure - measure that references other measures and applies filtering on dimensions afterwards

  • calculated measure - measure that implements some arithmetic expression on top of other measures

Syntax

'measure' <name> <parameters>? <body>?

Parameters

Name

Description

aggregate

Aggregate function to use for this measure. Supported functions are: 'sum', 'avg', 'count', 'distinctCount', 'approxDistinctCount', 'max', 'min'. By default set to 'sum'.

column

Reference to the column for this measure in form of table.column.

columns

Array of equivalent references to columns for this measure. Each element in the array corresponds to the rules described for column definition above.

measure

Reference to other measure to base this value on. Used in filtered measures.

filter

Filter to be applied when calculating this measure.

calculation

Arithmetic expression on other measures. Used in calculated measures.

Use the following rules to determine measure type:

  • If neither measure nor calculation are defined = aggregated measure. Aggregation function determined either from aggregate parameter or "sum" by default.

  • If measure is defined = filtered measure. In this case neither calculation nor conforms should be defined.

  • If calculation = calculated measure. In this case aggregate, measure, filter and conforms are prohibited.

Body

Can contain the following definitions:

Examples

// aggregated measure
measure unitSales(aggregate="sum", column = "stats.unit_sales")
// filtered measure
measure unitSalesx(measure = "unitSales", filter = "Customer.cars < 3")
// aggregated measure with filter
measure numChildren(filter = "Customer.cars < 3")
// calculated measure
measure calc (calculation = "storeSales + 100000 / (storeCost + storeSales) + 1")

Table

A schema table can either be a direct mapping to a physical table, or a transformation combining multiple physical tables (which can reside in different data sources). Schema tables can be used in SQL queries and referred to in dimension and measure definitions.

By default the schema table name is the same as the physical table name. To combine multiple physical tables, or perform other transformations at query time, a standard SQL query can be defined with the sql parameter. Table names in that query must be namespaced by the name of the data source.

Syntax

'table' <name> <parameters>? <body>?

Parameters

Name

Description

dataSource

The name of the data source for this table. Can reference data sources from this schema's project.

physicalTable

The name of underlying physical table for this schema table definition. Schema table name is used by default if this parameter is not provided.

sql

Defines a SQL statement. Cannot be used with physicalTable. Supports standard SQL.

Body

Can contain the following definitions:

Examples

// 1. Table with body. Physical table name is also "sales_fact_1998"
table sales_fact_1998 {
column customer_id (tableRef = "customer.customer_id")
column unit_sales
}
// 2. Using SQL to join sales to customer profiles which reside in different data sources
table sales_with_time(sql = """select * from "foodmart"."sales_fact_1998" sf8 join "customer_warehouse"."customer_profile" profile on (sf8.customer_id = profile.cust_id)""") {
columns *
}

Column

Column defines an elementary piece of data inside a schema table. This definition is also used to link the physical data structure with logical elements such as dimensions and measures, as well as to define relations between schema tables.

Syntax

'column' <name> <parameters>?

Parameters

Name

Description

expression

SQL expression for this column. Expressions must be written in standard SQL.

tableRef

Links to other schema columns. Schema columns shall be addressed using <table>.<column> notation. This attribute defines joining relations between tables.

tableRefs

Array of links to other schema columns. Each element of this array contains a definition conforming to the rules for the tableRef parameter defined above.

Examples

table sales_fact_1998 {
// Provides link to another table/column which can be used to join target table when necessary
column customer_id (tableRef = "customer.customer_id")
// Column references multiple tables
column stat_id (tableRefs = ["inventoryStats.stat_id", "salesStats.stat_id"])
column unit_sales
}
table calendar {
// An example of using expression
column day_of_week(expression = """extract(day from "the_date")""")
}

Columns

This is a shorthand for being able to easily expose all columns in the physical table.

Syntax

'columns' *

Parameters

N/A

Examples

table sales_fact_1998 {
// make all columns in the physical table available
columns *
}
table customer {
columns *
// overrides the definion of "customer_id" provided by "columns *"
column customer_id (tableRef = "customer.customer_id")
// adds an additional column based on an expression
column name(expression="fname || lname")
}

Filter

Filter expressions have their own SQL-like syntax. A schema can have at most one filter.

Syntax

'filter' '"'<expression> (<boolOp> <expression>)*'"'
<expression> := <dimAttr> <operator> <value>
<boolOp> := 'and' | 'or'

Expression

Name

Description

dimAttr

refers to a dimension attribute to filter by

operator

one of: =, !=, <, >, <=, >=, in, notIn, like

value

A dimension value (or values) to match

Examples

filter "Date.Date > '2017-01-01'"
filter """
Store.region.city in ('Tacoma', 'Seattle') and Store.store.type = 'Supermarket'
"""

Conforms

Used to identify that multiple measures, dimensions, levels, or attributes are the same (conforming) across tables and data sources. The conforms relation is mono-directional, i.e., if some element defined as conforming to another element (conforming element) is queried, the result returned can only be found by the means of the element to which it conforms (conform element). To rephrase, when querying for the conforming element, the result returned is retrieved via the conform element.

Syntax

'conforms' <qualifiedName>

Examples

import Sales.Store
import Inventory.{Store => InvStore}
import Sales.sales
import Inventory.{warehouseSales => invSales}
schema Foodmart {
dimension ConformingStore {
// defines conformance to 2 dimensions from other schemas
// ConformingStore dimension will have intersection of hierarchies of dimension it conforms to
conforms Store
conforms InvStore
}
measure totalSales {
// defines conformance to measure from other schema
conforms sales
}
measure warehouseSales {
// defines conformance to measure from other schema
conforms invSales
}
}
schema Foodmart {
import {GoogleAnalytics => ga, FacebookStats => fb}
// In this case we explicitly define a set of attributes for a dimension.
// If any defined attribute exists in one of the conformed dimensions, this will define conformance of attributes.
dimension Date {
conforms ga.Time
conforms fb.Time
attribute year
attribute day
attribute weekday
attribute date
}
}

Metadata

Defines metadata for different types of elements. Metadata is not directly processed by Contiamo but stored for reference and for use in other tools. Metadata can be introspected to extract additional configuration or knowledge about the schema.

Syntax

'metadata' <parameters>
Where:
<parameters> := '(' <param> (',' <param>)* ')'
<param> := <name> '=' (<simpleValue> | <arrayValue>)
<arrayValue> := '[' <simpleValue> (',' <simpleValue>)* ']'
<simpleValue> := BOOL | NUMBER | STRING

Parameter names here are arbitrary. They can be interpreted by third-party tools.

Examples

schema Foodmart {
dimension D {
metadata(d1 = ["d1", "dd1"], d2 = "d2")
attribute A {
metadata(a1 = ["a1", "aa1"], a2 = "a2")
}
}
measure M {
metadata(m1 = ["m1", "mm1"], m2 = "m2")
}
}