Dataverse Elastic Tables Deep Dive Tutorial

Dynamics Chronicles

958

Like ( 0 )

Report

Dataverse Elastic Tables Deep Dive Tutorial joao.neto Thu, 06/27/2024 - 14:46

Body

Dataverse Elastic Tables Deep Dive Tutorial

Intro

Said to be the "Super Heros" of the information world, that they're super fast and can store lots of information and that they're a game changer for storing massive loads of data.

I went to undercover what's behind the scenes of this Elastic Tables feature and check if the rumours are fake or not!

Principles and Architecture

Microsoft use Cosmos DB instead of azure SQL to store data, meaning they have a special column called Partition ID that organizes information in a "folders" like structure.

Dataverse Elastic Tables Deep Dive Tutorial

Partitioning and Horizontal Scaling:

Elastic tables use Azure Cosmos DB partitioning to scale individual tables
Each elastic table contains a system-defined Partition Id column
Rows are divided into distinct subsets (logical partitions) based on the value of the partitionid column, like storing rows per in folder by its partitionId
Choosing a consistent partitioning strategy is crucial for optimal performance

Unique References and Partitioning:

Unique references for elastic tables combine the primary key and the partitionid value
There is no limit on the number of logical partitions
If you don’t set a partitionid value, it remains null
Custom partitionid values allow multiple records with the same primary key but different partition IDs
PartitionId cannot be changed after creating a row

Consistency level

Elastic table supports strong consistency during a logical session. A logical session is a connection between a client and Dataverse. When a client performs a write operation on an elastic table, it receives a session token that uniquely identifies the logical session. To have strong consistency, you must maintain the logical session context by including the session token with all subsequent requests.

Session tokens ensure that all the read operations that are performed during the same logical session context return the most recent write that was made during that logical session. In other words, session tokens ensure that reads always honor the read-your-writes and write-follows-reads guarantees during a logical session. If a different logical session performs a write operation, other logical sessions might not immediately detect those changes.

Transactional behavior

For example, you have a synchronous plug-in step that is registered on the PostOperation stage of the Create message on an elastic table. In this case, an error that occurs in the plug-in does not roll back the record that is created in Dataverse. You should always avoid intentionally canceling any operation by throwing InvalidPluginExecutionException in the PreOperation or PostOperation synchronous stage. If the error is thrown after the Main operation, the request returns an error, but the data operation succeeds. Any write operations that are started in the PreOperation stage succeed.

However, you should always apply validation rules in a plug-in that is registered for the PreValidation synchronous stage. Validation is the purpose of this stage. Even when you use elastic tables, the request returns an error, and the data operation won't begin.

Expire data by using Time to live

Dataverse automatically includes a Time to live integer column with elastic tables. This column has the schema name TTLInSeconds and the logical name ttlinseconds.

When a value is set in this column, it defines the amount of time, in seconds, before the row expires and is automatically deleted from database. If no value is set, the record persists indefinitely, just like standard tables.

Flexible schema with JSON columns

Elastic tables enable you to store and query data with varying structures, without the need for predefined schemas or migrations. There's no need to write custom code to map the imported data into a fixed schema. More information: Developer guide: Query JSON columns in elastic tables Elastic tables enable you to store and query data with varying structures, without the need for predefined schemas or migrations. There's no need to write custom code to map the imported data into a fixed schema. More information: Developer guide: Query JSON columns in elastic tables.

Limitations

Table features currently not supported with elastic tables:

Business rules
Charts
Business process flows
One Dataverse connector for Power BI
Many-to-many (N:N) relationships to standard tables
Alternate key
Duplicate detection
Calculated and rollup columns
Currency columns
Column comparison in queries
Table sharing
Composite indexes
Cascade operations: Delete, Reparent, Assign, Share, Unshare
Ordering on lookup columns
Aggregate queries:
- Distinct value of attribute1 while orderby on attribute2 value
- Pagination when having multiple distincts
- Distinct with multiple order by
- Order by and group by together
- Group by on link entity (left outer join)
- Distinct on user owned tables 
Table connections
Access teams
Queues
Attachment

Currently, you can't add the following types of columns:

Money (MoneyAttributeMetadata)
MultiSelectPicklist (MultiSelectPicklistAttributeMetadata)
State (StateAttributeMetadata)
Status (StatusAttributeMetadata)
Image (ImageAttributeMetadata)
Calculated, Rollup, or Formula Columns
Currency
Lookup based on the Customer option

Relationships with elastic tables:

Doesn't support creating many-to-many relationships
One-to-many relationships are supported for elastic tables with the following limitations:
- Cascading isn't supported. Cascading behavior must be set to Cascade.None when the relationship is created
- Formatted values for lookup columns aren't returned when the following conditions are true
- The table that is retrieved is a standard table, and the lookup refers to an elastic table
You're using a custom elastic table partitionid value. In other words, the partitionid value is set to something other than the default value (null).
Elastic tables support one-to-many relationships, and related rows can be retrieved when a record is retrieved. Related records can't be included in a query

Also:

Duplicate detection or adding activities aren't supported
Return related rows in a query
Elastic tables don't currently support returning related rows when a query is run. If you try to return related rows
Deep insert is not supported with elastic tables. Each related record needs to be created independently
Elastic tables don't support multi-record transactions
Elastic tables also don't support grouping requests in a single database transaction that uses the SDK ExecuteTransactionRequest class or in a Web API $batch operation changeset. Currently, these operations succeed but aren't atomic

Hands-on

After the theoretical side of the elastic tables let's demystify them by putting our hands-on creating and playing with some of the standard features.

Creating an elastic table

On the Power Apps maker portal inside your solution create a new table and then select the type elastic.

Dataverse Elastic Tables Deep Dive Tutorial

In the table properties as shown in the image below we can check that some of the default table features aren't available for the elastic tables (creating activities, duplicate detection rules, connection, SharePoint document management, ...)

Dataverse Elastic Tables Deep Dive Tutorial

Exploring the elastic table Columns

Creating a new column on the elastic table has pretty much the same look & feel except for the missing options in the data type choice list (Currency, Formula, Customer Lookup).

A new option arises with the elastic tables within the data type single line of text we can now chose a new format, the JSON.

Dataverse Elastic Tables Deep Dive Tutorial

We now have our table created and let's see the aspect of some of the default columns present in it with our JSON Column in the image below.

Two new columns are present:

Partition ID: as explained above it defines the logical compartiment to store the record, it cannot be changed after the record creation
Time to Live: measured in seconds and determine the time the record will remain in the table

Elastic Table View

Considerations around the partitionid column that should meet these criteria:

The values in it don't change. After a row is created that has a partitionid value, you can't change it
It has a high cardinality value. In other words, the property should have a wide range of possible values. Each logical partition can store 20 gigabytes (GB) of data. Therefore, by choosing a partitionid value that has a wide range of possible values, you ensure that the table can scale without reaching limits for any specific logical partition
It spreads data as evenly as possible among all logical partitions
No values are larger than 1,024 bytes
No values contain slashes (/), angle brackets (), asterisks (*), percent signs (%), ampersands (&), colons (:), backslashes (\), question marks (?), or plus signs (+). These characters aren't supported for alternate keys

If a partitionid value isn't specified for a row, Dataverse uses the primary key value as the default partitionid value. For write-heavy tables of any size, or for cases where rows are mostly retrieved by using the primary key, the primary key is a great choice for the partitionid column.

When querying the JSON column:

Using the the default filter in the table view only uses standard text filtering options so here we will only be able to check for the presence of strings inside the JSON but no more than this.

Dataverse Elastic Tables Deep Dive Tutorial

However great news comes with You can use the ExecuteCosmosSQLQuery message that enables to run any Cosmos SQL query directly against your elastic table and filter rows based on properties inside JSON.

So let's try it in our created table and column !

Using the WebAPI we create the following http request, also available using the Dataverse SDK for .NET:

/api/data/v9.2/ExecuteCosmosSqlQuery(
    QueryText=@p1,
    EntityLogicalName=@p2,
    QueryParameters=@p3)?
    @p1='select c.props.sc_name as name, c.props.sc_jsoncolumn as json, c.props.sc_jsoncolumn.age as age 
    from c where c.props.sc_jsoncolumn.car=@car and c.props.sc_jsoncolumn.age >= @age'
    &@p2='sc_elasticpoc'
    &@p3={'Keys':['@car', '@age'],'Values':[{'Type':'System.String','Value':'Mustang'}
    , {'Type':'System.Int32','Value': '20'}]}

where "sc_elasticpoc" is our table, the "sc_jsoncolumn" is our JSON column and c.props is the prefix on the schema name of the columns. In this prefix, c is an alias or shorthand notation for the elastic table that is being queried.

And the result is:

{
 "@odata.context":"https://orgf9a090a4.crm17.dynamics.com/api/data/v9.2/$metadata#expando/$entity"
 ,"@odata.type":"#Microsoft.Dynamics.CRM.expando","PagingCookie":"","HasMore":false,"Result@odata.type":"#Collection(Microsoft.Dynamics.CRM.expando)"

,"Result":
 [{"@odata.type":"#Microsoft.Dynamics.CRM.expando","name":"elastic 2","age":20
,"json": {"@odata.type":"#Microsoft.Dynamics.CRM.expando","name":"Paul","age":20,"car":"Mustang"}}]
}

Final considerations between Elastic and Standard tables:

Performance:

As some announcements might let us think that elastic tables offer super performance over standard tables there are some studies that shows that's not quite the reality like in the Piotr Gaszewski blog (https://piotrgaszewski.hashnode.dev/dataverse-elastic-tables) that states:

Standard Create, Update, Delete operations: takes the same or a little bit more time to execute in elastic tables than in standard ones
Multiple Create, Update, Delete operations: takes significantly less time to execute in elastic tables than in standard ones (1/2 to 1/10)
Retrieve data operations: takes 2 to 3 times longer in elastic tables than in standard ones

So elastic tables offers significantly less performance execution times when using single atomic operation, however when working with multiple records on the same request it offers some advantages.

Service API Limits:

As per MS, Elastic Tables are still exposed to the Dataverse API throttling limits and bulk operation messages should be used to increase the throughput for high volume operations.

Is security model user compatible:

MS says yes that Elastic tables adhere to the Dataverse security model using

an User or Organisation level ownership
field level security

And yes audit is also available for elastic tables!

Conclusion:

Elastic tables enlarges the table types offer in dataverse, but it's to take with caution in my opinion because these pretty tables aren't the way to go to standard development in a relation database context, since table lookups aren't available ootb and since many other table features aren't also available which reduces a lot the scope of action of these tables.

Regarding performance finally they're also not the way to go if you search for increasing CRUD performances, except using multiple CRUD operations.

So my final take on the elastic tables would be for storing large datasets of data and mainly for IoT data that would managed by a service layer capable of using multiple CRUD operations and then consuming the data on a business intelligence scheme (as an example only).