Editorials

Use Partitioned Views to Separate Authorization for Data Access in Relational Databases

After reading my editorial about separating concerns of authorization and execution of application logic, Peter observes a similar need within a relational database. He writes:

I have a small custom made CMS application. In this I use stored procedures to which I pass the current user and in which I join web page content tables with user access right tables to return only the data the user is allowed to see/edit/… This might be a faillure regarding "seperation of concerns", but at the other hand … with this filter approach I prevent returning all content to the application and applying the filtering there based on an authorization API. If I do not filter in the stored procedure I cause network traffic of data that will not be used. In addition there might be a small risk that malware/network sniffers catch up that data. What do you suggest? Should I create a kind of TSQL API to check access rights inside stored procedures??

As I understand things, the data contained within your database can be filtered by the access rights of a user. If that filter data is in a separate table, you have made the necessary separation in a relational way. That seems really powerful.

In fact, you could easily filter that data using a view which joins the necessary data to filter the rows for a user. Then, you could request the data using an ORM tool, passing the user criteria as a filter.

Entity Framework might look like the following if you had a view that combined purchase orders with users called UserPurchaseOrders which is called in C# with a userId parameter:

MyRepository.Configuration.UserPurchaseOrders.Where(row=> row.UserId==userId)

The downside of this method is that each row now contains a UserId when retrieved from the database.

Another technique that is old school, and not often implemented anymore, is to use data partitioning. In order to implement this technique, you have a separate view for every unique partition. If you wish to filter all purchase orders for every user, you would create a schema for every user. In every schema, you create a view with the same name. Each user has it’s own instance of a view “PurchaseOrders”. The view is defined exactly the same for every user with the exception that there is a filter in the where clause for the specific user id.

Then when you write a query, SELECT * FROM PurchaseOrders, the values returned automatically filter based on the user currently logged into the database, and the view defiened specifically for them.

As you can see, this is VERY intensive way to partition the data. If you need to be user specific, it requires you create a separate schema for each user, and as many views as the tables you are partitioning. For this reason, partitioning is not often used in web applications, because we frequently implement our own filtering in the application, and have only one defined User for the entire application. In this kind of environment, the solution you have implemented is probably the most efficient to create and maintain.

If you have a less granular filter than users, you can still use partitioning. You still have multiple schemas and views. You simply have only as many as the data you wish to partition. In this case, if you had four divisions in your company, you could have five schemas and five views. Then the data is automatically filtered for each user based on the schema to which they are assigned. You would have one schema with views filtering for a single region. Then a final schema without a filter, returning all records regardless of region, able te report against the whole enterprise.

Since the views in all schemas have the same name, your query doesn’t know which one is used. The filter is applied in a transparent way, thus separating the concerns.

While I’m on the subject, you can create a view that does vertical partitioning as well. A common example of this is information concerning a person in a company. You can separate salary data, and make it only available to the Human Resources department by placing that data in a separate table with a one to one relationship. Then when someone needs to query the users for a department in the company they don’t see salary data.

Using vertical partitioning, you instead put all the data in a single table, and then create a different schema for HR than for the rest of the company. You create a view called Persons for the whole company, and do not select restricted columns in that view, such as salary. So, you could write a query “SELECT * FROM Persons” and you would not see the salary. Then you create a view for the HR schema, and include the Salary column. If you are a member of the HR schema, when you execute the same query, the database will instead use the HR schema, and return the Salary column in addition.

Vertical partitioning usually does not rely on implicit view resolution because the app will have errors if the necessary columns aren’t present, or have to have error handling for missing columns if the current user does not have adequate rights. So, the app would execute the query SELECT * FROM HR.Persons, explicitly specifying the need for the Persons view including protected columns.

Schemas are very powerful tools often not used today, or used in a very simplistic manner, because they require a fancy user interface for non-technical people to use them, or an Administrator to create and maintain this separation. Moreover, it is not always transportable from one data engine to another.

That’s enough on that topic for today. So, now you are aware that there are tools you can use built into your relational database engine, because all the major ones support schemas in a common way. Leave a comment if you’d like more info, or wish to share how you have used schemas in your systems.

Cheers,

Ben