p411 Rafique PDF
p411 Rafique PDF
p411 Rafique PDF
Multi-Tenant SaaS
411
quirements of multiple tenants simultaneously. 2.3 Multi-Tenant Log Management Case
The remainder of this paper is organized as follows: Sec- As a running example for this paper, we present a
tion 2 provides the necessary background information and multi-tenant Log Management-as-a-Service (LMaaS) appli-
introduces a motivating example for the paper. Section 3 cation [14, 15]. This SaaS application provides log manage-
presents our analysis of the main requirements, while Sec- ment facilities to its customer organizations (tenants), for
tion 4 outlines our ideas on addressing these requirements. example, banks, supermarkets, hospitals, telecom operators,
Section 5 connects and contrasts our work with other related etc. The application focuses on storing large amounts of
research in this area. Finally, Section 6 concludes the paper heterogeneous data: raw log entries, archived logs, log meta-
and indicates directions for the future work. data, historical logs, incident reports, and time series data
and is successful in doing so by using a federated storage ar-
chitecture, and by applying multi-tenancy, i.e. sharing these
2. BACKGROUND storage resources maximally among tenants.
In the context of a multi-tenant SaaS application, differ- In the case of the LMaaS application, log aggregation com-
ent tenants commonly impose different requirements on the ponents are installed at the tenants’ side, which collect and
application1 , and these requirements also affect the data stream log events to the LMaaS application. Figure 1 il-
storage tier of the application. There is therefore a need lustrates three such events, each sent by a different tenant
to customize the multi-tenant SaaS application at run time organization, which are stored in a single Log database table.
rows
to meet these different non-functional requirements.
ID DeviceID DeviceName DeviceType … Tenant
This section first outlines and discusses a number of current 1 401 BRI-Router-001 ciscortr … 1
and timely trends in the context of multi-tenant Software-as- 2 701 BRI-special-001 cisco-ace … 2
a-Service (SaaS) applications. The presented research goals 3 301 CAN-PIX-FW-001 Pix7 … 3
are motivated strongly from the context of these trends. Sec-
tion 2.1 introduces the trend of offering storage as a service. Figure 1: Log table for storing events information.
Secondly, the data tier of multi-tenant SaaS applications is
increasingly being structured as a federated storage archi- The table holds a chunk of log data, identified by an ID
tecture, which we discuss in Section 2.2. Then, Section 2.3 attribute, which uniquely identifies each row in the Log ta-
introduces the motivating example for the paper. ble. The (DeviceID, DeviceName, DeviceType, and ...) at-
tributes hold information about the device that generated
the log event. The Tenant attribute refers to the tenant for
2.1 Cloud Storage which the event is generated.
Cloud providers offer online and on-demand services that
However, as mentioned above, different tenants may have
can elastically scale up (or down) dynamically as demand
different data storage requirements. As an example from
increases (or decreases). Cloud data storage is one of the
the log management application, we contrast three tenant
prominent services of cloud providers, which predominantly
organizations, a financial agency (i.e. a bank), a medical in-
allows data owners to store their data in the cloud.
stitution (i.e. a hospital), and an SME, that each impose
From the point of view of the SaaS application provider, different requirements when it comes to data confidential-
however, the selection of a single cloud storage provider in ity: clearly, stricter regulations on data confidentiality apply
practice is a difficult decision: (i) application providers face for the financial and medical institution, as opposed to the
a lack of trust in the cloud storage provider and are reluc- SME.
tant to share sensitive application data; (ii) they and their
To illustrate with the data presented in Figure 1: as the
customer organizations (tenants) may require different ser-
tenant with id 1 is a financial agency, even the meta-data
vice levels (guarantees as to data availability, performance,
about the device is considered highly sensitive. Similarly,
and responsiveness), and technologies for different types of
as tenants with id 2 and id 3 both represent medical insti-
data; and (iii) as market conditions change, they may op-
tutions, only a part of the data should be considered to be
portunistically want to switch cloud storage providers but
sensitive: for the tenant with id 2, only the (DeviceID and
be confronted with a situation of provider lock-in.
DeviceName) attributes hold sensitive information, whereas
for the tenant with id 3, the (DeviceID, DeviceName, and
2.2 Federated Storage Architecture DeviceType) attributes contain confidential information.
To address these concerns, SaaS providers are increasingly
leveraging a combination of different cloud storage resources,
technologies, and providers in a so-called federated storage
3. REQUIREMENTS
architecture. A federated storage architecture combines dif- In this section, we introduce four key requirements for data
ferent storage resources (private and cloud resources) and encryption solutions in the specific context of multi-tenant
allows data storage needs of a single application provider SaaS applications, as introduced in the previous section.
to be attained by combining different cloud storage offer-
ings. This however comes at the cost of increased complex- R1. Encryption at Differing Levels of Granularity:
ity and maintenance, which is commonly addressed in the From the perspective of data confidentiality, different
application-level middleware [2, 19, 14]. tenants impose different, sometimes even contrasting
confidentiality requirements, and this in turn may af-
1
Usually expressed in Service Level Agreements (SLAs) fect the level of granularity at which data encryption is
412
to be applied2 . As such, specific data encryption sup- The data mapping strategies (i.e. how to map in-memory
port is required for applying encryption at differing object to in-table rows) built into these systems however
levels of granularity at runtime for multiple tenants. rely extensively on the assumption of fixed database tables.
For example, as strict confidentiality requirements ap- This naive strategy would in our example of Figure 1 leads
ply to the tenant with id 1, full entity-encryption is to the definition of three different database tables, one for
clearly the most suited option. Similarly, as tenants each tenant, and as such these platforms do not address
with ids 2 and 3 operate with relatively relaxed confi- R4. A negative consequence of scattering data of the same
dentiality requirements, instead of applying full entity- type over different database tables for example, is that the
encryption, partial data encryption (i.e. only encrypt- back-end NoSQL databases (which are by design distributed
ing the sensitive information) is considered a more ap- databases) might fragment this data over multiple database
propriate strategy. nodes, as such negatively affecting scalability and the overall
performance, for example when performing queries over all
R2. Application Tier Encryption: In a federated stor- Log entries.
age architecture that may include different public To address R1 and R4, we envision more efficient data map-
cloud providers, storage requires sending the applica- ping strategies for Object-NoSQL data mappers (ONDMs).
tion data over uncontrolled communication channels By leveraging the flexibility of columnar NoSQL databases
such as the public Internet. The lack of control and (i.e. there is no fixed schema according to which data ob-
the trust in these third-party cloud storage providers ject must be structured), data objects that are encrypted
force dealing with these confidentiality requirements differently can still be stored within a single database table.
within the application tier itself. This will avoid data fragmentation across multiple database
R3. Generic, Transparent, and Reusable Solution: rows and multiple database nodes and will allow NoSQL
Dealing with the confidentiality requirements of dif- databases to treat this data more efficiently.
ferent tenants introduces substantial application com-
plexity. In addition, the ability to change these re- 5. RELATED WORK
quirements at run time, for example to accommodate
new tenant requirements or changes in the federated Recently, there has been extensive research focusing on high-
storage architecture is a strong motivation to external- lighting the security issues in NoSQL databases [13, 18, 21].
ize this solution from the main application code. To end this, several research contributions [10, 20, 22] have
been made, which provide encryption support for NoSQL
More concretely, this requirement is the need for a databases at different levels to protect outsourced data.
generic and a reusable solution, which (i) external-
izes the encryption logic from the application, accom- The existing solutions [8, 11] in the state-of-practice to sup-
plishing a clear separation of concerns (i.e. being able port encryption at the middleware level either (i) offer lim-
to change the encryption logic without changing the ited support for encryption where attributes with only spe-
application code); and (ii) provides tenant adminis- cific data types can be encrypted, or (ii) provide solution-
trators and SaaS application operators with advanced specific data types to be used in the application to encrypt
configuration and management facilities, such as data sensitive data. Moreover, they operate on a fixed data model
storage policies, cryptographic key management infras- and provide no flexibility to support encryption at various
tructure, configuration dashboards to select different levels of granularity.
algorithms for data encryption (e.g., AES, RSA, etc.). A number of libraries are available in different programming
languages, achieving encryption support in the application
R4. Scalable Data Encryption: Encryption impacts ap- level. For example, one of the easiest ways to encrypt sen-
plication performance significantly [17], and scalability sitive data in Java is by using custom data types provided
is a key concern, especially in terms of the amount of by the Java simplified encryption (Jasypt) [8]. However,
tenants and storage nodes. these libraries have various limitations: (i) they need to be
configured to specify sensitive data during deployment time;
The next section outlines our ideas on how to address these and (iii) they do not support encryption at various-levels of
requirements, more specifically by leveraging the data model granularity, which can also be altered during run-time.
flexibility features of columnar NoSQL databases.
6. CONCLUSION
4. DISCUSSION Outsourcing data to third-party cloud storage providers of-
There is an increasing research interest in solving these re- fer a wide array of clear benefits over hosting data in costly
quirements at the level of advanced data access middleware on premise data centers. In practice, however, data con-
platforms [8, 11]. These solutions as such address R2 and fidentiality considerations often prohibit outsourcing con-
R3 (partially), but not R1 and R4. Especially, the rise of fidential application data to external and often untrusted
Object-NoSQL Data Mappers (ONDMs), which apply the storage providers.
principles of the widely popular Object-Relational Mapping
This paper motivates that data confidentiality considera-
(ORM) frameworks in a NoSQL context [6, 7, 22].
tions of multi-tenant SaaS applications must be supported
2 within the application-level middleware by utilizing NoSQL
Note that differences in search requirements may also affect
the level of granularity at which encryption is to be applied. databases. We envision to create an efficient data mapping
413
strategy that leverages the flexibility of columnar NoSQL Transparent Encryption for the Database Abstraction
databases such as Apache Cassandra to support efficient, Layer. In Proceedings of the CAiSE 16 Forum at the
dynamic, and scalable data encryption that can be enacted 28th International Conference on Advanced
at different levels of granularity. Information Systems Engineering, New York, NY,
USA , 2016.
This work fits into our ongoing research on application-level
middleware for federated data storage architectures in sup- [12] S. Malkowski et al. Empirical analysis of database
port of multi-tenant SaaS applications. This is an ongoing server scalability using an n-tier benchmark with
research, future work involves the implementation of the pro- read-intensive workload. In Proceedings of the 2010
posed data mapping strategy and validate it in a prototype. ACM Symposium on Applied Computing, SAC ’10,
pages 1680–1687, New York, NY, USA, 2010. ACM.
Acknowledgments This research is partially funded by [13] L. Okman et al. Security issues in nosql databases. In
the Research Fund KU Leuven (project GOA/14/003 - AD- 2011IEEE 10th International Conference on Trust,
DIS), the SBO DeCoMAdS project, and the iMinds Se- Security and Privacy in Computing and
Closed project. Communications, pages 541–547. IEEE, 2011.
[14] A. Rafique, D. Van Landuyt, B. Lagaisse, and
W. Joosen. Policy-driven data management
7. REFERENCES middleware for multi-cloud storage in multi-tenant
[1] D. Agrawal, S. Das, and A. El Abbadi. Big data and saas. In 2015 IEEE/ACM 2nd International
cloud computing: Current state and future Symposium on Big Data Computing (BDC), pages
opportunities. In Proceedings of the 14th International 78–84, Dec 2015.
Conference on Extending Database Technology, [15] A. Rafique, D. Van Landuyt, B. Lagaisse, and
EDBT/ICDT ’11, pages 530–533, New York, NY, W. Joosen. On the performance impact of data access
USA, 2011. ACM. middleware for nosql data stores. IEEE Transactions
[2] D. Bermbach, M. Klems, S. Tai, and M. Menzel. on Cloud Computing, PP(99):1–1, 2016.
Metastorage: A federated cloud storage system to [16] A. Rafique, S. Walraven, B. Lagaisse, T. Desair, and
manage consistency-latency tradeoffs. In 2011 IEEE W. Joosen. Towards portability and interoperability
4th International Conference on Cloud Computing, support in middleware for hybrid clouds. In Computer
pages 452–459, July 2011. Communications Workshops (INFOCOM WKSHPS),
[3] G. DeCandia et al. Dynamo: amazon’s highly 2014 IEEE Conference on, pages 7–12. IEEE, 2014.
available key-value store. ACM SIGOPS Operating [17] S. Q. Ren, S. H. Zhang, Y. Z. Chen, M. R. Felipe,
Systems Review, 41(6):205–220, 2007. Y. J. Ha, and K. M. M. Aung. Empirical study of
[4] K. Grolinger, W. A. Higashino, A. Tiwari, and M. A. accelerating data protection for multi-tenant storage.
Capretz. Data management in cloud environments: Advances in Information Sciences and Service
Nosql and newsql data stores. Journal of Cloud Sciences, 5(13):19–25, 08 2013.
Computing: Advances, Systems and Applications, [18] A. Ron, A. Shulman-Peleg, and A. Puzanov. Analysis
2(1):1, 2013. and mitigation of nosql injections. IEEE Security
[5] J. Hu and A. Klein. A benchmark of transparent data Privacy, 14(2):30–39, Mar 2016.
encryption for migration of web applications in the [19] S. Seshadri, L. Liu, B. F. Cooper, L. Chiu, K. Gupta,
cloud. In Dependable, Autonomic and Secure and P. Muench. A fault-tolerant middleware
Computing, 2009. DASC ’09. Eighth IEEE architecture for high-availability storage services. In
International Conference on, pages 735–740, Dec 2009. IEEE International Conference on Services
[6] M. Huber, M. Gabel, M. Schulze, and A. Bieber. Computing (SCC 2007), pages 286–293, July 2007.
Cumulus4j: A Provably Secure Database Abstraction [20] V. Sidorov et al. Transparent data encryption for
Layer, pages 180–193. Springer Berlin Heidelberg, data-in-use and data-at-rest in a cloud-based
Berlin, Heidelberg, 2013. database-as-a-service solution. In 2015 IEEE World
[7] Impetus. A JPA 2.1 compliant Polyglot Congress on Services, pages 221–228, June 2015.
Object-Datastore Mapping Library for NoSQL [21] D. S. Terzi, R. Terzi, and S. Sagiroglu. A survey on
Datastores. security and privacy issues in big data. In 10th
https://github.com/impetus-opensource/Kundera, International Conference for Internet Technology and
2016. [Last visited on December 02, 2016]. Secured Transactions, pages 202–207, Dec 2015.
[8] Jasypt. Java Simplified Encryption. [22] X. Tian, B. Huang, and M. Wu. A transparent
http://www.jasypt.org/, 2016. [Last visited on July middleware for encrypting data in mongodb. In
19, 2016]. Electronics, Computer and Applications, 2014 IEEE
[9] L. M. Kaufman. Data security in the world of cloud Workshop on, pages 906–909, May 2014.
computing. IEEE Security Privacy, 7(4):61–64, July [23] T. Waage and L. Wiese. Foundations and Practice of
2009. Security: 7th International Symposium, FPS 2014,
[10] L. Liu and J. Gai. A new lightweight database Montreal, QC, Canada, November 3-5, 2014. Revised
encryption scheme transparent to applications. In Selected Papers, chapter Benchmarking Encrypted
2008 6th IEEE International Conference on Industrial Data Storage in HBase and Cassandra with YCSB,
Informatics, pages 135–140, July 2008. pages 311–325. Cham, 2015.
[11] K. Lorey, E. Buchmann, and K. Böhm. TEAL:
414