![]() That’s good on paper, but without some structure around the core risk objects, it will be hard to formulate risk questions. Wildly varying data structures offer a good starting point, and one basic premise of a data lake is its ability to work with any structure. There are a few important attributes that push our Risk Data Lake over and above a vanilla data lake, including: a defined risk schema, risk data preparation utilities, risk microservices, and access to third-party risk data. What are the essential components of the RMS Risk Data Lake, and how is it different from a vanilla data lake? The short answer is that what we are building at RMS is an “applied” data lake designed to make risk analytics simpler for data engineers, data scientists, actuaries, data analysts, and developers. I’ll touch on some of these exploratory risk analytics, but before that, let’s start with some basic definitions. This requires the unification of distinct datasets that represent risk and then programmatically animating those datasets to help answer risk-related questions. RMS Intelligent Risk Platformīut for exploratory analyses, we need to go beyond the applications and open up ad hoc risk analytics. These applications deal with known datasets and common paths. At RMS, we started by building a platform and applications that can help deliver exposure and loss analytics: Risk Modeler ™ software and the ExposureIQ ™, TreatyIQ ™, SiteIQ ™, and UnderwriteIQ™ applications, among others. In this blog, I will explore what a risk-focused data lake is and why it is critical for new risk insights. It builds on top of typical data lake architecture to go beyond what a “vanilla” data lake can do. Over the past two years at RMS®, we have started putting building blocks in place for the RMS Risk Data Lake™. Combining all this with an engine to query, transform, connect, enrich data, and extract new insights (using a service such as Apache Spark, Presto, etc.).Building a data catalog using the data in the lake (using a service such as AWS Glue, Alation, etc.). ![]() Unifying various data formats and structures (including streaming, batched inputs from various file formats using structured, semistructured, and nonstructured data formatted in relational, nested, columnar formats such as CSV, JSON, Parquet, Avro, etc.).Whatever the nuanced name used, the implementation of a data lake boils down to: Recently, some vendors have been using the terms “lake house” and “data mesh,” which combine elements of data lakes, data warehouses, and federated querying. Nowadays, many companies maintain more than one data lake – they might have one focusing on customer- or marketing-related insights, another focusing on security and compliance, product analytics, and so on. A data lake helps simplify analytics by bringing large, distinct sources of data together under one architecture to drive the extraction of new insights. ![]() While the “data lake” has been around for some time, the term might still be unfamiliar to many.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |