ODPi Announces Egeria for Open Sharing, Exchange and Governance of Metadata
The Linux Foundation | 27 August 2018
Industry’s First Open Metadata Standard Helps Organizations Better Understand, Manage and Gain Value from Data
Vancouver, BC, Canada – August 27, 2018 – Open Source Summit North America — ODPi, a nonprofit organization accelerating the open ecosystem of big data solutions, today announced Egeria, a new project from ODPi that supports the free flow of metadata between different technologies and vendor offerings. Egeria enables organizations to locate, manage and use their data more effectively.
Last year’s ODPi white paper on “The Year of Enterprise-wide Production Hadoop” found that Data Governance and Security were the biggest blocking factors to enabling enterprises to take big data into true production. Recent data privacy regulations such as GDPR have brought these concerns to the forefront, and enterprises around the globe need a standard for ensuring that data providence and management is clear and consistent across the enterprise. Egeria enables this, as the only open source driven solution designed to set a standard for leveraging metadata in line of business applications, and enabling metadata repositories to federate across the enterprise.
“A consistent view on data across the entire landscape is essential for any organisation that wants to become data driven. Not just where the data is, but also the quality, the ownership, and the full lineage across the entire set of technologies used,” said Ferd Scheepers, chief information architect, ING. “The open metadata standard delivered by Egeria delivers this consistent view across all the technologies, while reducing the cost of metadata capture, and the management challenges of working with various data tool vendors.”
Egeria is built on open standards and delivered via Apache 2.0 open source license. The ODPi Egeria project creates a set of open APIs, types and interchange protocols to allow all metadata repositories to share and exchange metadata. From this common base, it adds governance, discovery and access frameworks for automating the collection, management and use of metadata across an enterprise. The result is an enterprise catalog of data resources that are transparently assessed, governed and used in order to deliver maximum value to the enterprise.
“Egeria’s open source metadata management presents an exciting opportunity to rethink both management and governance of data to provide greater trust and flexibility in how we all share and consume data,” said John Mertic, director of program management, ODPi. “Egeria’s open governance model allows our community and practitioners to develop and evolve the base for use in any offerings and deployments.”
IBM and ING, vendors and end users collaborated on the first Egeria release, which was initially incubated as part of the Apache Atlas project (an open source metadata repository designed for the Apache Hadoop ecosystem). IBM and ING jump-started Egeria with a significant code donation. ODPI members and end-users are actively collaborating to expand the Egeria code base with standard integration points between metadata repositories and line of business tools leveraging data. An Apache Atlas patch is available for immediate use, and an Egeria proof of concept is complete for IBM’s InfoSphere Information Governance Catalog.
“Changing the availability and the quality of metadata will in turn improve the agility of the data scientist, as well as the transparency of the results they produce,” said Mandy Chessell, distinguished engineer and master inventor, IBM. “Egeria simplifies metadata capture and management to create a consistent view of data across all tools an organization may use.”
Egeria Project Objectives
The Egeria project focuses on: Automation, Business Value and Connectivity.
- Automation — Providing an API for components that capture metadata from data platforms as data sources are created and changed. This metadata is stored in the metadata repository and results in notifications to alert governance and discovery services about the new/changed data source. It provides frameworks and servers to host bespoke components that automate the capture of detailed metadata and the actions necessary to govern data and its related assets.
- Business Value — Open metadata and governance provides specialized access services and user interfaces for key data roles such as CDO, Data Scientist, Developer, DevOps Operator, Asset Owner, and Applications. This enables metadata to directly support the work of people in the organization. The access services can also be used by tools from different vendors to deliver business value with open metadata.
- Connectivity — Connectivity enables a peer-to-peer Metadata Highway, offering open metadata exchange, linking and federation between heterogeneous metadata repositories.
“As a leader in advanced analytics, SAS understands the value of full transparency related to data provenance and governance,” said Craig Rubendall, vice president of Platform R&D, SAS. “Egeria and its open approach to metadata management and integration only underscores further the need for metadata standards to promote responsible data exchange across varied technology environments.”
Additional Resources
- Read about Egeria 1.0 on the ODPi Blog
- Get a first-hand look at Egeria at the Open Source Summit August 28: Egeria Open Metadata & Governance Workshop
- Get involved at the Egeria Github page
- Learn About ODPi Membership
About ODPi
ODPi is a nonprofit organization committed to simplification and standardization of the big data ecosystem. As a shared industry effort, ODPi members represent big data technology, solution provider and end user organizations focused on promoting and advancing the state of big data technologies for the enterprise. For more information about ODPi, please visit: http://www.ODPi.org
###
About The Linux Foundation
The Linux Foundation is the world’s leading home for collaboration on open source software, hardware, standards, and data. Linux Foundation projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, ONAP, OpenChain, OpenSSF, PyTorch, RISC-V, SPDX, Zephyr, and more. The Linux Foundation focuses on leveraging best practices and addressing the needs of contributors, users, and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org. The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see its trademark usage page: www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.