Graviti leads community of developers and data scientists to create data standards and formats that enable contributions by anyone
Napa Valley, Calif., Linux Foundation Membership Summit, November 2, 2021 — The Linux Foundation, the nonprofit organization enabling mass innovation through open source, today announced the new Project OpenBytes spearheaded by Graviti. Project OpenBytes is dedicated to making open data more available and accessible through the creation of data standards and formats.
Edward Cui is the founder of Graviti and a former machine learning expert within Uber’s Advanced Technologies Group. “For a long time, scores of AI projects were held up by a general lack of high-quality data from real use cases,” Cui said. “Acquiring higher quality data is paramount if AI development is to progress. To accomplish that, an open data community built on collaboration and innovation is urgently needed. Graviti believes it’s our social responsibility to play our part.”
By creating an open data standard and format, Project OpenBytes can reduce data contributors’ liability risks. Dataset holders are often reluctant to share their datasets publicly due to their lack of knowledge on various data licenses. If data contributors understand their ownership of data is well protected and their data will not be misused, more open data becomes accessible.
Project OpenBytes will also create a standard format of data published, shared, and exchanged on its open platform. A unified format will help data contributors and consumers easily find the relevant data they need and make collaboration easier. These OpenBytes functions will make high-quality data more available and accessible, which is significantly valuable to the whole AI community and will save a large amount of monetary and labor resources on repetitive data collecting.
“Project OpenBytes and community will benefit all AI developers, both academic and professional and at both large and small enterprises, by enabling access to more high-quality open datasets and making AI deployment faster and easier,” said Mike Dolan, general manager and senior vice president of Projects at the Linux Foundation.
The largest tech companies have already realized the potential of open data and how it can lead to novel academic machine learning breakthroughs and generate significant business value. However, there isn’t a well-established open data community with neutral and transparent governance across various organizations in a collaborative effort. Under the governance of the Linux Foundation, OpenBytes aims to create data standards and formats, enable contributions of good-quality data and, more importantly, be governed in a collaborative and transparent way.
For more information, please visit https://www.openbytes.io
Supporting Quotes
ElectrifAi
“As one of the earliest AI/ML companies in the U.S., ElectrifAi is happy to support the OpenBytes project. We believe OpenBytes will help in the sharing of trusted datasets and accelerate practical AI/ML to solve real business problems,” said Luming Wang, CTO, ElectrifAi.
Jina AI
“The future of software is being eaten by open source, as well as data-sharing. OpenByte’s announcement is a great signal for all developers on the accessibility of datasets. We are very excited to see standardized datasets available to a broader community, which will massively benefit AI engineers,” said Bing He, Co-founder & COO at Jina AI.
Motional
“Project OpenBytes will be essential to establish a vibrant open source dataset community. At Motional we are happy to contribute our freely available nuScenes and nuPlan datasets to this community. By standardizing datasets and licenses, we are making an important step towards interoperable machine learning systems and in particular safer autonomous vehicles,” said Holger Caesar, Data-Algorithms Team Lead at Motional.
Predibase
“At Predibase, we’re building the open source Ludwig AI project to make state-of-the-art deep learning accessible to everyone, but the biggest barrier to tackling more tasks has always been the lack of standards for training datasets over unstructured data like text and images. Project OpenBytes provides a common structure to unstructured data that makes it possible for low-code deep learning tools like Ludwig to automate a host of advanced computer vision, NLP, and other machine learning tasks that previously required bespoke solutions. I’m excited to see how the combination of OpenBytes and Ludwig can enable data scientists and ML engineers to spend less time figuring out how to stitch data and models together, and more time solving their business problems.”
Zilliz
“Data is crucial to the success of any Artificial Intelligence project. By sharing open datasets, Project OpenBytes will help more developers to understand, develop, and adopt AI/ML technologies. Project OpenBytes will be a fundamental component of the open-source AI ecosystem. At Zilliz, we are glad to participate and make contributions to this significant initiative,” said Jun Gu, Partner of Zilliz.
About the Linux Foundation
Founded in 2000, the Linux Foundation is supported by more than 1,000 members and is the world’s leading home for collaboration on open source software, open standards, open data, and open hardware. Linux Foundation’s projects are critical to the world’s infrastructure including Linux, Kubernetes, Node.js, and more. The Linux Foundation’s methodology focuses on leveraging best practices and addressing the needs of contributors, users and solution providers to create sustainable models for open collaboration. For more information, please visit us at linuxfoundation.org.
###
The Linux Foundation has registered trademarks and uses trademarks. For a list of trademarks of The Linux Foundation, please see our trademark usage page: https://www.linuxfoundation.org/trademark-usage. Linux is a registered trademark of Linus Torvalds.
Media Contact
Jennifer Cloer
Story Changes Culture
503-867-2304