Edu

Proof of Training Data

The Role of Blockchain in Building Trust and Security in AI Systems.

NOTE: In February 2024, CAD token was rebranded to CADAI token.

‍

Introduction

Artificial Intelligence (AI) has become omnipresent, touching almost every aspect of our daily lives - from healthcare to transportation to the arts. One of the most dynamic areas where AI has shown immense potential is in computer-aided design (CAD). Despite the monumental advances, one key challenge remains: trust. How do you know that the AI system you're using is safe, transparent, and reliable? Enter CADAICO and its AI assistant for CAD. This article explores the integral role of blockchain technology in establishing trust, enhancing security, and facilitating AI systems like CADAI.

_________________________________________________________________________

The Need for Trust and Security in AI Systems

While AI models are incredibly powerful, they can sometimes resemble a black box shrouded in complexity. Concerns are often raised about the integrity of the data flowing into these models and the potential risks of unauthorized access. These concerns are even more pronounced in areas such as CAD, where the data is highly specialized and often proprietary. Security breaches can result in significant economic and intellectual losses, while data integrity issues can lead to devastating miscalculations.

The most prominent example of the lack of trust in AI systems is ChatGPT and the unresolved regulatory issues. Surveys have shown that around 11% of the data entered into ChatGPT is confidential, leading to companies banning the use of ChatGPT to protect their IP.

Blockchain offers a promising solution for increasing trust and transparency in AI systems, as any data that goes into the AI model can be publicly verified. Ultimately, it provides a trusted source of information to prove what data is being used and how. For this purpose, CADAICO is developing a novel algorithm called "Proof of Training Data".

The CADAICO Platform

Founded with a vision to revolutionize the CAD sector, CADAICO aims to create a robust, trustworthy, and efficient AI assistant. CADAICO aims to set a new industry standard with its CADAI platform, which aims to be a reliable, efficient, and intelligent assistant for CAD tasks. It is designed to foster an ecosystem that brings together different stakeholders - designers, developers, and companies. However, the real brilliance of CADAI lies in its underlying architecture, which is powered by blockchain technology.

Why Blockchain?

Blockchain technology offers a unique combination of characteristics such as immutability, transparency, and security - features that are invaluable in addressing the trust deficit plaguing AI systems. Unlike traditional centralized databases, which are susceptible to hacking, corruption, and unauthorized access, a decentralized blockchain network ensures that data is transparent yet secure.

Core Benefits of CADAICO's Blockchain Approach

One of the key technical marvels of blockchain is the use of Merkle trees and cryptographic hash functions to ensure that blocks of data are completely tamper-proof. Once a data point is recorded in a block and added to the chain, changing it would mean recalculating all subsequent blocks, which is computationally infeasible. This data integrity is particularly important for CADAI as it ensures that all design iterations, specifications, and user activities are transparent and unalterable.

Algorithmic Approach: How "Proof of Training Data" Works

The 'Proof of Training Data' algorithm is designed to record critical data points and metadata on the blockchain. This includes but is not limited to, the origin of the data, timestamps of when the data was used for training, quality metrics, and changes in the performance of the AI model after training. Smart contracts embedded in the blockchain automatically validate the data based on pre-defined quality metrics before it is used for training, ensuring data integrity.

While the algorithm is still being tested and will not be publicly available until the release of the CADAI model, a basic conceptual approach can be seen below:

‍

Data Onboarding

As training data enters the system, a unique cryptographic hash is generated for each data point and recorded on the blockchain.

Smart Contracts for Quality Check

Before data is used, smart contracts validate it based on pre-defined metrics such as data completeness, data consistency and data relevance.

Training and Logging

As the AI model is trained, metadata such as model accuracy, the algorithm used and performance metrics are recorded on the blockchain.

Data Usage Transparency

All transactions related to the use of data for training purposes are time-stamped and recorded, making it transparent how data is used in the context of training the CADAI model.

Public Verification

The blockchain allows any user to verify the origin of the data and its subsequent use, thereby establishing the legitimacy of the data.

Token-Based Incentives

CADAICO's native CAD token can be used to incentivize quality data contribution and to compensate those who contribute to validating transactions on the network.

Increased Efficiency

CADAICO uses smart contracts for data validation. These self-executing contracts, with terms written directly into the code, enable automatic verification of data quality. For example, when a user uploads a new design to CADAI, a smart contract can instantly check it against pre-defined quality metrics before it's accepted into the system. This type of real-time, transparent validation method significantly increases operational efficiency.

Data Distribution and Transaction Settlement

Blockchain becomes a powerful tool in a multi-stakeholder platform by efficiently tracking transactions. It offers the benefits of low-cost, near-instantaneous settlements on a global scale, while securely recording all data transfers. The CADAICO blockchain uses a consensus algorithm based on Byzantine Fault Tolerance (BFT). In an ecosystem with multiple actors - designers, developers, users - achieving consensus on transactions is critical. BFT algorithms guarantee that as long as a pre-determined fraction of network nodes are not malicious, the system will reach consensus, providing robust, low-cost, and fast global settlements.

Secure Access Management

CADAICO uses Role-Based Access Control (RBAC) to manage licenses, applications, and data. Each role has a unique cryptographic key pair, and permissions are verified using digital signatures. This allows the blockchain to efficiently manage who has access to different types of data, whether it's a user accessing design templates or an administrator changing system parameters.

The CAD Token: Fueling the Ecosystem

The CAD token serves multiple purposes, from transaction currency to governance token. Technically, it uses the ERC-20 standard, making it highly interoperable and easy to integrate into decentralised applications (dApps). Future plans include the migration to a unique standard tailored for the CADAICO blockchain. For more information on the CAD token, you can read the CADAICO - $CAD Token Economy paper we published.