Technical Documentation
I. Overview
Agent Workstation (AW) is a distributed AI computing product developed by Earos, designed to provide users with an open environment for running distributed AI workloads and executing intelligent tasks efficiently in a decentralized manner. The AW system is currently built on a network of cloud service provider servers, with a future vision of enabling decentralized operation on personal computers, thereby creating a globally distributed computing network.
II.Technical Architecture
1. Basic Network
To ensure a high-availability AI computation network, the foundation architecture of AW is currently built on servers provided by cloud computing providers like AWS, Google Cloud, and Azure. These computing resources are seamlessly integrated with platform software through advanced virtualization technologies, delivering stable and high-performance compute resources to users.
Virtualization: Implementation of container orchestration frameworks such as Docker and Kubernetes for efficient management and allocation of compute nodes.
Dynamic Resource Scaling: Adaptive scaling of compute resources in accordance with user demand to maximize resource utilization
A fully decentralized computing network is about to be established in the future where users can run the AW program on their personal computers to contribute to the network and share idle computing resources.
Home Node Deployment: Provision of a lightweight client application that facilitates seamless node deployment by end-users, requiring no specialized technical expertise.
Incentive Structure: Utilization of blockchain technology to accurately log computational contributions and reward participants with Eos tokens, thereby enhancing network engagement and participation.
The AW platform employs sophisticated intelligent scheduling algorithms for optimal task allocation, ensuring efficient utilization of computational resources while minimizing latency.
Task Allocation Algorithm: A matching algorithm that pairs tasks with nodes based on complexity and performance metrics, prioritizing allocation to the most capable nodes.
Fault Tolerance and Redundancy: Implementation of multi-node redundancy and automated failover protocols to ensure high availability and reliability of computational tasks.
Task Scheduling Module
Allocates AI computational tasks to cloud or user nodes for execution, with real-time monitoring of task completion status.
Reward Distribution Module
Calculates and distributes token rewards based on computational contributions, supporting a transparent mechanism with on-chain records.
User Management Module
Provides functionalities for user account registration, computational resource purchase, and task submission.
Security Module
Ensures the safety of user data and computational resources through encrypted communication and data isolation techniques.
API Module
Offers an open API interface to facilitate integration and utilization of AW's computational services by third-party developers.
III. Operational Principles
Agent Workstation (AW) is a distributed AI computing platform whose core operational principles revolve around cloud infrastructure and decentralized node computing. It employs an intelligent scheduling system to efficiently allocate computational resources to user tasks, ultimately achieving a distributed and decentralized AI computing network. Below are the detailed technical workings:
1. Computational Network Architecture
1.1 Cloud Infrastructure
At the current stage, AW leverages high-performance computing clusters from cloud service providers such as AWS, Google Cloud, and Azure to provide stable computational resources. The platform utilizes containerization technologies (such as Docker and Kubernetes) to deploy and manage computing nodes, ensuring stability and scalability during task execution.
Containerization Technologies: Each user task is allocated to run in an isolated computing environment, preventing resource contention and conflicts between tasks.
Dynamic Resource Allocation: Based on the computational requirements of user-submitted tasks (such as model size and runtime), the system dynamically allocates computational resources to optimize resource utilization.
1.2 Decentralized Node Computing (Coming soon)
As technology evolves, AW will implement a decentralized node computing model. Users will be able to run a lightweight client on their personal computers, becoming an integral part of the computational network.
User Nodes: Once connected to the AW network, user devices will share their idle computational resources to participate in distributed task execution.
Distributed Network Topology: A distributed network will be constructed using P2P (peer-to-peer) communication technology, enabling efficient task distribution and node management.
Incentive Mechanism: Blockchain technology will be employed to record the computational contributions of each node, with rewards distributed in Eos tokens to encourage network expansion.
2. Task Scheduling and Execution
2.1 Task Allocation Mechanism
The AW platform employs an intelligent scheduling system to allocate user-submitted tasks to the optimal nodes for execution:
Resource Matching: The system selects the best computational nodes based on task requirements (such as memory, processing power, and bandwidth). For complex tasks, they can be decomposed into multiple subtasks, which are distributed across different nodes for parallel processing.
Load Balancing: Load balancing algorithms (such as round-robin and least connections strategy) are utilized to optimize the allocation of computational resources among nodes, preventing any single node from becoming overloaded.
2.2 Execution and Monitoring
Once a task begins, the system conducts real-time monitoring of the computation process to ensure the accuracy of results and the efficiency of task completion:
Real-Time Monitoring: A log collection and analysis system (e.g., ELK Stack) is employed to monitor task status, including resource utilization of computing nodes (CPU, memory, bandwidth, etc.).
Task Restart Mechanism: If a task is interrupted due to node failure during execution, the system automatically migrates the task to other nodes for re-execution, ensuring task completion.
3. Result Verification and Output
3.1 Data Integrity Verification
To ensure the accuracy of computation results, the AW platform employs a data redundancy and verification mechanism:
Multi-Node Verification: Results for the same task are computed by multiple nodes, and their output consistency is compared.
On-Chain Recording: Computation results and task completion status are recorded on the blockchain in the form of hash values, ensuring immutability and public verifiability.
3.2 Result Delivery
Upon task completion, users can retrieve the computation results via the platform API or client download. For data requiring long-term storage, AW provides cloud storage support.
4. Incentives and Economic Model
AW encourages user participation in the computational network through a distributed incentive mechanism:
Computational Rewards: The system allocates Eos token rewards based on the computational capacity and runtime provided by user nodes, with all reward information recorded on-chain for transparency.
Task Pricing: Users pay computational fees when submitting tasks, with costs dynamically adjusted based on the resource requirements and execution time of the tasks.
5. Future Evolution of the Decentralized Computing Network
5.1 Support for Edge Computing
In the future, AW will expand into the edge computing domain, facilitating the integration of diverse computational resources such as Internet of Things (IoT) devices and edge servers to build a global and efficient computing network.
5.2 Integration of Distributed Storage and Computing
By leveraging distributed storage technologies like IPFS, AW will provide an integrated solution for data storage and AI computing, catering to the diverse needs of complex tasks.
5.3 Community Governance
AW will introduce a community governance mechanism through a Decentralized Autonomous Organization (DAO), allowing users to actively participate in decision-making regarding the platform's development direction.
6. Security and Privacy Protection
Data Isolation: Each task operates within an independent virtual computing environment, ensuring data isolation and user privacy.
Encrypted Communication: Data security during transmission is ensured through encryption technologies such as SSL/TLS.
Access Control: A Role-Based Access Control (RBAC) mechanism is implemented to ensure that user tasks and data are accessible only to authorized personnel.
IV. Functional Features
- Data Transparency and Credibility:
On-Chain Transparency: Users can query detailed records of computational contributions and token rewards via the blockchain.
Open Computational Data: Plans to provide real-time computational data on the platform, including computational scale and node distribution, to showcase overall platform performance.
Community Oversight Mechanism: Implements a community proposal and voting system to enhance user participation and transparency.
V. Application Scenarios
AI Model Training:
Users can execute deep learning model training tasks on the AW platform, efficiently achieving their training objectives.
AI Service Deployment:
Provides computational support, facilitating users in deploying AI applications to the cloud or local networks.
Decentralized Application (dApp) Support:
Offers AI computational services for Web3 developers, fostering innovation and development in decentralized applications.
VI. Plans
Q1 2025: Launch of the decentralized node application, initiating a global distributed computing network.
Q2 2025: Expansion of AI application scenarios to support a broader range of models and task types.
Q3 2025: Promote global community collaboration to attract more users and developers to participate in platform development.
Last updated