Version 1.09. Last updated: 250603

The Haypenny Platform

Forward

The Haypenny vision is for everybody on Earth to trade goods and services using digital currencies every day, many times a day, at a very low cost per transaction, allowing the engine to offer free transactions across the internet.

To realize that vision, we invented a new paradigm for value transfer: block-split-combine, and we developed entirely new kind of transaction engine that can cost-effectively scale to millions of transactions per second and do so with absolute transaction integrity, multi-site redundancy, and consistent low latency.

Executive Summary

The system described here achieves the lowest possible cost per transaction in the context of an ultra-high scale system, which is two network operations and two non-volatile storage operations for each transaction side, per data center location (typically three locations, called asynchronously).

To achieve this, we made key system and application design decisions, starting with the very simplified Haypenny financial transaction model consisting of only two basic operations, then the four-tier system model of client, real-time, near-real-time and offline support systems, and the construction of an entirely purpose-built software system consisting of several scratch-built components.

The resulting system has been benchmarked and cost-modeled to show it able to scale to its initial goal of one million peak transactions per second and 100 billion transactions per month for approximately $0.000004 per monthly transaction in total data center fees. This is done in the context of each transaction being written in multiple, decentralized data stores in real-time, and each transaction being memorialized forever in indelible WORM storage within approximately 30 seconds.

The Haypenny System

Usage Paradigm

The first step in creating an ultra-high capacity, ultra-low cost per transaction system is to carefully define the application model in a way that reduces system complexity and facilitates key optimizations.

The core function of the Haypenny system is to enable the transfer of numerically-delineated value from one entity to another. The Haypenny system pares this functionality down to its absolute essence, defining only one core system object--called a "block"--which is its identifier, balance, and currency identifier (also known as the "realm" or "coin id").

A Block has two possible operations: split and combine. A split operation will subtract the balance of one block and create a new block, and a combine operation will take two blocks, decrement the balance one of the blocks and add the balance of the other block.

The Haypenny paradigm has no notion of identity outside of one's knowledge of the block identifer, and is thus akin to a digital form of physical cash: you obtain the value by your mere possession of the identifier.

System Architectural Approach

The Haypenny system is broken up into four execution tiers, each with their own distinct response time and uptime requirements: client, real-time, near-real-time, and offline.

This segmentation allows the most difficult technical problem, the real-time transaction request response, to be completely isolated and pared down to only the requirements for responding to the request.

The system's four tiers, therefore, support each other: the near-real-time systems offload the real-time systems, and the offline systems offload the near-real-time systems. Client systems also serve to simplify the task for the real-time systems and the system's API is designed to facilitate this.

The real-time and near-real-time systems are thought of as a set of systems, and are duplicated in multiple geographically separate data centers, typically at least three locations for "N+2" redundancy, e.g. the ability to safely continue real-time operations even if an entire data center is disrupted.

Architectural Details

Here are the details of the four types of subsystems:

Client content (HTML/JavaScript, mobile apps) that utilize the APIs of the real-time systems.
Real-time systems that process each transaction request API call in less than two milliseconds (internal time).
Near-real-time systems that support the real-time systems.
Offline systems that gather and store information to assist in running the service.

Client Content

The Haypenny system is completely "API driven" from the front-end standpoint in that no end-user display content is delivered from any Haypenny systems. Rather, content is served entirely through either a content delivery network or through a mobile app store. Front-end content, therefore, is a non-issue when it comes to scalability and costs, since this approach allows a virtually limitless amount of traffic for relatively very low cost.

Real-time Systems

Haypenny's real-time systems provide a REST API¹ that allows Haypenny transactions to be created and collected. This API is provided using the HTTPS protocol, as it typical for services of this kind.

Haypenny's real-time systems include:

Load balancers, which receive the initial HTTPS request from the end-user device.
Front-end Transaction Processors, which format and request and dispatch the request into one or two calls to the Data Table Processors
Data Table Processors, which provide decentralized storage for Haypenny transaction data.
The Haypenny Registration system, which supports the creation and maintenance of system metadata apart from the core transaction system.

Load Balancers

While being the most generic component of the Haypenny system, this component is critically important not only for the necessary task of dividing up requests among a pool of front-end processors, but also to provide a "pinhole" security approach to stop probing attacks on the Front-end Processors and also to aid in mitigating distributed denial-of-service (DDOS) attacks.

Front-end Transaction Processors

This is a pool of systems that unpack and interpret API calls, make calls to the Data Table Processors, and then return a properly formatted reply to the caller. Calls to the Data Table Processors are performed asynchronously and simultaneously such that the overall latency to the data store only takes as long as the single highest latency call.

The Front-end Processors are written in Java and use the Jetty HTTP server framework. The decision to go with this framework and not pure C was made very deliberately: in benchmarks we found that the Jetty framework responded to http requests within 20% of the performance of the fastest pure C framework in terms of overhead alone. However, that overhead represented about 20% of the overall API call time, the balance being taken up by the IO to the Data Table Processors. This would mean that a move to pure C and a relatively less tested HTTP server framework would only result in approximately 4% lower latency. Java, on the other hand, offers significant advantages over C in terms of code safety, maintainability and availability of already-tested and secure libraries for various operations.

Front-end processors are responsible for all "house cleaning" tasks including system metrics. These systems do not track individual requests. Access logging for system control and monitoring purposes is left to the load balancer component (non-transactional, "imperfect" tracking); the Data Table Processors handle the primary transaction logging.

Data Table Processors

The Data Table Processors are written in C and connect to the Front-end Processors using an open TCP/IP socket. They maintain a memory-based hash table of their blocks.

Data Table Processors are arranged in a sharding approach wherein each instance contains blocks and transactions of a certain modulo. These systems hold all Haypenny block identifiers in existence, and grow both "vertically" (data size) and "horizontally" (request volume) by adding shards.

Data Table Processors store a Haypenny block very efficiently: the core data structure for a Block is 36 bytes, including hashtable overhead. Hence, even a relatively modest system by today's standards can store billions of Haypenny blocks, and larger systems and additional shards will allow for trillions of active blocks, which is far above long-term projections.

The Data Table Processor also handles the task of memorializing each transaction component to non-volatile decentralized storage to ensure absolute data integrity. This subsystem consists of a rotating set of reused log files that ensure that a single operation will incur no more than a single IO to non-volatile storage. The transaction does not indicate as completed to the end-user until the data is physically written (not cached) to redundant non-volatile storage both locally and to a server in at least one physically separate geographical location.

The Haypenny Registration System

This system is the most "conventional" of all Haypenny's components, as it is a front-end HTTP system handling APIs connected to an off-the-shelf RDBMS. Its purpose is to handle metadata services for system.

Unlike the Haypenny transaction systems, this system has a much easier set of requirements:

The data in question is smaller and grows much slower.
Data requirements allow high-volume reads to be cached (i.e. data that is "display only" and can be slightly delayed to the end user without adverse consequences).
Uptime requirements are not as pronounced for these systems as it is for the primary transaction systems, as transactions can still occur even if these systems are not available.

Near-real-time Systems

The overall design approach for the Haypenny real-time systems is to perform the absolute minimum amount of processing possible on the real-time systems such that they can return their response as quickly as possible. This is enabled by Haypenny's near-real-time systems that run in the background and on separate hardware systems.

Near-real-time systems take care of the other half of the high-scale ultra-low-latency transaction logging system, rotating logs when they are full on each Data Table Processor system and moving the data to permanent locations on the network, including to disaster recovery sites. This tier also creates periodic snapshots for shards that allow for near instant recovery upon a single system failure.

The other effect of a centralized near-real-time system is that it can coordinate transaction rollbacks across shards. Split and Combine operations each consist of two different data table processor calls, potentially to two different shards. As such, if one shard fails to memorialize the operation after the first call was successfully made, the operation must be rolled back on the first shard. This is normally coordinated by the front-end processor, but there is a chance that system could also fail at the exact moment it was necessary to do so (indeed, if a front-end processor fails, there is a high chance that it would fail in between the two calls to the data table processor since this is an IO wait state). Because of this, part of the near-real-time system's job is to ensure that transactions always come in pairs, and if they don't, it will roll back the operation as necessary to maintain system integrity.

Offline Systems

Haypenny's offline systems include a data warehouse-style database that holds an (encrypted) copy of all transactions, blocks, and all other metadata of the system for use for offline processing. This system is connected to applications that perform auditing of transactions (and allow external auditors to do the same), analysis of the system's usage for system management purposes*, and other analysis related to the running of the service, such as editorial control over metadata.

The offline system realm also includes the final storage location for the transaction log, which is in a Write Once Read Many (WORM) storage mechanism that complies with FINRA 4511, and SEC 17a-4(f). All transactions are written to this medium within approximately 30 seconds of their execution on the real-time systems.

(*Currently, over 40 different application-level metrics are gathered from each Front-end Processor every 10 seconds and this data is sent to the data warehouse).

The HayMan System Management Console

The management of a system of this kind (with potentially hundreds of server nodes) must necessarily be fully automated, and for the Haypenny system that system is called "HayMan".

HayMan is a management console and supporting servers that allows system personnel to manage every aspect of the Haypenny system, such as deploying new versions of subsystem software, starting and stopping processing pools, troubleshooting issues for each process, and viewing statistics for each relevant process. This automation allows the deployment of an entire subsystem pool with a single click, for instance.

Physical Infrastructure

The Haypenny system requires a secure, robust and scalable hardware and networking infrastructure. Haypenny chose Amazon Web Services (AWS) as its infrastructure provider after carefully considering the alternatives. The winning factors for the AWS decision included:

A complete and integrated product offering that includes everything the system needs.
Redundancy in networks and physical locations.
Available High Performance Computing (HPC) components including ultra-low-latency network connections.
The safety of going with the clear market leader in the cloud computing space.
A track-record of hosting extremely security-sensitive systems such as banks and other financial infrastructure.

The Haypenny Transaction API

The Haypenny transaction system consists of two write operations through a total of three APIs.

The transaction write operations are Split and Combine. These two operations are conceptual mirrors of each other.

The Split operation takes an existing block and an (optional) amount as input and returns a new block.

The given block's balance is decremented by the amount plus the transaction fee.
A new block is created with the given amount as its balance.
If the amount is not specified, the entire balance of the block is decremented, effectively deleting it (this is useful for accepting a block from a non-trusted entity in order to "flip" the block to another that only one party knows).

The Combine operation takes an existing block and another block (the "combine block"), along with an (optional) unit amount:

The given block will be incremented by the amount specified (assuming the combine block has the appropriate balance).
The combine block is decremented (or invalidated if no amount is given).

API Calling Mechanism

The Haypenny API uses a REST API and the returned values are in JSON format. The calls consist of the following:

Split
- Parameters
  - Block -- A Haypenny block, which is a string of 22 characters.
  - Amount -- The amount desired, in units. If not specified, uses the entire balance.
  - Secret -- A 64-bit integer generated by the caller.
- Returns
  - New Block -- A new Haypenny block, which is a string of 22 characters.
  - Balance -- The balance of given block after it has been debited.
  - New Block Balance - The balance of the newly created block.
Combine
- Parameters
  - Block -- A Haypenny block you wish to add balance to.
  - Combine Block -- A Haypenny block you wish to decrement.
  - Amount -- The unit amount to transfer; if not specified, moves the entire balance of the combine block.
- Returns
  - Balance -- The balaner of the first block.
  - Amount -- The actual amount transferred (in the case of not specifying an amount above).

The IBlock Mechanism

The system includes a concept called a "IBlock" which is short for "Info Block". An Info Block is a 44 character string provided automatically every time the tx.Info API call is made for a Block. That string can be used in subsequent tx.Info calls instead of the normal block string (with the "iblock" parameter instead of the block parameter), and the call will return the same block information as if it were called with the original block string.

This mechanism is useful to securely verify the balance of a block by a third party, or to securely allow a Combine operation into a block, e.g. for a client-side-only payment.

Security

Summary and Problem Statement

The Haypenny paradigm treats Block identifiers as the equivalent of physical cash and thus the system's security profile must be at the level of businesses dealing with large amounts of cash, which is to say that the system must maintain an extremely diligent level of threat detection and mitigation.

The Haypenny approach to creating a secure system for customers is fivefold:

Standards-based certification for methods and processes, including human methods and processes, such as ISO 27001⁵, requiring any technology partner used by the system to adhere to the same standards.
An application-level approach that mitigates risk (e.g. encrypted block IDs).
Industry-trusted methods and sub-systems such as the AWS Nitro enclave, AWS Application Load Balancer, and the network isolation features found in AWS Virtual Private Cloud (VPC).
Multiple isolated authorization realms that limit personnel to access only what they need to perform their authorized function.
Periodic (e.g. daily) auditing of all system transactions to quickly detect errant or suspicious patterns, to verify system integrity, and so on.

Standards-based Certification

Haypenny starts its approach to securing its systems at the enterprise level, and will adhere to standards appropriate to financial institutions and select only contracted systems (viz. Amazon AWS) that adhere to these standards as well.

These standards call for, among other things:

Documentation on known areas of concern, threats, vulnerabilities, and potential impact of attacks.
Written procedures for granting and revoking access to systems and other security controls.
Security procedures and requirements that are integrated with the management of the company.
Using certified permanent (WORM) storage for the long-term transaction store.

Amazon AWS, Haypenny's cloud computing provider, itself adheres to these standards and hosts a number of prominent financial institutions who have certification for these standards.

Encrypted Block IDs

Block IDs (exposed externally as a "block string") are stored in Haypenny's system in an encrypted state using an industry standard encryption method (AES). This means Haypenny personnel never see "live" Block IDs.

System-level Mitigations

Access to Haypenny's Amazon AWS accounts are secured through two-factor authentication and limited to a very small number of personnel. All access is logged and personnel are thoroughly screened.
Access to Haypenny production systems is enabled by a VPN connection secured with two-factor authentication.
Communication between production systems is limited to only the exact traffic necessary to run the systems. Systems are not connected to any other company's systems.
No production system is directly connected to the Internet, and API systems are secured using the AWS Application Load Balancer, which provides for a "pinhole" security approach for malicious request mitigation by letting only the small number of valid requests through to the front-end processors.
Access to Haypenny databases are strictly limited and logged, and they are not directly connected to the Internet.

System Audit

Periodically (on the order of once per day), various forms of analysis are performed on Haypenny offline systems analyzing all system transactions and balances as a whole. This analysis allows the detection of many forms of DDOS attacks, and attempts to use the system fraudulently or inappropriately, as well as providing a mechanism to double-check all transactions and block balances.

After the offline balances are verified and reconciled based on the permanent record of transactions, the online systems (Data Table Processors) are verified as having the correct balances for all blocks. This mechanism employs a strategy whereby a batch of blocks in the system are halted for trading, updates to the offline systems are completed, and then each block balance is checked against the realtime system with an ordinary user-level (tx.Info) call.

Along with this checking, the entire system is temporarily (on the order of one minute) halted occasionally such that the process can verify the active block count in the realtime systems aligns with the block count in the final system of record. The combination of these two audits ensures that no spurious blocks can be stored on the realtime systems.

Customer Service Mechanism

Using the Internet, there is always a possibility of network failure between an end-user client and Haypenny servers at any time. Because of this, a mechanism must be made available to ensure that a user recover a lost Block if a call is successfully made but not returned.

The assumptions behind this approach are:

That this should occur very rarely, and almost all users will never experience this sort of failure.
Therefore, a longer resolution time is acceptable, on the order of a maximum of one hour (although typically only a few minutes or even a few seconds).

Because of these assumptions, the mechanism to handle it resides in the offline system tier.

Mechanism

For Split operations, the client creates a "secret number" that is presumed to be a cryptographically secure pseudo-random number³. It is 64 bits in size.

The "secret number" is included in the Split call, and that number is logged along with the rest of the transaction data.

In the event that a user does not obtain a response from the server due to a network error (i.e. the operation completed on the Haypenny system but the response was not received), the user may make use of an API call and provide the secret number and Block string used to make the call, and subsequently receive the lost Block string in an email.

Benchmarks and Performance Considerations

Measurement Methodology

The first step in making a system perform is to define exactly what "performance" means in a given context.

The Haypenny system's primary performance metric is internal transaction latency. This metric is defined as the amount of time elapsed from when a user's request reaches Haypenny servers to when the response is initiated. (It's worth noting that internet latency is not something taken into consideration in the Haypenny design, as this is both a straight-forward problem and not entirely under Haypenny's control).

Besides the time metric itself, another implied metric is consistency of responses. From a user-experience standpoint, users will often remember only the slowest response time for a given service even if that time is unusual. As such, our latency metric will be qualified: the highest latency within a given test will be the one by which we measure our performance goals.

While low latency is a desirable user trait to the end-user, after a certain point, a reduced latency will not be noticeable to their human-perceived response. However, latency also determines the cost of the system: the longer resources are tied up responding to a request, the more it will cost.

A higher total time taken for each request on the Front-end processor means more threads will be required to service the same number of requests per second in aggregate. An average response time of 1ms means each thread can service 1000 requests a second, and thus a Front-end processor would require 20 threads to service 20,000 API calls per second, the design goal for the Front-end processor. If that is increased to 4ms, then 80 threads are required, and so on. Threads do not scale in a linear fashion because of shared resource contention, which means there is only a finite number of threads that are practical.

Volume and Frequency Requirements

The Haypenny system is designed to be a primary means of small payments for every single user of the Internet, every day. The system's architecture, therefore, includes the following assumptions about the ultimate scaling requirements:

Every user on the internet will have multiple Haypenny currency blocks of their own.
Every user on the Internet will use Haypenny to make small payments at least dozens of times a day.
Volume may be "peaky": yearly volume may occur mostly in one or two monthly periods, monthly volume may occur more on certain day in the week (e.g. Monday morning), and daily volume may occur during specific hours and minutes. (Hence the peak volume requirement must be a high multiple of the average volume).
Haypenny, as a new value transfer paradigm, may change the way consumers think about transactions as they can now happen anonymously and at extremely high speed, can be much smaller in value, be virtually free in price, and thus today's payment volume in traditional payment methods such as credit cards and payment services may multiply many times.
The Haypenny system must be scalable from the beginning to its 'final' requirements. Users of the Haypenny system, even at the outset, must be assured of the system's ability to scale to the capacity needs of a major internet utility.

Using the above assumptions, the following top-level scaling requirements were used in defining the architecture for the Haypenny system:

The number of outstanding active Haypenny active blocks could grow into the billions.
Total daily transaction volume could ultimately grow to on the order of 3 billion per day or 100 billion per month (for comparison, the total credit card transaction volume world-wide is currently about one billion per day on average⁴).
Using a peak-to-average ratio of 25, the estimated peak (peak day, peak hour, peak second) transaction volume will be approximately one million transactions per second.

Extensive Instrumentation

A key enabler of performance is to track as much metrics as possible on the system. The Haypenny engine tracks over 40 real-time metrics that track everything from application-level activity to internal backend memory usage at a granular (object) level. All API calls to Haypenny are individually benchmarked and a recent-running-average execution time is kept. All metrics are continuously scrutinized by near-real-time systems in order to save these metrics to a database for later reporting.

Benchmark Results

The Haypenny system has been extensively measured under a contrived load of random Split and Combine operations on a large number of Blocks.

Besides application-level benchmarks, measurements were taken at the various layers to isolate specific bits of software and hardware response times. Knowing the "absolute hardware speed" of the systems involved and comparing them to high-level benchmarks isolates the exact performance of the software itself and allows performance to be extrapolated to heavier loads.

Modern cloud computing provides a roughly linear way of scaling resources (within a very specific context, with caveats). You can purchase systems with various performance ratings, and those ratings can be tested with low-level benchmarks. Thus, if a ratio between the low-level benchmarks and the application-level benchmarks is established, a cost-performance curve can be created.

The Haypenny system's primary performance limitations ultimately come from two kinds of IO: network IO and non-volatile storage IO. A Haypenny Block operation (split or combine) requires four of each kind of IO (two network IOs to the Data Table processors; two each IO writes to non-volatile storage; one each network IOs to remote storage). Based on the cloud hardware purchased and how systems are configured, network IO operations can range from 20µs to 500µs, which again can be measured separately. Non-volatile storage "hard write" operations (i.e. not buffered in any way) can range from approximately 50µs to 500µs depending on concurrent load. (The maximum estimates here depend on the phsyical distance between data centers, and the maximum numbers shown here are based on a typical distribution of data centers within 100 kilometers from each other for instance).

Haypenny benchmarks revealed the following:

That the difference between the rated hardware latencies (the "theoretical maximum speed") and the application-level benchmarks is no more than 30% of total response time.
That the difference between the application-level benchmark curves and hardware-level curves is stable over a large spectrum of load levels.
An average latency of a single operation to the Data Table processor under high loads of 1ms is very achievable with currently available cloud hardware.
An overall internal (processing) latency of 3ms for a single API call (Split or Combine) under very high loads is very achievable with currently available cloud hardware.