Author: Ming Wen
Apache APISIX is a high-performance and scalable microservices API gateway. Its implement is based on Nginx and etcd. Compared with traditional API gateways, APISIX has functions such as dynamic routing, plugins hot-reloading , and gRPC protocol transcoding, which is especially suitable for API management under the microservices system.
Apache APISIX is a booming open source project. After being open sourced on June 6th,2019, it has quickly gained attention and interests from developers, and was included in the CNCF panorama a month later. Now Apache APISIX, with more than 1800 stars and nearly 60 contributors on GitHub, is a developers’ community with more than 1000 people.
From the very beginning of being open sourced, APISIX has been releasing a version per month. To ensure the quality and stability of the code, it sticks to the concept of test-driven development and automated CI / CD.
Microservices API gateway
The API gateway is not an emerging concept, but has existed for more than ten years. API gateway, as the entrance to Internet traffic, is to uniformly process business-related requests, which makes API more secure, fast and accurate. Here are its traditional features:
- Reverse proxy and load balancing, which is consistent with the function of Nginx.
- Supporting dynamic functions such as dynamic upstream, dynamic SSL certificates, and dynamic rate and speed limiting, which are not available in the open source version of Nginx.
- Supporting upstream active and passive health check and service disconnection. APXSIX expands on the basis of API gateway, and becomes a lifecycle management platform.
In recent years, business-related traffic is no longer initiated by PC clients and browsers only. More traffic is initiated from mobile phones, IoT devices, etc, which will keep increasing with the popularity of 5G in the future. At the same time, with the changing of microservices architecture, traffic in services has also begun to grow explosively. In this new business situation, there are more new requirements for API gateway:
- Being friendly to cloud native computing, light architecture for easy containerization;
- Connecting with Prometheus, Zipkin, Skywalking and other monitoring components;
- Supporting proxy of gRPC, Dubbo, websocket, MQTT and other protocols as well as protocol conversion between http and gRPC in order to adapt to a wider range of scenarios;
- Performing identity verification and interface with the identity authentication services provided by Auth0 and okta to prioritize the security of traffic;
- Executing user functions dynamically to make it serverless and make the edge nodes of the gateway more flexible;
- Supporting plugins hot-reloading and no need to reload service when adding, deleting and modifying plugins;
- Unlocking down users and supporting the deployment architecture of hybrid cloud.
- Finally, the gateway node itself must be stateless and can be expanded and contracted freely.
With these functions, microservices only need to focus on the business itself. Business-related management functions, such as service discovery, service fusing, identity authentication, current and speed limit, statistics carrying and performance analysis, can all be achieved at the gateway. From this perspective, the API gateway can not only replace all the functions of Nginx to handle north-south traffic, but also play the role of Istio control plane and Envoy plane to handle east-west traffic.
There have already been many optional gateway products. Why do we still explore in this area? Here is our analysis of existing products:
- Industry leaders: mostly based on Java+JS, which have poor performance of closed source and are unavailable to custom-made software development.
Most of the industry leaders’ technical programs are based on Java and JS since they all started more than a decade ago. The only option in that era was only Java. I believe if Ali had started now, he would also have different technology choices, but to design applications that were widely used in that era, only Java was available. If you want to make your API gateway product dynamic, there is basically only one way to choose-JS. So the final combination was JAVA + JS.
- Industry visionaries: Most are based on OpenResty or Golang, and few open source.
The technical programs adopted by the visionaries in Ganter are mostly based on OpenResty and Golang. We can see that the amount of code of these industry visionaries is relatively large, which often means that the structure is complicated. And in the end, they are found to be really inefficient.
- Opportunity of Apache APISIX: lightweight, extreme performance and hot plugins.
At the beginning, we realized that we had to be ten times better than the visionary before we could succeed. So the programs should be lightweight, with extreme performance and it will be perfect if there is a complete plugins ecosystem.
The development of Apache APISIX
In early April 2019, we started writing the first line of code. The product is called APISIX so we chose to open source on June 6th to make it is easy for everyone to remember. In July, APISIX entered the CNCF, which is currently the most famous software foundation. In August, we had our first business user, with the APISIX kernel, we helped the user to double the QPS for the first two weeks.
In September, the open source user ke.com was officially launched, and till now there is 500 million daily traffic to be processed, but the CPU taken by it is about 2%.
In September we contacted Apache and prepared donations, and it came true in October. This should be the first Apache project donated by a startup in China. It usually takes years for a project to enter Apache , but we made it within only a month.
In October, we achieved full platform support. In addition to common operating systems, APISIX is supported by the two major architectures X86 and ARM64 as well. And we have completed all cases regression testing.
APISIX is a test-driven project with a test coverage of more than 80%. As long as the test cases run completely, APISIX can be used normally in production.
The strengths of Apache APISIX
The following are the strengths of Apache APISIX, most of which are completely out of competition:
- First API gateway project of ASF
- Excellent dynamic forwarding performance that ranks NO.1 in the test report.
- Lowest average request latency:200 us.
- Plugins, no matter new or old, are available to hot loading and unloading, which is unique among similar competitors.
- Plugins can be able to called at any phase of Nginx.
- All components are plugins, including the router itself, which is unique in the industry.
- Full support for ARM64.
- Full support for IPv6.
- Dynamic forwarding IOT MQTT protocol.
- OpenResty and Tengine can be chosen to run freely.
- Excellent performance validator: jsonschema (the highest performance validators).
Evolution of API Gateway Product
As shown in the figures above, Figure ① is the initial product form of the gateway. The left is client, and the right is service. Between client and service is the gateway. As services are aggregated and classified, (as shown in Figure ②, the services are divided into two categories.) So the importance of API gateway is reflected here: it needs to be externally unaware and to distribute traffic information according to users’ requests. At this time the API gateway becomes a single point of failure, from which the pattern shown in Figure ③ evolves. There are two API gateways, both of which can access any of the subsequent service clusters and back up each other. This is a highly available pattern because the client can send requests to any gateway. In Figure ④, API Gateway is in charge of traffic-forwarding, etcd is responsible for configuration storage, and API Gateway is the console of the management staff, so it is far from enough if only API Gateway is highly available.
It really reassures users if API Gateway, configuration center, and control center can all fully support high availability. As a microservices API gateway, it needs to be deployed flexibly. API Gateway, etcd, and management console need to meet any number of scaling requirements as well. This poses a great challenge for our open source version. What should the installation form look like?
What we need is that all the three forms in the above picture allow users to deploy: Admin, Gateway and Gateway + Admin. All In One is the first solution. That is, there is only one “Gateway + Admin” package. When users need to deploy Gateway and Admin separately, they only need to modify the configuration and choose whether to enable Admin.
We simply distinguish the types of nodes through configuration. Any node can contain a part, such as Admin or Gateway, or both. In this way, users can solve a bunch of problems easily to achieve functions including high availability, elastic scaling, distribution, clustering, and automatic failover.
Basic Architecture of API Gateway
The basic architecture of the API gateway is introduced below:
During the whole process, the administrator tells the gateway what to do through the admin API, which is often called as control plane. In contrast, the data plane partially processes the real requests from users. According to the rules of administrator, the current requests are configured based on route matching, and then the plugins in the configuration are executed and forwarded to the specified upstream.
There are three basic issues involved:
- Router: to match users’ requests, it needs to be advanced and well performed.
- Validator: to verify whether the data requested by the user is legal, so it needs to be universal and have high-performance.
- Configuration Center: to store configuration, it needs to be easy to use and available to incremental subscription.
If the three basic requirements can be met, then the quality of this gateway is basically acceptable.
Technology selection of Apache APISIX
Key concept: What aspects need to be considered when selecting technology?
- Configuration Center: high availability, Incremental subscription and historic records.
- Language or development platform: dynamic function, high-performance and rich surrounding resources of the gateway
- Data verification: open standards and having a certain kind of ecosystem.
- Extra strengths: excellent route.
- Shortcuts for selection: to get previous materials from Ganter reports to do analysis and comparison.
Configuration Center: etcd
We did not choose a traditional relational database, but etcd as the configuration center. The following factors were mainly considered at that time:
- Cluster support
- Historical modification records available
- Storage support-as the storage of some data is conditional.
- Notification of sub-millisecond change
After analysis, we found that etcd just met our needs. When we finally saw the official list of why etcd (as shown below), we knew that we had chosen the right one.
Language and development platform: OpenResty
Basically, there are only two ways for newly selected API gateway development platform, one is Lua, that is OpenResty. And the other is Golang. Golang is a static language and its dynamic capabilities are not as good as Lua’s, so we chose OpenResty in the end. I have been engaged in OpenResty community since 2014, so I have a better understanding of it. Since it’s a brand-new project, we directly based it on the latest version: OpenResty> = 1.15.8, Tengine> = 2.3.2, both of which are based on Nginx, and any of them can run APISIX.
A more general language is needed to absorb its surrounding ecology. In this regard, Lua is not comparable with C / C ++. Commonly, we achieve this by calling C / C ++ dynamic database or Golang-based database. From this perspective, we choose OpenResty as the basic platform to develop APISIX business, which will be workable. Because we don’t need to worry about the lack of peripheral database. In addition, OpenResty has been used more in API gateways in recent years, so there are many ready-made components that can be used. In this way, APISIX may only need to be used for secondary integration. However, during the integration process, we found that the open source versions of some projects were not well written, so we worked a lot to improve them.
Data verification: jsonschema
Jsonschema’s data validation specification ranks first in Google. In other words, if there is a validation specification and it has already ranked first, we don’t have to make another one by ourselves, so we chose the jsonschema standard. This verification standard almost covers mainstream languages such as C, Java, JS, etc., and the results of pressure test are officially provided. We pay special attention to the performance of all the selections, so it will be great if there is a ready-made pressure test frame and result.
Of course, we experienced some setbacks in practice: we wanted to find selection in the official jsonschema at first, but we didn’t find any suitable one because the degree of coupling was relatively high.
The first option we found was lua-rapidjson, which was not on the official recommendation list of jsonschema but was open sourced by Tencent. We are doing an open source project, so simple and easy to use is what we are pursuing. But the compilation condition of rapidjson is high, and it’s an implementation of C / C ++, which is relatively a big problem for us. Besides, rapidjson only supports 95% of the content in draft4, and some features are not supported, such as the frequently-used default. So we modified it according to an open source solution and implemented the new iresty / jsonschema. We mainly added the following aspects:
- Support OpenResty at runtime
- Fully support draft4
- Fully support draft 6, draft 7
This database adopts the way the compiler does. We have conducted a test on it: a simple object with two fields, a string and an int type, were carried out pressure test repeatedly. After they have run a million times, we compared the running time and found that the performance of iresty / jsonschema is 5–10 times that of lua-rapidjson and 500–1000 times that of gojsonschema (golang).
Route: Why was the lua-resty-radixtree route made?
Route is the key of the API gateway. Without high-performance route, there is no fast matching process, and the performance of the API gateway cannot be improved, either. At the same time, the route matching conditions must be flexible and powerful enough. In addition to supporting the most basic uri and host, other matching conditions such as IP address, request parameters, request headers, and cookies are also required.
I thought having achieved these would be enough, but users of open source projects still have other needs. Therefore, I added custom functions. Users can write Lua scripts, which adopts Lua’s dynamic characteristics again. In other words, when a certain logic that is particularly difficult to express or not yet supported is involved, users can use custom functions to create judgment rules to bypass it .
lua-resty-radixtree, assembles all the good characteristics of routes. At present, a single core of lua-resty-radixtree can reach millions of matches per second. Compared with the previous selection of libr3, the performance of radixtree is improved by at least an order of magnitude. What’s more，it allows to refer to any built-in variables of nginx, and the free creation of indexes also makes it easy to support the use of uri or host + uri. All of these make lua-resty-radixtree a master route that assembles good characteristics.
Till now the three selections were determined: route lua-resty-radixtree, validator-iresty/jsonschema, configuration center-etcd, and Apache APISIX’s prototype came into being.
Architecture of Apache APISIX
As shown above, this is the current architecture of Apache APISIX: the left is the administrator, and the right is user’s requests. After the administrator enters the information into cache in etcd, the user accesses APISIX to make routing judgments, and gets results. The match may be a specific microservice or a serverless function.
As shown in the figure above, the basic architecture of APISIX software layer does not adopt the traditional layer-by-layer nesting method, but only the base layer and business layer. The base layer is completely separated from the APISIX kernel and has no business binding. You can refer to it in any OpenResty project.
Plugins can be hot-plugged without restarting the service. And common plugins such as current and speed limiting, identity authentication, request rewriting, URI redirection, opentracing and serverless, etc. have been built in. APISIX’s support for plugins is not the same as its competitors, and the specific differences are as follows:
- “Inherit” on demand；
- Allow mounting at all stages of Nginx；
- Plugins hot loading and unloading；
- Plugins forbidden temporarily.
Conclusion: there are three strengths of Apache APISIX:
- Configuration and distribution are based on etcd to streamline the core.
- lua-resty-radixtree: based high-performance prefix matching.
- High-performance base database apisix / core: enhancement of Nginx variable collection; optimization of error log and table pool.
Apache APISIX performance test
Apache APISIX currently has more than 30 functions, which has basically surpassed most open source competitors. Generally speaking, the introduction of dozens of functions mentioned above will be accompanied by a decline in performance. But how much is it?
As shown above, there is an empty running service that I wrote for testing. There is no logical code running in it. Some variables in ngx_lua were passed to the empty function fake_fetch, which was as same as in the following stages http filter,log,etc., without any calculation. Then, I tested the pressure of APISIX and the empty running service.
The comparison results showed that the performance of APISIX only decreased by 15%, which means if you accept the 15% performance decrease, you can enjoy all the functions mentioned earlier. Our computing platform on Alibaba Cloud can run 23–24k QPS with a single core, and 68k QPS with 4 cores.
You are welcome to search APISIX through github to know more.